date:20020415

Re: [HACKERS] ecpg/preproc/preproc.y now generates lots of warnings

2002-04-15 Thread Michael Meskes


On Sun, Apr 14, 2002 at 10:15:50PM -0400, Tom Lane wrote:
> Could this get cleaned up please?

Argh! Sorry, don't know how that typo made it in. I just fixed it.

Michael
-- 
Michael Meskes
[EMAIL PROTECTED]
Go SF 49ers! Go Rhein Fire!
Use Debian GNU/Linux! Use PostgreSQL!

---(end of broadcast)---
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]

Re: [HACKERS] [PATCHES] YADP - Yet another Dependency Patch

2002-04-15 Thread Bruce Momjian

Rod Taylor wrote:
> [ copied to hackers ]
> 
> > 1. I don't like the code that installs and removes ad-hoc
> dependencies
> > from relations to type Oid.  On its own terms it's wrong (if it were

Looks good to me.  I realize this is a huge chunk of code.  The only
ultra-minor thing I saw was the use of dashes for comment blocks when
not needed:

/*  
 *  word
 *  
 */

We use dashes like this only for comments that shouldn't be reformatted
by pgindent.

The one thing I would like to know is what things does it track, and
what does it not track?  Does it complete any TODO items, or do we save
that for later?

-- 
  Bruce Momjian|  http://candle.pha.pa.us
  [EMAIL PROTECTED]   |  (610) 853-3000
  +  If your life is a hard drive, |  830 Blythe Avenue
  +  Christ can be your backup.|  Drexel Hill, Pennsylvania 19026

---(end of broadcast)---
TIP 3: if posting/reading through Usenet, please send an appropriate
subscribe-nomail command to [EMAIL PROTECTED] so that your
message can get through to the mailing list cleanly

Re: [HACKERS] Importing Large Amounts of Data

2002-04-15 Thread Gavin Sherry

On Tue, 16 Apr 2002, Curt Sampson wrote:

> > Given the very low parsing and 'planning' overhead, the real cost would be
> > WAL (the bootstrapper could fail and render the database unusable) and the
> > subsequent updating of on-disk relations.
> 
> MS SQL Server, when doing a BULK INSERT or BCP, can do it as a fully or
> "minimally" logged operation. When minimally logged, there's no ability
> to roll-forward or recover inserted data, just the ability to go back
> to the state at the beginning of the operation. This technique can work
> even though an on-line database. A bit more information is available at

The other reason I say that this bootstrap tool would still use WAL is
that bypassing WAL would require writing a fairly significant amount of
code (unless the pre-WAL heap_insert() code could be used, with relevant
modification).

On the other hand, I would imagine it to be very difficult to implement
an 'interactive' roll back facility with the kind of tool I am
describing.

Gavin

---(end of broadcast)---
TIP 3: if posting/reading through Usenet, please send an appropriate
subscribe-nomail command to [EMAIL PROTECTED] so that your
message can get through to the mailing list cleanly

Re: [HACKERS] Firebird 1.0 released

2002-04-15 Thread Tom Lane

"Christopher Kings-Lynne" <[EMAIL PROTECTED]> writes:
> The Firebird guys have gotten around to releasing 1.0.  If you read this
> front page spiel, you'll notice that they use MVCC, but with an overwriting
> storage manager.

Yup.  I've had a couple of long chats with Ann Harrison at the recent
"OSDB summit" meetings.  I think we each came away enlightened about the
other implementation, but not in any large hurry to change our own.

I did steal at least one idea from her, though.  (rummages in CVS logs)
ah, here's a hit:

2001-09-29 19:49  tgl

* src/backend/access/nbtree/nbtinsert.c: Tweak btree page split
logic so that when splitting a page that is rightmost on its tree
level, we split 2/3 to the left and 1/3 to the new right page,
rather than the even split we use elsewhere.  The idea is that when
faced with a steadily increasing series of inserted keys (such as
sequence or timestamp values), we'll end up with a btree that's
about 2/3ds full not 1/2 full, which is much closer to the desired
steady-state load for a btree.  Per suggestion from Ann Harrison of
IBPhoenix.

regards, tom lane

---(end of broadcast)---
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]

Re: [HACKERS] [PATCHES] [SQL] 16 parameter limit

2002-04-15 Thread Bruce Momjian

Tom Lane wrote:
> Peter Eisentraut <[EMAIL PROTECTED]> writes:
> > How about this:  We store the first 16 parameters in some fixed array for
> > fast access like now, and when you have more than 16 then 17 and beyond
> > get stored in some variable array in pg_proc.
> 
> <>  What's this going to cost us in the function lookup code paths?
> 
> If we can do it with little or no performance cost (at least for the
> "normal case" of fewer-than-N parameters) then I'm all ears.

OK, I have an idea.  Tom, didn't you just add code that allows the cache
to return multiple rows for a lookup?  I think you did it for schemas.

What if we lookup on the first 16 params, then look at every matching
hit if there are more than 16 params supplied?  Another idea would be to
hash the function arg types and look that up rather than looking for
exact matches of oidvector.

-- 
  Bruce Momjian|  http://candle.pha.pa.us
  [EMAIL PROTECTED]   |  (610) 853-3000
  +  If your life is a hard drive, |  830 Blythe Avenue
  +  Christ can be your backup.|  Drexel Hill, Pennsylvania 19026

---(end of broadcast)---
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]

Re: [HACKERS] Importing Large Amounts of Data

2002-04-15 Thread Curt Sampson

On Tue, 16 Apr 2002, Gavin Sherry wrote:

> I don't see any straight forward way of modifying the code to allow a fast
> path directly to relationals on-disk. However, it should be possible to
> bypass locking, RI, MVCC etc with the use of a bootstrap-like tool.

That was my thought. I'm not asking for disk-speed writes through the
database server itself; I can make do with taking the server off-line,
doing the imports, and bringing it back again, as you suggest.

> Given the very low parsing and 'planning' overhead, the real cost would be
> WAL (the bootstrapper could fail and render the database unusable) and the
> subsequent updating of on-disk relations.

MS SQL Server, when doing a BULK INSERT or BCP, can do it as a fully or
"minimally" logged operation. When minimally logged, there's no ability
to roll-forward or recover inserted data, just the ability to go back
to the state at the beginning of the operation. This technique can work
even though an on-line database. A bit more information is available at

http://msdn.microsoft.com/library/en-us/adminsql/ad_impt_bcp_9esz.asp

(You may want to browse this with lynx; the javascript in it is going to
force your screen into a configuration with frames.) You can follow some
of the links in that page for further information.

Another option, for off-line databases, might just be not to log at all.
If you take a backup first, it may be faster to restore the backup and
start again than to try to roll back the operation, or roll it foward
to partial completion and then figure out where to restart your import.
This seems especially likely if you can restore only the files relating
to the table that was actually damanaged.

cjs
-- 
Curt Sampson  <[EMAIL PROTECTED]>   +81 90 7737 2974   http://www.netbsd.org
Don't you know, in this new Dark Age, we're all light.  --XTC

---(end of broadcast)---
TIP 2: you can get off all lists at once with the unregister command
(send "unregister YourEmailAddressHere" to [EMAIL PROTECTED])

Re: [HACKERS] [PATCHES] [SQL] 16 parameter limit

2002-04-15 Thread Tom Lane

Peter Eisentraut <[EMAIL PROTECTED]> writes:
> How about this:  We store the first 16 parameters in some fixed array for
> fast access like now, and when you have more than 16 then 17 and beyond
> get stored in some variable array in pg_proc.

<>  What's this going to cost us in the function lookup code paths?

If we can do it with little or no performance cost (at least for the
"normal case" of fewer-than-N parameters) then I'm all ears.

regards, tom lane

---(end of broadcast)---
TIP 6: Have you searched our list archives?

http://archives.postgresql.org

[HACKERS] Firebird 1.0 released

2002-04-15 Thread Christopher Kings-Lynne


The Firebird guys have gotten around to releasing 1.0.  If you read this
front page spiel, you'll notice that they use MVCC, but with an overwriting
storage manager.

http://www.ibphoenix.com/ibp_act_db.html

The relevant extract:

"Multi-version concurrency control uses back versions of modified and
deleted records to maintain a consistent view of data for read transactions.
Each record version is tagged with the identifier of the transaction that
created it. When a record is modified, the old version of the record is
reduced to a "delta record" - a set of differences from the new version -
and written to a new location, ordinarily on the same page where it was
originally stored. Then the new record overwrites the old. The new record
points to the old record. Unless the values of indexed fields are changed,
there's no need to update the index. Even if the values have changed, the
old values remain in the index to keep the record available to older
transactions.

The transaction identifier also permits update transactions to recognize
updates by concurrent transactions and allows Firebird to dispense with
write locks on records. When a transaction encounters a record updated by a
concurrent transaction, it waits for the other transaction to complete. If
the competing transaction commits, the waiting transaction gets an error. If
the competing transaction rolls back, the waiting transaction succeeds. If
the competing transaction attempts to update a record that the waiting
transaction has modified, a deadlock exists and one or the other will
receive an error.

Multi-version concurrency replaces a before-image (rollback) log with
versions stored in the database. When a transaction fails, its changes
remain in the database. The next transaction that reads that record
recognizes that the record version is invalid. Depending on the version of
Firebird and architecture, that transaction either replaces the invalid
record version with its back version or invokes a garbage collect thread. "

Chris


---(end of broadcast)---
TIP 2: you can get off all lists at once with the unregister command
(send "unregister YourEmailAddressHere" to [EMAIL PROTECTED])

Re: [HACKERS] multibyte support by default

2002-04-15 Thread Tom Lane


Tatsuo Ishii <[EMAIL PROTECTED]> writes:
> In my understanding, our consensus was enabling multibyte support by
> default for 7.3. Any objection?

Uh, was it?  I don't recall that.  Do we have any numbers on the
performance overhead?

regards, tom lane

---(end of broadcast)---
TIP 6: Have you searched our list archives?

http://archives.postgresql.org

Re: [HACKERS] [PATCHES] [SQL] 16 parameter limit

2002-04-15 Thread Peter Eisentraut

Tom Lane writes:

> Neil Conway <[EMAIL PROTECTED]> writes:
> > My vote is to set the default # of function args to some
> > reasonable default (32 sounds good), and leave it at that.
>
> Bear in mind that s/32/16/ gives you the exact state of the discussion
> when we raised the limit from 8 to 16 ;-)

How about this:  We store the first 16 parameters in some fixed array for
fast access like now, and when you have more than 16 then 17 and beyond
get stored in some variable array in pg_proc.  This way procedures with
few arguments don't lose any performance but we could support an
"infinite" number of parameters easily.  It sounds kind of dumb, but
without some sort of break out of the fixed storage scheme we'll have this
argument forever.

-- 
Peter Eisentraut   [EMAIL PROTECTED]

---(end of broadcast)---
TIP 6: Have you searched our list archives?

http://archives.postgresql.org

Re: [HACKERS] Operators and schemas

2002-04-15 Thread Tom Lane

Peter Eisentraut <[EMAIL PROTECTED]> writes:
> I had imagined that pg_dump would emit commands such as this:

> CREATE SCHEMA foo
>   CREATE TABLE bar ( ... )
>   CREATE otherthings
> ;

> which is how I read the SQL standard.  Are there plans to implement the
> CREATE SCHEMA command that way?  I think I recall someone from Toronto
> mentioning something along these lines.

We have portions of that now, but I don't think there is any serious
intent to support *all* Postgres CREATE statements inside CREATE SCHEMA.
Because there are no semicolons in there, allowing random statements in
CREATE SCHEMA tends to force promotion of keywords to full-reserved
status (so you can tell where each sub-statement starts).  My
inclination is to allow the minimum necessary for SQL spec compliance.

(Fernando, your thoughts here?)

>> Given the present semantics of
>> search_path, that will imply an implicit search of pg_catalog before
>> foo.

> Interesting ... Is that only temporary?  (since you say "present"
> semantics)

Only meant to imply "it hasn't been seriously reviewed, so someone
might have a better idea".  At the moment I'm happy with it.

regards, tom lane

---(end of broadcast)---
TIP 6: Have you searched our list archives?

http://archives.postgresql.org

Re: [HACKERS] regexp character class locale awareness patch

2002-04-15 Thread Manuel Sugawara

Tatsuo Ishii <[EMAIL PROTECTED]> writes:

> I don't think character classes are applicable for most mutibyte
> encodings. Maybe only the exeception is Unicode?

Maybe, and is the only one I need ;-)

> 
> > Peter Eisentraut <[EMAIL PROTECTED]> writes:
> > >
> > > Basically, you manually preprocess the patch to include the
> > > USE_LOCALE branch and remove the not USE_LOCALE branch.
> > 
> > Yeah, that should work. You may also remove include/regex/cclass.h
> > since it will not be used any more.
> 
> But I don't like cclass_init() routine runs every time when reg_comp
> called.

Actually it is called once per backend and only if it uses the regular
expression engine.

> In my understanding the result of cclass_init() is always
> same. 

Yes, if localization does not change. Karel once talked about the
possibility of being able to have different locales in the same
DB.

> What about running cclass_init() in postmaster, not postgres? Or
> even better in initdb time?

It might be, but ... I think that it would be nice if we leave the
door open to the possibility of having mixed locale configurations,
across data bases or even across columns of the same table.

Regards,
Manuel.

---(end of broadcast)---
TIP 6: Have you searched our list archives?

http://archives.postgresql.org

Re: [HACKERS] Importing Large Amounts of Data

2002-04-15 Thread Gavin Sherry

On Tue, 16 Apr 2002, Curt Sampson wrote:

[snip]

> What I'm thinking would be really cool would be to have an "offline"
> way of creating tables using a stand-alone program that would write
> the files at, one hopes, near disk speed. 

Personally, I think there is some merit in this. Postgres can be used
for large scale data mining, an application which does not need
(usually) multi-versioning and concurrency but which can benefit from
postgres's implementation of SQL, as well as backend extensibility. 

I don't see any straight forward way of modifying the code to allow a fast
path directly to relationals on-disk. However, it should be possible to
bypass locking, RI, MVCC etc with the use of a bootstrap-like
tool.

Such a tool would only be able to be used when the database was
offline. It would read data from files pasted to it in some format,
perhaps that generated by COPY.

Given the very low parsing and 'planning' overhead, the real cost would be
WAL (the bootstrapper could fail and render the database unusable) and the
subsequent updating of on-disk relations.

Comments?

Gavin

---(end of broadcast)---
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]

Re: [HACKERS] [PATCHES] [SQL] 16 parameter limit

2002-04-15 Thread Christopher Kings-Lynne


> > Anyway, how does one measure the perfomance impact of such a change?
> > By merely changing the constant definition, or also by actually using
> > long identifiers? I can do that if it's of any help, for various values
> > perhaps.
>
> I think I would measure disk size change in a newly created database,
> and run regression for various values.  That uses a lot of identifier
> lookups.

With schemas, maybe there'd be less name lookups and comparisons anyway,
since there's more reliance on oids instead of names?

Chris


---(end of broadcast)---
TIP 2: you can get off all lists at once with the unregister command
(send "unregister YourEmailAddressHere" to [EMAIL PROTECTED])

Re: [HACKERS] [PATCHES] [SQL] 16 parameter limit

2002-04-15 Thread Bruce Momjian


Alvaro Herrera wrote:
> (Tom Lane says both SQL92 and SQL99 specify 128 as the maximun
> identifier length)
> 
> Anyway, how does one measure the perfomance impact of such a change?
> By merely changing the constant definition, or also by actually using
> long identifiers? I can do that if it's of any help, for various values
> perhaps.

I think I would measure disk size change in a newly created database,
and run regression for various values.  That uses a lot of identifier
lookups.

-- 
  Bruce Momjian|  http://candle.pha.pa.us
  [EMAIL PROTECTED]   |  (610) 853-3000
  +  If your life is a hard drive, |  830 Blythe Avenue
  +  Christ can be your backup.|  Drexel Hill, Pennsylvania 19026

---(end of broadcast)---
TIP 2: you can get off all lists at once with the unregister command
(send "unregister YourEmailAddressHere" to [EMAIL PROTECTED])

Re: [HACKERS] [PATCHES] [SQL] 16 parameter limit

2002-04-15 Thread Neil Conway

On Mon, 15 Apr 2002 23:34:04 -0400
"Alvaro Herrera" <[EMAIL PROTECTED]> wrote:
> En Mon, 15 Apr 2002 23:19:45 -0400
> "Rod Taylor" <[EMAIL PROTECTED]> escribió:
> 
> > On the note of NAMEDATALEN, a view in the INFORMATION_SCHEMA
> > definition is exactly 2 characters over the current limit.
> > 
> > ADMINISTRABLE_ROLE_AUTHORIZATIONS
> > 
> > Not that it's a great reason, but it isn't a bad one for increasing
> > the limit ;)
> 
> http://archives.postgresql.org/pgsql-general/2002-01/msg00939.php
> 
> (Tom Lane says both SQL92 and SQL99 specify 128 as the maximun
> identifier length)
> 
> Anyway, how does one measure the perfomance impact of such a change?
> By merely changing the constant definition, or also by actually using
> long identifiers?

Name values are stored NULL-padded up to NAMEDATALEN bytes, so
there is no need to actually use long identifiers, just change
the value of NAMEDATALEN, recompile and run some benchmarks
(perhaps OSDB? http://osdb.sf.net).

If you do decide to run some benchmarks (and some more data
would be good), please use the current CVS code. I sent in a
patch a little while ago that should somewhat reduce the
penalty for increasing NAMEDATALEN.

Cheers,

Neil

-- 
Neil Conway <[EMAIL PROTECTED]>
PGP Key ID: DB3C29FC

---(end of broadcast)---
TIP 4: Don't 'kill -9' the postmaster

Re: [HACKERS] [PATCHES] [SQL] 16 parameter limit

2002-04-15 Thread Alvaro Herrera


En Mon, 15 Apr 2002 23:19:45 -0400
"Rod Taylor" <[EMAIL PROTECTED]> escribió:

> On the note of NAMEDATALEN, a view in the INFORMATION_SCHEMA
> definition is exactly 2 characters over the current limit.
> 
> ADMINISTRABLE_ROLE_AUTHORIZATIONS
> 
> Not that it's a great reason, but it isn't a bad one for increasing
> the limit ;)

http://archives.postgresql.org/pgsql-general/2002-01/msg00939.php

(Tom Lane says both SQL92 and SQL99 specify 128 as the maximun
identifier length)

Anyway, how does one measure the perfomance impact of such a change?
By merely changing the constant definition, or also by actually using
long identifiers? I can do that if it's of any help, for various values
perhaps.

-- 
Alvaro Herrera ()
"Las cosas son buenas o malas segun las hace nuestra opinion" (Lisias)

---(end of broadcast)---
TIP 3: if posting/reading through Usenet, please send an appropriate
subscribe-nomail command to [EMAIL PROTECTED] so that your
message can get through to the mailing list cleanly

Re: [HACKERS] [PATCHES] YADP - Yet another Dependency Patch

2002-04-15 Thread Rod Taylor


[ copied to hackers ]

> 1. I don't like the code that installs and removes ad-hoc
dependencies
> from relations to type Oid.  On its own terms it's wrong (if it were
...
> explicit representation of pinning in the pg_depends table, perhaps
it
> would work to create a row claiming that "table 0 / Oid 0 / subid 0"
> depends on a pinned object.

Yes, a pinned dependency makes much more sense.

int4, bool, varchar, and name are in the same boat.

I'll make it so dependCreate() will ignore adding any additional
dependencies on pinned types (class 0, Oid 0, SubID 0) and
dependDelete() will never allow deletion when that dependency exists.

> 2. Is it really necessary to treat pg_depends as a bootstrapped
> relation?  That adds a lot of complexity, as you've evidently
already
> found, and it does not seem necessary if you're going to load the
system
> dependencies in a later step of the initdb process.  You can just
make
> the dependency-adding routines be no-ops in bootstrap mode; then
create
> pg_depends as an ordinary system catalog; and finally load the
entries
> post-bootstrap.

Ack.. .  All that work to avoid a simple
if statement.

Ahh well.. learning at it's finest :)

> 3. Isn't there a better way to find the initial dependencies?  That
> SELECT is truly ugly, and more to the point is highly likely to
break
> anytime someone rearranges the catalogs.  I'd like to see it
generated
> automatically (maybe using a tool like findoidjoins); or perhaps we
> could do the discovery of the columns to look at on-the-fly.

I'm not entirely sure how to approach this, but it does appear that
findoidjoins would find all the relations.

So...  I could create a pg_ function which will find all oid joins,
and call dependCreate() for each entry it finds.  That way
dependCreate will ignore anything that was pinned (see above)
automagically.  It would also make initdb quite slow, and would add a
pg_ function that one should normally avoid during normal production.
Then again, I suppose it could be used to recreate missing
dependencies if a user was manually fiddling with that table.

initdb would call SELECT pg_findSystemDepends(); or something.

> 4. Do not use the parser's RESTRICT/CASCADE tokens as enumerated
type
> values.  They change value every time someone tweaks the grammar.
> (Yes, I know you copied from extant code; that code is on my
hitlist.)
> Define your own enum type instead of creating a lot of bogus
> dependencies on parser/parser.h.

All but one of those will go away once the functions are modified to
accept the actual RESTRICT or CASCADE bit.   That was going to be step
2 of the process but I suppose I could do it now, along with a rather
large regression test.  The only place that RESTRICT will be used is
dependDelete();  Nowhere else will care.  It'll simply pass on what
was given to it by the calling function from utility.c or a cascading
dependDelete.  Of course, gram.y will be littered with the
'opt_restrictcascade' tag.

The RESTRICT usage is more of a current placeholder.  I've marked the
includes as /* FOR RESTRICT */ for that reason, make them easy to
remove later.

> 6. The tests on relation names in dependDelete, getObjectName are
(a)
> slow and (b) not schema-aware.  Can you make these into OID
comparisons
> instead?

Ahh yes.  Good point.


---(end of broadcast)---
TIP 5: Have you checked our extensive FAQ?

http://www.postgresql.org/users-lounge/docs/faq.html

Re: [HACKERS] [PATCHES] [SQL] 16 parameter limit

2002-04-15 Thread Rod Taylor


On the note of NAMEDATALEN, a view in the INFORMATION_SCHEMA
definition is exactly 2 characters over the current limit.

ADMINISTRABLE_ROLE_AUTHORIZATIONS

Not that it's a great reason, but it isn't a bad one for increasing
the limit ;)

--
Rod Taylor

> Are we staying at 16 as the default?   I personally think we can
> increase it to 32 with little penalty, and that we should increase
> NAMEDATALEN to 64.



---(end of broadcast)---
TIP 3: if posting/reading through Usenet, please send an appropriate
subscribe-nomail command to [EMAIL PROTECTED] so that your
message can get through to the mailing list cleanly

Re: [HACKERS] [SQL] 16 parameter limit

2002-04-15 Thread Bruce Momjian



Here is an email I sent to patches, minus the patch.  I am sending to
hackers for comments.

---

> 
> The following patch adds --maxindfuncparams to configure to allow you to
> more easily set the maximum number of function parameters and columns
> in an index.  (Can someone come up with a better name?)
> 
> The patch also removes --def_maxbackends, which Tom reported a few weeks
> ago he wanted to remove.  Can people review this?  To test it, you have
> to run autoconf.
> 
> Are we staying at 16 as the default?   I personally think we can
> increase it to 32 with little penalty, and that we should increase
> NAMEDATALEN to 64.

-- 
  Bruce Momjian|  http://candle.pha.pa.us
  [EMAIL PROTECTED]   |  (610) 853-3000
  +  If your life is a hard drive, |  830 Blythe Avenue
  +  Christ can be your backup.|  Drexel Hill, Pennsylvania 19026

---(end of broadcast)---
TIP 3: if posting/reading through Usenet, please send an appropriate
subscribe-nomail command to [EMAIL PROTECTED] so that your
message can get through to the mailing list cleanly

Re: [HACKERS] Stumbled upon a time bug...

2002-04-15 Thread Thomas Lockhart


> Is PostgreSQL broken? Or is it FreeBSD?

Both at most. FreeBSD at least ;)

The Posix definition for mktime() insists that the function return "-1"
if it has an error. Which also happens to correspond to 1 second earlier
than 1970-01-01, causing trouble for supporting *any* times before 1970.

PostgreSQL relies on a side-effect of mktime(), that the time zone
information pointed to in the tm structure *input* to mktime() gets
updated for output. I don't actually care about the function result of
time_t, and don't care what it is set to as long as the time zone info
gets filled in.

That happens on most every platform I know about, with the exception of
AIX (confirming for me its reputation as a strange relative of Unix best
left chained in the cellar).

Apparently glibc (and hence Linux) is at risk of getting this behavior
too, although I *hope* that the mods to glibc will be to return the "-1"
(if necessary) but still using the time zone database to fill in the
time zone information, even for dates before 1970.

I'm not sure I still have the info to include the glibc contact in this
thread. In any case, there is no excuse for refusing to return valid
info for a DST boundary time imho. Even if it requires an arbitrary
convention on how to jump the time forward or backward...

- Thomas

---(end of broadcast)---
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]

Re: [HACKERS] RFC: Generating useful names for foreign keys and checks

2002-04-15 Thread Christopher Kings-Lynne


> Actually I'm in favor of it.  I have a proposal outstanding to require
> constraints to have names that are unique per-table, for consistency
> with triggers (already are that way) and rules (will become that way,
> rather than having globally unique names as now).  AFAIR the only
> significant concern was making sure that the system wouldn't generate
> duplicate constraint names by default.

Yeah, that's what's giving me pain - foreign key names are generated in the
rewriter or something somewhere, so I'm not sure exactly what I have access
to for checking duplicates...

The other interesting issue is the the little suffix we append is just in
the name.  ie. someone can create an index called '_pkey' and cause
confusion.

Chris


---(end of broadcast)---
TIP 4: Don't 'kill -9' the postmaster

Re: [HACKERS] [PATCHES] ANSI Compliant Inserts

2002-04-15 Thread Thomas Lockhart


...
> OK, how about a NOTICE stating that the missing columns were filled in
> with defaults?

Yuck. There is a short path from that to rejecting the insert, but
printing the entire insert statement which would have been acceptable in
the error message ;)

   - Thomas

---(end of broadcast)---
TIP 2: you can get off all lists at once with the unregister command
(send "unregister YourEmailAddressHere" to [EMAIL PROTECTED])

Re: [HACKERS] regexp character class locale awareness patch

2002-04-15 Thread Tatsuo Ishii


> According to POSIX -regex (7)-, standard character class are:
> 
>   alnum   digit   punct
>   alpha   graph   space
>   blank   lower   upper
>   cntrl   print   xdigi
> 
> Many of that classes are different in different locales, and currently
> all work as if the localization were C. Many of those tests have
> multibyte issues, however with the patch postgres will work for
> one-byte encondings, which is better than nothing. If someone
> (Tatsuo?) gives some advice I will work in the multibyte version.

I don't think character classes are applicable for most mutibyte
encodings. Maybe only the exeception is Unicode?

> Peter Eisentraut <[EMAIL PROTECTED]> writes:
> >
> > Basically, you manually preprocess the patch to include the
> > USE_LOCALE branch and remove the not USE_LOCALE branch.
> 
> Yeah, that should work. You may also remove include/regex/cclass.h
> since it will not be used any more.

But I don't like cclass_init() routine runs every time when reg_comp
called. In my understanding the result of cclass_init() is always
same. What about running cclass_init() in postmaster, not postgres? Or
even better in initdb time?
--
Tatsuo Ishii

---(end of broadcast)---
TIP 2: you can get off all lists at once with the unregister command
(send "unregister YourEmailAddressHere" to [EMAIL PROTECTED])

Re: [HACKERS] Bug #633: CASE statement evaluation does not short-circut

2002-04-15 Thread Thomas Lockhart


...
> I don't really consider this a bug; at least, fixing it would imply not
> const-simplifying the result expressions of CASEs, which is a cure far
> worse than the disease IMHO.  Does anyone think we *should* allow CASE
> to defeat const-simplification?

No. Constant-folding during parsing should *always* be allowed.

   - Thomas

---(end of broadcast)---
TIP 4: Don't 'kill -9' the postmaster

Re: [HACKERS] regexp character class locale awareness patch

2002-04-15 Thread Manuel Sugawara

According to POSIX -regex (7)-, standard character class are:

  alnum   digit   punct
  alpha   graph   space
  blank   lower   upper
  cntrl   print   xdigi

Many of that classes are different in different locales, and currently
all work as if the localization were C. Many of those tests have
multibyte issues, however with the patch postgres will work for
one-byte encondings, which is better than nothing. If someone
(Tatsuo?) gives some advice I will work in the multibyte version.

Peter Eisentraut <[EMAIL PROTECTED]> writes:
>
> Basically, you manually preprocess the patch to include the
> USE_LOCALE branch and remove the not USE_LOCALE branch.

Yeah, that should work. You may also remove include/regex/cclass.h
since it will not be used any more.

> However, if the no-locale branches have significant performance
> benefits then it might be worth pondering setting up some
> optimizations.

This is not the case.

Regards,
Manuel.

---(end of broadcast)---
TIP 3: if posting/reading through Usenet, please send an appropriate
subscribe-nomail command to [EMAIL PROTECTED] so that your
message can get through to the mailing list cleanly

Re: [HACKERS] multibyte support by default

2002-04-15 Thread Peter Eisentraut


Tatsuo Ishii writes:

> In my understanding, our consensus was enabling multibyte support by
> default for 7.3. Any objection?

It was my understanding (or if I was mistaken, then it is my suggestion)
that the build-time option would be removed altogether and certain
performance-critical places (if any) would be wrapped into

if (encoding_is_single_byte(current_encoding)) { }

That's basically what I did with the locale support.

-- 
Peter Eisentraut   [EMAIL PROTECTED]


---(end of broadcast)---
TIP 5: Have you checked our extensive FAQ?

http://www.postgresql.org/users-lounge/docs/faq.html

Re: [HACKERS] Operators and schemas

2002-04-15 Thread Peter Eisentraut

Tom Lane writes:

> What I'm now envisioning is that pg_dump will explicitly set
>   set search_path = 'foo';
> when dumping or reloading schema foo.

I had imagined that pg_dump would emit commands such as this:

CREATE SCHEMA foo
  CREATE TABLE bar ( ... )
  CREATE otherthings
;

which is how I read the SQL standard.  Are there plans to implement the
CREATE SCHEMA command that way?  I think I recall someone from Toronto
mentioning something along these lines.

Obviously, this command style would be mostly equivalent to temporarily
setting the search path.  We'd also need alter schema, which SQL doesn't
have.

> Given the present semantics of
> search_path, that will imply an implicit search of pg_catalog before
> foo.

Interesting ... Is that only temporary?  (since you say "present"
semantics)

-- 
Peter Eisentraut   [EMAIL PROTECTED]

---(end of broadcast)---
TIP 2: you can get off all lists at once with the unregister command
(send "unregister YourEmailAddressHere" to [EMAIL PROTECTED])

Re: [HACKERS] Importing Large Amounts of Data

2002-04-15 Thread Curt Sampson

On Mon, 15 Apr 2002, Bruce Momjian wrote:

> Can you check your load and see if there is a PRIMARY key on the table
> at the time it is being loaded.

There is not. I create the table with a PRIMARY KEY declaration,
but I drop that index before doing the import, and do an ALTER
TABLE to re-add the primary key afterwards.

At one point I tried doing a load with all indices enabled, but
after about eight or nine hours I gave up. (Typically the load
takes about 30 minutes. This is using about 2% of the sample data.)

> In the old days, we created indexes
> only after the data was loaded, but when we added PRIMARY key, pg_dump
> was creating the table with PRIMARY key then loading it, meaning the
> table was being loaded while it had an existing index.  I know we fixed
> this recently but I am not sure if it was in 7.2 or not.

Ah, I saw that fix. But I'm doing the load by hand, not using
pg_restore.

cjs
-- 
Curt Sampson  <[EMAIL PROTECTED]>   +81 90 7737 2974   http://www.netbsd.org
Don't you know, in this new Dark Age, we're all light.  --XTC

---(end of broadcast)---
TIP 6: Have you searched our list archives?

http://archives.postgresql.org

Re: [HACKERS] Importing Large Amounts of Data

2002-04-15 Thread Neil Conway


On Mon, 15 Apr 2002 21:44:26 -0400 (EDT)
"Bruce Momjian" <[EMAIL PROTECTED]> wrote:
> In the old days, we created indexes
> only after the data was loaded, but when we added PRIMARY key, pg_dump
> was creating the table with PRIMARY key then loading it, meaning the
> table was being loaded while it had an existing index.  I know we fixed
> this recently but I am not sure if it was in 7.2 or not. 

It's not in 7.2 -- but it's fixed in CVS.

Cheers,

Neil

-- 
Neil Conway <[EMAIL PROTECTED]>
PGP Key ID: DB3C29FC

---(end of broadcast)---
TIP 2: you can get off all lists at once with the unregister command
(send "unregister YourEmailAddressHere" to [EMAIL PROTECTED])

Re: [HACKERS] regexp character class locale awareness patch

2002-04-15 Thread Peter Eisentraut

Bruce Momjian writes:

> Tatsuo Ishii wrote:
> > > Whatever you do with this patch, remember that the USE_LOCALE symbol is
> > > gone.
> >
> > Then the patches should be modified.
>
> Yes, I am not quite sure how to do that.  I will research it unless
> someone else lends a hand.

Basically, you manually preprocess the patch to include the USE_LOCALE
branch and remove the not USE_LOCALE branch.  However, if the no-locale
branches have significant performance benefits then it might be worth
pondering setting up some optimizations.

-- 
Peter Eisentraut   [EMAIL PROTECTED]

---(end of broadcast)---
TIP 2: you can get off all lists at once with the unregister command
(send "unregister YourEmailAddressHere" to [EMAIL PROTECTED])

Re: [HACKERS] Importing Large Amounts of Data

2002-04-15 Thread Bruce Momjian


Curt Sampson wrote:
> On Mon, 15 Apr 2002, Tom Lane wrote:
> 
> > > I'm not looking for "runs a bit faster;" five percent either way
> > > makes little difference to me. I'm looking for a five-fold performance
> > > increase.
> >
> > You are not going to get it from this; where in the world did you get
> > the notion that data integrity costs that much?
> 
> Um...the fact that MySQL imports the same data five times as fast? :-)
> 
> Note that this is *only* related to bulk-importing huge amounts of
> data. Postgres seems a little bit slower than MySQL at building
> the indicies afterwards, but this would be expected since (probably
> due to higher tuple overhead) the size of the data once in postgres
> is about 75% larger than in MySQL: 742 MB vs 420 MB. I've not done
> any serious testing of query speed, but the bit of toying I've done
> with it shows no major difference.

Can you check your load and see if there is a PRIMARY key on the table
at the time it is being loaded.  In the old days, we created indexes
only after the data was loaded, but when we added PRIMARY key, pg_dump
was creating the table with PRIMARY key then loading it, meaning the
table was being loaded while it had an existing index.  I know we fixed
this recently but I am not sure if it was in 7.2 or not. 

-- 
  Bruce Momjian|  http://candle.pha.pa.us
  [EMAIL PROTECTED]   |  (610) 853-3000
  +  If your life is a hard drive, |  830 Blythe Avenue
  +  Christ can be your backup.|  Drexel Hill, Pennsylvania 19026

---(end of broadcast)---
TIP 5: Have you checked our extensive FAQ?

http://www.postgresql.org/users-lounge/docs/faq.html

Re: [HACKERS] Importing Large Amounts of Data

2002-04-15 Thread Curt Sampson

On Mon, 15 Apr 2002, Tom Lane wrote:

> > I'm not looking for "runs a bit faster;" five percent either way
> > makes little difference to me. I'm looking for a five-fold performance
> > increase.
>
> You are not going to get it from this; where in the world did you get
> the notion that data integrity costs that much?

Um...the fact that MySQL imports the same data five times as fast? :-)

Note that this is *only* related to bulk-importing huge amounts of
data. Postgres seems a little bit slower than MySQL at building
the indicies afterwards, but this would be expected since (probably
due to higher tuple overhead) the size of the data once in postgres
is about 75% larger than in MySQL: 742 MB vs 420 MB. I've not done
any serious testing of query speed, but the bit of toying I've done
with it shows no major difference.

> Have you tried all the usual speedup hacks?  Turn off fsync, if you
> really think you do not care about crash integrity; use COPY FROM STDIN
> to bulk-load data, not retail INSERTs; possibly drop and recreate
> indexes rather than updating them piecemeal; etc.  You should also
> consider not declaring foreign keys, as the runtime checks for reference
> validity are pretty expensive.

Yes, I did all of the above. (This was all mentioned in my initial
message, except for turning off foreign key constraints--but the
table has no foreign keys.)

What I'm thinking would be really cool would be to have an "offline"
way of creating tables using a stand-alone program that would write
the files at, one hopes, near disk speed. Maybe it could work by
creating the tables in a detached tablespace, and then you'd attach
the tablespace when you're done. It might even be extended to be
able to do foreign key checks, create indicies, and so on. (Foreign
key checks would be useful; I'm not sure that creating indicies
would be any faster than just doing it after the tablespace is
attached.)

This would be particularly useful for fast restores of backups.
Downtime while doing a restore is always a huge pain for large
databases.

cjs
-- 
Curt Sampson  <[EMAIL PROTECTED]>   +81 90 7737 2974   http://www.netbsd.org
Don't you know, in this new Dark Age, we're all light.  --XTC

---(end of broadcast)---
TIP 5: Have you checked our extensive FAQ?

http://www.postgresql.org/users-lounge/docs/faq.html

[HACKERS] multibyte support by default

2002-04-15 Thread Tatsuo Ishii


In my understanding, our consensus was enabling multibyte support by
default for 7.3. Any objection?
--
Tatsuo Ishii

---(end of broadcast)---
TIP 5: Have you checked our extensive FAQ?

http://www.postgresql.org/users-lounge/docs/faq.html

Re: [HACKERS] regexp character class locale awareness patch

2002-04-15 Thread Bruce Momjian


Tatsuo Ishii wrote:
> > Whatever you do with this patch, remember that the USE_LOCALE symbol is
> > gone.
> 
> Then the patches should be modified.

Yes, I am not quite sure how to do that.  I will research it unless
someone else lends a hand.

-- 
  Bruce Momjian|  http://candle.pha.pa.us
  [EMAIL PROTECTED]   |  (610) 853-3000
  +  If your life is a hard drive, |  830 Blythe Avenue
  +  Christ can be your backup.|  Drexel Hill, Pennsylvania 19026

---(end of broadcast)---
TIP 2: you can get off all lists at once with the unregister command
(send "unregister YourEmailAddressHere" to [EMAIL PROTECTED])

Re: [HACKERS] JDBC build fails

2002-04-15 Thread Tatsuo Ishii


> Tatsuo,
> 
> Yes, ant version 1.4.1 or later is required to build the driver

Then it should be noted somewhere in the docs. Also, that should be
noted in the "incompatibilty section" of the release note for 7.3.
--
Tatsuo Ishii

> see http://jakarta.apache.org/ant
> 
> Dave
> On Sat, 2002-04-13 at 23:28, Barry Lind wrote:
> > Dave,
> > 
> > This was your change I believe.  Can you respond?
> > 
> > thanks,
> > --Barry
> > 
> > Tatsuo Ishii wrote:
> > > Can someone please fix this? Building JDBC staffs in current has been
> > > broken for a while(7.2.x is ok). Maybe current JDBC build process
> > > requires more recent version of ant than I have, I don't know. But if
> > > so, that should be stated somewhere in the docs explicitly, I think.
> > > 
> > > /usr/bin/ant -buildfile ./build.xml all \
> > >   -Dmajor=7 -Dminor=3 -Dfullversion=7.3devel -Ddef_pgport=5432 -Denable_debug=yes
> > > Buildfile: ./build.xml
> > > 
> > > all:
> > > 
> > > prepare:
> > > 
> > > BUILD FAILED
> > > 
> > > /usr/local/src/pgsql/current/pgsql/src/interfaces/jdbc/./build.xml:155: Could 
>not create task of type: condition. Common solutions are to use taskdef to declare 
>your task, or, if this is an optional task, to put the optional.jar in the lib 
>directory of your ant installation (ANT_HOME).
> > > 
> > > FYI, my build tools are:
> > > 
> > > Java: 1.3.0
> > > Ant: 1.3
> > > 
> > > 
> > > ---(end of broadcast)---
> > > TIP 6: Have you searched our list archives?
> > > 
> > > http://archives.postgresql.org
> > > 
> > 
> > 
> > 
> 
> 

---(end of broadcast)---
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]

Re: [HACKERS] regexp character class locale awareness patch

2002-04-15 Thread Tatsuo Ishii


> Whatever you do with this patch, remember that the USE_LOCALE symbol is
> gone.

Then the patches should be modified.
--
Tatsuo Ishii

---(end of broadcast)---
TIP 4: Don't 'kill -9' the postmaster

Re: [HACKERS] Operators and schemas

2002-04-15 Thread Tom Lane


Hiroshi Inoue <[EMAIL PROTECTED]> writes:
> Tom Lane wrote:
>> * Names in the current schema need be qualified only if they

> What does the current schema mean ?

In this case, it means the one pg_dump is trying to dump.

regards, tom lane

---(end of broadcast)---
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]

Re: [HACKERS] Operators and schemas

2002-04-15 Thread Hiroshi Inoue

Tom Lane wrote:
> 
> Peter Eisentraut <[EMAIL PROTECTED]> writes:
> > I imagine that pg_dump could be able to figure out that certain references
> > would be "local", so no explicit schema qualification is necessary.

[snip]

> * Names in the current schema need be qualified only if they

What does the current schema mean ?
Or What does "local" mean ?

regards,
Hiroshi Inoue

---(end of broadcast)---
TIP 2: you can get off all lists at once with the unregister command
(send "unregister YourEmailAddressHere" to [EMAIL PROTECTED])

Re: [HACKERS] [GENERAL] Notify argument?

2002-04-15 Thread Bruce Momjian



Fix applied.

---

Tom Lane wrote:
> Bruce Momjian <[EMAIL PROTECTED]> writes:
> > The breakage will come when we lengthen NAMEDATALEN, which I plan to
> > tackle for 7.3.  We will need to re-order the NOTIFY structure and put
> > the NAMEDATALEN string at the end of the struct so differing namedatalen
> > backend/clients will work.  If you want to break it, 7.3 would probably
> > be the time to do it.  :-)  Users will need a recompile pre-7.3 to use
> > notify for 7.3 and later anyway.
> 
> If we're going to change the structure anyway, let's fix it to be
> independent of NAMEDATALEN.  Instead of
> 
> charrelname[NAMEDATALEN];
> int be_pid;
> 
> let's do
> 
> char   *relname;
> int be_pid;
> 
> This should require no source-level changes in calling C code, thanks
> to C's equivalence between pointers and arrays.  We can preserve the
> fact that freeing a PQnotifies result takes only one free() with a
> little hacking to make the string be allocated in the same malloc call:
> 
> newNotify = (PGnotify *) malloc(sizeof(PGnotify) + strlen(str) + 1);
> newNotify->relname = (char *) newNotify + sizeof(PGnotify);
> strcpy(newNotify->relname, str);
> 
> Thus, with one line of extra ugliness inside the library, we solve the
> problem permanently.
> 
>   regards, tom lane
> 

-- 
  Bruce Momjian|  http://candle.pha.pa.us
  [EMAIL PROTECTED]   |  (610) 853-3000
  +  If your life is a hard drive, |  830 Blythe Avenue
  +  Christ can be your backup.|  Drexel Hill, Pennsylvania 19026

---(end of broadcast)---
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]

Re: [HACKERS] [PATCHES] [patch] fe-connect.c doesn't handle EINTR correctly

2002-04-15 Thread Bruce Momjian



Fix applied.  Thanks.

---

Bruce Momjian wrote:
> 
> David, sorry you patch didn't make it into 7.2.X.  That whole EINTR
> discussion was quite complicated so I am not surprised we missed it.
> 
> The attached patch implements your ENITR test in cases that seems to
> need it.  I have followed the method we used for ENITRY in fe-misc.c.
> 
> 
> ---
> 
> David Ford wrote:
> > Last year we had a drawn out discussion about this and I created a patch 
> > for it.  I never noticed that the patch didn't go in until I installed 
> > 7.2 the other day and realised that fe-connect.c never was fixed.
> > 
> > Here is the patch again.  It is against CVS 3/16/2002.  This time I only 
> > rewrote the connect procedure at line 912, I leave it up to the regular 
> > hackers to copy it's functionality to the SSL procedure just below it.
> > 
> > In summary, if a software writer implements timer events or other events 
> > which generate a signal with a timing fast enough to occur while libpq 
> > is inside connect(), then connect returns -EINTR.  The code following 
> > the connect call does not handle this and generates an error message. 
> >  The sum result is that the pg_connect() fails.  If the timer or other 
> > event is right on the window of the connect() completion time, the 
> > pg_connect() may appear to work sporadically.  If the event is too slow, 
> > pg_connect() will appear to always work and if the event is too fast, 
> > pg_connect() will always fail.
> > 
> > David
> > 
> 
> > Index: src/interfaces/libpq/fe-connect.c
> > ===
> > RCS file: /projects/cvsroot/pgsql/src/interfaces/libpq/fe-connect.c,v
> > retrieving revision 1.181
> > diff -u -r1.181 fe-connect.c
> > --- src/interfaces/libpq/fe-connect.c   2001/11/11 02:09:05 1.181
> > +++ src/interfaces/libpq/fe-connect.c   2002/03/16 05:17:47
> > @@ -909,29 +909,48 @@
> >  * Thus, we have to make arrangements for all eventualities.
> >  * --
> >  */
> > -   if (connect(conn->sock, &conn->raddr.sa, conn->raddr_len) < 0)
> > -   {
> > -   if (SOCK_ERRNO == EINPROGRESS || SOCK_ERRNO == EWOULDBLOCK || 
>SOCK_ERRNO == 0)
> > -   {
> > -   /*
> > -* This is fine - we're in non-blocking mode, and the
> > -* connection is in progress.
> > -*/
> > -   conn->status = CONNECTION_STARTED;
> > -   }
> > -   else
> > -   {
> > -   /* Something's gone wrong */
> > -   connectFailureMessage(conn, SOCK_ERRNO);
> > -   goto connect_errReturn;
> > +   do {
> > +   int e;
> > +   e=connect(conn->sock, &conn->raddr.sa, conn->raddr_len)
> > +
> > +   if(e < 0) {
> > +   switch (e) {
> > +   case EINTR:
> > +   /*
> > +* Interrupted by a signal, keep trying.  This 
>handling is
> > +* required because the user may have turned 
>on signals in
> > +* his program.  Previously, libpq would 
>erronously fail to
> > +* connect if the user's timer event fired and 
>interrupted
> > +* this syscall.  It is important that we 
>don't try to sleep
> > +* here because this may cause havoc with the 
>user program.
> > +*/
> > +   continue;
> > +   break;
> > +   case 0:
> > +   case EINPROGRESS:
> > +   case EWOULDBLOCK:
> > +   /*
> > +* This is fine - we're in non-blocking mode, 
>and the
> > +* connection is in progress.
> > +*/
> > +   conn->status = CONNECTION_STARTED;
> > +   break;
> > +   default:
> > +   /* Something's gone wrong */
> > +   connectFailureMessage(conn, SOCK_ERRNO);
> > +   goto connect_errReturn;
> > +   break;
> > +   }
> > +   } else {
> > +   /* We're connected now */
> > +   conn->status = CONNECTION_MADE;
> > }
> > -   }
> > -   else
> > -   {
> > -   /* We're connected already */
> > -   conn->status = CONNECTION_MADE;
> > -   }
> > +

Re: [HACKERS] regression in CVS HEAD

2002-04-15 Thread Tom Lane

Neil Conway <[EMAIL PROTECTED]> writes:
> backend> create table foo (c1 int);
> ERROR:  invalid relation "foo"; system catalog modifications are currently disallowed

I've committed a fix for this.

BTW, I dunno about other developers, but I never use standalone backends
for debugging.  It's a lot nicer to open two windows, run a regular psql
in one, and use the other to run gdb and "attach" to the backend
process.

regards, tom lane

---(end of broadcast)---
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]

Re: [HACKERS] Operators and schemas

2002-04-15 Thread Tom Lane

Peter Eisentraut <[EMAIL PROTECTED]> writes:
> I imagine that pg_dump could be able to figure out that certain references
> would be "local", so no explicit schema qualification is necessary.

Well, if it makes assumptions about the path then it can do that ... or
I guess it could explicitly set the path, and then it knows.  Yeah, that
will probably work well enough.  Okay, good ... the question of what
pg_dump should do about qualifying names was bugging me.

What I'm now envisioning is that pg_dump will explicitly set
set search_path = 'foo';
when dumping or reloading schema foo.  Given the present semantics of
search_path, that will imply an implicit search of pg_catalog before
foo.  Therefore, we have the following ground rules for schema
qualification in pg_dump:
* System (pg_catalog) names never need qualification.
* Names in the current schema need be qualified only if they
  conflict with system names.
* Cross-references to other schemas will always be qualified.

This seems workable.  Thoughts?

regards, tom lane

---(end of broadcast)---
TIP 5: Have you checked our extensive FAQ?

http://www.postgresql.org/users-lounge/docs/faq.html

Re: [HACKERS] Operators and schemas

2002-04-15 Thread Peter Eisentraut

Tom Lane writes:

> But: do you really want to see all dumped rules, defaults, etc in that
> style?  Ugh... talk about loss of readability...

I imagine that pg_dump could be able to figure out that certain references
would be "local", so no explicit schema qualification is necessary.
Thus, the only weird-looking operator invocations would be those that were
also created in weird ways.  In general, pg_dump should try to avoid
making unnecessary schema qualifications on any object so that you can
edit the dump and only change the schema name in one place.

-- 
Peter Eisentraut   [EMAIL PROTECTED]

---(end of broadcast)---
TIP 6: Have you searched our list archives?

http://archives.postgresql.org

Re: [HACKERS] RFC: Generating useful names for foreign keys and checks

2002-04-15 Thread Tom Lane

"Christopher Kings-Lynne" <[EMAIL PROTECTED]> writes:
> I'm thinking of doing a patch to generate foo_fkey and foo_chk names for
> fk's and checks.  I know that this will make using DROP CONSTRAINT a whole
> heck of a lot easier.  There have also been a few people who've complained
> on the list about all the  foreign keys, etc.

> I know Tom had some fears, but I don't know if they still apply, or if
> they're any worse than the current situation?

Actually I'm in favor of it.  I have a proposal outstanding to require
constraints to have names that are unique per-table, for consistency
with triggers (already are that way) and rules (will become that way,
rather than having globally unique names as now).  AFAIR the only
significant concern was making sure that the system wouldn't generate
duplicate constraint names by default.

Actually, I was only thinking of CHECK constraints (pg_relcheck) in this
proposal.  In the long run it'd be a good idea to have a table that
explicitly lists all constraints --- check, unique, primary, foreign
key, etc --- and the index on such a table would probably enforce
name uniqueness across all types of constraints on one table.  Right now,
though, each type of constraint effectively has a separate namespace.

regards, tom lane

---(end of broadcast)---
TIP 4: Don't 'kill -9' the postmaster

Re: [HACKERS] [PATCHES] ANSI Compliant Inserts

2002-04-15 Thread Peter Eisentraut


Bruce Momjian writes:

> OK, how about a NOTICE stating that the missing columns were filled in
> with defaults?

Please not.

-- 
Peter Eisentraut   [EMAIL PROTECTED]


---(end of broadcast)---
TIP 4: Don't 'kill -9' the postmaster

Re: [HACKERS] Operators and schemas

2002-04-15 Thread Tom Lane

Peter Eisentraut <[EMAIL PROTECTED]> writes:
> We could make some sort of escape syntax, like

> op1 myschema.operator(+) op2

I thought a little bit about that ... the above syntax does not work
but it looks like we could do something along the lines of

op1 OPERATOR(myschema.+) op2

where OPERATOR has to become a fully reserved keyword.

But: do you really want to see all dumped rules, defaults, etc in that
style?  Ugh... talk about loss of readability...

regards, tom lane

---(end of broadcast)---
TIP 3: if posting/reading through Usenet, please send an appropriate
subscribe-nomail command to [EMAIL PROTECTED] so that your
message can get through to the mailing list cleanly

Re: [HACKERS] [PATCHES] ANSI Compliant Inserts

2002-04-15 Thread Bruce Momjian


Vince Vielhaber wrote:
> On Mon, 15 Apr 2002, Tom Lane wrote:
> 
> > Peter Eisentraut <[EMAIL PROTECTED]> writes:
> > > I recall that this was the behavior we agreed we wanted.  IMHO, it would
> > > be conditional on the INSERT ... VALUES (DEFAULT) capability being
> > > provided.  I'm not sure if that is there yet.
> >
> > That is there now.  Do you recall when this was discussed before?
> > I couldn't remember if there'd been any real discussion or not.
> 
> It has to be at least a year, Tom.  I brought it up in hackers after
> I got bit by it.  I had a rather long insert statement and missed a
> value in the middle somewhere which shifted everything by one.  It
> was agreed that it shouldn't happen but I don't recall what else was
> decided.

Yes, I do remember Vince's comment, and I do believe that is the time it
was added.

-- 
  Bruce Momjian|  http://candle.pha.pa.us
  [EMAIL PROTECTED]   |  (610) 853-3000
  +  If your life is a hard drive, |  830 Blythe Avenue
  +  Christ can be your backup.|  Drexel Hill, Pennsylvania 19026

---(end of broadcast)---
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]

Re: [HACKERS] [PATCHES] ANSI Compliant Inserts

2002-04-15 Thread Bruce Momjian


Peter Eisentraut wrote:
> Bruce Momjian writes:
> 
> > Peter, are you saying you don't want to require all columns to be
> > specified when INSERT doesn't list the columns?
> 
> Yes, that's what I'm saying.  Too much breakage and annoyance potential in
> that change.

OK, how about a NOTICE stating that the missing columns were filled in
with defaults?

-- 
  Bruce Momjian|  http://candle.pha.pa.us
  [EMAIL PROTECTED]   |  (610) 853-3000
  +  If your life is a hard drive, |  830 Blythe Avenue
  +  Christ can be your backup.|  Drexel Hill, Pennsylvania 19026

---(end of broadcast)---
TIP 4: Don't 'kill -9' the postmaster

Re: [HACKERS] RFC: Generating useful names for foreign keys and checks

2002-04-15 Thread Bruce Momjian



Yes!  Please do something with those unnamed constraints.

---

Christopher Kings-Lynne wrote:
> Hi,
> 
> I'm thinking of doing a patch to generate foo_fkey and foo_chk names for
> fk's and checks.  I know that this will make using DROP CONSTRAINT a whole
> heck of a lot easier.  There have also been a few people who've complained
> on the list about all the  foreign keys, etc.
> 
> I know Tom had some fears, but I don't know if they still apply, or if
> they're any worse than the current situation?
> 
> Can I go ahead?
> 
> Chris
> 
> 
> ---(end of broadcast)---
> TIP 4: Don't 'kill -9' the postmaster
> 

-- 
  Bruce Momjian|  http://candle.pha.pa.us
  [EMAIL PROTECTED]   |  (610) 853-3000
  +  If your life is a hard drive, |  830 Blythe Avenue
  +  Christ can be your backup.|  Drexel Hill, Pennsylvania 19026

---(end of broadcast)---
TIP 4: Don't 'kill -9' the postmaster

Re: [HACKERS] regression in CVS HEAD

2002-04-15 Thread Tom Lane

Neil Conway <[EMAIL PROTECTED]> writes:
> POSTGRES backend interactive interface 
> $Revision: 1.260 $ $Date: 2002/03/24 04:31:07 $

> backend> create table foo (c1 int);
> ERROR:  invalid relation "foo"; system catalog modifications are currently disallowed
> backend> create schema x;
> backend> create table x.bar (c1 int);
> backend>

> Is this the expected behavior?

It is at the moment but I'm planning to change it.  Currently, a
standalone backend defaults to pg_catalog being the target creation
namespace, which is needed by initdb; but I was planning to make initdb
explicitly set the search_path to pg_catalog, because it seems like a
bad idea for pg_catalog to ever be the default target.

In the meantime, try an explicit
set search_path = 'public';
then "create table foo" would create public.foo which will be allowed.

regards, tom lane

---(end of broadcast)---
TIP 2: you can get off all lists at once with the unregister command
(send "unregister YourEmailAddressHere" to [EMAIL PROTECTED])

Re: [HACKERS] Operators and schemas

2002-04-15 Thread Rod Taylor


> 2. Use a restricted, perhaps fixed search-path for searching for
> operators.  For example, we might force the search path to have
> pg_catalog first even when this is not true for the table name
search
> path.  But I'm not sure what an appropriate definition would be.
> A restricted search path might limit the usefulness of private
operators
> to the point where we might as well have kept them database-wide.

Wanting to open a bucket of worms, what would making the system create
an operator with the schema name in it?

Ie.  Create operator schema.+ would create:

'schema.+'  in pg_catalog
'+' in schema

This would require double the operator entries, but isn't really any
worse than the array types as related to their base type.

So, user could type:
select col1 + col2 from schema.tab;
select col1 schema.+ col2 from tab;



---(end of broadcast)---
TIP 6: Have you searched our list archives?

http://archives.postgresql.org

Re: [HACKERS] Operators and schemas

2002-04-15 Thread Peter Eisentraut


Tom Lane writes:

> After some fooling around with gram.y, I have come to the conclusion
> that there's just no way to use a schema-qualified name for an operator
> in an expression.  I was hoping we might be able to write something like
>   operand1 schema.+ operand2
> but I can't find any way to make this work without tons of shift/reduce
> conflicts.  One counterexample suggesting it can't be done is that
>   foo.*
> might be either a reference to all the columns of foo, or a qualified
> operator name.

What about foo."*"?

> We can still put operators into namespaces and allow qualified names in
> CREATE/DROP OPERATOR.  However, lookup of operators in expressions would
> have to be completely dependent on the search path.  That's not real
> cool; among other things, pg_dump couldn't guarantee that dumped
> expressions would be interpreted the same way when reloaded.

We could make some sort of escape syntax, like

op1 myschema.operator(+) op2

-- 
Peter Eisentraut   [EMAIL PROTECTED]


---(end of broadcast)---
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]

Re: [HACKERS] Array Iterator functions

2002-04-15 Thread Tom Lane

"Rod Taylor" <[EMAIL PROTECTED]> writes:
> What would it take to make the array iterator functions a part of the
> standard base? (contrib/array)

To me, the main problem with contrib/array is that it doesn't scale:
you need more C functions for every array datatype you want to support.

At the very least it needs a way to avoid more per-datatype C code.
The per-datatype operator definitions are annoying too, but perhaps
not quite as annoying... one could imagine CREATE TYPE automatically
adding those along with the array type itself.

I'm not sure what it would take to avoid the per-datatype C code.
Clearly we want something like array_in/array_out, but how does the
extra information get to these functions?

It would also be good to have some idea of whether we could ever hope to
index queries using these functions.  The GIST stuff might provide that,
or it might not.  I don't insist that this work on day one, but I'd like
to see a road map, just to be sure that we are not shooting ourselves in
the foot by standardizing a not-quite-index-compatible definition.

regards, tom lane

---(end of broadcast)---
TIP 3: if posting/reading through Usenet, please send an appropriate
subscribe-nomail command to [EMAIL PROTECTED] so that your
message can get through to the mailing list cleanly

Re: [HACKERS] Inefficient handling of LO-restore + Patch

2002-04-15 Thread Tom Lane

Bruce Momjian <[EMAIL PROTECTED]> writes:
> Tom Lane wrote:
>> Anyone object if I turn off public read access to
>> pg_largeobject?

> Please do whatever you can to tighten it up. I thought we needed to keep
> read access so people could get to their large objects, but maybe not.

Yeah, right after sending that message I remembered that we had already
discussed this and concluded it would break clients :-(.

There's really no security for large objects anyway, since if you know
or can guess the OID of one you can read (or write!) it regardless.
Not much point in turning off read access on pg_largeobject unless we
rethink that ...

regards, tom lane

---(end of broadcast)---
TIP 3: if posting/reading through Usenet, please send an appropriate
subscribe-nomail command to [EMAIL PROTECTED] so that your
message can get through to the mailing list cleanly

Re: [HACKERS] Inefficient handling of LO-restore + Patch

2002-04-15 Thread Bruce Momjian


Tom Lane wrote:
> "Mario Weilguni" <[EMAIL PROTECTED]> writes:
> > * select octet_length(data) from pg_largeobject where loid=OIDOFOBJECT and pageno=0
> 
> This really should not work if you're not superuser.  Right now it does,
> but I think that's an oversight in the default permissions settings for
> system tables.  Anyone object if I turn off public read access to
> pg_largeobject?

Please do whatever you can to tighten it up. I thought we needed to keep
read access so people could get to their large objects, but maybe not.

-- 
  Bruce Momjian|  http://candle.pha.pa.us
  [EMAIL PROTECTED]   |  (610) 853-3000
  +  If your life is a hard drive, |  830 Blythe Avenue
  +  Christ can be your backup.|  Drexel Hill, Pennsylvania 19026

---(end of broadcast)---
TIP 6: Have you searched our list archives?

http://archives.postgresql.org

[HACKERS] RFC: Generating useful names for foreign keys and checks

2002-04-15 Thread Christopher Kings-Lynne


Hi,

I'm thinking of doing a patch to generate foo_fkey and foo_chk names for
fk's and checks.  I know that this will make using DROP CONSTRAINT a whole
heck of a lot easier.  There have also been a few people who've complained
on the list about all the  foreign keys, etc.

I know Tom had some fears, but I don't know if they still apply, or if
they're any worse than the current situation?

Can I go ahead?

Chris


---(end of broadcast)---
TIP 4: Don't 'kill -9' the postmaster

[HACKERS] regression in CVS HEAD

2002-04-15 Thread Neil Conway


Hi all,

I'm seeing this on a fresh build from CVS:

$ ./configure && make && make check
...
$ cd src/test/regress/tmp_check
$ ./install/tmp/pgsql/bin/postgres -D data regression
LOG:  database system was shut down at 2002-04-15 15:03:58 EDT
LOG:  checkpoint record is at 0/160368C
LOG:  redo record is at 0/160368C; undo record is at 0/0; shutdown TRUE
LOG:  next transaction id: 4551; next oid: 139771
LOG:  database system is ready

POSTGRES backend interactive interface 
$Revision: 1.260 $ $Date: 2002/03/24 04:31:07 $

backend> create table foo (c1 int);
ERROR:  invalid relation "foo"; system catalog modifications are currently disallowed
backend> create schema x;
backend> create table x.bar (c1 int);
backend>

Is this the expected behavior? I haven't been following the schema
work very closely, but this was quite a surprise to me...

Cheers,

Neil

-- 
Neil Conway <[EMAIL PROTECTED]>
PGP Key ID: DB3C29FC

---(end of broadcast)---
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]

Re: [HACKERS] rules and default values

2002-04-15 Thread Neil Conway


On Mon, 15 Apr 2002 14:25:28 -0400
"Tom Lane" <[EMAIL PROTECTED]> wrote:
> Awhile back I said:
> > [EMAIL PROTECTED] (Neil Conway) writes:
> >> In other words, when the insert statement on the view is transformed by
> >> the rule, the "default value" columns are replaced by explicit NULL
> >> values (which is the default value for the columns of the pseudo-table
> >> created by CREATE VIEW). Is this the correct behavior?
> 
> > It's correct, from the point of view of the rule rewriter, but that
> > doesn't make the behavior useful.
> 
> > What'd make sense to me is to allow defaults to be attached to the
> > view columns, say by doing ALTER TABLE ADD DEFAULT on the view.
> > Unfortunately that won't do much in the current implementation,
> > because such defaults will never get applied (the planner certainly
> > won't see them as applicable).
> 
> > Maybe inserting defaults should be the first phase of rewriting, just
> > before rule substitution, rather than being left to the planner as it
> > is now.  We took it out of the parser for good reasons, but perhaps
> > we moved it too far downstream.
> 
> I recently moved the default-insertion phase to fix a different bug,
> so this is now possible.  Given the attached patch, it actually works.

Great!

> However I have not applied the patch because it needs (a) pg_dump
> support and (b) documentation, neither of which I have time for at the
> moment.  Anyone want to pick up the ball?

Sure, I'll do this stuff.

Cheers,

Neil

-- 
Neil Conway <[EMAIL PROTECTED]>
PGP Key ID: DB3C29FC

---(end of broadcast)---
TIP 6: Have you searched our list archives?

http://archives.postgresql.org

Re: [HACKERS] Operators and schemas

2002-04-15 Thread Tom Lane

Fernando Nasser <[EMAIL PROTECTED]> writes:
> If some types are really important and operators are desired, it can be
> coordinated with the DBA as operators would be a database wide resource.
> (This would be the case if indices extensions were involved anyway).

No, there isn't any particular reason that index extensions should be
considered database-wide resources; if operators are named local to
schemas, then opclasses can be too, and that's all you need.

In practice maybe it doesn't matter; I doubt anyone would try to
implement an indexable datatype in anything but C, and to define
C functions you must be superuser anyway.  But this does not seem
to me to be a good argument why operator names should be global.

regards, tom lane

---(end of broadcast)---
TIP 5: Have you checked our extensive FAQ?

http://www.postgresql.org/users-lounge/docs/faq.html

Re: [HACKERS] Operators and schemas

2002-04-15 Thread Fernando Nasser

Tom Lane wrote:
> 
> 1. Keep operators as database-wide objects, instead of putting them into
> namespaces.  This seems a bit silly though: if the types and functions
> that underlie an operator are private to a namespace, shouldn't the
> operator be as well?
> 

Not necessarily.  One can still create a type and functions to operate 
on them.  Operators are a convenience, not a necessity (except for 
indices extensions).

If some types are really important and operators are desired, it can be
coordinated with the DBA as operators would be a database wide resource.
(This would be the case if indices extensions were involved anyway).

I would keep operators database-wide.  

-- 
Fernando Nasser
Red Hat - Toronto   E-Mail:  [EMAIL PROTECTED]
2323 Yonge Street, Suite #300
Toronto, Ontario   M4P 2C9

---(end of broadcast)---
TIP 4: Don't 'kill -9' the postmaster

Re: [HACKERS] rules and default values

2002-04-15 Thread Tom Lane

Awhile back I said:
> [EMAIL PROTECTED] (Neil Conway) writes:
>> In other words, when the insert statement on the view is transformed by
>> the rule, the "default value" columns are replaced by explicit NULL
>> values (which is the default value for the columns of the pseudo-table
>> created by CREATE VIEW). Is this the correct behavior?

> It's correct, from the point of view of the rule rewriter, but that
> doesn't make the behavior useful.

> What'd make sense to me is to allow defaults to be attached to the
> view columns, say by doing ALTER TABLE ADD DEFAULT on the view.
> Unfortunately that won't do much in the current implementation,
> because such defaults will never get applied (the planner certainly
> won't see them as applicable).

> Maybe inserting defaults should be the first phase of rewriting, just
> before rule substitution, rather than being left to the planner as it
> is now.  We took it out of the parser for good reasons, but perhaps
> we moved it too far downstream.

I recently moved the default-insertion phase to fix a different bug,
so this is now possible.  Given the attached patch, it actually works.
However I have not applied the patch because it needs (a) pg_dump
support and (b) documentation, neither of which I have time for at the
moment.  Anyone want to pick up the ball?

regards, tom lane

Demonstration of defaults for views (with patch):

regression=# create table foo (f1 int);
CREATE
regression=# create view vv as select * from foo;
CREATE
regression=# create rule vvi as on insert to vv do instead
regression-# insert into foo select new.*;
CREATE
regression=# insert into vv default values;
INSERT 0 0
regression=# select * from vv;
 f1

(1 row)

regression=# alter table vv alter column f1 set default 42;
ALTER
regression=# insert into vv default values;
INSERT 0 0
regression=# select * from vv;
 f1

 42
(2 rows)

*** src/backend/commands/tablecmds.c~   Mon Apr 15 01:22:03 2002
--- src/backend/commands/tablecmds.cMon Apr 15 14:16:58 2002
***
*** 622,629 

rel = heap_open(myrelid, AccessExclusiveLock);

!   if (rel->rd_rel->relkind != RELKIND_RELATION)
!   elog(ERROR, "ALTER TABLE: relation \"%s\" is not a table",
 RelationGetRelationName(rel));

if (!allowSystemTableMods
--- 622,635 

rel = heap_open(myrelid, AccessExclusiveLock);

!   /*
!* We allow defaults on views so that INSERT into a view can have
!* default-ish behavior.  This works because the rewriter substitutes
!* default values into INSERTs before it expands rules.
!*/
!   if (rel->rd_rel->relkind != RELKIND_RELATION &&
!   rel->rd_rel->relkind != RELKIND_VIEW)
!   elog(ERROR, "ALTER TABLE: relation \"%s\" is not a table or view",
 RelationGetRelationName(rel));

if (!allowSystemTableMods

---(end of broadcast)---
TIP 2: you can get off all lists at once with the unregister command
(send "unregister YourEmailAddressHere" to [EMAIL PROTECTED])

[HACKERS] Array Iterator functions

2002-04-15 Thread Rod Taylor


What would it take to make the array iterator functions a part of the
standard base? (contrib/array)

A number of people want this type of functionality (value = [ any
value of array ], and value = [ all values of array ] ).

The license needs to be changed (with authors permission), but other
than that?

-- README BELOW --
This loadable module defines a new class of functions which take
an array and a scalar value, iterate a scalar operator over the
elements of the array and the value, and compute a result as
the logical OR or AND of the iteration results.
For example array_int4eq returns true if some of the elements
of an array of int4 is equal to the given value:

 array_int4eq({1,2,3}, 1)  -->  true
 array_int4eq({1,2,3}, 4)  -->  false

If we have defined T array types and O scalar operators we can
define T x O x 2 array functions, each of them has a name like
"array_[all_]" and takes an array of type T
iterating the operator O over all the elements. Note however
that some of the possible combination are invalid, for example
the array_int4_like because there is no like operator for int4.

We can then define new operators based on these functions and use
them to write queries with qualification clauses based on the
values of some of the elements of an array.
For example to select rows having some or all element of an array
attribute equal to a given value or matching a regular expression:

 create table t(id int4[], txt text[]);

 -- select tuples with some id element equal to 123
 select * from t where t.id *= 123;

 -- select tuples with some txt element matching '[a-z]'
 select * from t where t.txt *~ '[a-z]';

 -- select tuples with all txt elements matching '^[A-Z]'
 select * from t where t.txt[1:3] **~ '^[A-Z]';

The scheme is quite general, each operator which operates on a base
type
can be iterated over the elements of an array. It seem to work well
but
defining each new operators requires writing a different C function.
Furthermore in each function there are two hardcoded OIDs which
reference
a base type and a procedure. Not very portable. Can anyone suggest a
better and more portable way to do it ?

See also array_iterator.sql for an example on how to use this module.
--
Rod Taylor



---(end of broadcast)---
TIP 3: if posting/reading through Usenet, please send an appropriate
subscribe-nomail command to [EMAIL PROTECTED] so that your
message can get through to the mailing list cleanly

[HACKERS] Operators and schemas

2002-04-15 Thread Tom Lane


After some fooling around with gram.y, I have come to the conclusion
that there's just no way to use a schema-qualified name for an operator
in an expression.  I was hoping we might be able to write something like
operand1 schema.+ operand2
but I can't find any way to make this work without tons of shift/reduce
conflicts.  One counterexample suggesting it can't be done is that
foo.*
might be either a reference to all the columns of foo, or a qualified
operator name.

We can still put operators into namespaces and allow qualified names in
CREATE/DROP OPERATOR.  However, lookup of operators in expressions would
have to be completely dependent on the search path.  That's not real
cool; among other things, pg_dump couldn't guarantee that dumped
expressions would be interpreted the same way when reloaded.

Things we might do to reduce the uncertainty:

1. Keep operators as database-wide objects, instead of putting them into
namespaces.  This seems a bit silly though: if the types and functions
that underlie an operator are private to a namespace, shouldn't the
operator be as well?

2. Use a restricted, perhaps fixed search-path for searching for
operators.  For example, we might force the search path to have
pg_catalog first even when this is not true for the table name search
path.  But I'm not sure what an appropriate definition would be.
A restricted search path might limit the usefulness of private operators
to the point where we might as well have kept them database-wide.

Comments anyone?  I'm really unsure what's the best way to proceed.

regards, tom lane

---(end of broadcast)---
TIP 3: if posting/reading through Usenet, please send an appropriate
subscribe-nomail command to [EMAIL PROTECTED] so that your
message can get through to the mailing list cleanly

Re: [HACKERS] regexp character class locale awareness patch

2002-04-15 Thread Peter Eisentraut


Tatsuo Ishii writes:

> > Whatever you do with this patch, remember that the USE_LOCALE symbol is
> > gone.
>
> I thought we have some way to tern off locale support at the configure
> time.

You do it at initdb time now.

-- 
Peter Eisentraut   [EMAIL PROTECTED]


---(end of broadcast)---
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]

Re: [HACKERS] That CREATE OPERATOR CLASS patch

2002-04-15 Thread Bill Studenmund

On Sun, 14 Apr 2002, Bruce Momjian wrote:

>
> Good question.  I see the thread at:
>
>   
>http://groups.google.com/groups?hl=en&threadm=Pine.LNX.4.30.0202262002040.685-10%40peter.localdomain&rnum=2&prev=/groups%3Fq%3Dcreate%2Boperator%2Bgroup:comp.databases.postgresql.*%26hl%3Den%26selm%3DPine.LNX.4.30.0202262002040.685-10%2540peter.localdomain%26rnum%3D2
>
> I asked the author to resumit but did not see a reply.  Perhaps someone
> else can take it over and make the requested changes.  Thanks.

Yeah, the problem is:

1) the author was (is) feeling a bit burnt out over successive battles
over the command syntax. I proposed it, and Tom said, "that syntax sucks
[which it verily did], try this." So I did "this" and got the patch that
is lying around. Then when 7.3 was done, someone else chimed in (I think
it was Peter) that it should be different. While the newer-yet command
syntax is better, *why wasn't it proposed the first time we went through
this on hackers?* It's frustrating to ask, "what should I do," do it, and
then get told, "no, that's not right." I mean, now, how do I know that
once I get a new version ready, it won't get revised *again*?

2) I now work at a new job, which is taking up lots of my time doing other
things. It's really cool, but PostgreSQL hacking isn't a paid part of it
(like it was at Zembu). In a few weeks I can probably get time to update
this, but if Christopher wants to work on it, go for it.

Take care,

Bill

> ---
>
> Christopher Kings-Lynne wrote:
> > If Bruce is thinking of applying outstanding patches - whatever happened
> > with Bill Studenmund's CREATE OPERATOR CLASS patch?
> >
> > Chris
> >
> >
> > ---(end of broadcast)---
> > TIP 4: Don't 'kill -9' the postmaster
> >
>
> --
>   Bruce Momjian|  http://candle.pha.pa.us
>   [EMAIL PROTECTED]   |  (610) 853-3000
>   +  If your life is a hard drive, |  830 Blythe Avenue
>   +  Christ can be your backup.|  Drexel Hill, Pennsylvania 19026
>

---(end of broadcast)---
TIP 6: Have you searched our list archives?

http://archives.postgresql.org

Re: [HACKERS] Inefficient handling of LO-restore + Patch

2002-04-15 Thread Mario Weilguni


And how about getting database internals via SQL-functions - e.g. getting BLCSIZE, 
LOBBLCSIZE?

-Ursprüngliche Nachricht-
Von: Tom Lane [mailto:[EMAIL PROTECTED]]
Gesendet: Montag, 15. April 2002 16:32
An: Mario Weilguni
Cc: [EMAIL PROTECTED]
Betreff: Re: [HACKERS] Inefficient handling of LO-restore + Patch 


"Mario Weilguni" <[EMAIL PROTECTED]> writes:
> * select octet_length(data) from pg_largeobject where loid=OIDOFOBJECT and pageno=0

This really should not work if you're not superuser.  Right now it does,
but I think that's an oversight in the default permissions settings for
system tables.  Anyone object if I turn off public read access to
pg_largeobject?

regards, tom lane

---(end of broadcast)---
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]

Re: [HACKERS] Importing Large Amounts of Data

2002-04-15 Thread Tom Lane


"Christopher Kings-Lynne" <[EMAIL PROTECTED]> writes:
>> Yes, I know. I mean how does this affect performance? How this can change
>> planner decision? Does it have any effect except cosmetical one?

> Only cosmetic.  In the example he gave, he wanted a primary key, so I showed
> him how to make one properly.

The ALTER form will complain if any of the columns are not marked NOT
NULL, so the difference isn't completely cosmetic.

regards, tom lane

---(end of broadcast)---
TIP 3: if posting/reading through Usenet, please send an appropriate
subscribe-nomail command to [EMAIL PROTECTED] so that your
message can get through to the mailing list cleanly

Re: [HACKERS] Inefficient handling of LO-restore + Patch

2002-04-15 Thread Tom Lane


"Mario Weilguni" <[EMAIL PROTECTED]> writes:
> * select octet_length(data) from pg_largeobject where loid=OIDOFOBJECT and pageno=0

This really should not work if you're not superuser.  Right now it does,
but I think that's an oversight in the default permissions settings for
system tables.  Anyone object if I turn off public read access to
pg_largeobject?

regards, tom lane

---(end of broadcast)---
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]

Re: [HACKERS] Importing Large Amounts of Data

2002-04-15 Thread Tom Lane

Curt Sampson <[EMAIL PROTECTED]> writes:
> On Mon, 15 Apr 2002, Christopher Kings-Lynne wrote:
>> CREATE TABLE WITHOUT OIDS ...

> As you can see from the schema I gave later in my message, that's
> exactly what I did. But does this actually avoid allocating the
> space in the on-disk tuples? What part of the code deals with this?
> It looks to me like the four bytes for the OID are still allocated
> in the tuple, but not used.

Curt is correct: WITHOUT OIDS does not save any storage.  Having two
different formats for the on-disk tuple header seemed more pain than
the feature was worth.  Also, because of alignment considerations it
would save no storage on machines where MAXALIGN is 8.  (Possibly my
thinking is colored somewhat by the fact that that's so on all my
favorite platforms ;-).)

However, as for the NULL values bitmap: that's already compacted out
when not used, and always has been AFAIK.

>> It's a bit hard to say "just turn off all the things that ensure your data
>> integrity so it runs a bit faster", if you actually need data integrity.

> I'm not looking for "runs a bit faster;" five percent either way
> makes little difference to me. I'm looking for a five-fold performance
> increase.

You are not going to get it from this; where in the world did you get
the notion that data integrity costs that much?  When the WAL stuff
was added in 7.1, we certainly did not see any five-fold slowdown.
If anything, testing seemed to indicate that WAL sped things up.
A lot would depend on your particular scenario of course.

Have you tried all the usual speedup hacks?  Turn off fsync, if you
really think you do not care about crash integrity; use COPY FROM STDIN
to bulk-load data, not retail INSERTs; possibly drop and recreate
indexes rather than updating them piecemeal; etc.  You should also
consider not declaring foreign keys, as the runtime checks for reference
validity are pretty expensive.

regards, tom lane

---(end of broadcast)---
TIP 6: Have you searched our list archives?

http://archives.postgresql.org

[HACKERS] Stumbled upon a time bug...

2002-04-15 Thread Rod Taylor


Is PostgreSQL broken? Or is it FreeBSD?

--

http://www.freebsd.org/cgi/query-pr.cgi?pr=ports/36954

PostgreSQL does not currently check the results of mktime().  On a
border condition, mktime() fails and would populate invalid data
into the database.

Most other *NIX's seem to automatically account for this and
automatically adjust the value when it gets passed to mktime(&tm).
Should FreeBSD have it's mktime() in libc updated?

CREATE TABEL tt ( tt TIMESTAMP );
INSERT INTO tt VALUES ('2002-4-7 2:0:0.0');

--
Rod Taylor

Your eyes are weary from staring at the CRT. You feel sleepy. Notice
how restful it is to watch the cursor blink. Close your eyes. The
opinions stated above are yours. You cannot imagine why you ever felt
otherwise.



---(end of broadcast)---
TIP 5: Have you checked our extensive FAQ?

http://www.postgresql.org/users-lounge/docs/faq.html

Re: [GENERAL] [HACKERS] [Fwd: AW: More UB-Tree patent information]

2002-04-15 Thread postgresql



Hannu Krosing wrote:
> 
> Have you found out _what_ exaclty is patented ?
> 
> Is it just his concrete implementation of "UB-Tree" or something
> broader, like using one multi-dimensional index instead of multiple
> one-dimensional ones ?

(I know it is OT, please reply in private, I can summarize any reactions 
to the list ...)
 
Patents are supposed to be only applicable to an industrial application 
(with external side-effects).  So ideas in themselves are not patentable.

Anyway, this is once more a good example of the danger of software patents 
- you know what to reply when people say "software patents promote 
innovation"

IANAL, just my 0,02 Euro.

see also : http://www.gnu.org/philosophy/savingeurope.html (also 
interesting for non-europeans, of course !)

-- 
Tycho Fruru [EMAIL PROTECTED]
"Prediction is extremely difficult. Especially about the future."
  - Niels Bohr



---(end of broadcast)---
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]

Re: [GENERAL] [HACKERS] [Fwd: AW: More UB-Tree patent information]

2002-04-15 Thread postgresql

On Wed, 10 Apr 2002 [EMAIL PROTECTED] wrote:

> Anyway, this is once more a good example of the danger of software patents 
> - you know what to reply when people say "software patents promote 
> innovation"

We (AEL, an association promoting freedom in every sense) have just now
put on-line a page which contains known software patents and the behaviour
of their respective owners wrt Free Software. (We have several
patent-related queries outstanding right now)

http://www.ael.be/node.php?id=52

We hope that this becomes a valuable source of information on which
patent's implementations are available for Free Software work, and which
aren't.  We also include some contact information to ease communication
with the patent holder.

Of course, we encourage people on the "incompatible with Free Software"  
category to inform us of any licensing changes they implement which
facilitate Free Software implementations, so that we can promptly update
the page accordingly.

Best Regards,
Tycho

-- 
Tycho Fruru [EMAIL PROTECTED]
"Prediction is extremely difficult. Especially about the future."
  - Niels Bohr

---(end of broadcast)---
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]

Re: [HACKERS] pg_dump is broken in CVS tip

2002-04-15 Thread Neil Conway

On Fri, 12 Apr 2002 13:28:34 -0400
"Tom Lane" <[EMAIL PROTECTED]> wrote:
> pg_dumping a table having a primary key yields commands like
> 
> --
> -- TOC Entry ID 2 (OID 139812)
> --
> -- Name: table1 Type: TABLE Owner: postgres
> --
> 
> CREATE TABLE "table1" (
>   "column10" character varying(255) NOT NULL,
>   "column1" character varying(255) NOT NULL,
>   "column2" smallint NOT NULL,
>   "column6" numeric,
>   "column7" "char",
>   Constraint "table1_pkey" Primary Key ("column10", "column1", "column2")
> );
> 
> [snip]
> 
> --
> -- TOC Entry ID 5 (OID 139817)
> --
> -- Name: "table1_pkey" Type: CONSTRAINT Owner: postgres
> --
> 
> Alter Table "table1" Add Constraint "table1_pkey" Primary Key ("column10", 
>"column1", "column2");
> 
> which on execution quite properly complains about duplicate primary
> keys.

Thanks for finding this Tom -- my apologies, this is likely my bug.

However, when I created a table using the commands above and then
dumped it again, I got a dump that worked properly: there was no
Constraint within the table definition itself, just an ALTER
TABLE at the end of the dump to add the PK (i.e. the patch worked
as intended and the table could be restored properly).

If you can give me a reproduceable test-case, I'll fix the bug.

Cheers,

Neil

-- 
Neil Conway <[EMAIL PROTECTED]>
PGP Key ID: DB3C29FC

---(end of broadcast)---
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]

Re: [HACKERS] PostgreSQL 7.2.1-2PGDG RPMs available for

2002-04-15 Thread Hannu Krosing


On Sun, 2002-04-14 at 08:48, Lamar Owen wrote:
> 
> Incidentally, the 7.2.93 (skipjack) public beta is a serious improvement over 
> RHL 7.2, and I personally recommend it, as KDE 3 is worth the upgrade, even 
> to a beta.

Is the 7.2.93 (skipjack) public beta an improvement in raw postgresql
performance or just in added stuff like KDE ?
 

Hannu



---(end of broadcast)---
TIP 6: Have you searched our list archives?

http://archives.postgresql.org

Re: [GENERAL] [HACKERS] [Fwd: AW: More UB-Tree patent information]

2002-04-15 Thread Jean-Luc Lachance


I while ago I used xbase2pg and pg2xbase.
You should be able to find it on the net.

[EMAIL PROTECTED] wrote:
> 
> Hannu Krosing wrote:
> >
> > Have you found out _what_ exaclty is patented ?
> >
> > Is it just his concrete implementation of "UB-Tree" or something
> > broader, like using one multi-dimensional index instead of multiple
> > one-dimensional ones ?
> 
> (I know it is OT, please reply in private, I can summarize any reactions
> to the list ...)
> 
> Patents are supposed to be only applicable to an industrial application
> (with external side-effects).  So ideas in themselves are not patentable.
> 
> Anyway, this is once more a good example of the danger of software patents
> - you know what to reply when people say "software patents promote
> innovation"
> 
> IANAL, just my 0,02 Euro.
> 
> see also : http://www.gnu.org/philosophy/savingeurope.html (also
> interesting for non-europeans, of course !)
> 
> --
> Tycho Fruru [EMAIL PROTECTED]
> "Prediction is extremely difficult. Especially about the future."
>   - Niels Bohr
> 
> ---(end of broadcast)---
> TIP 2: you can get off all lists at once with the unregister command
> (send "unregister YourEmailAddressHere" to [EMAIL PROTECTED])

---(end of broadcast)---
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]

Re: [GENERAL] [HACKERS] [Fwd: AW: More UB-Tree patent information]

2002-04-15 Thread Tycho Fruru


> Hannu Krosing wrote:
> > 
> > Have you found out _what_ exaclty is patented ?
> > 
> > Is it just his concrete implementation of "UB-Tree" or something
> > broader, like using one multi-dimensional index instead of multiple
> > one-dimensional ones ?

(I know it is OT, please reply in private, I can summarize any reactions 
to the list ...)

Patents are supposed to be only applicable to an industrial application 
(with external side-effects).  So ideas in themselves are not patentable.

Anyway, this is once more a good example of the danger of software patents 
- you know what to reply when people say "software patents promote 
innovation"

IANAL, just my 0,02 Euro.

see also : http://www.gnu.org/philosophy/savingeurope.html (also 
interesting for non-europeans, of course !)

-- 
Tycho Fruru [EMAIL PROTECTED]
"Prediction is extremely difficult. Especially about the future."
  - Niels Bohr


---(end of broadcast)---
TIP 5: Have you checked our extensive FAQ?

http://www.postgresql.org/users-lounge/docs/faq.html

Re: [HACKERS] more on large oids

2002-04-15 Thread Tom Lane


Daniel Kalchev <[EMAIL PROTECTED]> writes:
> getTables(): SELECT (for PRIMARY KEY) failed on table config_2002_03_02.  
> Explanation from backend: ERROR:  dtoi4: integer out of range

> Any idea how to fix this? This is on 7.1.3. Will the 7.2 pg_dump handle this 
> database?

Yes.

regards, tom lane

---(end of broadcast)---
TIP 6: Have you searched our list archives?

http://archives.postgresql.org

Re: [HACKERS] Importing Large Amounts of Data

2002-04-15 Thread Christopher Kings-Lynne


> Yes, I know. I mean how does this affect performance? How this can change
> planner decision? Does it have any effect except cosmetical one?

Only cosmetic.  In the example he gave, he wanted a primary key, so I showed
him how to make one properly.

Chris


---(end of broadcast)---
TIP 2: you can get off all lists at once with the unregister command
(send "unregister YourEmailAddressHere" to [EMAIL PROTECTED])

[HACKERS] deletion of attributes

2002-04-15 Thread Michael Meskes


I was just being asked if we are working on deletion of attributes in an
existing table. Something like ALTER TABLE foo DROP COLUMN bar. Is
anyone working on this, or are there design problems with it?

Michael
-- 
Michael Meskes
[EMAIL PROTECTED]
Go SF 49ers! Go Rhein Fire!
Use Debian GNU/Linux! Use PostgreSQL!

---(end of broadcast)---
TIP 3: if posting/reading through Usenet, please send an appropriate
subscribe-nomail command to [EMAIL PROTECTED] so that your
message can get through to the mailing list cleanly

Re: [HACKERS] [PATCHES] ANSI Compliant Inserts

2002-04-15 Thread Vince Vielhaber


On Mon, 15 Apr 2002, Tom Lane wrote:

> Peter Eisentraut <[EMAIL PROTECTED]> writes:
> > I recall that this was the behavior we agreed we wanted.  IMHO, it would
> > be conditional on the INSERT ... VALUES (DEFAULT) capability being
> > provided.  I'm not sure if that is there yet.
>
> That is there now.  Do you recall when this was discussed before?
> I couldn't remember if there'd been any real discussion or not.

It has to be at least a year, Tom.  I brought it up in hackers after
I got bit by it.  I had a rather long insert statement and missed a
value in the middle somewhere which shifted everything by one.  It
was agreed that it shouldn't happen but I don't recall what else was
decided.

Vince.
-- 
==
Vince Vielhaber -- KA8CSHemail: [EMAIL PROTECTED]http://www.pop4.net
 56K Nationwide Dialup from $16.00/mo at Pop4 Networking
Online Campground Directoryhttp://www.camping-usa.com
   Online Giftshop Superstorehttp://www.cloudninegifts.com
==




---(end of broadcast)---
TIP 3: if posting/reading through Usenet, please send an appropriate
subscribe-nomail command to [EMAIL PROTECTED] so that your
message can get through to the mailing list cleanly

Re: [HACKERS] Inefficient handling of LO-restore + Patch

2002-04-15 Thread Mario Weilguni

>"Mario Weilguni" <[EMAIL PROTECTED]> writes:
>> And I did not find out how I can detect the large object
>> chunksize, either from getting it from the headers (include
>> "storage/large_object.h" did not work)
>
>Why not?
>
>Still, it might make sense to move the LOBLKSIZE definition into
>pg_config.h, since as you say it's of some interest to clients like
>pg_dump.

I tried another approach to detect the LOBLKSIZE of the destination server:
* at restore time, create a LO large enough to be split in two parts (e.g. BLCSIZE+1)
* select octet_length(data) from pg_largeobject where loid=OIDOFOBJECT and pageno=0
* select lo_unlink(OIDOFOBJECT)

IMO this gives the advantage that the LOBLKSIZE is taken from the database I'm 
restoring to, and not a constant defined at compile time. Otherwise, it wastes an OID.

Is there a way to get compile-time settings (such as BLCSIZE, LOBLKSIZE and such via 
functions - e.g.
select pginternal('BLCSIZE') or something similar? 

I tested with and without my patch against 2 Gigabytes of LO's using MD5, and got 
exactly the same result on all 25000 large objects. So I think my patch is safe. If 
there's interest for integration into pg_dump, I'll prepare a patch for the current 
CVS version.

---(end of broadcast)---
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]

Re: [HACKERS] Importing Large Amounts of Data

2002-04-15 Thread Denis Perchine


On Monday 15 April 2002 05:15, Christopher Kings-Lynne wrote:
> > On Monday 15 April 2002 03:53, Christopher Kings-Lynne wrote:
> > > BTW, instead of:
> > >
> > > CREATE UNIQUE INDEX bigone_pkey ON bigone (rec_no);
> > >
> > > do:
> > >
> > > ALTER TABLE bigone ADD PRIMARY KEY(rec_no);
> >
> > I am sorry, could you please elaborate more on the difference?
>
> They have the same _effect_, it's just that the first sytnax does not mark
> the index as the _primary_ index on the relation.

Yes, I know. I mean how does this affect performance? How this can change
planner decision? Does it have any effect except cosmetical one?

--
Denis


---(end of broadcast)---
TIP 6: Have you searched our list archives?

http://archives.postgresql.org

Re: [HACKERS] Importing Large Amounts of Data

2002-04-15 Thread Christopher Kings-Lynne


> On Monday 15 April 2002 03:53, Christopher Kings-Lynne wrote:
> > BTW, instead of:
> >
> > CREATE UNIQUE INDEX bigone_pkey ON bigone (rec_no);
> >
> > do:
> >
> > ALTER TABLE bigone ADD PRIMARY KEY(rec_no);
>
> I am sorry, could you please elaborate more on the difference?

They have the same _effect_, it's just that the first sytnax does not mark
the index as the _primary_ index on the relation.

Chris


---(end of broadcast)---
TIP 4: Don't 'kill -9' the postmaster

Re: [HACKERS] Importing Large Amounts of Data

2002-04-15 Thread Denis Perchine


On Monday 15 April 2002 03:53, Christopher Kings-Lynne wrote:
> BTW, instead of:
>
> CREATE UNIQUE INDEX bigone_pkey ON bigone (rec_no);
>
> do:
>
> ALTER TABLE bigone ADD PRIMARY KEY(rec_no);

I am sorry, could you please elaborate more on the difference?

--
Denis


---(end of broadcast)---
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]

Re: [HACKERS] Importing Large Amounts of Data

2002-04-15 Thread Christopher Kings-Lynne


>b) In fact, at times I don't need that data integrity. I'm
> prefectly
>happy to risk the loss of a table during import, if it
> lets me do the
>import more quickly, especially if I'm taking the database off line
>to do the import anyway. MS SQL server in fact allows me to specify
>relaxed integrity (with attendant risks) when doing a BULK
> IMPORT; it
>would be cool if Postgres allowed that to.

Well I guess a TODO item would be to allow COPY to use relaxed constraints.
Don't know how this would go over with the core developers tho.

> Thanks. This is the kind of useful information I'm looking for. I
> was doing a vacuum after, rather than before, generating the indices.

That's because the indexes themselves are cleaned out with vacuum, as well
as the tables.

Chris


---(end of broadcast)---
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]

Re: [HACKERS] Importing Large Amounts of Data

2002-04-15 Thread Curt Sampson

On Mon, 15 Apr 2002, Christopher Kings-Lynne wrote:

> OK, well now it depends on what kind of selects you're doing.  Do you
> regularly select over a certain subset of the data, in which case using
> partial indices might give you significant speedup.

I believe from the information I've been given that we will indeed
be regularly selecting over certain subsets, based on day. (One of
the test queries I've been asked to use selects based on user_id
and a date range.) But I was intending to partition the tables
based on date range (to keep the index rebuild time from getting
completely out of hand), so that will handily take care of that
requirement anyway.

> Do you select functions of columns?

No.

> It depends on your definition.  You have to accept a certain overhead if
> you're to have data integrity and MVCC.  If you can't handle that overhead,
> then you can't have data integrity and vice versa.

Well, a few points:

 a) I am not convinced that data integrity should cost a five-fold
 decrease in performance,

 b) In fact, at times I don't need that data integrity. I'm prefectly
 happy to risk the loss of a table during import, if it lets me do the
 import more quickly, especially if I'm taking the database off line
 to do the import anyway. MS SQL server in fact allows me to specify
 relaxed integrity (with attendant risks) when doing a BULK IMPORT; it
 would be cool if Postgres allowed that to.

> BTW, instead of:
>
> CREATE UNIQUE INDEX bigone_pkey ON bigone (rec_no);
>
> do:
>
> ALTER TABLE bigone ADD PRIMARY KEY(rec_no);
>
> And remember to run "VACUUM ANALYZE bigone;" or just "ANALYZE bigone;" after
> the COPY and before trying to use the table.  I'm not sure if it's better to
> analyze before or after the indexes are added, but it's definitely better to
> vaccum before the indexes are added.

Thanks. This is the kind of useful information I'm looking for. I
was doing a vacuum after, rather than before, generating the indices.

cjs
-- 
Curt Sampson  <[EMAIL PROTECTED]>   +81 90 7737 2974   http://www.netbsd.org
Don't you know, in this new Dark Age, we're all light.  --XTC

---(end of broadcast)---
TIP 5: Have you checked our extensive FAQ?

http://www.postgresql.org/users-lounge/docs/faq.html

Re: [HACKERS] Importing Large Amounts of Data

2002-04-15 Thread Christopher Kings-Lynne


> As you can see from the schema I gave later in my message, that's
> exactly what I did. But does this actually avoid allocating the
> space in the on-disk tuples? What part of the code deals with this?
> It looks to me like the four bytes for the OID are still allocated
> in the tuple, but not used.

OK, well I guess in that case they are - I'm no expert on the file format.

> But from the looks of it, the production system will be doing daily
> imports of fresh data ranging in size from a copule of million rows
> to a couple of tens of millions of rows.

Well that definitely makes a difference then...

> > It's a bit hard to say "just turn off all the things that
> ensure your data
> > integrity so it runs a bit faster", if you actually need data integrity.
>
> I'm not looking for "runs a bit faster;" five percent either way
> makes little difference to me. I'm looking for a five-fold performance
> increase.

> Anyway, from the looks of it, this is going to be fairly simple
> stuff.  (Unfortunately, I don't have details of the real application
> the client has in mind, though I sure wish I did.) What I'm trying
> to indicate when I say "OLAP" is that it's basically selecting
> across broad swaths of a large data set, and doing little or nothing
> in the way of updates. (Except for the daily batches of data, of
> course.)

OK, well now it depends on what kind of selects you're doing.  Do you
regularly select over a certain subset of the data, in which case using
partial indices might give you significant speedup.  Do you select functions
of columns?  If so, then you'll need functional indices.  MySQL doesn't have
either of these.  However, if you're always doing full table scans, then
MySQL will probably do these faster.

Now, here's another scenario.  Suppose you're often querying aggregate data
over particular subsets of the data.  Now instead of requerying all the
time, you can set up triggers to maintain your aggregates for you on the
fly.  This will give O(1) performance on select compared to O(n).  MySQL's
new query cache might help you with this, however.

> I don't want to start a flamewar here, because personally I don't
> even like MySQL and would prefer always to use PostgreSQL. But it
> makes it a lot harder to do so when people keep insisting that
> import speed is not important. Rather than say that, why don't we
> just admit that PosgreSQL is a fairly crap performer in this regard
> at the moment (at least the way I'm doing it), and work out ways
> to fix this?

It depends on your definition.  You have to accept a certain overhead if
you're to have data integrity and MVCC.  If you can't handle that overhead,
then you can't have data integrity and vice versa.

BTW, instead of:

CREATE UNIQUE INDEX bigone_pkey ON bigone (rec_no);

do:

ALTER TABLE bigone ADD PRIMARY KEY(rec_no);

And remember to run "VACUUM ANALYZE bigone;" or just "ANALYZE bigone;" after
the COPY and before trying to use the table.  I'm not sure if it's better to
analyze before or after the indexes are added, but it's definitely better to
vaccum before the indexes are added.

Chris


---(end of broadcast)---
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]

Re: [PATCHES] unknownin/out patch (was [HACKERS] PQescapeBytea is

2002-04-15 Thread Tatsuo Ishii


I'm about to commit your patches with a small fix.
--
Tatsuo Ishii

From: Joe Conway <[EMAIL PROTECTED]>
Subject: Re: [PATCHES] unknownin/out patch (was [HACKERS] PQescapeBytea is
Date: Mon, 08 Apr 2002 22:57:47 -0700
Message-ID: <[EMAIL PROTECTED]>

> Joe Conway wrote:
> > Tatsuo Ishii wrote:
> >  >>> Tatsuo Ishii wrote:
> >  >>>
> >  
> >   Try a multibyte encoding database. For example,
> >  
> >   $ createdb -E EUC_JP test $ psql -c 'SELECT
> >   SUBSTRING('1234567890' FROM 3)' test substring --- 3456
> >  
> > 
> >   (1 row)
> >  
> >   Apparently this is wrong. -- Tatsuo Ishii
> >  >>>
> >  >>> This problem exists in CVS tip *without* the unknownin/out
> >  >>> patch:
> >  >>
> >  >> Sure. That has been broken for a while.
> >  >
> >  >
> >  > I guess this actually happened in 1.79 of varlena.c:
> >  >
> > Yes, I was just looking at that also. It doesn't consider the case of n 
> > = -1 for MB. See the lines:
> > 
> > #ifdef MULTIBYTE
> >eml = pg_database_encoding_max_length ();
> > 
> >if (eml > 1)
> >{
> >   sm = 0;
> >   sn = (m + n) * eml + 3;
> >}
> > #endif
> > 
> > When n = -1 this does the wrong thing. And also a few lines later:
> > 
> > #ifdef MULTIBYTE
> >len = pg_mbstrlen_with_len (VARDATA (string), sn - 3);
> > 
> > I think both places need to test for n = -1. Do you agree?
> > 
> > 
> > Joe
> > 
> 
> The attached patch should fix the bug reported by Tatsuo.
> 
> # psql -U postgres testjp
> Welcome to psql, the PostgreSQL interactive terminal.
> 
> Type:  \copyright for distribution terms
> \h for help with SQL commands
> \? for help on internal slash commands
> \g or terminate with semicolon to execute query
> \q to quit
> 
> testjp=# SELECT SUBSTRING('1234567890' FROM 3);
>   substring
> 
>   34567890
> (1 row)
> 
> Joe

---(end of broadcast)---
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]

Re: [HACKERS] Importing Large Amounts of Data

2002-04-15 Thread Curt Sampson

On Mon, 15 Apr 2002, Christopher Kings-Lynne wrote:

> > ...the OID and
> > the NULL bitfield) are not used; would it be possible to avoid
> > allocating these in this relations that don't use them?
>
> CREATE TABLE WITHOUT OIDS ...

As you can see from the schema I gave later in my message, that's
exactly what I did. But does this actually avoid allocating the
space in the on-disk tuples? What part of the code deals with this?
It looks to me like the four bytes for the OID are still allocated
in the tuple, but not used.

> This conclusion seems to me to be remarkably shortsighted.  Does the initial
> data load into the database occur just once or quite often?

Well, I'm going to be doing the initial load (.5 billion tuples) quite
a few times, in order to test some different partitioning arrangements.
So I'll save quite a lot of time initially if I get a faster import.

But from the looks of it, the production system will be doing daily
imports of fresh data ranging in size from a copule of million rows
to a couple of tens of millions of rows.

> It's a bit hard to say "just turn off all the things that ensure your data
> integrity so it runs a bit faster", if you actually need data integrity.

I'm not looking for "runs a bit faster;" five percent either way
makes little difference to me. I'm looking for a five-fold performance
increase.

> Anyway, from what I understand an OLTP application is all about selects and
> memoising certain aggregate results.

I guess that was a typo, and you meant OLAP?

Anyway, from the looks of it, this is going to be fairly simple
stuff.  (Unfortunately, I don't have details of the real application
the client has in mind, though I sure wish I did.) What I'm trying
to indicate when I say "OLAP" is that it's basically selecting
across broad swaths of a large data set, and doing little or nothing
in the way of updates. (Except for the daily batches of data, of
course.)

> The fact that you can load stuff quicker in
> MySQL and it takes up less disk space seems totally irrelevant.

Yeah, everybody's telling me this. Let me try once again here:

1. Every day, I must import millions, possibly tens of
millions, of rows of data. Thus, speed of import is indeed
fairly important to me.

2. It looks, at least at this point, as if the application
will be doing only fairly simple selects out of the current
half-billion rows of data and whatever gets added in the
future. Thus, I don't think that using MySQL would be a
problem. (If I did, I wouldn't be proposing it.)

I don't want to start a flamewar here, because personally I don't
even like MySQL and would prefer always to use PostgreSQL. But it
makes it a lot harder to do so when people keep insisting that
import speed is not important. Rather than say that, why don't we
just admit that PosgreSQL is a fairly crap performer in this regard
at the moment (at least the way I'm doing it), and work out ways
to fix this?

> Just wait until your MySQL server crashes and your client finds that half
> his data is corrupted...

If there are no updates, why would anything be corrupted? At any
rate, I can always restore from backup, since little or nothing
would be lost.

cjs
-- 
Curt Sampson  <[EMAIL PROTECTED]>   +81 90 7737 2974   http://www.netbsd.org
Don't you know, in this new Dark Age, we're all light.  --XTC

---(end of broadcast)---
TIP 6: Have you searched our list archives?

http://archives.postgresql.org

90 matches

Mail list logo