date:20090911

Re: [HACKERS] COPY enhancements

2009-09-11 Thread Robert Haas

On Fri, Sep 11, 2009 at 6:56 PM, Emmanuel Cecchet  wrote:
> Robert Haas wrote:
>>
>> http://developer.postgresql.org/pgdocs/postgres/sql-explain.html
>>
>
> Just out of curiosity, it looks like I could write something like:
> EXPLAIN (ANALYZE TRUE, COSTS FALSE, VERBOSE TRUE, COSTS TRUE) statement
>
> What is the expected behavior if someone puts multiple time the same option
> with different values. The last value prevails?

Yes.

...Robert

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] COPY enhancements

2009-09-11 Thread Josh Berkus

On 9/11/09 3:56 PM, Emmanuel Cecchet wrote:
> Robert Haas wrote:
>> http://developer.postgresql.org/pgdocs/postgres/sql-explain.html
>>   
> Just out of curiosity, it looks like I could write something like:
> EXPLAIN (ANALYZE TRUE, COSTS FALSE, VERBOSE TRUE, COSTS TRUE) statement
> 
> What is the expected behavior if someone puts multiple time the same
> option with different values. The last value prevails?

Yes.


-- 
Josh Berkus
PostgreSQL Experts Inc.
www.pgexperts.com

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] COALESCE and NULLIF semantics

2009-09-11 Thread Sam Mason

On Fri, Sep 11, 2009 at 12:41:21PM -0500, Kevin Grittner wrote:
> Sam Mason  wrote:
>  
> > what you you want is full type-inference as it's only that which
> > will allow you to track back up the layers and assign consistent
> > types to arbitrary expressions like the above.
>  
> Well, obviously that would fix it; I'm not clear on why *only* that
> would fix it.

Because, I think, if you did come up with "another" solution and gave it
another name most type-theorists would call it type-inference anyway.

Type inference is just a general idea and is implemented in lots of
different ways depending on the specifics of the problem.  You could
argue that PG has a limited form of type inference already.

> It seemed to me that we wouldn't have to go back up
> like that if we deferred the assignment of a type in conditional
> expressions.  I've only scanned that part of the code, so it's well
> within the range of possibility that I misunderstood something, but I
> thought the type assigned to a CASE or COALESCE is used in the context
> of evaluating enclosing expressions on the way *down*, no?

Maybe we're using different terms; but when a literal is declared you
don't know what type it is, just that it needs at most one.  It's only
later on when the variable is actually used that you find out what its
type constraints are.

-- 
  Sam  http://samason.me.uk/

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] COPY enhancements

2009-09-11 Thread Emmanuel Cecchet


Robert Haas wrote:

http://developer.postgresql.org/pgdocs/postgres/sql-explain.html
  

Just out of curiosity, it looks like I could write something like:
EXPLAIN (ANALYZE TRUE, COSTS FALSE, VERBOSE TRUE, COSTS TRUE) statement

What is the expected behavior if someone puts multiple time the same 
option with different values. The last value prevails?
I know that this example looks stupid but when you have a lot of options 
it sometimes happen that you put twice an option with different values 
(that happens with some JDBC driver options...).


manu

--
Emmanuel Cecchet
Aster Data Systems
Web: http://www.asterdata.com


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] COALESCE and NULLIF semantics

2009-09-11 Thread Sam Mason

On Fri, Sep 11, 2009 at 01:37:00PM -0400, Tom Lane wrote:
> "Kevin Grittner"  writes:
> > Yeah, I am.  When you have queries built based on which fields on a
> > QBE window are filled by a user, it's not hard to come up with a
> > clause like:
>  
> > AND (somedate < COALESCE(NULL, NULL) OR ...)
>  
> Right.  The only real way to fix that is to propagate the later
> discovery that type 'date' would be preferred back to the inputs of the
> COALESCE, which is what Sam Mason has been on about (IIUC).

Yup; sounds right.  The fact that the inputs to COALESCE here are just
simple NULL literals is making the example seem too simple.  The general
case is that of them being an arbitrary expression and you somehow need
to get the DATE type all the way back up to the top literal and make
sure that no other branch uses it as anything else.  This was what my
example with using a sub-select was about, maybe this would be better
though:

  SELECT date '2001-1-1' < COALESCE(NULL,v)
  FROM (SELECT NULL) x(v)
  WHERE v = 10;

This should fail to type check; v is being treated as both a DATE and
an INT.  Getting a compiler to do this is pretty easy (store the type
constraints some where else during type checking and then make sure they
all match at the end), but isn't the way PG works at the moment.

> I'm afraid
> that such a thing would make the behavior even more full of surprises
> than what we have now.  Resolving unknown from context is already
> "action at a distance", as it were, and the longer the distance involved
> the more chance for unexpected behavior.  Not to mention the
> implementation difficulties.

Most of my experience says that type-inference actually makes things
easier.  If you're dealing with dynamically *checked* languages then I
can see where you comment comes from; but for typed languages, where
everything has to be perfect at compile time, then this doesn't seem to
be true.  This is why people who use Haskell tend to be the ones saying
things like "when it type checks you can be pretty certain it's going to
work".  I'm not saying we should be going that complicated, just that
in my experience more complicated type systems imply simpler and more
understandable runtime behavior.

Implementation is a bit harder, but it won't be much more complicated
than what PG already has.  It's already dealing with most of the issues
(in a somewhat ad-hoc way) and I'd expect that getting type-inference in
would help clean other things up a bit.

-- 
  Sam  http://samason.me.uk/

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] COPY enhancements

2009-09-11 Thread Stephen Frost

* Robert Haas (robertmh...@gmail.com) wrote:
> Integrating hstore into core and then
> making COPY able to execute a subquery to get its options is certainly
> not easier than a straightforward grammar modification; it's taking a
> small project and turning it into several big ones.

To be honest, my comment was more intended to support hstore in core in
general than this proposal.  I would like to see it happen, if possible,
because I see alot of value in hstore.

Thanks,

Stephen

signature.asc
Description: Digital signature

Re: [HACKERS] COPY enhancements

2009-09-11 Thread Emmanuel Cecchet


Greg Smith wrote:

The full set of new behavior here I'd like to see allows adjusting:

-Accept or reject rows with extra columns?
-Accept or reject rows that are missing columns at the end?
--Fill them with the default for the column (if available) or NULL?
-Save rejected rows?
--To a single system table?
--To a user-defined table?
--To the database logs?
The proposed patch save all rejected rows (with extra or missing 
columns) to a user-defined table (that can be created automatically). If 
you want to handle these bad rows on the fly, I guess you could have a 
trigger on the error table that does the appropriate processing 
depending on the data you are processing. In that case, having multiple 
error tables allows you to plug different triggers to handle possible 
error cases differently. The other option is to process the error table 
after COPY to handle the bad rows.
I guess the problem with extra or missing columns is to make sure that 
you know exactly which data belongs to which column so that you don't 
put data in the wrong columns which is likely to happen if this is fully 
automated.


I will try to re-read the thread on your proposal to better understand 
how you figure out which rows are missing or extra.


Emmanuel

--
Emmanuel Cecchet
Aster Data Systems
Web: http://www.asterdata.com


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] COALESCE and NULLIF semantics

2009-09-11 Thread Jeff Davis

On Fri, 2009-09-11 at 17:35 -0400, Tom Lane wrote:
> Eh?  It's a null value of a composite type.  The above is a type
> violation.

The spec calls it "the null value" which is included in all domains
(Framework 4.4.2). However, in the same section, it mentions "the data
type of the null value", so apparently each null does have a specific
type.

It seems to me like the spec would have something to say about
ROW(NULL,NULL) versus NULL. Do other systems make a distinction?

Regards,
Jeff Davis

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] COPY enhancements

2009-09-11 Thread Andrew Dunstan




Josh Berkus wrote:

Greg,

  

The performance of every path to get data into the database besides COPY
is too miserable for us to use anything else, and the current
inflexibility makes it useless for anything but the cleanest input data.



One potential issue we're facing down this road is that current COPY has
a dual purpose: for database restore, and for importing and exporting
data.  At some point, we may want to separate those two behaviors,
because we'll be adding bells and fringes to import/export which slow
down overall performance or add bugs.

  


Nothing that has been proposed will slow down existing behaviour AFAIK. 
The new behaviour will be slower in most cases, but that's a different 
matter.


cheers

andrew


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] COALESCE and NULLIF semantics

2009-09-11 Thread Kevin Grittner

Tom Lane  wrote:
> I'm expecting coerce_type to fail, along the lines of
> ERROR: failed to find conversion function from unknown to whatever

OK.  After playing with that and reading the code in more depth, I now
see what you've been saying.

[picks up lance and takes aim at windmill]

It still seems solvable without getting too extreme.

It seems to me that all of the conditional expressions besides the
ones using the CASE keyword *could* be resolved down to a literal of
unknown type at parse time, since they fall into this behavior *only*
when all parameters are NULL or literals of unknown type.  Behavior
would be deterministic at parse time.  The biggest argument against
doing this is that it would be bad to have behavior for the CASE
abbreviations which doesn't match the behavior of the CASE predicate
itself.  Less trivial than what I thought was needed, but doable.

[spurs donkey to a trot]

I think the explicit CASE predicate can be solved, too.

The thing which makes the CASE predicate harder, is that you can have
the information that the result is guaranteed to be NULL or an untyped
literal, but you might not know *which* of the values will be chosen
until run time.  (I don't know why it took me this long to see that
distinction.  Oh, well.)  We currently coerce all of the return values
to text for this.  OK.  But...

[breaks into a gallop]

... you know that it *will* be a NULL or a literal of unknown type,
and at parse time you can determine *which* typinput function will
need to be used once the actual value is determined at run time.

[reaches windmill]

So, couldn't the plan just include something to call the typinput at
run time against the text value we're already building up?

[brushes dust from clothing]

What did I miss this time?

-Kevin

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] COPY enhancements

2009-09-11 Thread Josh Berkus

Greg,

> The performance of every path to get data into the database besides COPY
> is too miserable for us to use anything else, and the current
> inflexibility makes it useless for anything but the cleanest input data.

One potential issue we're facing down this road is that current COPY has
a dual purpose: for database restore, and for importing and exporting
data.  At some point, we may want to separate those two behaviors,
because we'll be adding bells and fringes to import/export which slow
down overall performance or add bugs.

> The user-defined table for rejects is obviously exclusive of the system
> one, either of those would be fine from my perspective.

I've been thinking about it, and can't come up with a really strong case
for wanting a user-defined table if we settle the issue of having a
strong key for pg_copy_errors.  Do you have one?

> I wasn't really pleased with the "if it's not the most general solution
> possible we're not interested" tone of Andrew's other COPY-change thread
> this week. 

As someone who uses (and abuses) COPY constantly, I didn't leap at
Andrew's suggestion either because it wasn't *obviously* generally
applicable.  We don't want to accept patches which are designed only to
solve the specific problems faced by one user.  So for a feature
suggestion as specific as Andrew's, it's worth discussion ... out of
which came some interesting ideas, like copy to TEXT[].

Certainly we're not the project to add "quick hacks" where we can do better.

After some thought, I think that Andrew's feature *is* generally
applicable, if done as IGNORE COLUMN COUNT (or, more likely,
column_count=ignore).  I can think of a lot of data sets where column
count is jagged and you want to do ELT instead of ETL.  But I had to
give it some thought; as initially presented, the feature seemed very
single-user-specific.

> I don't think there's *that* many common requests here that
> they can't all be handled by specific implementations,

I disagree.  That way lies maintenance hell.

> and the scope
> creep of launching into a general framework for adding them is just
> going to lead to nothing useful getting committed.

As opposed to Tom, Peter and Heikki vetoing things because the feature
gain doesn't justify the maintnenance burden?  That's your real choice.
 Adding a framework for manageable syntax extensions means that we can
be more liberal about what we justify as an extension.

There is a database which allows unrestricted addition of ah-hoc
features.  It's called MySQL.  They have double the code lines count we
do, and around 100x the outstanding bugs.

>  If you want
> something really complicated, drop into a PL-based solution.  The stuff
> I list above I see regular requests for at *every* PG installation I've
> ever been involved in, and it would be fantastic if they were available
> out of the box.

I don't think that anyone is talking about not adding this to core.
It's just a question of how we add it.  In fact, it's mostly a question
of syntax.

> obviously go away.  (The main reason I haven't pushed for us to submit
> our customizations here is that I know perfectly well the GUC-based UI
> isn't acceptable, but I haven't been able to get a better one done yet)

Well, now you can help Aster.  ;-)

>   If I were reviewing this I'd just
> kick it back as "separate these cleanly into separate patches where the
> partitioning one depends on the logging one" before even starting to
> look at the code, it's too much stuff to consume properly in one gulp.

Well, Bruce was supposed to be helping them submit it.  And why *aren't*
you reviewing it?

-- 
Josh Berkus
PostgreSQL Experts Inc.
www.pgexperts.com

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] COPY enhancements

2009-09-11 Thread Tom Lane

Robert Haas  writes:
> On Fri, Sep 11, 2009 at 5:32 PM, Tom Lane  wrote:
>> Why?  We'd certainly still support the old syntax for existing options,
>> just as we did with EXPLAIN.

> None of the syntax proposals upthread had that property, which doesn't
> mean we can't do it.  However, we'd need some way to differentiate the
> old syntax from the new one. I guess we could throw an unnecessary set
> of parentheses around the option list (blech), but you have to be able
> to tell from the first token which kind of list you're reading if you
> want to support both sets of syntax.

No, you just have to be able to tell it before the first difference in
grammar reduction path.  If we did the syntax as keyword = value, for
instance, I believe the first equal sign would be a sufficient cue for
bison.

Not that parentheses would be such a terrible thing, especially if your
thoughts are leaning towards making COPY-embedded-in-SELECT be special
syntax rather than trying to force it into SRF syntax.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] COPY enhancements

2009-09-11 Thread Robert Haas

On Fri, Sep 11, 2009 at 5:32 PM, Tom Lane  wrote:
> Robert Haas  writes:
>> The biggest problem I have with this change is that it's going to
>> massively break anyone who is using the existing COPY syntax.
>
> Why?  We'd certainly still support the old syntax for existing options,
> just as we did with EXPLAIN.

None of the syntax proposals upthread had that property, which doesn't
mean we can't do it.  However, we'd need some way to differentiate the
old syntax from the new one. I guess we could throw an unnecessary set
of parentheses around the option list (blech), but you have to be able
to tell from the first token which kind of list you're reading if you
want to support both sets of syntax.

...Robert

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] COALESCE and NULLIF semantics

2009-09-11 Thread Tom Lane

Jeff Davis  writes:
> To make that interpretation work I think you would need to say that
> ROW(NULL,NULL) _is_ the null value,

Right...

> and you would have to allow things like:

>   select 1 + row(null,null);

Eh?  It's a null value of a composite type.  The above is a type
violation.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] COPY enhancements

2009-09-11 Thread Tom Lane

Robert Haas  writes:
> The biggest problem I have with this change is that it's going to
> massively break anyone who is using the existing COPY syntax.

Why?  We'd certainly still support the old syntax for existing options,
just as we did with EXPLAIN.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] COPY enhancements

2009-09-11 Thread Greg Smith


On Fri, 11 Sep 2009, Tom Lane wrote:

If you believe that somebody might think of a new per-column COPY 
behavior in the future, then the same issue is going to come up again.


While Andrew may have given up on a quick hack to work around his recent 
request, I don't have that luxury.  We've already had to add two new 
behaviors here to COPY in our version and I expect more in the future. 
The performance of every path to get data into the database besides COPY 
is too miserable for us to use anything else, and the current 
inflexibility makes it useless for anything but the cleanest input data.


The full set of new behavior here I'd like to see allows adjusting:

-Accept or reject rows with extra columns?
-Accept or reject rows that are missing columns at the end?
--Fill them with the default for the column (if available) or NULL?
-Save rejected rows?
--To a single system table?
--To a user-defined table?
--To the database logs?

The user-defined table for rejects is obviously exclusive of the system 
one, either of those would be fine from my perspective.


I wasn't really pleased with the "if it's not the most general solution 
possible we're not interested" tone of Andrew's other COPY-change thread 
this week.  I don't think there's *that* many common requests here that 
they can't all be handled by specific implementations, and the scope creep 
of launching into a general framework for adding them is just going to 
lead to nothing useful getting committed.  If you want something really 
complicated, drop into a PL-based solution.  The stuff I list above I see 
regular requests for at *every* PG installation I've ever been involved 
in, and it would be fantastic if they were available out of the box.


But I think it's quite reasonable to say the COPY syntax needs to be 
overhauled to handle all these.  The two changes we've made at Truviso 
both use GUCs to control their behavior, and I'm guessing Aster did that 
too for the same reasons we did:  it's easier to do and makes for cleaner 
upstream merges.  That approach doesn't really scale well though to many 
options, and when considered for core the merge concerns obviously go 
away.  (The main reason I haven't pushed for us to submit our 
customizations here is that I know perfectly well the GUC-based UI isn't 
acceptable, but I haven't been able to get a better one done yet)


This auto-partioning stuff is interesting if the INSERT performance of it 
can be made reasonable.  I think Emmanuel is too new to the community 
process here to realize that there's little hope of those getting 
committed or even reviewed together.  If I were reviewing this I'd just 
kick it back as "separate these cleanly into separate patches where the 
partitioning one depends on the logging one" before even starting to look 
at the code, it's too much stuff to consume properly in one gulp.


--
* Greg Smith gsm...@gregsmith.com http://www.gregsmith.com Baltimore, MD

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] COPY enhancements

2009-09-11 Thread Robert Haas

2009/9/11 Stephen Frost :
        Postgres has a hstore data type which seems well suited for passing
 key/value option pairs...
>>
>> Quite apart from any other reason, it is not builtin, so there is no way
>> that any builtin thing can use it.
>
> Clearly, that's fixable..  I think it's an interesting concept, but I
> don't know that I'd advocate it (using hstore for this in some way).

Unless I'm missing something, which is possible, this whole line of
conversation is based on a misunderstanding.  Data types, like hstore,
are things that have input/output functions, index methods, and
on-disk representations.  In-memory data structures require none of
these things, and can use techniques not suitable for on-disk
representations, such as pointers.  The parser already has an object
called a DefElem which is well-suited for exactly the kind of option
handling we're talking about here, and hstore would not be, not only
because it's the wrong kind of object (a data type rather than an
in-memory data structure), but because a DefElem can do things that
hstore can't, like store as the associated value a list of parse
nodes.

The original reference to hstore was a suggestion that it might be
possible to pass an hstore argument to COPY rather than having to
build up a command string and pass it to EXECUTE.  That may or may not
be a useful innovation - personally, I tend to think not - but it
seems to me that it would require COPY to execute an arbitrary
subquery and use the results as options.  We have no other commands
that work that way to my knowledge, but beyond that, Pierre Frédéric
Caillaud seemed to be suggesting this would be an "easier" way to
implement an options syntax.  Integrating hstore into core and then
making COPY able to execute a subquery to get its options is certainly
not easier than a straightforward grammar modification; it's taking a
small project and turning it into several big ones.

...Robert

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] COALESCE and NULLIF semantics

2009-09-11 Thread Jeff Davis

On Fri, 2009-09-11 at 12:59 -0400, Tom Lane wrote:
> If so then ROW(NULL,NULL) would be
> indistinguishable from NULL and the semantic gripes seem to largely
> go away.  It would be a problem for anyone who actually wanted to
> distinguish those two cases, but how much do we care?

Does that violate the standard?

To make that interpretation work I think you would need to say that
ROW(NULL,NULL) _is_ the null value, and you would have to allow things
like:

  select 1 + row(null,null);

which seems strange to me.

Regards,
Jeff Davis



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] drop tablespace error: invalid argument

2009-09-11 Thread Robert Creager



On Sep 11, 2009, at 2:35 PM, David E. Wheeler wrote:


On Sep 11, 2009, at 12:42 PM, Tom Lane wrote:


Well, 10.6.1 is out and it's still got the readdir() bug :-(.


Has someone filed a bug report about this with Apple?

   https://bugreport.apple.com/cgi-bin/WebObjects/RadarWeb.woa


Look at the history of this thread, and it's already submitted:

 http://www.nabble.com/drop-tablespace-error:-invalid-argument-td24992634.html

Later,
Rob

smime.p7s
Description: S/MIME cryptographic signature

Re: [HACKERS] drop tablespace error: invalid argument

2009-09-11 Thread Robert Creager



On Sep 11, 2009, at 2:35 PM, David E. Wheeler wrote:


On Sep 11, 2009, at 12:42 PM, Tom Lane wrote:


Well, 10.6.1 is out and it's still got the readdir() bug :-(.


Has someone filed a bug report about this with Apple?

   https://bugreport.apple.com/cgi-bin/WebObjects/RadarWeb.woa


If no one has (yet), I'll be happy to.  I just submitted one for an  
AirPort problem...  I guess I'll whip up an example program and just  
submit it anyway...  Anyone already written one?


Later,
Rob

smime.p7s
Description: S/MIME cryptographic signature

Re: [HACKERS] COPY enhancements

2009-09-11 Thread Robert Haas

On Fri, Sep 11, 2009 at 4:02 PM, Tom Lane  wrote:
> Robert Haas  writes:
>> Another approach would be to generalize what is allowable as an
>> optional parameter to include a parenthesized column list, but I don't
>> really think that has a lot to recommend it.
>
> Well, maybe it's worth doing.  If you believe that somebody might think
> of a new per-column COPY behavior in the future, then the same issue is
> going to come up again.

I can't immediately think of one, but I wouldn't bet against someone
else dreaming one up.

The biggest problem I have with this change is that it's going to
massively break anyone who is using the existing COPY syntax.  Really
simple examples might be OK (like if they're using 0 or 1 options),
but more complex things are going to just break.  How much do we care
about that?

...Robert

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] COPY enhancements

2009-09-11 Thread Stephen Frost

* Andrew Dunstan (and...@dunslane.net) wrote:
> Robert Haas wrote:
>>>Postgres has a hstore data type which seems well suited for passing
>>> key/value option pairs...
>
> Quite apart from any other reason, it is not builtin, so there is no way  
> that any builtin thing can use it.

Clearly, that's fixable..  I think it's an interesting concept, but I
don't know that I'd advocate it (using hstore for this in some way).

Thanks,

Stephen


signature.asc
Description: Digital signature

Re: [HACKERS] drop tablespace error: invalid argument

2009-09-11 Thread David E. Wheeler


On Sep 11, 2009, at 12:42 PM, Tom Lane wrote:


Well, 10.6.1 is out and it's still got the readdir() bug :-(.


Has someone filed a bug report about this with Apple?

https://bugreport.apple.com/cgi-bin/WebObjects/RadarWeb.woa

Best,

David


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] COPY enhancements

2009-09-11 Thread Tom Lane

Robert Haas  writes:
> Another approach would be to generalize what is allowable as an
> optional parameter to include a parenthesized column list, but I don't
> really think that has a lot to recommend it.

Well, maybe it's worth doing.  If you believe that somebody might think
of a new per-column COPY behavior in the future, then the same issue is
going to come up again.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] COPY enhancements

2009-09-11 Thread Robert Haas

On Fri, Sep 11, 2009 at 12:28 PM, Tom Lane  wrote:
> Robert Haas  writes:
>> On Fri, Sep 11, 2009 at 11:37 AM, Tom Lane  wrote:
>>> No, it *isn't* cheap.  Particularly not for COPY, for which we need an
>>> ad-hoc parser in psql.
>
>> :-(  That's definitely a complication, although it seems to me it will
>> need substantial work no matter what option we decide on.
>
> True :-(.  But I'm hoping we only have to revise it once more, not every
> time somebody thinks of another COPY option.

Yes, I completely agree.  But a special case for one existing option
won't have that result, so long as we're resolved not to accept any
more (and perhaps eventually to remove the special case once we're
confident the functionality isn't needed any longer).

Another approach would be to generalize what is allowable as an
optional parameter to include a parenthesized column list, but I don't
really think that has a lot to recommend it.

...Robert

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] drop tablespace error: invalid argument

2009-09-11 Thread Tom Lane

I wrote:
> Jan Otto  writes:
>> The bug in readdir() appeared in the final snow leopard too. Anybody
>> with Snow Leopard installed can check this, with simply doing the
>> regression tests (make check). The tablespace regression test is
>> failing.

>> The patch i sent in works around the issue. if it is not acceptable to
>> reread the tablespace-directory after every delete i can rewrite the
>> workaround.  Probably it is preferred that we write all entries of the
>> directory into an array and looping through that array after that
>> instead of looping with ReadDir()?

> I'm not really eager to put in a workaround for such a basic OS bug,
> especially not when the odds are good that it'll be fixed in 10.6.1.
> Let's wait a little bit for Apple to get their act together.

Well, 10.6.1 is out and it's still got the readdir() bug :-(.

It's likely that there'll be a 10.6.2 before very long, but I wonder if
we should go ahead with some sort of hack; at least as a temporary fix
in CVS HEAD so that we can get more useful buildfarm reports from Snow
Leopard machines.

Comments?

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Re: [COMMITTERS] pgsql: On Windows, when a file is deleted and another process still has

2009-09-11 Thread Tom Lane

Magnus Hagander  writes:
> On Fri, Sep 11, 2009 at 10:44, Heikki Linnakangas
>> Here's a patch implementing that, and changing pgrename() to check for
>> ERROR_SHARING_VIOLATION and ERROR_LOCK_VIOLATION like pgwin32_open()
>> does, instead of ERROR_ACCESS_DENIED.

> I have definitely seen AV programs return access deniderather than
> sharing violation more than once for temporary errors. How about we
> keep the access denied one as well?

+1 ... presumably the original coding was tested in *some* environment.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] COPY enhancements

2009-09-11 Thread Tom Lane

Emmanuel Cecchet  writes:
> Tom Lane wrote:
>> The important point is to look at the grammar, which doesn't have any
>> idea what the specific options are in the list.

> I understand the convenience from a developer perspective but I wonder 
> how this improves the user experience.

It's all in the eye of the beholder I suppose, but I don't find random
pseudo-English phrases to be particularly great to remember or work with
either.  The existing COPY syntax is a complete mess from a user's
viewpoint already, IMO.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Re: [COMMITTERS] pgsql: On Windows, when a file is deleted and another process still has

2009-09-11 Thread Magnus Hagander

On Fri, Sep 11, 2009 at 10:44, Heikki Linnakangas
 wrote:
> (moving to pgsql-hackers)
>
> Tom Lane wrote:
>> Heikki Linnakangas  writes:
>>> A completely different approach would be to treat any failure on all
>>> platforms as non-fatal. We shouldn't really cut the checkpoint short if
>>> recycling a WAL file fails, whatever the reason. That seems like a more
>>> robust approach than trying to guess which error codes are OK to ignore.
>>
>> I could live with that, as long as it gets logged.
>
> Here's a patch implementing that, and changing pgrename() to check for
> ERROR_SHARING_VIOLATION and ERROR_LOCK_VIOLATION like pgwin32_open()
> does, instead of ERROR_ACCESS_DENIED.

I have definitely seen AV programs return access deniderather than
sharing violation more than once for temporary errors. How about we
keep the access denied one as well? If we actually don't have
permissions in pg_xlog, we most likely never even got here...


-- 
 Magnus Hagander
 Me: http://www.hagander.net/
 Work: http://www.redpill-linpro.com/

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] COPY enhancements

2009-09-11 Thread Emmanuel Cecchet


Tom Lane wrote:

Robert Haas  writes:
  

Or look at your CVS/git checkout.



The important point is to look at the grammar, which doesn't have any
idea what the specific options are in the list.  (Well, okay, it had
to have special cases for ANALYZE and VERBOSE because those are reserved
words :-(.  But future additions will not need to touch the grammar.
In the case of COPY that also means not having to touch psql \copy.)
  
I understand the convenience from a developer perspective but I wonder 
how this improves the user experience. If options are not explicitly 
part of the grammar:

- you cannot do automated testing based on the grammar
- it seems that it will be harder to document
- it still requires the same amount of work in psql and 3rd party tools 
to support command-completion and so on?
- why only COPY or EXPLAIN would use that syntax? what is the good limit 
between an option and something that is part of the grammar?


It looks like passing the current GUC variables as options to COPY. 
Isn't there a design problem with the parser if it is so hard to add a 
new option to a command?  In all cases, both the client and the server 
will have to support the new extension (and it will have to be 
documented) so it should not make a big difference whether it is 
explicitly part of the command grammar or a set of generic options.


manu

--
Emmanuel Cecchet
Aster Data Systems
Web: http://www.asterdata.com


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] COALESCE and NULLIF semantics

2009-09-11 Thread Tom Lane

"Kevin Grittner"  writes:
> Tom Lane  wrote:
>> It's what happens afterwards that's the problem --- try it and see.
 
> Anything in particular I should test or be looking for, or will the
> error of my ways be glaringly obvious on any usage?

I'm expecting coerce_type to fail, along the lines of
ERROR: failed to find conversion function from unknown to whatever

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] COALESCE and NULLIF semantics

2009-09-11 Thread Kevin Grittner

Tom Lane  wrote:
> "Kevin Grittner"  writes:
>> I was thinking of changing what is currently done, for example,
>> here:
>  
>> newc->coalescetype = select_common_type(pstate, newargs,
>> "COALESCE", NULL);
>  
>> Is that so late as you say, or is there a reason that can't work?
> 
> It's what happens afterwards that's the problem --- try it and see.
 
Anything in particular I should test or be looking for, or will the
error of my ways be glaringly obvious on any usage?
 
-Kevin

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] COPY enhancements

2009-09-11 Thread Tom Lane

David Fetter  writes:
> On Fri, Sep 11, 2009 at 11:37:33AM -0400, Tom Lane wrote:
>> No, it *isn't* cheap.  Particularly not for COPY, for which we need
>> an ad-hoc parser in psql.

> Is there some way we can use the regular parser, as plpgsql is moving
> to, or is there just too much wound into the server?

The reason that's an option for plpgsql is it's running inside the
server.  The amount of baggage gram.y would carry along is daunting.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] COPY enhancements

2009-09-11 Thread David Fetter

On Fri, Sep 11, 2009 at 11:37:33AM -0400, Tom Lane wrote:
> Robert Haas  writes:
> > While I'm at least as big a fan of generic options as the next
> > person, syntax is cheap.
> 
> No, it *isn't* cheap.  Particularly not for COPY, for which we need
> an ad-hoc parser in psql.

Is there some way we can use the regular parser, as plpgsql is moving
to, or is there just too much wound into the server?

Cheers,
David.
-- 
David Fetter  http://fetter.org/
Phone: +1 415 235 3778  AIM: dfetter666  Yahoo!: dfetter
Skype: davidfetter  XMPP: david.fet...@gmail.com

Remember to vote!
Consider donating to Postgres: http://www.postgresql.org/about/donate

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] COALESCE and NULLIF semantics

2009-09-11 Thread Kevin Grittner

Sam Mason  wrote:

> what you you want is full type-inference as it's only that which
> will allow you to track back up the layers and assign consistent
> types to arbitrary expressions like the above.

Well, obviously that would fix it; I'm not clear on why *only* that
would fix it.  It seemed to me that we wouldn't have to go back up
like that if we deferred the assignment of a type in conditional
expressions.  I've only scanned that part of the code, so it's well
within the range of possibility that I misunderstood something, but I
thought the type assigned to a CASE or COALESCE is used in the context
of evaluating enclosing expressions on the way *down*, no?

-Kevin

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] community decision-making & 8.5

2009-09-11 Thread Tom Lane

"Greg Sabino Mullane"  writes:
>> We also very occasionally step in and make a decision if -hackers (or
>> another group) is deadlocked over an issue. For example, the whole   
>> 'change the name' debate.

> I wouldn't really hold that up as a shining example of a core decision. :)

The point of core's action then was to put a stop to an unproductive
flamewar.  Which it did.  Could we please not restart that?

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] COALESCE and NULLIF semantics

2009-09-11 Thread Tom Lane

"Kevin Grittner"  writes:
> I was thinking of changing what is currently done, for example, here:

> newc->coalescetype = select_common_type(pstate, newargs, "COALESCE",
> NULL);

> Is that so late as you say, or is there a reason that can't work?

It's what happens afterwards that's the problem --- try it and see.

> Yeah, I am.  When you have queries built based on which fields on a
> QBE window are filled by a user, it's not hard to come up with a
> clause like:

> AND (somedate < COALESCE(NULL, NULL) OR ...)

Right.  The only real way to fix that is to propagate the later
discovery that type 'date' would be preferred back to the inputs of the
COALESCE, which is what Sam Mason has been on about (IIUC).  I'm afraid
that such a thing would make the behavior even more full of surprises
than what we have now.  Resolving unknown from context is already
"action at a distance", as it were, and the longer the distance involved
the more chance for unexpected behavior.  Not to mention the
implementation difficulties.

> We solved this by modifying our framework to pass down metadata about
> the values in addition to the values themselves.  We were always able
> to look at an object's class to generate the correct literal type -- a
> Date object would generate a DATE '2009-09-11' format literal; but a
> NULL had been bare in that situation.  We now generate CAST(NULL AS
> type) whenever we insert a NULL literal, so we are no longer burned by
> this. I'm just thinking that it would reduce pain for others.

Of course that's what the SQL spec would tell you to do anyway ;-)

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] RfD: more powerful "any" types

2009-09-11 Thread Pavel Stehule

2009/9/11 Tom Lane :
> Merlin Moncure  writes:
>> If you are going to use printf format codes, which is good and useful
>> being something of a standard, I'd call routine printf (not format)
>> and actually wrap vsnprintf.  The format codes in printf have a very
>> specific meaning: converting native C types to arrays of characters.
>> I think that a postgresql implementation should do exactly that:
>> attempt to convert the passed in datum to the c type in question if
>> possible (erroring if no cast exists) and then pass it down.
>
> I think this is a bit too restrictive.  Aside from the issue of loss of
> precision for NUMERIC, do we really want users to have to deal with the
> fact that "long" doesn't mean the same thing on every platform?  I don't
> want the same SQL to work on some platforms and fail on others because
> a particular datatype has a cast to int4 and not to int8, for instance.
>
> We should certainly leverage the C library as much as we can for this,
> but exposing users to every single idiosyncrasy of C is not quite the
> right thing IMHO.
>

I am thinking so PostgreSQL sprintf function that isn't real sprintf
function is really perfect idea. I see messages, sprintf doesn't
support format correctly ... And I will have three sprintf functions,
perl, c and postgres, ... still are you thinking, so this is good
idea?

regards
Pavel Stehule

>                        regards, tom lane
>

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] COALESCE and NULLIF semantics

2009-09-11 Thread Sam Mason

On Fri, Sep 11, 2009 at 12:26:45PM -0500, Kevin Grittner wrote:
> Tom Lane  wrote:
> > if that weren't true then we wouldn't be arguing about whether
> > COALESCE is wrong.
>  
> Yeah, I am.  When you have queries built based on which fields on a
> QBE window are filled by a user, it's not hard to come up with a
> clause like:
>  
> AND (somedate < COALESCE(NULL, NULL) OR ...)
>  
> We solved this by modifying our framework to pass down metadata about
> the values in addition to the values themselves.

You need a *much* more invasive change to fix this.  PG's type checker
only looks one level deep when choosing what types to replace "unknown"
with; what you you want is full type-inference as it's only that which
will allow you to track back up the layers and assign consistent types
to arbitrary expressions like the above.

-- 
  Sam  http://samason.me.uk/

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] COALESCE and NULLIF semantics

2009-09-11 Thread Sam Mason

On Fri, Sep 11, 2009 at 06:24:22PM +0100, Sam Mason wrote:
> One thing I've just realized these discussions have pointed out is
> that PG isn't doing the correct thing all the time with types.  When
> is it ever valid to see an "unknown" after type checking?  AFAICT, it
> shouldn't ever appear and hence doing:
> 
>   CREATE VIEW v AS SELECT 'foo';
> 
> Should be doing the normal default to TEXT type.  Is that right? or does
> "unknown" have more meaning besides just being something that needs to
> be fixed up during type checking.

Doh, sorry I shouldn't have sent that.  I wanted to spend some time to
see if I could find any other examples, but I hit send by accident.

-- 
  Sam  http://samason.me.uk/

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] COALESCE and NULLIF semantics

2009-09-11 Thread Kevin Grittner

Tom Lane  wrote:
> "Kevin Grittner"  writes:
>> I'm only proposing parse-time changes for conditional
>> expressions -- the CASE predicate and its abbreviations.
> 
> No, you are not; you are proposing run-time changes, specifically
> the need to coerce unknown to something else long after the point
> where the unknown is just a literal constant.

I was thinking of changing what is currently done, for example, here:

newc->coalescetype = select_common_type(pstate, newargs, "COALESCE",
NULL);

Is that so late as you say, or is there a reason that can't work?

> As far as I can see, this entire discussion turns on the complaint
> that IS NULL gives different results for plain NULL and
> ROW(NULL,NULL,...);

No, I'm not proposing any change to that.  (Others are, but that's not
my focus, personally.)

> if that weren't true then we wouldn't be arguing about whether
> COALESCE is wrong.

Yeah, I am.  When you have queries built based on which fields on a
QBE window are filled by a user, it's not hard to come up with a
clause like:

AND (somedate < COALESCE(NULL, NULL) OR ...)

We solved this by modifying our framework to pass down metadata about
the values in addition to the values themselves.  We were always able
to look at an object's class to generate the correct literal type -- a
Date object would generate a DATE '2009-09-11' format literal; but a
NULL had been bare in that situation.  We now generate CAST(NULL AS
type) whenever we insert a NULL literal, so we are no longer burned by
this. I'm just thinking that it would reduce pain for others.

-Kevin

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] COALESCE and NULLIF semantics

2009-09-11 Thread Sam Mason

On Fri, Sep 11, 2009 at 12:59:04PM -0400, Tom Lane wrote:
> "Kevin Grittner"  writes:
> > I'm only proposing parse-time changes for conditional
> > expressions -- the CASE predicate and its abbreviations.
> 
> No, you are not; you are proposing run-time changes, specifically the
> need to coerce unknown to something else long after the point where
> the unknown is just a literal constant.

One thing I've just realized these discussions have pointed out is
that PG isn't doing the correct thing all the time with types.  When
is it ever valid to see an "unknown" after type checking?  AFAICT, it
shouldn't ever appear and hence doing:

  CREATE VIEW v AS SELECT 'foo';

Should be doing the normal default to TEXT type.  Is that right? or does
"unknown" have more meaning besides just being something that needs to
be fixed up during type checking.

> I've been wondering whether it would be sensible to make the
> composite-datum constructors check for all-null fields and generate
> a plain NULL if so.  If so then ROW(NULL,NULL) would be
> indistinguishable from NULL and the semantic gripes seem to largely
> go away.  It would be a problem for anyone who actually wanted to
> distinguish those two cases, but how much do we care?

I'd prefer these semantics; it would make it do "the right thing" in
more cases than now.

-- 
  Sam  http://samason.me.uk/

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] RfD: more powerful "any" types

2009-09-11 Thread Merlin Moncure

On Fri, Sep 11, 2009 at 12:11 PM, Tom Lane  wrote:
> Merlin Moncure  writes:
>> If you are going to use printf format codes, which is good and useful
>> being something of a standard, I'd call routine printf (not format)
>> and actually wrap vsnprintf.  The format codes in printf have a very
>> specific meaning: converting native C types to arrays of characters.
>> I think that a postgresql implementation should do exactly that:
>> attempt to convert the passed in datum to the c type in question if
>> possible (erroring if no cast exists) and then pass it down.
>
> I think this is a bit too restrictive.  Aside from the issue of loss of
> precision for NUMERIC, do we really want users to have to deal with the
> fact that "long" doesn't mean the same thing on every platform?  I don't
> want the same SQL to work on some platforms and fail on others because
> a particular datatype has a cast to int4 and not to int8, for instance.
>
> We should certainly leverage the C library as much as we can for this,
> but exposing users to every single idiosyncrasy of C is not quite the
> right thing IMHO.

hmm. how about leaving the existing format codes alone and making some
safer additional ones that we advice the user to use?   It could
probably be all fixed up in the vsnprintf layer.

merlin

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] COALESCE and NULLIF semantics

2009-09-11 Thread Tom Lane

"Kevin Grittner"  writes:
> I'm only proposing parse-time changes for conditional
> expressions -- the CASE predicate and its abbreviations.

No, you are not; you are proposing run-time changes, specifically the
need to coerce unknown to something else long after the point where
the unknown is just a literal constant.

As far as I can see, this entire discussion turns on the complaint that
IS NULL gives different results for plain NULL and ROW(NULL,NULL,...);
if that weren't true then we wouldn't be arguing about whether COALESCE
is wrong.  We really ought to be focusing on that and not making random
adjustments to the behavior of "unknown".

I've been wondering whether it would be sensible to make the
composite-datum constructors check for all-null fields and generate
a plain NULL if so.  If so then ROW(NULL,NULL) would be
indistinguishable from NULL and the semantic gripes seem to largely
go away.  It would be a problem for anyone who actually wanted to
distinguish those two cases, but how much do we care?

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] COALESCE and NULLIF semantics

2009-09-11 Thread Kevin Grittner

Sam Mason  wrote: 
> On Wed, Sep 09, 2009 at 10:25:34AM -0400, Tom Lane wrote:
>> Now admittedly there's probably not any major technical obstacle to
>> making a runtime conversion happen --- it's merely delayed
>> invocation of the destination type's input function.  But I find it
>> really ugly from a theoretical point of view.  Doing calculations
>> with "unknown" values just seems wrong.

Agreed.  I did say that I didn't actually want to actually turn these
into functions -- I was just putting together a simple demonstration
(usable only for simple test cases) of what I thought the parse-time
behavior should be, to facilitate discussion.  I *was* thinking that
showing that COALESCE could behave that way for simple cases with two
one-line plpgsql functions might show that the semantics weren't
excessively bizarre.

I don't think that the suggestion could be a problem for COALESCE.  In
fact, I think somewhere in another thread, Tom conceded that much, but
(understandably) didn't want that one form of CASE behaving
differently than everything else did.  I'll try to address that.

The first point is that if any of the expressions used for any result
value in one of these predicates is typed, nothing at all would
change.  This is only about the behavior when each result value is
NULL or an untyped literal.  All subsequent comments assume that, to
avoid the tedium of restating it each time.

I don't think explicit CASE predicates in either form would be a
problem, because there is nothing to suggest a connection between a
literal in the expression *which chooses* a result value and a literal
*used as* a result value.

I think that LEAST and GREATEST are a lost cause in terms of changing
much, since there are obviously compares to be made using *some* type
before a value can be derived -- at least if there is more than one
non-NULL value.  Since these are PostgreSQL extensions which don't
even behave consistently with other products' extensions using the
same words, I'm not too concerned about them being "irregular".  (I'm
not sure what the justification for the current behavior would even be
-- since NULL means *unknown*, how can you declare that you know the
greatest or least value in a set of values when any are unknown?  It
seems like these should be named LEAST_KNOWN and GREATEST_KNOWN for
their current semantics.)  In any event, the current behavior is to
treat them as text; I don't think we can improve on that, beyond
perhaps using unknown if all values are NULL, or all but one are NULL
and the remaining one is an untyped literal.  Not sure whether that's
sane or worth it.

NULLIF presents a problem only with two arguments which are *both*
untyped literals.  That case currently resolves to text.  If both are
NULL, or one is NULL and the other is an untyped literal, I don't see
how there is a problem declaring the result type as unknown.  I think
it would be sane to continue using text with two untyped literals. 
This would require users to declare the type of one or both literals
if they want something else.  (Frankly, I've never had a use for
NULLIF; it seems like a kludge which is there to encourage
substitution of magic values for NULL and then allow those magic
values to be transformed back to NULL on demand.  Does anybody who
expects sane behavior really use this?)

Finally, there is one "minor" extension to what I said above.  Any of
these conditional expressions which evaluate to an untyped literal or
NULL would be considered the same as a bare untyped literal or NULL
for all purposes, including their use in an enclosing conditional
expression.  I don't *think* that adds a lot of complexity to the
issue, but I'm not sure on that one.

> It's pretty grim.  This seems to be some strange halfway house on
> the way to real type-inference, with broken semantics to boot.  How
> would it prevent weirdos like:
> 
>   SELECT 'msg'||v, date_trunc('year',v), v+10
>   FROM (SELECT 'hi ho') x(v);

I don't see where what I'm proposing would change the behavior of that
at all.  I'm only proposing parse-time changes for conditional
expressions -- the CASE predicate and its abbreviations.

I have looked at the code where the parser resolves types for these. 
I think it would be within my skill set to produce a patch if others
agree this makes sense; although so far such agreement doesn't seem
too likely.  :-(

-Kevin

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] COPY enhancements

2009-09-11 Thread Tom Lane

Robert Haas  writes:
> On Fri, Sep 11, 2009 at 11:37 AM, Tom Lane  wrote:
>> No, it *isn't* cheap.  Particularly not for COPY, for which we need an
>> ad-hoc parser in psql.

> :-(  That's definitely a complication, although it seems to me it will
> need substantial work no matter what option we decide on.

True :-(.  But I'm hoping we only have to revise it once more, not every
time somebody thinks of another COPY option.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] RfD: more powerful "any" types

2009-09-11 Thread Tom Lane

Merlin Moncure  writes:
> If you are going to use printf format codes, which is good and useful
> being something of a standard, I'd call routine printf (not format)
> and actually wrap vsnprintf.  The format codes in printf have a very
> specific meaning: converting native C types to arrays of characters.
> I think that a postgresql implementation should do exactly that:
> attempt to convert the passed in datum to the c type in question if
> possible (erroring if no cast exists) and then pass it down.

I think this is a bit too restrictive.  Aside from the issue of loss of
precision for NUMERIC, do we really want users to have to deal with the
fact that "long" doesn't mean the same thing on every platform?  I don't
want the same SQL to work on some platforms and fail on others because
a particular datatype has a cast to int4 and not to int8, for instance.

We should certainly leverage the C library as much as we can for this,
but exposing users to every single idiosyncrasy of C is not quite the
right thing IMHO.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] COPY enhancements

2009-09-11 Thread Robert Haas

On Fri, Sep 11, 2009 at 11:37 AM, Tom Lane  wrote:
> Robert Haas  writes:
>> While I'm at least as big a fan of generic options as the next person,
>> syntax is cheap.
>
> No, it *isn't* cheap.  Particularly not for COPY, for which we need an
> ad-hoc parser in psql.

:-(  That's definitely a complication, although it seems to me it will
need substantial work no matter what option we decide on.

...Robert

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] RfD: more powerful "any" types

2009-09-11 Thread Merlin Moncure

On Fri, Sep 11, 2009 at 10:38 AM, Tom Lane  wrote:
> Alvaro Herrera  writes:
>> Is this really all that hard?  I'm thinking it could be implemented by
>> using the real C sprintf underneath, passing one % specifier and its
>> corresponding parameter at a time, coerced to whatever the conversion
>> specifier specifies.
>
> The only disadvantage I can see of that is that it would lose precision
> for NUMERIC.  I'd really like to be able to write "%300.100f" and have it
> Do The Right Thing with a 300-digit numeric input.

that could be simply worked around by formatting the  numeric in sql
and passing it to printf as %s.

merlin

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] COPY enhancements

2009-09-11 Thread Andrew Dunstan




Robert Haas wrote:

   Postgres has a hstore data type which seems well suited for passing
key/value option pairs...
   

  


Quite apart from any other reason, it is not builtin, so there is no way 
that any builtin thing can use it.


cheers

andrew

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] RfD: more powerful "any" types

2009-09-11 Thread Merlin Moncure

On Fri, Sep 11, 2009 at 11:19 AM, Robert Haas  wrote:
> On Fri, Sep 11, 2009 at 10:30 AM, Tom Lane  wrote:
>> "Kevin Grittner"  writes:
>>> I think the main benefit of a sprintf type function for PostgreSQL is
>>> in the formatting (setting length, scale, alignment), not in making
>>> concatenation more pretty.
>>
>> Exactly, which is why I'm so distressed that this proposal not only
>> hasn't got that, but is designed so that it's impossible to add it
>> later.
>
> I like the idea of making concatenation more pretty, quite frankly.
> No one has really responded to Pavel's contention that this is what
> to_char() is for.  Twice the code paths = twice the bugs, twice the
> places that have to be updated when some new feature is added, etc.

If you are going to use printf format codes, which is good and useful
being something of a standard, I'd call routine printf (not format)
and actually wrap vsnprintf.  The format codes in printf have a very
specific meaning: converting native C types to arrays of characters.
I think that a postgresql implementation should do exactly that:
attempt to convert the passed in datum to the c type in question if
possible (erroring if no cast exists) and then pass it down.  The idea
is we are not adding new formatting routines but using a very high
quality existing one...why reinvent the wheel?

so if you did: select printf('%s %3.1f', foo::box, bar::circle);
the box to char* cast would work (using the text cast) but the second
cast would fail unless the user added a cast to float.  The code in
question is easy to imagine...parse the format string, and loop the
varargs using the appropriate looked up cast one by one...

merlin

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] COPY enhancements

2009-09-11 Thread Tom Lane

Robert Haas  writes:
> While I'm at least as big a fan of generic options as the next person,
> syntax is cheap.

No, it *isn't* cheap.  Particularly not for COPY, for which we need an
ad-hoc parser in psql.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] COPY enhancements

2009-09-11 Thread Andrew Dunstan




Tom Lane wrote:

Robert Haas  writes:
  

I don't see any reasonable way to sandwhich the FORCE NOT NULL syntax
into a keyword/value notation.



Any number of ways, for example "force_not_null = true" or multiple
occurrences of "force_not_null = column_name".  Andrew was on the verge
of admitting we don't need that option anymore anyway ;-), so I don't
think we should allow it to drive an exception to the simplified syntax.


  


We won't need it if we can use an expression on the source, like

   coalesce(t[4],"")

that gets applied on the way in. Until then it is useful, if ugly.

cheers

andrew

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Ragged CSV import

2009-09-11 Thread Tom Lane

Andrew Dunstan  writes:
> Well, I think the objection was that it would slow COPY down to have to 
> go though the executor in the copy-as-source scenario. But maybe that 
> would happen anyway, and maybe we don't care, we'd just accept that it 
> wouldn't be nearly as fast as a raw copy.

I haven't heard complaints about the COPY (query) syntax, which is
the same thing in the opposite direction.  You can't expect that
flexibility costs zero.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] COPY enhancements

2009-09-11 Thread Robert Haas

2009/9/11 Pierre Frédéric Caillaud :
>> I was thinking something like:
>>
>> COPY tablename [ ( column [, ...] ) ] FROM { 'filename' | STDIN }
>> [WITH] [option [, ...]]
>>
>> Where:
>>
>> option := ColId [Sconst] | FORCE NOT NULL (column [,...])
>>
>> I don't see any reasonable way to sandwhich the FORCE NOT NULL syntax
>> into a keyword/value notation.
>
>        Postgres has a hstore data type which seems well suited for passing
> key/value option pairs...
>        Why another syntax to remember, another parser to code, when almost
> everything is already there ?
>
>        Think about plpgsql code which generates some SQL COPY command
> string, then this is parsed and executed... wouldn't it be a lot simpler to
> just manipulate parameters in a hstore ?...

I doubt it very much.

...Robert

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] COPY enhancements

2009-09-11 Thread Robert Haas

On Fri, Sep 11, 2009 at 11:26 AM, Tom Lane  wrote:
> Robert Haas  writes:
>> I don't see any reasonable way to sandwhich the FORCE NOT NULL syntax
>> into a keyword/value notation.
>
> Any number of ways, for example "force_not_null = true" or multiple
> occurrences of "force_not_null = column_name".  Andrew was on the verge
> of admitting we don't need that option anymore anyway ;-), so I don't
> think we should allow it to drive an exception to the simplified syntax.

While I'm at least as big a fan of generic options as the next person,
syntax is cheap.  I don't see any reason to get worked up about one
exception to a generic options syntax.  If the feature is useless, of
course we can rip it out, but that's a separate discussion.  For what
it's worth, I think your proposed alternative is ugly and an abuse of
the idea of keyword-value pairs.  In the EXPLAIN-world, a later value
for the same option overrides a previous assignment earlier in the
list, and I am in favor of sticking with that approach.

...Robert

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] COPY enhancements

2009-09-11 Thread Pierre Frédéric Caillau d


I was thinking something like:

COPY tablename [ ( column [, ...] ) ] FROM { 'filename' | STDIN }
[WITH] [option [, ...]]

Where:

option := ColId [Sconst] | FORCE NOT NULL (column [,...])

I don't see any reasonable way to sandwhich the FORCE NOT NULL syntax
into a keyword/value notation.


	Postgres has a hstore data type which seems well suited for passing  
key/value option pairs...
	Why another syntax to remember, another parser to code, when almost  
everything is already there ?


	Think about plpgsql code which generates some SQL COPY command string,  
then this is parsed and executed... wouldn't it be a lot simpler to just  
manipulate parameters in a hstore ?...



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] RfD: more powerful "any" types

2009-09-11 Thread Tom Lane

Robert Haas  writes:
> I like the idea of making concatenation more pretty, quite frankly.
> No one has really responded to Pavel's contention that this is what
> to_char() is for.

[ shrug... ]  I regard this as a prettier replacement for to_char.
That thing has got nothing whatsoever to recommend it, other than being
bug-compatible with Oracle.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Ragged CSV import

2009-09-11 Thread Andrew Dunstan




Tom Lane wrote:

Andrew Dunstan  writes:
  

I wrote:


I'd love to be able to do something like

INSERT into foo (x,y,z) select t[3],[t2],[t57] from (COPY RETURNING
t FROM stdin CSV);
  


  
Some IRC discussion suggested ways we could do better than that syntax. 
I think my current preferred candidate is something like



  

COPY foo (a,b,c)
FROM stdin
CSV
AS t USING (t[3],t[2],[t57]);



[ scratches head... ]  What happened to seeing COPY as a data source in
a larger command?  If that syntax has anything at all to recommend it,
I'm not seeing what.  It's not extensible and it would require lots of
code duplication to implement the impoverished feature set it does have.


  


Well, I think the objection was that it would slow COPY down to have to 
go though the executor in the copy-as-source scenario. But maybe that 
would happen anyway, and maybe we don't care, we'd just accept that it 
wouldn't be nearly as fast as a raw copy.


cheers

andrew

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] COPY enhancements

2009-09-11 Thread Tom Lane

Robert Haas  writes:
> I don't see any reasonable way to sandwhich the FORCE NOT NULL syntax
> into a keyword/value notation.

Any number of ways, for example "force_not_null = true" or multiple
occurrences of "force_not_null = column_name".  Andrew was on the verge
of admitting we don't need that option anymore anyway ;-), so I don't
think we should allow it to drive an exception to the simplified syntax.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] RfD: more powerful "any" types

2009-09-11 Thread Robert Haas

On Fri, Sep 11, 2009 at 10:30 AM, Tom Lane  wrote:
> "Kevin Grittner"  writes:
>> I think the main benefit of a sprintf type function for PostgreSQL is
>> in the formatting (setting length, scale, alignment), not in making
>> concatenation more pretty.
>
> Exactly, which is why I'm so distressed that this proposal not only
> hasn't got that, but is designed so that it's impossible to add it
> later.

I like the idea of making concatenation more pretty, quite frankly.
No one has really responded to Pavel's contention that this is what
to_char() is for.  Twice the code paths = twice the bugs, twice the
places that have to be updated when some new feature is added, etc.
On the other hand I don't really strongly object if someone else wants
to do the work, either.  I do think allowing for upward compatibility
with future extensions is probably smart, regardless of how simple or
complex the first version is.

...Robert

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Ragged CSV import

2009-09-11 Thread Tom Lane

Andrew Dunstan  writes:
> I wrote:
>> I'd love to be able to do something like
>> 
>> INSERT into foo (x,y,z) select t[3],[t2],[t57] from (COPY RETURNING
>> t FROM stdin CSV);

> Some IRC discussion suggested ways we could do better than that syntax. 
> I think my current preferred candidate is something like

> COPY foo (a,b,c)
> FROM stdin
> CSV
> AS t USING (t[3],t[2],[t57]);

[ scratches head... ]  What happened to seeing COPY as a data source in
a larger command?  If that syntax has anything at all to recommend it,
I'm not seeing what.  It's not extensible and it would require lots of
code duplication to implement the impoverished feature set it does have.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] COPY enhancements

2009-09-11 Thread Tom Lane

Robert Haas  writes:
> Or look at your CVS/git checkout.

The important point is to look at the grammar, which doesn't have any
idea what the specific options are in the list.  (Well, okay, it had
to have special cases for ANALYZE and VERBOSE because those are reserved
words :-(.  But future additions will not need to touch the grammar.
In the case of COPY that also means not having to touch psql \copy.)

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] COPY enhancements

2009-09-11 Thread Robert Haas

On Fri, Sep 11, 2009 at 10:18 AM, Tom Lane  wrote:
> Emmanuel Cecchet  writes:
>> The new syntax could look like:
>
>> COPY /tablename/ [ ( /column/ [, ...] ) ]
>>     FROM { '/filename/' | STDIN }
>>     [ [, BINARY ]
>>       [, OIDS ]
>>       [, DELIMITER [ AS ] '/delimiter/' ]
>>       [, NULL [ AS ] '/null string/' ]
>>       [, CSV [ HEADER ]
>>              [ QUOTE [ AS ] '/quote/' ]
>>              [ ESCAPE [ AS ] '/escape/' ]
>>              [ FORCE NOT NULL (/column/ [, ...]) ]
>>       [, ERRORS { SKIP |
>>                   LOG INTO { tablename | 'filename' }
>>                     [ LABEL label_name ]
>>                     [ KEY key_name ]
>>                     [ MAX ERRORS /count/ ] } ]
>
>> Is this what you had in mind?
>
> No. because that doesn't do a darn thing to make the option set less
> hard-wired into the syntax. I was thinking of a strict keyword/value
> format with non-wired-in keywords ... and only *one* keyword per value.
> See EXPLAIN.

I was thinking something like:

COPY tablename [ ( column [, ...] ) ] FROM { 'filename' | STDIN }
[WITH] [option [, ...]]

Where:

option := ColId [Sconst] | FORCE NOT NULL (column [,...])

I don't see any reasonable way to sandwhich the FORCE NOT NULL syntax
into a keyword/value notation.

...Robert

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] COPY enhancements

2009-09-11 Thread Robert Haas

On Fri, Sep 11, 2009 at 10:53 AM, Emmanuel Cecchet  wrote:
> Tom,
>
> I looked at EXPLAIN
> (http://www.postgresql.org/docs/current/interactive/sql-explain.html) and
> there is not a single line of what you are talking about.
> And the current syntax is just EXPLAIN [ ANALYZE ] [ VERBOSE ] /statement
> /

http://developer.postgresql.org/pgdocs/postgres/sql-explain.html

Or look at your CVS/git checkout.

...Robert

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Ragged CSV import

2009-09-11 Thread Andrew Dunstan



I wrote:

I'd love to be able to do something like

   INSERT into foo (x,y,z) select t[3],[t2],[t57] from (COPY RETURNING
   t FROM stdin CSV);




Some IRC discussion suggested ways we could do better than that syntax. 
I think my current preferred candidate is something like


   COPY foo (a,b,c)
   FROM stdin
   CSV
   AS t USING (t[3],t[2],[t57]);

I'm not sure how we'd could plug a filter into that. Maybe a WHERE 
clause? My immediate need at least doesn't actually involve filtering 
anything - we load every line in the CSV into a temp table and then 
filter what we load into the main tables after applying the business rules.



cheers

andrew

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] COPY enhancements

2009-09-11 Thread Emmanuel Cecchet


Tom,

I looked at EXPLAIN 
(http://www.postgresql.org/docs/current/interactive/sql-explain.html) 
and there is not a single line of what you are talking about.

And the current syntax is just EXPLAIN [ ANALYZE ] [ VERBOSE ] /statement
/
If I try to decrypt what you said, you are looking at something like

COPY /tablename/ [ ( /column/ [, ...] ) ]
   FROM { '/filename/' | STDIN }
   [option1=value[,...]]

That would give something like:
COPY foo FROM 'input.txt' binary=on, oids=on, errors=skip, max_errors=10

If this is not what you are thinking, please provide an example.

Emmanuel
/

/

Emmanuel Cecchet  writes:
  

The new syntax could look like:



  

COPY /tablename/ [ ( /column/ [, ...] ) ]
FROM { '/filename/' | STDIN }
[ [, BINARY ]
  [, OIDS ]
  [, DELIMITER [ AS ] '/delimiter/' ]
  [, NULL [ AS ] '/null string/' ]
  [, CSV [ HEADER ]
 [ QUOTE [ AS ] '/quote/' ] 
 [ ESCAPE [ AS ] '/escape/' ]

 [ FORCE NOT NULL (/column/ [, ...]) ]
  [, ERRORS { SKIP | 
  LOG INTO { tablename | 'filename' }

[ LABEL label_name ]
[ KEY key_name ]
[ MAX ERRORS /count/ ] } ]



  

Is this what you had in mind?



No. because that doesn't do a darn thing to make the option set less
hard-wired into the syntax.  I was thinking of a strict keyword/value
format with non-wired-in keywords ... and only *one* keyword per value.
See EXPLAIN.

regards, tom lane
  



--
Emmanuel Cecchet
Aster Data Systems
Web: http://www.asterdata.com


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Re: [COMMITTERS] pgsql: On Windows, when a file is deleted and another process still has

2009-09-11 Thread Tom Lane

Heikki Linnakangas  writes:
> Here's a patch implementing that, and changing pgrename() to check for
> ERROR_SHARING_VIOLATION and ERROR_LOCK_VIOLATION like pgwin32_open()
> does, instead of ERROR_ACCESS_DENIED.

This looks sane in a quick once-over, though I haven't tested it.
One tiny stylistic suggestion:

if (err != ERROR_SHARING_VIOLATION &&
err != ERROR_LOCK_VIOLATION)
 #else
if (errno != EACCES)
 #endif
return -1;
if (++loops > 300)  /* time out after 30 sec */
return -1;

This is overly cute and will probably confuse both pgindent and ordinary
editors.  It's worth one extra line to keep each part of the #if
syntactically independent, ie

if (err != ERROR_SHARING_VIOLATION &&
err != ERROR_LOCK_VIOLATION)
return -1;
 #else
if (errno != EACCES)
return -1;
 #endif
if (++loops > 300)  /* time out after 30 sec */
return -1;

> I wonder if we should reduce the timeout in pgrename(). It's 30 s at the
> moment, but apparently it hasn't been working correctly, failing
> immediately instead if the file is locked.

I have a vague recollection that there was a specific reason for having
such a long timeout --- you might want to check the archives to see the
discussion before that code got committed.  However, if nothing turns
up, I wouldn't object to reducing it to 5 or 10 sec.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] RfD: more powerful "any" types

2009-09-11 Thread Tom Lane

Alvaro Herrera  writes:
> Is this really all that hard?  I'm thinking it could be implemented by
> using the real C sprintf underneath, passing one % specifier and its
> corresponding parameter at a time, coerced to whatever the conversion
> specifier specifies.

The only disadvantage I can see of that is that it would lose precision
for NUMERIC.  I'd really like to be able to write "%300.100f" and have it
Do The Right Thing with a 300-digit numeric input.

> The only thing that breaks this idea is the $n positional specifiers, I
> think.

Yeah, that's a bit of a pain too.  But we have the logic for that in
src/port/.  It wouldn't be that much work to repurpose it.  Actually,
since a SQL implementation wouldn't be constrained to read the actual
arguments left-to-right, you could probably simplify it a great deal.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] RfD: more powerful "any" types

2009-09-11 Thread Tom Lane

"Kevin Grittner"  writes:
> I think the main benefit of a sprintf type function for PostgreSQL is
> in the formatting (setting length, scale, alignment), not in making
> concatenation more pretty.

Exactly, which is why I'm so distressed that this proposal not only
hasn't got that, but is designed so that it's impossible to add it
later.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] RfD: more powerful "any" types

2009-09-11 Thread Kevin Grittner

Alvaro Herrera  wrote:
 
> the format version is a lot better than the || alternative.
 
Well, if you're trying to tell me what is easier for me to read,
you're probably wrong.  I won't presume to try to tell you what you
find easier to read.
 
I think the main benefit of a sprintf type function for PostgreSQL is
in the formatting (setting length, scale, alignment), not in making
concatenation more pretty.
 
-Kevin

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] COPY enhancements

2009-09-11 Thread Tom Lane

Emmanuel Cecchet  writes:
> The new syntax could look like:

> COPY /tablename/ [ ( /column/ [, ...] ) ]
> FROM { '/filename/' | STDIN }
> [ [, BINARY ]
>   [, OIDS ]
>   [, DELIMITER [ AS ] '/delimiter/' ]
>   [, NULL [ AS ] '/null string/' ]
>   [, CSV [ HEADER ]
>  [ QUOTE [ AS ] '/quote/' ] 
>  [ ESCAPE [ AS ] '/escape/' ]
>  [ FORCE NOT NULL (/column/ [, ...]) ]
>   [, ERRORS { SKIP | 
>   LOG INTO { tablename | 'filename' }
> [ LABEL label_name ]
> [ KEY key_name ]
> [ MAX ERRORS /count/ ] } ]

> Is this what you had in mind?

No. because that doesn't do a darn thing to make the option set less
hard-wired into the syntax.  I was thinking of a strict keyword/value
format with non-wired-in keywords ... and only *one* keyword per value.
See EXPLAIN.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Why does LOG have higher priority than ERROR and WARNING?

2009-09-11 Thread Tom Lane

Itagaki Takahiro  writes:
> LOG messages have higher priority than ERROR and WARNING
> in log_min_messages (PANIC > FATAL > LOG > ERROR > WARNING) now.
> Can I reorder them to ERROR > WARNING > LOG ?

No.  That was an intentional decision.  LOG is for stuff that we
really want to get logged, in most cases.  ERROR is very often not
that interesting, and WARNING even more so.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Disable and enable of table and column constraints

2009-09-11 Thread Jan Wieck


On 9/10/2009 11:06 AM, Tom Lane wrote:

Christopher Browne  writes:

With the ALTER TABLE DISABLE TRIGGER functionality added in 8.3, we
already have the ability to do this with foreign key constraints.


That "feature" is a crock that should not be extended, because it
leaves it entirely on the user's shoulders whether the constraint
is actually true when the system thinks it is.  What is being discussed
here is ways to incrementally add real, proven-valid constraints.

(Indeed, given the thought that's being given to having the planner
assume that FK constraints hold, I rather think that we need to
reconsider ALTER DISABLE TRIGGER.)


The feature was originally intended to be a clean way of avoiding 
interferences of triggers and/or foreign keys with replication systems 
that work on the user level (like Bucardo, Londiste and Slony). The only 
way to break foreign keys in that scenario is to replicate a referencing 
table without replicating the corresponding PK table.


Note that Slony-I currently does apply updates in a fashion that would 
actually make checking of foreign keys on the replica possible, but does 
need the ability to disable regular user triggers. But for some future 
version of Slony, we may need to change that and apply changes within 
one replication group (SYNC) out of order with respect to multiple 
tables. Which means that Slony would need at least some mechanism to 
disable user triggers and force all foreign key constraints to be 
deferred, whether they are declared deferrable or not.



Jan

--
Anyone who trades liberty for security deserves neither
liberty nor security. -- Benjamin Franklin


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

[HACKERS] autovacuum_max_workers docs

2009-09-11 Thread Joshua Tolley

The current docs for autovacuum_max_workers suggest it should be modifiable
with a reload, unless I'm reading in awfully silly ways this morning (which
isn't entirely out of the question). Anyway, in the 8.3.7 and 8.5devel
instances I've tried, autovacuum_max_workers can only be set at server start.
I propose this:

diff --git a/doc/src/sgml/config.sgml b/doc/src/sgml/config.sgml
index 7c82835..26a8ddf 100644
--- a/doc/src/sgml/config.sgml
+++ b/doc/src/sgml/config.sgml
@@ -3589,8 +3589,7 @@ COPY postgres_log FROM '/full/path/to/logfile.csv' WITH 
csv;

 Specifies the maximum number of autovacuum processes (other than the
 autovacuum launcher) which may be running at any one time.  The default
-is three.  This parameter can only be set in
-the postgresql.conf file or on the server command line.
+is three.  This parameter can only be set at server start.

   
  

--
Joshua Tolley / eggyknap
End Point Corporation
http://www.endpoint.com


signature.asc
Description: Digital signature

Re: [HACKERS] RfD: more powerful "any" types

2009-09-11 Thread Alvaro Herrera

Kevin Grittner escribió:
> Pavel Stehule  wrote:
>  
> > what is more readable?
> > 
> > select 'i=' || i || ', b=' || b || ', c=' || c ..
> > 
> > or
> > 
> > select format('i=%, b=%, c=%', i, b, c ..)
>  
> Seriously, those are about dead even for me.  The concatenation
> might have a slight edge, particularly since I have the option, if
> it gets out of hand, to do:
>  
> select 'i=' || i
>   || ', b=' || b
>   || ', c=' || c
>   ..

That barely works for me, and then only because it's a trivial example.
In real uses it's never that clear-cut, and the format version is a lot
better than the || alternative.

-- 
Alvaro Herrerahttp://www.CommandPrompt.com/
The PostgreSQL Company - Command Prompt, Inc.

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] COPY enhancements

2009-09-11 Thread Emmanuel Cecchet


Hi Robert,


I like this idea, perhaps not surprisingly (for those not following at
home: that was my patch).  Unfortunately, it looks to me like there is
no way to do this without overhauling the syntax.  If the existing
syntax required a comma between options (i.e. "copy blah to stdout
binary, oids" rather than "copy to stdout binary oids", this would be
pretty straightforward; but it doesn't even allow one).
  
Well some options like CSV FORCE ... take a comma separated list of 
columns. This would require all options to become reserved keywords or 
force parenthesis around option parameters.

I wonder if we should consider allowing COPY options to be
comma-separated beginning with 8.5, and then require it in 8.6.  Other
options include continuing to support the old syntax for the existing
options, but allowing some new syntax as well and requiring its use
for all new options (this is what we did with EXPLAIN, but there were
only two pre-existing options there), and just changing the syntax
incompatibly and telling users to suck it up.  But I'm not sure I like
either of those choices.
  
We could keep the current syntax for backward compatibility only (can be 
dropped in a future release) and have a new syntax (starting in 8.5). To 
avoid confusion between both, we could just replace WITH with something 
else (or just drop it) to indicate that this is the new syntax.


The new syntax could look like:

COPY /tablename/ [ ( /column/ [, ...] ) ]
   FROM { '/filename/' | STDIN }
   [ [, BINARY ]
 [, OIDS ]
 [, DELIMITER [ AS ] '/delimiter/' ]
 [, NULL [ AS ] '/null string/' ]
 [, CSV [ HEADER ]
[ QUOTE [ AS ] '/quote/' ] 
[ ESCAPE [ AS ] '/escape/' ]

[ FORCE NOT NULL (/column/ [, ...]) ]
 [, ERRORS { SKIP | 
 LOG INTO { tablename | 'filename' }

   [ LABEL label_name ]
   [ KEY key_name ]
   [ MAX ERRORS /count/ ] } ]


Is this what you had in mind?

Emmanuel

--
Emmanuel Cecchet
Aster Data Systems
Web: http://www.asterdata.com

Re: [HACKERS] RfD: more powerful "any" types

2009-09-11 Thread Hannu Krosing

On Thu, 2009-09-10 at 19:52 +0200, Pavel Stehule wrote:
> 2009/9/10 Tom Lane :
> > Alvaro Herrera  writes:
> >> alvherre=# select text_format('% was % at % and said % % times', 
> >> 'Pavel'::text, 'here'::unknown, now(), row('a','b','c'), '{42}'::int[]);
> >>  text_format
> >> -
> >>  Pavel was here at 2009-09-10 13:12:09.054653-04 and said (a,b,c) {42} 
> >> times
> >> (1 fila)
> >
> > Is that what's being proposed?  That pretty much sucks --- it's just
> > another way of concatenating some strings.  I thought the idea was to
> > provide the same power as sprintf, eg field width controls, numeric
> > formatting options, etc.
> >
> 
> I thing so this is enough - we can get simply message text - like
> raise notice statement. I thing so simple and clean function has more
> usability than heavy real sprintf function. We (c coders) are old
> dinosaurs - but  nobody else knows what sprintf function does.

They probably do, as at least PHP and perl have also (s)printf
functions, probably many others as well. But most likely each of them
has a slightly different syntax.

> I thing so the name only "format" is good, it's short. If you need
> some other formating, just you can use to_char function.

yes, format(...) is  generic enough that people won't expect it to
confirm to their favorite languages version of printf.

> Pavel
> 
> >regards, tom lane
> >

-- 
Hannu Krosing   http://www.2ndQuadrant.com
PostgreSQL Scalability and Availability 
   Services, Consulting and Training



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] RfD: more powerful "any" types

2009-09-11 Thread Aidan Van Dyk

* Alvaro Herrera  [090910 23:32]:
 
> Is this really all that hard?  I'm thinking it could be implemented by
> using the real C sprintf underneath, passing one % specifier and its
> corresponding parameter at a time, coerced to whatever the conversion
> specifier specifies.

It's not "hard", but please, don't break this, to make it more "not
hard":

> The only thing that breaks this idea is the $n positional specifiers, I
> think.

And also, please work for user-defined types (meaning you need to use
the type and catalog system to lookup coercions, not hard-code
anything...

l-)

It's doable, but it's going got be a lot of explicit casting and
coercion, and going to require a lot of documentation and error
states...

Remember, users using sprintf are really going to want it to act exactly
as it would if they were using C, minus the crash part.

a.
-- 
Aidan Van Dyk Create like a god,
ai...@highrise.ca   command like a king,
http://www.highrise.ca/   work like a slave.


signature.asc
Description: Digital signature

[HACKERS] Why does LOG have higher priority than ERROR and WARNING?

2009-09-11 Thread Itagaki Takahiro

LOG messages have higher priority than ERROR and WARNING
in log_min_messages (PANIC > FATAL > LOG > ERROR > WARNING) now.
Can I reorder them to ERROR > WARNING > LOG ?  It makes a difference
to "per-destination minimum message levels" feature that I working on.

LOG messages are often used for performance logging. On the other hand,
WARNING and ERROR messages report something bad. It should be no surprise
that users think ERRORs and WARNINGs are more important than LOGs.
So, I think we should allow users to set log_min_messages to report
only PANIC, FATAL, ERROR and WARNING messages in server logs.

Am I missing something?

Regards,
---
ITAGAKI Takahiro
NTT Open Source Software Center


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

[HACKERS] Re: [COMMITTERS] pgsql: On Windows, when a file is deleted and another process still has

2009-09-11 Thread Heikki Linnakangas

(moving to pgsql-hackers)

Tom Lane wrote:
> Heikki Linnakangas  writes:
>> A completely different approach would be to treat any failure on all
>> platforms as non-fatal. We shouldn't really cut the checkpoint short if
>> recycling a WAL file fails, whatever the reason. That seems like a more
>> robust approach than trying to guess which error codes are OK to ignore.
> 
> I could live with that, as long as it gets logged.

Here's a patch implementing that, and changing pgrename() to check for
ERROR_SHARING_VIOLATION and ERROR_LOCK_VIOLATION like pgwin32_open()
does, instead of ERROR_ACCESS_DENIED.

I wonder if we should reduce the timeout in pgrename(). It's 30 s at the
moment, but apparently it hasn't been working correctly, failing
immediately instead if the file is locked. And no-one has complained
about that. But if we sleep in InstallXLogFileSegment(), we're holding
ControlFileLock, which can force other backends to wait, and that might
cause more harm than just failing outright. Something like 5 seconds
might be more appropriate, giving anti-virus and similar software some
time to give up the lock, but not too much to cause long delays. 5
seconds should be enough for anti-virus or backup software to process a
file under normal circumstances.

OTOH, pgwin32_open() uses 30 s, with the same potential for lockups, and
no-one has complained about that either. The bottom line is that if
another program keeps a file locked for any extended period of time,
you're going to have trouble one way or another.

-- 
  Heikki Linnakangas
  EnterpriseDB   http://www.enterprisedb.com
*** a/src/backend/access/transam/xlog.c
--- b/src/backend/access/transam/xlog.c
***
*** 2262,2273  XLogFileInit(uint32 log, uint32 seg,
  *use_existent, &max_advance,
  use_lock))
  	{
! 		/* No need for any more future segments... */
  		unlink(tmppath);
  	}
  
- 	elog(DEBUG2, "done creating and filling new WAL file");
- 
  	/* Set flag to tell caller there was no existent file */
  	*use_existent = false;
  
--- 2262,2275 
  *use_existent, &max_advance,
  use_lock))
  	{
! 		/*
! 		 * No need for any more future segments, or InstallXLogFileSegment()
! 		 * failed to rename the file into place. If the rename failed, opening
! 		 * the file below will fail.
! 		 */
  		unlink(tmppath);
  	}
  
  	/* Set flag to tell caller there was no existent file */
  	*use_existent = false;
  
***
*** 2280,2285  XLogFileInit(uint32 log, uint32 seg,
--- 2282,2289 
  		   errmsg("could not open file \"%s\" (log file %u, segment %u): %m",
    path, log, seg)));
  
+ 	elog(DEBUG2, "done creating and filling new WAL file");
+ 
  	return fd;
  }
  
***
*** 2409,2418  XLogFileCopy(uint32 log, uint32 seg,
   * place.  This should be TRUE except during bootstrap log creation.  The
   * caller must *not* hold the lock at call.
   *
!  * Returns TRUE if file installed, FALSE if not installed because of
!  * exceeding max_advance limit.  On Windows, we also return FALSE if we
!  * can't rename the file into place because someone's got it open.
!  * (Any other kind of failure causes ereport().)
   */
  static bool
  InstallXLogFileSegment(uint32 *log, uint32 *seg, char *tmppath,
--- 2413,2421 
   * place.  This should be TRUE except during bootstrap log creation.  The
   * caller must *not* hold the lock at call.
   *
!  * Returns TRUE if the file was installed successfully. FALSE indicates that
!  * max_advance limit was exceeded, or an error occurred while renaming the
!  * file into place.
   */
  static bool
  InstallXLogFileSegment(uint32 *log, uint32 *seg, char *tmppath,
***
*** 2460,2490  InstallXLogFileSegment(uint32 *log, uint32 *seg, char *tmppath,
  	 */
  #if HAVE_WORKING_LINK
  	if (link(tmppath, path) < 0)
! 		ereport(ERROR,
  (errcode_for_file_access(),
   errmsg("could not link file \"%s\" to \"%s\" (initialization of log file %u, segment %u): %m",
  		tmppath, path, *log, *seg)));
  	unlink(tmppath);
  #else
  	if (rename(tmppath, path) < 0)
  	{
! #ifdef WIN32
! #if !defined(__CYGWIN__)
! 		if (GetLastError() == ERROR_ACCESS_DENIED)
! #else
! 		if (errno == EACCES)
! #endif
! 		{
! 			if (use_lock)
! LWLockRelease(ControlFileLock);
! 			return false;
! 		}
! #endif   /* WIN32 */
! 
! 		ereport(ERROR,
  (errcode_for_file_access(),
   errmsg("could not rename file \"%s\" to \"%s\" (initialization of log file %u, segment %u): %m",
  		tmppath, path, *log, *seg)));
  	}
  #endif
  
--- 2463,2488 
  	 */
  #if HAVE_WORKING_LINK
  	if (link(tmppath, path) < 0)
! 	{
! 		if (use_lock)
! 			LWLockRelease(ControlFileLock);
! 		ereport(LOG,
  (errcode_for_file_access(),
   errmsg("could not link file \"%s\" to \"%s\" (initialization of log file %u, segment %u): %m",
  		tmppath, path, *log, *seg)));
+ 		return false;
+ 	}
  	unlink(tmppath);
  #else
  	if (rename(tmppath, path) < 0)
  	{
! 		if (use_lo

Re: [HACKERS] Ragged CSV import

2009-09-11 Thread Dimitri Fontaine

Hi,

Andrew Dunstan  writes:
> I do like the idea of COPY returning a SETOF text[], but I am not at all
> clear on the mechanics of feeding STDIN to an SRF. ISTM that something like
> a RETURNING clause on COPY and the ability to use it in FROM clause or
> something similar might work better. 

I envisonned COPY "just" returning all what it reads (minus extra discarded
column as soon as your proposal gets implemented), in the FROM clause,
and the user sorting out what he wants in the SELECT clause.

> I understand the difficulties, but
> maybe we could place some restrictions on where it could be used so as to
> obviate at least some of those.

Maybe instead of opening FROM for COPY, having it accepted in WITH would
be better, the same way (from the user point of view) that DML returning
are worked on.

>INSERT into foo (x,y,z) select t[3],[t2],[t57] from (COPY RETURNING
>t FROM stdin CSV);

Would become:

 WITH csv AS (
   COPY t FROM stdin CSV
 )
 INSERT INTO foo(x, y, z) 
 SELECT t[3], t[2], mytimestamptz([5], [6], [7])
   FROM csv;

Now the text[] has a strange feeling, without it it'd be:

 WITH csv AS (
   COPY t(a, b, c, d, e, f, g) 
   FROM stdin 
CSV IGNORING EXTRA COLUMNS -- random nice syntax
MISSING COLUMNS DEFAULTS NULL  -- that needs some reality check
 )
 INSERT INTO foo(x, y, z) 
 SELECT c, b, mytimestamptz(e, f, g)
   FROM csv;

The function mytimestamptz(date text, time text, timezone text) will
accept input that PostgreSQL input types would have errored out on... so
you can process in one go strange formats from other products.

> The only thing that's been seriously on the table that isn't accounted for
> by something like this is the suggestion of making the header line have some
> semantic significance, and I'm far from sure that's a good idea.

Yeah, and it seems only useful when you don't have any way to play with
what COPY returns before it goes to a table (magic column reordering or
ignoring).

Regards,
-- 
dim

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

82 matches

Mail list logo