Re: [HACKERS] Per-function GUC settings: trickier than it looked

2007-09-03 Thread Michael Paesold

Florian G. Pflug wrote:

Tom Lane wrote:

So, to reiterate, my idea is
.) Make "SET TRANSACTION" a synonym for "SET LOCAL" at the SQL-Level.
.) In pl/pgsql, "SET TRANSACTION" sets a new value that is kept after 
the

function exits, even if the function has a matching SET-clause.
.) "SET LOCAL" in pl/pgsql set a new value that is kept if the function
has no matching SET-clause. If it has one, the value is restored.
In any case, we emit a warning that "SET LOCAL" is going away.
.) One day, make "SET LOCAL" in pl/pgsql mean "local to the surrounding
BEGIN/END block". Independent of any SET-clauses the function
might or might not have.


I don't think it's a good idea to change SET LOCAL now and plan on
changing it again later ;-).  If we really want BEGIN-block-local
SET capability, I'd prefer to think of some new keyword for that.
But I'm not convinced it's interesting --- given the proposed behavior
of function SET-clauses, attaching a SET to your function seems like
it'll cover the need for restoring outer values.


Hm... could we still have "SET TRANSACTION" as a synonym for "SET LOCAL"?
That would blend nicely with "SET TRANSACTION ISOLATION LEVEL" and
"SET TRANSACTION READ ONLY".


I don't think it's a very good idea to make SET TRANSACTION an alias for 
SET LOCAL, because SET TRANSACTION has already got its own meaning in the 
SQL spec - it sets transaction modes. Although I agree with you that 
variables set with SET LOCAL are also attached to the transaction (by 
definition), I would still rather separate transaction-local GUCs from 
spec-defined transaction modes.


As precedence, they have two separate reference pages already:
http://www.postgresql.org/docs/8.1/interactive/sql-set.html
http://www.postgresql.org/docs/8.1/interactive/sql-set-transaction.html


[ thinking... ] Hey, wait a moment. Regarding "SET TRANSACTION READ ONLY" -
This is not strictly speaking a GUC, but still, if we pretend that
there are no subtransaction, that command should too propage to the
outermost transaction on release, shouldn't it?


...


I believe that for consistencies sake, the "set transaction read only" 
should have propagated to the outermost transaction on "release s1".


Sounds reasonable to me. I understand SAVEPOINT/RELEASE come from the SQL 
standard. So does the SQL standard say anything about this?


Best Regards
Michael Paesold

---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [HACKERS] tsearch filenames unlikes special symbols and numbers

2007-09-03 Thread Pavel Stehule
2007/9/4, Tom Lane <[EMAIL PROTECTED]>:
> "Ben Tilly" <[EMAIL PROTECTED]> writes:
> > On 9/3/07, Tom Lane <[EMAIL PROTECTED]> wrote:
> >> There seems fairly clear use-case for allowing A-Z a-z 0-9 and
> >> underscore (while CVS head rejects 0-9 and underscore).
>
> > The problem with allowing uppercase letters is that on some
> > filesystems foo and Foo are the same file, and on others they are not.
> >  This may lead to obscure portability problems where code worked fine
> > on Unix and then fails when the database is running on Windows.
>
> Yeah, good point.  So far it seems that a-z 0-9 and underscore cover the
> real use-cases, so what say we just allow those for now?  It's a lot
> easier to loosen up later than tighten up ...
>
> regards, tom lane
>

It's system specific. I prefere a-z and A-Z. Clasic name for
dictionaries combine lower and upper characters .. for czech
cs_CZ_UTF8 etc.

dictfile = cs_CZ_UTF8  ... automatic convert to cs_cz_utf8.dict
dictfile = 'cs_CZ_UTF8' .. check and use cs_CZ_UTF8

Regards
Pavel Stehule

p.s. it's important on UNIX platforms and without any efect on windows.

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [HACKERS] Per-function GUC settings: trickier than it looked

2007-09-03 Thread Florian G. Pflug

Tom Lane wrote:

So, to reiterate, my idea is
.) Make "SET TRANSACTION" a synonym for "SET LOCAL" at the SQL-Level.
.) In pl/pgsql, "SET TRANSACTION" sets a new value that is kept after the
function exits, even if the function has a matching SET-clause.
.) "SET LOCAL" in pl/pgsql set a new value that is kept if the function
has no matching SET-clause. If it has one, the value is restored.
In any case, we emit a warning that "SET LOCAL" is going away.
.) One day, make "SET LOCAL" in pl/pgsql mean "local to the surrounding
BEGIN/END block". Independent of any SET-clauses the function
might or might not have.


I don't think it's a good idea to change SET LOCAL now and plan on
changing it again later ;-).  If we really want BEGIN-block-local
SET capability, I'd prefer to think of some new keyword for that.
But I'm not convinced it's interesting --- given the proposed behavior
of function SET-clauses, attaching a SET to your function seems like
it'll cover the need for restoring outer values.


Hm... could we still have "SET TRANSACTION" as a synonym for "SET LOCAL"?
That would blend nicely with "SET TRANSACTION ISOLATION LEVEL" and
"SET TRANSACTION READ ONLY".

[ thinking... ] Hey, wait a moment. Regarding "SET TRANSACTION READ ONLY" -
This is not strictly speaking a GUC, but still, if we pretend that
there are no subtransaction, that command should too propage to the
outermost transaction on release, shouldn't it?

This is what happens currently (CVS HEAD with at least your initial
function-SET-clause patch already in)
regression=# begin ;
BEGIN
regression=# savepoint s1 ;
SAVEPOINT
regression=# set transaction read only ;
SET
regression=# release s1 ;
RELEASE
regression=# create table test (id int) ;
CREATE TABLE
regression=# commit ;
COMMIT

compared to:
regression=# begin ;
BEGIN
regression=# set transaction read only ;
SET
regression=# create table test (id int) ;
ERROR:  transaction is read-only

I believe that for consistencies sake, the "set transaction read only" should
have propagated to the outermost transaction on "release s1".

greetings, Florian Pflug


---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
  choose an index scan if your joining column's datatypes do not
  match


Re: [HACKERS] Hash index todo list item

2007-09-03 Thread Ben Tilly
On 9/3/07, Tom Lane <[EMAIL PROTECTED]> wrote:
> "Ben Tilly" <[EMAIL PROTECTED]> writes:
> > That raises a very random thought.  One of the nicer features of
> > Oracle is the ability to have function-based indexes.  So you could
> > index, say, trim(lower(person.name)).
>
> > Is there any prospect of postgres aquiring that functionality?
>
> Uh, no, since it's already there; has been since Berkeley days ...

Nice!

I know of at least one DBA who is moving from Oracle to postgres who
will be *very* happy to hear that.

Ben

---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org


[HACKERS] Updatable cursor doubt

2007-09-03 Thread FAST PostgreSQL

In CVS HEAD

workspace=# begin;
BEGIN
workspace=# declare cu cursor for select * from t1 for read only;
DECLARE CURSOR
workspace=# fetch cu;
 a
---
 1
(1 row)

workspace=# delete from t1 where current of cu;
DELETE 1
workspace=# commit;
COMMIT

Is this the intended behaviour? If so should we remove the 'READ ONLY' 
clause from the allowable syntax?


The documentation does not have 'READ ONLY' as part of the cursor syntax 
anymore.


Rgds,
Arul Shaji



---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [HACKERS] Per-function GUC settings: trickier than it looked

2007-09-03 Thread Tom Lane
"Florian G. Pflug" <[EMAIL PROTECTED]> writes:
> It still seems a bit strange that "SET LOCAL" is undone at function-exit,
> if the function has a matching SET-clause. But we need that for backwards-
> compatibility of the secure-search_path workaround, right?

Yeah, I'm afraid we backed ourselves into a corner on that one.

> So, to reiterate, my idea is
> .) Make "SET TRANSACTION" a synonym for "SET LOCAL" at the SQL-Level.
> .) In pl/pgsql, "SET TRANSACTION" sets a new value that is kept after the
> function exits, even if the function has a matching SET-clause.
> .) "SET LOCAL" in pl/pgsql set a new value that is kept if the function
> has no matching SET-clause. If it has one, the value is restored.
> In any case, we emit a warning that "SET LOCAL" is going away.
> .) One day, make "SET LOCAL" in pl/pgsql mean "local to the surrounding
> BEGIN/END block". Independent of any SET-clauses the function
> might or might not have.

I don't think it's a good idea to change SET LOCAL now and plan on
changing it again later ;-).  If we really want BEGIN-block-local
SET capability, I'd prefer to think of some new keyword for that.
But I'm not convinced it's interesting --- given the proposed behavior
of function SET-clauses, attaching a SET to your function seems like
it'll cover the need for restoring outer values.

regards, tom lane

---(end of broadcast)---
TIP 7: You can help support the PostgreSQL project by donating at

http://www.postgresql.org/about/donate


Re: [HACKERS] Per-function GUC settings: trickier than it looked

2007-09-03 Thread Florian G. Pflug

Tom Lane wrote:

"Florian G. Pflug" <[EMAIL PROTECTED]> writes:

Tom Lane wrote:

Clear to everyone?  Any objections?



That makes "SET LOCAL" completely equivalent to "SET", except
when used inside a function that has a corresponding SET-clause, right?


Maybe it wasn't clear :-(.  They aren't equivalent because in the
absence of rollback, SET's effects persist past main-transaction end;
SET LOCAL's don't.  That's the way they were defined originally
(pre-subtransactions) and it still seems to make sense.

Ah, OK - things make much more sense now.


So I think *if* this is done, "SET LOCAL" should be renamed to
"SET FUNCTION". This would also prevent confusion, because everyone
who currently uses SET LOCAL will have to change his code anyway,
since the semantics change for every use-case apart from functions
with SET-clauses, which don't exist in 8.2.


I'm not sure how many people have really written code that depends on
the behavior of SET LOCAL rolling back at successful subtransaction end.
I think we'd have heard about it if very many people had noticed,
because it's not what the manual says.

For the one use we've actually advocated (setting a temporary value
within a function and then reverting to the old setting before exit),
there isn't any visible change in behavior, since abandonment of the
restored value at subtransaction end still ends up with the same result.


And renaming SET LOCAL also emphasized that point that we are taking
away functionality here - even if that functionality might not seem
very useful.


We can't break the officially advocated solution for secure search_path.
However, that particular coding pattern will still work with the change
I'm proposing.  It's only where you *don't* manually restore the prior
value that you might notice a difference.


BTW, I *did* check the documentation before responding to Simon's original
mail, and I *did* read it as "SET LOCAL goes away a subtransaction end".
I figured that since there is no word on subtransactions in that part
of the documentation, transaction will apply equally to both toplevel
and subtransaction.


Yeah, but you know that it's subtransactions under the hood, whereas
someone who's thinking in terms of SAVEPOINT/RELEASE and BEGIN/EXCEPTION
probably hasn't a clue about that.

I plead guilty here ;-). That whole SAVEPOINT/RELEASE thing always seemed
strange to me - I just accepted it at some point, but still translated it
into something hierarchical I guess




   .) In pl/pgsql, that fact that "SET LOCAL" goes away after the current
  BEGIN/END block seems entirely logical.


I don't think so ... your other side-effects such as table updates don't
disappear, so why should SET's

I guess because LOCAL to me implies some lexical locality - like the
surrounding BEGIN/END block.


I'm not necessarily averse to inventing a third version of SET, but I
don't see a well-thought-out proposal here.  In particular, we should be
making an effort to *not* expose the concept of subtransaction at the
SQL level at all, because that's not what the spec has.

Thanks for your explanation - I can see your point now, after realizing
why the spec has SAVEPOINT/RELEASE and *not* nested BEGIN/COMMIT blocks.

So, at least on the SQL-level, I guess I agree - your new semantics fit
better with the sql spec, even if they seemed quite strange to me at
first sight. Though maybe we should add  "SET TRANSACTION" as a synonym for
"SET LOCAL"? - the former seems to convey your new semantics much better than
the later.

It still seems a bit strange that "SET LOCAL" is undone at function-exit,
if the function has a matching SET-clause. But we need that for backwards-
compatibility of the secure-search_path workaround, right? Maybe we could
make "SET TRANSACTION" different from "SET LOCAL" in pl/pgsql, and warn
if "SET LOCAL" is used? That would enable us either get rid of "SET LOCAL"
in the long term, or to really make it local to the surrounding BEGIN/END
block.

So, to reiterate, my idea is
.) Make "SET TRANSACTION" a synonym for "SET LOCAL" at the SQL-Level.
.) In pl/pgsql, "SET TRANSACTION" sets a new value that is kept after the
   function exits, even if the function has a matching SET-clause.
.) "SET LOCAL" in pl/pgsql set a new value that is kept if the function
   has no matching SET-clause. If it has one, the value is restored.
   In any case, we emit a warning that "SET LOCAL" is going away.
.) One day, make "SET LOCAL" in pl/pgsql mean "local to the surrounding
   BEGIN/END block". Independent of any SET-clauses the function
   might or might not have.

The last idea might seem to create a inconsistency between the SQL-level
and pl/pgsql, but I think it does not. "SET LOCAL" is local to the
surrounding BEGIN/{END|COMMIT} block in both cases - it's just that
you have nested such blocks in pl/pgsql, but not in plain SQL.

greetings, Florian Pflug

---(end of broadcast)---
TIP 2: Don't 'kill -9

Re: [HACKERS] Hash index todo list item

2007-09-03 Thread Tom Lane
"Ben Tilly" <[EMAIL PROTECTED]> writes:
> That raises a very random thought.  One of the nicer features of
> Oracle is the ability to have function-based indexes.  So you could
> index, say, trim(lower(person.name)).

> Is there any prospect of postgres aquiring that functionality?

Uh, no, since it's already there; has been since Berkeley days ...

regards, tom lane

---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org


Re: [HACKERS] Hash index todo list item

2007-09-03 Thread Kenneth Marshall
On Mon, Sep 03, 2007 at 05:20:34PM -0700, Ben Tilly wrote:
> 
> That raises a very random thought.  One of the nicer features of
> Oracle is the ability to have function-based indexes.  So you could
> index, say, trim(lower(person.name)).  There are a *lot* of practical
> situations where that comes in handy.  The best workaround that I can
> think of for not having that is to have a column defined to hold the
> result of the function, maintain that column with a trigger, then
> index that column.  Which works, but is inelegant.  (It also requires
> storing completely redundant data.)
> 
> Is there any prospect of postgres aquiring that functionality?
> 
> Ben
> 
I believe that PostgreSQL already supports functional indexes. In fact,
one suggestion to address the egregiously poor performance of the current
hash index was to replace it with a functional index.

Regards,
Ken

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [HACKERS] Hash index todo list item

2007-09-03 Thread Ben Tilly
On 9/3/07, Gregory Stark <[EMAIL PROTECTED]> wrote:
>
> "Kenneth Marshall" <[EMAIL PROTECTED]> writes:
>
> > On Sun, Sep 02, 2007 at 10:41:22PM -0400, Tom Lane wrote:
> >> Kenneth Marshall <[EMAIL PROTECTED]> writes:
> >> > ... This is the rough plan. Does anyone see anything critical that
> >> > is missing at this point?
> >>
> >> Sounds pretty good.  Let me brain-dump one item on you: one thing that
> >> hash currently has over btree is the ability to handle index items up
> >> to a full page.  Now, if you go with a scheme that only stores hash
> >> codes and not the underlying data, you can not only handle that but
> >> improve on it;
>
> I think that would be a big selling point for hash indexes. It would let you
> index even toasted data which are larger than a page. I'm not sure whether you
> can make it work for unique indexes though. But for non-unique indexes I think
> it would be a solid win and mean you cover a set of use cases quite distinct
> from btree indexes.
>
> > - Hash lookup is O(1) while btree is O(logN).
>
> That's not really true. There's a tradeoff between insertion time and lookup
> time. In order to get O(1) lookups you need to work pretty hard to maintain
> the hash table including spending a lot of time reorganizing it when you grow
> it. If you don't want to spend the time on inserts then you end up with
> buckets and the hash table is basically just a linear speedup to whatever
> algorithm you use to scan the buckets.

These facts notwithstanding, average insert performance remains O(1)
if you grow the hash exponentially every time it needs to be grown.
Suppose, for example, that you use a power of 2 arrangement.  Then the
worst case scenario, right after a split, is that all of your keys had
to be inserted, all had to be moved once, half had to be moved twice,
a quarter 3 times, etc.  So the ratio of moves to keys is 1 + 1/2 +
1/4 + ... which is a well-known geometric series converging on 2.

True, when you cross the threshold a lot of work needs to be done.
Life would be simpler if you could just put up a lock while you split
the hash.  You can't do that for a busy transactional database though.
 But if you want to be clever about it, you build into your hash
implementation the intelligence to be able to have 1 or 2 hash
locations to search.  When they are both present, all inserts go into
one of them, all deletes and updates are performed against both.  Then
you're able to have a background job reorganize your hash while the
database continues to use it.

> > - What about multi-column indexes? The current implementation
> >   only supports 1 column.
>
> That seems kind of weird. It seems obvious that you mix the three hashes
> together which reduces it to the solved problem.

That raises a very random thought.  One of the nicer features of
Oracle is the ability to have function-based indexes.  So you could
index, say, trim(lower(person.name)).  There are a *lot* of practical
situations where that comes in handy.  The best workaround that I can
think of for not having that is to have a column defined to hold the
result of the function, maintain that column with a trigger, then
index that column.  Which works, but is inelegant.  (It also requires
storing completely redundant data.)

Is there any prospect of postgres aquiring that functionality?

Ben

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [HACKERS] tsearch filenames unlikes special symbols and numbers

2007-09-03 Thread Tom Lane
"Ben Tilly" <[EMAIL PROTECTED]> writes:
> On 9/3/07, Tom Lane <[EMAIL PROTECTED]> wrote:
>> There seems fairly clear use-case for allowing A-Z a-z 0-9 and
>> underscore (while CVS head rejects 0-9 and underscore).

> The problem with allowing uppercase letters is that on some
> filesystems foo and Foo are the same file, and on others they are not.
>  This may lead to obscure portability problems where code worked fine
> on Unix and then fails when the database is running on Windows.

Yeah, good point.  So far it seems that a-z 0-9 and underscore cover the
real use-cases, so what say we just allow those for now?  It's a lot
easier to loosen up later than tighten up ...

regards, tom lane

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match


Re: [HACKERS] tsearch filenames unlikes special symbols and numbers

2007-09-03 Thread Tom Lane
"Ben Tilly" <[EMAIL PROTECTED]> writes:
> I don't know what you're discussing well enough to know if this is
> relevant, but what you just said is not always true.  If there is any
> way to pass arbitrary binary data into your function call, then
> someone can pass in a string with nul in it.

Not a problem here, because the passed-in data is considered
nul-terminated already.  (Sometimes, not being 8-bit-clean is
an advantage...)

regards, tom lane

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly


Re: [HACKERS] Hash index todo list item

2007-09-03 Thread Tom Lane
Gregory Stark <[EMAIL PROTECTED]> writes:
> "Kenneth Marshall" <[EMAIL PROTECTED]> writes:
>> - What about multi-column indexes? The current implementation
>> only supports 1 column.

> That seems kind of weird. It seems obvious that you mix the three hashes
> together which reduces it to the solved problem. 

No, because part of the deal is that you can do lookups using only the
leading index columns.  At least, all the existing multicolumn index
types can do that.

regards, tom lane

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match


Re: [HACKERS] tsearch filenames unlikes special symbols and numbers

2007-09-03 Thread Ben Tilly
On 9/3/07, Tom Lane <[EMAIL PROTECTED]> wrote:
> Gregory Stark <[EMAIL PROTECTED]> writes:
> > "Tom Lane" <[EMAIL PROTECTED]> writes:
> >> I'm not convinced that . is issue-free.  On most if not all versions of 
> >> Unix,
> >> you are allowed to open a directory as a file and read the filenames it
> >> contains.  While I don't say it'd be easy to manage that through
> >> tsearch, there's at least a potential for discovering the filenames
> >> present in . and .. --- how much do we care about that?
>
> > Actually I don't think that's true any more, most file systems on most 
> > Unixen
> > do not allow it. However it appears it's still the case for Solaris so it's
> > still a good point.
>
> Actually, now that I've woken up a bit more, it is not a problem as
> long as the tsearch code always appends some kind of file extension
> to what the user gives, such as ".dict".  It'll be impossible to name
> "." or ".." with that addition.

I don't know what you're discussing well enough to know if this is
relevant, but what you just said is not always true.  If there is any
way to pass arbitrary binary data into your function call, then
someone can pass in a string with nul in it.  When that hits the OS
API, your appended .dict won't be seen as part of the filename.

(This is a common security oversight when calling C APIs from
higher-level languages such as Perl.  See
http://artofhacking.com/files/phrack/phrack55/P55-07.TXT for more.)

[...]

Cheers,
Ben

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly


Re: [HACKERS] tsearch filenames unlikes special symbols and numbers

2007-09-03 Thread Ben Tilly
On 9/3/07, Tom Lane <[EMAIL PROTECTED]> wrote:
> Alvaro Herrera <[EMAIL PROTECTED]> writes:
> > On the other hand, this means the name has to be quoted if it would be
> > quoted as an SQL identifier, right?
>
> Something like that.  I wasn't planning on rejecting uppercase letters,
> though, which would be necessary if you wanted to be strict about
> matching unquoted identifiers.
>
> There seems fairly clear use-case for allowing A-Z a-z 0-9 and
> underscore (while CVS head rejects 0-9 and underscore).  There also seem
> to be good arguments for disallowing / \ : on various platforms, which
> leaves us with some other punctuation in question, as well as the whole
> matter of non-ASCII characters.  I'm not sure whether we want to touch
> the idea of non-ASCII; comments?

The problem with allowing uppercase letters is that on some
filesystems foo and Foo are the same file, and on others they are not.
 This may lead to obscure portability problems where code worked fine
on Unix and then fails when the database is running on Windows.

The approach that I'd suggest is allow a very restricted subset as an
immediate solution (say a-z and 0-9), and plan to later allow
arbitrary data to be passed in, then be encoded in some way before
hitting disk.  (And later need not be much later - such encodings are
not that hard to write.)

Cheers,
Ben

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [HACKERS] Per-function GUC settings: trickier than it looked

2007-09-03 Thread Tom Lane
"Florian G. Pflug" <[EMAIL PROTECTED]> writes:
> Tom Lane wrote:
>> Clear to everyone?  Any objections?

> That makes "SET LOCAL" completely equivalent to "SET", except
> when used inside a function that has a corresponding SET-clause, right?

Maybe it wasn't clear :-(.  They aren't equivalent because in the
absence of rollback, SET's effects persist past main-transaction end;
SET LOCAL's don't.  That's the way they were defined originally
(pre-subtransactions) and it still seems to make sense.

> So I think *if* this is done, "SET LOCAL" should be renamed to
> "SET FUNCTION". This would also prevent confusion, because everyone
> who currently uses SET LOCAL will have to change his code anyway,
> since the semantics change for every use-case apart from functions
> with SET-clauses, which don't exist in 8.2.

I'm not sure how many people have really written code that depends on
the behavior of SET LOCAL rolling back at successful subtransaction end.
I think we'd have heard about it if very many people had noticed,
because it's not what the manual says.

For the one use we've actually advocated (setting a temporary value
within a function and then reverting to the old setting before exit),
there isn't any visible change in behavior, since abandonment of the
restored value at subtransaction end still ends up with the same result.

> And renaming SET LOCAL also emphasized that point that we are taking
> away functionality here - even if that functionality might not seem
> very useful.

We can't break the officially advocated solution for secure search_path.
However, that particular coding pattern will still work with the change
I'm proposing.  It's only where you *don't* manually restore the prior
value that you might notice a difference.

> BTW, I *did* check the documentation before responding to Simon's original
> mail, and I *did* read it as "SET LOCAL goes away a subtransaction end".
> I figured that since there is no word on subtransactions in that part
> of the documentation, transaction will apply equally to both toplevel
> and subtransaction.

Yeah, but you know that it's subtransactions under the hood, whereas
someone who's thinking in terms of SAVEPOINT/RELEASE and BEGIN/EXCEPTION
probably hasn't a clue about that.

>.) In pl/pgsql, that fact that "SET LOCAL" goes away after the current
>   BEGIN/END block seems entirely logical.

I don't think so ... your other side-effects such as table updates don't
disappear, so why should SET's?

I'm not necessarily averse to inventing a third version of SET, but I
don't see a well-thought-out proposal here.  In particular, we should be
making an effort to *not* expose the concept of subtransaction at the
SQL level at all, because that's not what the spec has.

regards, tom lane

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [HACKERS] Per-function GUC settings: trickier than it looked

2007-09-03 Thread Florian G. Pflug

Tom Lane wrote:

"Florian G. Pflug" <[EMAIL PROTECTED]> writes:

And the rule becomes (I tend to forget things, so I like simple
rules that I can remember ;-) ) "For each SET-clause, there is
a pseudo-subtransaction affecting only *this* GUC".


The other question is whether we want to change the behavior of SET
LOCAL even in the absence of function SET-clauses.  The current rule
is that a LOCAL setting goes away at subtransaction commit, leading
to this behavior:

regression=# show regex_flavor;
 regex_flavor 
--

 advanced
(1 row)

regression=# begin;
BEGIN
regression=# savepoint x;
SAVEPOINT
regression=# set local regex_flavor to basic;
SET
regression=# release x;
RELEASE
regression=# show regex_flavor;
 regex_flavor 
--

 advanced
(1 row)

which makes some sense if you think of "release" as "subtransaction
end", but not a lot if you think of it as forgetting a savepoint.
Likewise, SET LOCAL within a plpgsql exception block goes away at
successful block exit, which is not the first thing you'd expect.
Neither of these behaviors are documented anywhere AFAIR; certainly
the SET reference page doesn't explain 'em.

I think we should probably take this opportunity to fix that, and
make SET LOCAL mean "persists until end of current top-level
transaction, unless rolled back earlier or within a function SET
clause".

So:

* Plain SET takes effect immediately and persists unless rolled back
or overridden by another explicit SET.  In particular the value will
escape out of a function that has a SET-clause for the same variable.

* SET LOCAL takes effect immediately and persists until rolled back,
overridden by another SET, or we exit a function that has a SET-clause
for the same variable.

* Rollback of a transaction or subtransaction cancels any SET or SET
LOCAL within it.  Otherwise, the latest un-rolled-back SET or SET LOCAL
determines the active value within a transaction, and the latest
un-rolled-back SET determines the value that will prevail after the
transaction commits.

* A function SET clause saves the entry-time value, and restores it at
function exit, except when overridden by an un-rolled-back SET (but not
SET LOCAL) within the function.

Clear to everyone?  Any objections?

That makes "SET LOCAL" completely equivalent to "SET", except
when used inside a function that has a corresponding SET-clause, right?
So I think *if* this is done, "SET LOCAL" should be renamed to
"SET FUNCTION". This would also prevent confusion, because everyone
who currently uses SET LOCAL will have to change his code anyway,
since the semantics change for every use-case apart from functions
with SET-clauses, which don't exist in 8.2.
Or am I overlooking something?

And renaming SET LOCAL also emphasized that point that we are taking
away functionality here - even if that functionality might not seem
very useful.

BTW, I *did* check the documentation before responding to Simon's original
mail, and I *did* read it as "SET LOCAL goes away a subtransaction end".
I figured that since there is no word on subtransactions in that part
of the documentation, transaction will apply equally to both toplevel
and subtransaction. It might very well be that I'm the only one who
read it that way, though ;-) And I must admin that I wasn't completely
sure, so I *did* try it out before I posted...

I'd strong prefer "SET LOCAL" to kept it's current semantics, only that "SET
LOCAL" changes will now be rolled back if the function has a matching 
SET-clause. For multiple reasons:

  .) It's useful to be able to temporarily change GUCs from a client, and
 being able to reset them afterwards. Using a subtransaction for this
 is maybe a bit wastefull, but at least it works.
  .) In pl/pgsql, that fact that "SET LOCAL" goes away after the current
 BEGIN/END block seems entirely logical.
  .) It doesn't take away existing functionality

greetings, Florian Pflug

---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [HACKERS] Hash index todo list item

2007-09-03 Thread Gregory Stark

"Kenneth Marshall" <[EMAIL PROTECTED]> writes:

> On Sun, Sep 02, 2007 at 10:41:22PM -0400, Tom Lane wrote:
>> Kenneth Marshall <[EMAIL PROTECTED]> writes:
>> > ... This is the rough plan. Does anyone see anything critical that
>> > is missing at this point?
>> 
>> Sounds pretty good.  Let me brain-dump one item on you: one thing that
>> hash currently has over btree is the ability to handle index items up
>> to a full page.  Now, if you go with a scheme that only stores hash
>> codes and not the underlying data, you can not only handle that but
>> improve on it; 

I think that would be a big selling point for hash indexes. It would let you
index even toasted data which are larger than a page. I'm not sure whether you
can make it work for unique indexes though. But for non-unique indexes I think
it would be a solid win and mean you cover a set of use cases quite distinct
from btree indexes.

> - Hash lookup is O(1) while btree is O(logN). 

That's not really true. There's a tradeoff between insertion time and lookup
time. In order to get O(1) lookups you need to work pretty hard to maintain
the hash table including spending a lot of time reorganizing it when you grow
it. If you don't want to spend the time on inserts then you end up with
buckets and the hash table is basically just a linear speedup to whatever
algorithm you use to scan the buckets.


> - What about multi-column indexes? The current implementation
>   only supports 1 column.

That seems kind of weird. It seems obvious that you mix the three hashes
together which reduces it to the solved problem. 

-- 
  Gregory Stark
  EnterpriseDB  http://www.enterprisedb.com

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [HACKERS] Per-function GUC settings: trickier than it looked

2007-09-03 Thread Tom Lane
"Florian G. Pflug" <[EMAIL PROTECTED]> writes:
> And the rule becomes (I tend to forget things, so I like simple
> rules that I can remember ;-) ) "For each SET-clause, there is
> a pseudo-subtransaction affecting only *this* GUC".

The other question is whether we want to change the behavior of SET
LOCAL even in the absence of function SET-clauses.  The current rule
is that a LOCAL setting goes away at subtransaction commit, leading
to this behavior:

regression=# show regex_flavor;
 regex_flavor 
--
 advanced
(1 row)

regression=# begin;
BEGIN
regression=# savepoint x;
SAVEPOINT
regression=# set local regex_flavor to basic;
SET
regression=# release x;
RELEASE
regression=# show regex_flavor;
 regex_flavor 
--
 advanced
(1 row)

which makes some sense if you think of "release" as "subtransaction
end", but not a lot if you think of it as forgetting a savepoint.
Likewise, SET LOCAL within a plpgsql exception block goes away at
successful block exit, which is not the first thing you'd expect.
Neither of these behaviors are documented anywhere AFAIR; certainly
the SET reference page doesn't explain 'em.

I think we should probably take this opportunity to fix that, and
make SET LOCAL mean "persists until end of current top-level
transaction, unless rolled back earlier or within a function SET
clause".

So:

* Plain SET takes effect immediately and persists unless rolled back
or overridden by another explicit SET.  In particular the value will
escape out of a function that has a SET-clause for the same variable.

* SET LOCAL takes effect immediately and persists until rolled back,
overridden by another SET, or we exit a function that has a SET-clause
for the same variable.

* Rollback of a transaction or subtransaction cancels any SET or SET
LOCAL within it.  Otherwise, the latest un-rolled-back SET or SET LOCAL
determines the active value within a transaction, and the latest
un-rolled-back SET determines the value that will prevail after the
transaction commits.

* A function SET clause saves the entry-time value, and restores it at
function exit, except when overridden by an un-rolled-back SET (but not
SET LOCAL) within the function.

Clear to everyone?  Any objections?

As far as implementation, I think this can be made to happen by
rejiggering the value stacking and unstacking rules within guc.c.
I'm tempted to try to get rid of the "tentative" value slots at the
same time.  That's a hangover from the pre-subtransaction
implementation, when we only had to remember one inactive value for the
case of SET followed by SET LOCAL within a transaction.  Now that we
have a stack of saved values, it seems to make more sense to try to
handle this case by stacking the SET value when we hit SET LOCAL at the
same nesting level.

regards, tom lane

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly


Re: [HACKERS] tsearch filenames unlikes special symbols and numbers

2007-09-03 Thread Heikki Linnakangas
Tom Lane wrote:
> I'm not sure whether we want to touch
> the idea of non-ASCII; comments?

Non-ASCII filenames sounds like recipe for problems to me. We don't know
what encoding the filenames are in on disk.

-- 
  Heikki Linnakangas
  EnterpriseDB   http://www.enterprisedb.com

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [HACKERS] tsearch filenames unlikes special symbols and numbers

2007-09-03 Thread Tom Lane
Alvaro Herrera <[EMAIL PROTECTED]> writes:
> On the other hand, this means the name has to be quoted if it would be
> quoted as an SQL identifier, right?

Something like that.  I wasn't planning on rejecting uppercase letters,
though, which would be necessary if you wanted to be strict about
matching unquoted identifiers.

There seems fairly clear use-case for allowing A-Z a-z 0-9 and
underscore (while CVS head rejects 0-9 and underscore).  There also seem
to be good arguments for disallowing / \ : on various platforms, which
leaves us with some other punctuation in question, as well as the whole
matter of non-ASCII characters.  I'm not sure whether we want to touch
the idea of non-ASCII; comments?

regards, tom lane

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match


Re: [pgsql-www] [HACKERS] \dF wrt text search

2007-09-03 Thread Stefan Kaltenbrunner
Decibel! wrote:
> On Sun, Sep 02, 2007 at 11:18:45PM -0300, Marc G. Fournier wrote:
>> Should be fixed now, running a manual run of it right now, give it about 15 
>> minutes or so ...
> 
> Is there now monitoring for it as well?

yes


Stefan


---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly


Re: [HACKERS] tsearch filenames unlikes special symbols and numbers

2007-09-03 Thread Alvaro Herrera
Tom Lane escribió:

> Possibly we could allow '.' as long as we forbade /, but the other
> trouble with allowing . is that it encourages people to try to specify
> the filetype suffix (as indeed Oleg was doing).  I'd prefer to keep the
> suffixes out of the SQL object definitions, with an eye to possibly
> someday migrating all the configuration data inside the database.
> There's a reasonable argument for restricting the names used for these
> things in the SQL definitions to be valid SQL identifiers, so that that
> will work nicely...

Well, if we were to use SQL identifiers, we couldn't forbade anything
too much, seeing as almost anything can be used as an identifier, so
long as it is properly quoted.

But it seems to me like we could just pick an convenient subset which
doesn't make any OS too angry about it (say, reject / \ . and :), and
when we get to using actual SQL identifiers, we can enlarge the
supported char set without creating any backwards-compatibility problem.

On the other hand, this means the name has to be quoted if it would be
quoted as an SQL identifier, right?

-- 
Alvaro Herrera http://www.amazon.com/gp/registry/DXLWNGRJD34J
"Nunca confiaré en un traidor.  Ni siquiera si el traidor lo he creado yo"
(Barón Vladimir Harkonnen)

---(end of broadcast)---
TIP 7: You can help support the PostgreSQL project by donating at

http://www.postgresql.org/about/donate


Re: [HACKERS] tsearch filenames unlikes special symbols and numbers

2007-09-03 Thread Florian Pflug

Trevor Talbot wrote:

On 9/3/07, Mark Mielke <[EMAIL PROTECTED]> wrote:

Tom Lane wrote:

Also,  says that Windows throws an error for ":" in the filename,
which means we needn't.



Windows doesn't fail - but it can do odd things. For example, try:

C:\> echo hi >foo:bar

If one then checks the directory, one finds a "foo".


: is used for naming streams and attribute types in NTFS filenames.
It's not very well-known functionality and tends to confuse people,
but I'm not aware of any situation where it'd be a problem for read
access.  (Creation is not a security risk in the technical sense, but
as most administrators aren't aware of alternate data streams and the
shell does not expose them, it's effectively hidden data.)

If any of you are familiar with MacOS HFS resource forks, NTFS
basically supports an arbitrary number of named forks.  A file is
collection of one or more data streams, the single unnamed stream
being default.


On MacOS (prior) to OSX, : was used as a directory seperator (Paths
looked like "My Harddisk:My Folder:Somefile"). In OSX, "/" is used,
but for backwards-compatibility the Finder translates "/" in filenames
to ":". So, of you do for example "touch 'my:test'" on the shell,
you see "my/test" in the Finder.

Thats another argument for staying away from : in filenames.

greetings, Florian Pflug

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

  http://www.postgresql.org/docs/faq


Re: [HACKERS] Hash index todo list item

2007-09-03 Thread Kenneth Marshall
On Mon, Sep 03, 2007 at 10:33:54AM +0100, Simon Riggs wrote:
> > 
> > This is the rough plan. Does anyone see anything critical that
> > is missing at this point? Please send me any suggestions for test
> > data and various performance test ideas, since I will be working
> > on that first.
> 
> Sounds good.
> 
> I'd be particularly interested in large indexes, say ~ 0.5 - 2GB. There
> are likely to be various effects apparent as the indexes grow. It would
> be too easy to do all the tests with smaller indexes and miss things.
> 
> Other factors are:
> - volatility
> - concurrency
> 
> My general experience is that hash-based indexes are better when the
> range of inputs is relatively well-known, allowing a fast lookup. If
> that is the only benefit of hash indexes, a flexible hashing scheme may
> simply weaken the benefit-case for using them. If that's true, should
> the index build process examine the key values in the data to determine
> the best parameters to use? Kind of ANALYZE before build.
> 
> My current feeling is that they ought to be very good at handling
> read-mostly situations such as privilege checking or UPDATE-intensive
> situations such as Customer-Current-Credit tracking, when the number of
> customers is large.
> 
> It might also be worth looking at lossy hash indexes, i.e. the index
> stores only the block numbers. That would need to be part of the
> discussion around how lossy we will allow indexes to be.
> 
> We currently have two kinds of full text index with different
> concurrency use cases, so it should be acceptable to have hash indexes
> have a clear benefit in one use case but a clear loss in another.
> 
> -- 
>   Simon Riggs
>   2ndQuadrant  http://www.2ndQuadrant.com
> 

Simon,

Thank you for your input. I would like to include some tests with large
indexes too. Do you have any ideas for a test corpus or should we try
and generate the test data programatically? Many people in the literature
of text retrieval use the TREC* data for at least some of their runs. I
am going to check at work to see if the campus has access to the data,
otherwise I will do some web crawling to generate some sample data. I
have just posted a reply to Tom Lane with some further ideas for consideration
in the new hash index support. Like you, I suspect that volatile data that
results in many index changes may not work well with hash indexes, in general.
PostgreSQL has the additional burden of needing to access both the index and
the data heap. Obviously, the less I/O that is needed the better the
performance is likely to be. The new HOT functionality plus clustering the
table data on the hash index would effectively organize the table into the
"hash buckets" which could help with reducing both the churn in the index
as well as in the tables.

Regards,
Ken

---(end of broadcast)---
TIP 7: You can help support the PostgreSQL project by donating at

http://www.postgresql.org/about/donate


Re: [HACKERS] Per-function GUC settings: trickier than it looked

2007-09-03 Thread Florian G. Pflug

Tom Lane wrote:

"Florian G. Pflug" <[EMAIL PROTECTED]> writes:

At least for me, the least surprising behaviour would be to
revert it too. Than the rule becomes "a function is always
executed in a pseudo-subtransaction that affects only GUCs"


Only if it has at least one SET clause.  The overhead is too high
to insist on this for every function call.


In that case, I agree that only variables specified in a SET-clause
should be reverted. Otherwise, adding or removing
SET-clauses (e.g, because you chose a different implementation
of a function that suddenly doesn't need regexps anymore) will
cause quite arbitrary behavior changes.

And the rule becomes (I tend to forget things, so I like simple
rules that I can remember ;-) ) "For each SET-clause, there is
a pseudo-subtransaction affecting only *this* GUC".

greetings, Florian Pflug



---(end of broadcast)---
TIP 7: You can help support the PostgreSQL project by donating at

   http://www.postgresql.org/about/donate


[HACKERS] Code examples

2007-09-03 Thread Decibel!
Moving to -docs

On Sun, Sep 02, 2007 at 06:46:11PM -0400, Tom Lane wrote:
> > Another problem I see are broken examples of dictionary and parser in 
> > documentation:
> > http://momjian.us/main/writings/pgsql/sgml/textsearch-rule-dictionary-example.html
> > http://momjian.us/main/writings/pgsql/sgml/textsearch-parser-example.html
> 
> Yeah, I wanted to discuss that with you.  Code examples in sgml docs are
> a bad idea: they're impossible to use as actual templates, because of
> all the weird markup changes, and there's no easy way to notice if
> they're broken.  It would be better to remove these from the docs and
> set them up as contrib modules.

Couldn't we come up with some method of specifying code examples in the
docs and then having the doc build process actually run those examples
and put that into the doc build?

I wrote some code that does this back when I was thinking about writing
a book, if anyone wants to see it.
-- 
Decibel!, aka Jim Nasby[EMAIL PROTECTED]
EnterpriseDB  http://enterprisedb.com  512.569.9461 (cell)


pgpfIY6tWUOJ1.pgp
Description: PGP signature


Re: [HACKERS] Hash index todo list item

2007-09-03 Thread Kenneth Marshall
On Sun, Sep 02, 2007 at 10:41:22PM -0400, Tom Lane wrote:
> Kenneth Marshall <[EMAIL PROTECTED]> writes:
> > ... This is the rough plan. Does anyone see anything critical that
> > is missing at this point?
> 
> Sounds pretty good.  Let me brain-dump one item on you: one thing that
> hash currently has over btree is the ability to handle index items up
> to a full page.  Now, if you go with a scheme that only stores hash
> codes and not the underlying data, you can not only handle that but
> improve on it; but if you reduce the bucket size and don't remove the
> data, it'd be a step backward.  The idea I had about dealing with that
> was to only reduce the size of primary buckets --- if it's necessary to
> add overflow space to a bucket, the overflow units are still full pages.
> So an index tuple up to a page in size can always be accommodated by
> adding an overflow page to the bucket.
> 
> Just a thought, but AFAIR it's not in the archives anywhere.
> 
>   regards, tom lane
> 
Tom,

Thank you for the input. I agree that keeping the ability to accomodate
an index tuple up to a page is size worth keeping. I think that your
goal in reducing the bucket size is to improve the packing efficiency
of the index. Since the on disk page size remains the same, it may be
possible to use a different structure overlayed on the current bucket
size and still improve the packing efficiency of the index. After some
more mulling, here are some further thoughts on the improved hash table
implementation:

- Hash lookup is O(1) while btree is O(logN). Is there a value
  in optimizing the NOT case, i.e. the entry is not in the table?

- Space versus performance trade-off. This may tie into cache
  efficiency and use of L2/L3, shared buffers, main memory.
  Denser layouts with a higher load facter may be a bit slower
  in lookups but play much nicer in a multi-user system. Look
  at the possibility of a lossy mapping?

- Build time versus update time. How does concurrency enter into
  the picture regarding simultaneous updates, inserts, deletes,
  and lookups?

- Could a hybrid structure with some type of prefix compression
  give a more efficient layout and possibly improve performance?

- Index larger fields. Btree is limited to blocksize/3, the
  current hash implementation can go up to a full block.

- What about multi-column indexes? The current implementation
  only supports 1 column.

More ideas are welcome and I will add them to the list for
investigation.

Regards,
Ken

---(end of broadcast)---
TIP 7: You can help support the PostgreSQL project by donating at

http://www.postgresql.org/about/donate


Re: [HACKERS] tsearch filenames unlikes special symbols and numbers

2007-09-03 Thread Trevor Talbot
On 9/3/07, Mark Mielke <[EMAIL PROTECTED]> wrote:
> Tom Lane wrote:
> > Also,  says that Windows throws an error for ":" in the filename,
> > which means we needn't.

> Windows doesn't fail - but it can do odd things. For example, try:
>
> C:\> echo hi >foo:bar
>
> If one then checks the directory, one finds a "foo".

: is used for naming streams and attribute types in NTFS filenames.
It's not very well-known functionality and tends to confuse people,
but I'm not aware of any situation where it'd be a problem for read
access.  (Creation is not a security risk in the technical sense, but
as most administrators aren't aware of alternate data streams and the
shell does not expose them, it's effectively hidden data.)

If any of you are familiar with MacOS HFS resource forks, NTFS
basically supports an arbitrary number of named forks.  A file is
collection of one or more data streams, the single unnamed stream
being default.

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [HACKERS] Per-function GUC settings: trickier than it looked

2007-09-03 Thread Tom Lane
"Florian G. Pflug" <[EMAIL PROTECTED]> writes:
> At least for me, the least surprising behaviour would be to
> revert it too. Than the rule becomes "a function is always
> executed in a pseudo-subtransaction that affects only GUCs"

Only if it has at least one SET clause.  The overhead is too high
to insist on this for every function call.

regards, tom lane

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match


Re: [HACKERS] Per-function GUC settings: trickier than it looked

2007-09-03 Thread Florian G. Pflug

Tom Lane wrote:

"Florian G. Pflug" <[EMAIL PROTECTED]> writes:

Tom Lane wrote:

So it seems that only SET LOCAL within a function with per-function
GUC settings is at issue.  I think that there is a pretty strong
use-case for saying that if you have a per-function setting of a
particular variable foo, then any "SET LOCAL foo" within the function
ought to vanish at function end --- for instance a function could want
to try a few different search_path settings and automatically revert to
the caller's setting on exit.

Agreed.

> The question is what about SET LOCAL

on a variable that *hasn't* been explicitly SET by the function
definition.  Either approach we take with it could be surprising,
but probably having it revert at function end is more surprising...


At least for me, the least surprising behaviour would be to
revert it too. Than the rule becomes "a function is always
executed in a pseudo-subtransaction that affects only GUCs"

Since at least for pl/pgsql, a function body *alreay* is a
BEGIN/END block - and therefore syntactically even looks
like a subtransaction - this seems quite logical.

And it would mean that the semantics of "SET LOCAL" won't change,
just because you add an EXCEPTION clause to the function's toplevel
BEGIN/END block.

greetings, Florian Pflug


---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly


[HACKERS] [ANN] SE-PostgreSQL 8.2.4-1.0 Released

2007-09-03 Thread KaiGai Kohei
This is announcement of the first official version of SE-PostgreSQL.


  SE-PostgreSQL 8.2.4-1.0 Released

SE-PostgreSQL development team released "SE-PostgreSQL 8.2.4-1.0" and
"The SE-PostgreSQL Security Guide (Japanese/English)".

You can get these packages from the following URL:
  http://code.google.com/p/sepgsql/downloads/list

NOTE: Any packages built for Fedora development edition (rawhide) will
  be distributed via Fedora mirrors soon.

* SE-PostgreSQL 8.2.4-1.0
  sepostgresql-8.2.4-1.0.fc7.i386.rpm
  sepostgresql-8.2.4-1.0.fc7.src.rpm
* The base security policy for Fedora 7
  selinux-policy-devel-2.6.4-38.sepgsql.fc7.noarch.rpm
  selinux-policy-targeted-2.6.4-38.sepgsql.fc7.noarch.rpm
  selinux-policy-2.6.4-38.sepgsql.fc7.noarch.rpm
  selinux-policy-2.6.4-38.sepgsql.fc7.src.rpm
* The official documentation
  sepgsql_security_guide.20070903.jp.pdf
  sepgsql_security_guide.20070903.en.pdf

See the following URL, for installation details.
* SE-PostgreSQL Installation Memo (Fedora 7)
  http://code.google.com/p/sepgsql/wiki/install_memo_Fedora7

The features of SE-PostgreSQL
-
Security-Enhanced PostgreSQL (SE-PostgreSQL) is a security extension built
in PostgreSQL. It enables to apply a unified security policy of SELinux to
both operating system and database management system.
In addition, it also provides fine-grained access control including column
and row level, and mandatory access control being non-bypassable, even if
privileged database users.
These features enables to build a database management system into information
flow control scheme integrated with operating system, and to protect our
information asset from threats like manipulation or leaking.

The position of this version

This is the first official version of SE-PostgreSQL based on PostgreSQL 8.2.4.
However, it does not have enough achievement of works compared to the original
PostgreSQL. Therefore, we recommend you to have enough evaluation and
verification on its introduction.
The series of SE-PostgreSQL 8.2.4-1.x got into maintenance phase after the
release of this version, and we don't have any plan to release new version
in this series, except for bug fixes.
The SE-PostgreSQL development team has a plan to develop next major version
of SE-PostgreSQL based on PostgreSQL 8.3, with several new features.

Acknowledgment
--
The development of SE-PostgreSQL is supported by Exploratory Software Project,
IPA(Information-technology Promotion Agency, Japan).

-- 
KaiGai Kohei <[EMAIL PROTECTED]>

---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [HACKERS] Per-function GUC settings: trickier than it looked

2007-09-03 Thread Tom Lane
Simon Riggs <[EMAIL PROTECTED]> writes:
> ISTM that SET LOCAL is mostly superceded by per-function parameters.

Mostly, but not entirely.  The case where you still need SET LOCAL is
where the value you want to use locally has to be computed, or where you
need to change it more than once within the function.  Yet in such cases
it'd still be handy to let the SET-clause mechanism deal with the detail
of restoring the caller's value at exit.

There is also a fairly nasty backward-compatibility problem.  Suppose
that security definer function OldSD uses the recommended-up-to-now
method for setting a secure search path, which I quote from the 8.2
manual:

old_path := pg_catalog.current_setting('search_path');

PERFORM pg_catalog.set_config('search_path', 'admin, pg_temp', true);

-- Do whatever secure work we came for.

PERFORM pg_catalog.set_config('search_path', old_path, true);

(The set_config calls are equivalent to SET LOCAL.)  Also suppose that
security definer function NewSD uses the fancy new function-local-
SET-clause method to avoid all that tedious stuff there.  Now suppose
that NewSD calls OldSD.  If SET LOCAL overrides SET-clauses, this
happens:

* NewSD saves outer search path and sets its own.
* OldSD saves NewSD's search path, then sets its own with SET LOCAL.
* OldSD restores NewSD's search path using SET LOCAL.
* NewSD tries to restore outer search path, but silently fails
  because SET LOCAL takes precedence.
* We exit to the caller with NewSD's search path still in effect.

This scenario will surely happen in the field, and therefore I argue
that we *must* not allow SET LOCAL's effects to persist beyond the
exit from a surrounding function-local SET clause on the same variable.

I'm not sure what conclusions that leads to for other cases, though.
We don't necessarily have to be consistent between the case where
SET affects a variable and the case where it doesn't.

regards, tom lane

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq


Re: [HACKERS] tsearch filenames unlikes special symbols and numbers

2007-09-03 Thread Mark Mielke

Tom Lane wrote:

Also,  says that Windows throws an error for ":" in the filename,
which means we needn't.

  

Windows doesn't fail - but it can do odd things. For example, try:

   C:\> echo hi >foo:bar

If one then checks the directory, one finds a "foo".

Depending on *which* API one uses, the rules may change around a bit - 
but whatever the situation, as long as you prefix it with a valid path, 
the ":" is not going to cause you problems.



It might still be a good idea to restrict the names to be SQL
identifiers (ie, alphanumerics and underscores) for future-proofing,
but it wasn't clear whether anyone but me thought that was a good
argument.  I'm willing to make it just be no-dir-separators.
  

I think it is a good argument.

Cheers,
mark

--
Mark Mielke <[EMAIL PROTECTED]>

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
  choose an index scan if your joining column's datatypes do not
  match


Re: [HACKERS] tsearch filenames unlikes special symbols and numbers

2007-09-03 Thread Mark Mielke

Tom Lane wrote:

Magnus Hagander <[EMAIL PROTECTED]> writes:
  

On Mon, Sep 03, 2007 at 07:47:14AM +0100, Gregory Stark wrote:


Actually I think in Windows \ : and . are problems (not allowed more
than one dot in dos).

\ and : are problems.


Is : really a problem, given that the name in question will be appended
to a known directory's path?
  
The file name shouldn't have a ':' in it. Accessing a path with multiple 
':' in it to open a file for reading should just fail normally. So yes, 
there should be no problem.



. is not a problem. We don't support 16-bit windows anyway, and multiple
dots works fine on any system we support.


I'm not convinced that . is issue-free.  On most if not all versions of Unix,
you are allowed to open a directory as a file and read the filenames it
contains.  While I don't say it'd be easy to manage that through
tsearch, there's at least a potential for discovering the filenames
present in . and .. --- how much do we care about that?
  
No more than discovering the file names in any other directory without 
using '.' or '..'? If it matters, check to ensure it is a regular file 
before opening it?


Cheers,
mark

--
Mark Mielke <[EMAIL PROTECTED]>



Re: [HACKERS] tsearch filenames unlikes special symbols and numbers

2007-09-03 Thread Gregory Stark
"Tom Lane" <[EMAIL PROTECTED]> writes:

> It might still be a good idea to restrict the names to be SQL
> identifiers (ie, alphanumerics and underscores) for future-proofing,
> but it wasn't clear whether anyone but me thought that was a good
> argument.  I'm willing to make it just be no-dir-separators.

I thought that was a good argument actually.

-- 
  Gregory Stark
  EnterpriseDB  http://www.enterprisedb.com

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [HACKERS] tsearch filenames unlikes special symbols and numbers

2007-09-03 Thread Tom Lane
Gregory Stark <[EMAIL PROTECTED]> writes:
> "Tom Lane" <[EMAIL PROTECTED]> writes:
>> I'm not convinced that . is issue-free.  On most if not all versions of Unix,
>> you are allowed to open a directory as a file and read the filenames it
>> contains.  While I don't say it'd be easy to manage that through
>> tsearch, there's at least a potential for discovering the filenames
>> present in . and .. --- how much do we care about that?

> Actually I don't think that's true any more, most file systems on most Unixen
> do not allow it. However it appears it's still the case for Solaris so it's
> still a good point.

Actually, now that I've woken up a bit more, it is not a problem as
long as the tsearch code always appends some kind of file extension
to what the user gives, such as ".dict".  It'll be impossible to name
"." or ".." with that addition.

Also, Magnus says that Windows throws an error for ":" in the filename,
which means we needn't.

So the bottom line seems to be that rejecting directory separators
is sufficient to prevent any unwanted file accesses.

It might still be a good idea to restrict the names to be SQL
identifiers (ie, alphanumerics and underscores) for future-proofing,
but it wasn't clear whether anyone but me thought that was a good
argument.  I'm willing to make it just be no-dir-separators.

regards, tom lane

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly


Re: [HACKERS] tsearch filenames unlikes special symbols and numbers

2007-09-03 Thread Gregory Stark
"Tom Lane" <[EMAIL PROTECTED]> writes:

> I'm not convinced that . is issue-free.  On most if not all versions of Unix,
> you are allowed to open a directory as a file and read the filenames it
> contains.  While I don't say it'd be easy to manage that through
> tsearch, there's at least a potential for discovering the filenames
> present in . and .. --- how much do we care about that?

Actually I don't think that's true any more, most file systems on most Unixen
do not allow it. However it appears it's still the case for Solaris so it's
still a good point.

I'm sure it's not true for modern versions of Linux and I thought it was false
for other modern OSes -- I'm surprised it's not for Solaris even.

-- 
  Gregory Stark
  EnterpriseDB  http://www.enterprisedb.com

---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [HACKERS] FW: [ADMIN] max_connections and shared_buffers

2007-09-03 Thread Tom Lane
"Scott Marlowe" <[EMAIL PROTECTED]> writes:
> Not sure with Windows.  I'm strictly a unix type of guy.  I'm guessing
> that Windows is detecting too many connections / out of memory and
> shutting down the service.

The whole thing is pretty strange.  "received fast shutdown request"
means that the postmaster got SIGINT --- a moment's look at the code
proves there is no other possibility.  Now what sent it SIGINT?
AFAICS there are only two possible paths: "pg_ctl stop -m fast" or
this little bit of code in win32/signal.c:

/* Console control handler will execute on a thread created
   by the OS at the time of invocation */
static BOOL WINAPI
pg_console_handler(DWORD dwCtrlType)
{
if (dwCtrlType == CTRL_C_EVENT ||
dwCtrlType == CTRL_BREAK_EVENT ||
dwCtrlType == CTRL_CLOSE_EVENT ||
dwCtrlType == CTRL_SHUTDOWN_EVENT)
{
pg_queue_signal(SIGINT);
return TRUE;
}
return FALSE;
}

Can any Windows hackers speculate on causes of this?

regards, tom lane

---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [HACKERS] tsearch filenames unlikes special symbols and numbers

2007-09-03 Thread Magnus Hagander
On Mon, Sep 03, 2007 at 09:27:19AM -0400, Tom Lane wrote:
> Magnus Hagander <[EMAIL PROTECTED]> writes:
> > On Mon, Sep 03, 2007 at 07:47:14AM +0100, Gregory Stark wrote:
> >> Actually I think in Windows \ : and . are problems (not allowed more
> >> than one dot in dos).
> 
> > \ and : are problems.
> 
> Is : really a problem, given that the name in question will be appended
> to a known directory's path?

Yes. It won't work - the API calls will reject it.

> > . is not a problem. We don't support 16-bit windows anyway, and multiple
> > dots works fine on any system we support.
> 
> I'm not convinced that . is issue-free.  On most if not all versions of Unix,
> you are allowed to open a directory as a file and read the filenames it
> contains.  While I don't say it'd be easy to manage that through
> tsearch, there's at least a potential for discovering the filenames
> present in . and .. --- how much do we care about that?

I just meant that it's not a problem on Win32 to have a file with multiple
dots in the name. There can certainly be *other* reasons for it. I don't
really see the need to have an extra dot in the filename in this particular
case, so I'd certainly be fine with restricting this one a lot more.

//Magnus

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [HACKERS] tsearch filenames unlikes special symbols and numbers

2007-09-03 Thread Tom Lane
Magnus Hagander <[EMAIL PROTECTED]> writes:
> On Mon, Sep 03, 2007 at 07:47:14AM +0100, Gregory Stark wrote:
>> Actually I think in Windows \ : and . are problems (not allowed more
>> than one dot in dos).

> \ and : are problems.

Is : really a problem, given that the name in question will be appended
to a known directory's path?

> . is not a problem. We don't support 16-bit windows anyway, and multiple
> dots works fine on any system we support.

I'm not convinced that . is issue-free.  On most if not all versions of Unix,
you are allowed to open a directory as a file and read the filenames it
contains.  While I don't say it'd be easy to manage that through
tsearch, there's at least a potential for discovering the filenames
present in . and .. --- how much do we care about that?

regards, tom lane

---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [HACKERS] helps required in postgresql and visual basic 6.0

2007-09-03 Thread Heikki Linnakangas
Vishnu Aggarwal wrote:
> I am software engineer. i have a problem in postgresql. i am using
> postgresql 8.2.4 version and i have save image in table by java program
>  my table fields as id-integer ,image -bytea
> 
> i insert image by java in prepared statement by setbinarystream() its
> running perfects.
> i can retrieve image by java application also.
> 
> but when i used this in visual basic 6.0 its gives problem

This mailing list is for discussions regarding development of PostgreSQL
server. Please post your question on a more appropriate list.

-- 
  Heikki Linnakangas
  EnterpriseDB   http://www.enterprisedb.com

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


[HACKERS] helps required in postgresql and visual basic 6.0

2007-09-03 Thread Vishnu Aggarwal
hello sir

I am software engineer. i have a problem in postgresql. i am using
postgresql 8.2.4 version and i have save image in table by java program
 my table fields as id-integer ,image -bytea

i insert image by java in prepared statement by setbinarystream() its
running perfects.
i can retrieve image by java application also.

but when i used this in visual basic 6.0 its gives problem

   Dim Chunk() As Byte
   Open imagename For Binary Access Write As #1
   ReDim Chunk(1 To RS("FImage").ActualSize)

   Chunk() = RS("FImage").GetChunk(RS("FImage").ActualSize)
   Put 1, , Chunk()
   Close #1


its says when its comes in image field its gives  errors

operation is not allowed in this context

helps me out


regards
vishnu aggarwal


Re: [HACKERS] integrated tsearch has different results than tsearch2

2007-09-03 Thread Teodor Sigaev

1. I am not able use fulltext with latin2 encoding :( I missing note
about only utf8 dictionaries in doc).
You can use any server encoding, but dictionary's files should be in utf8 - 
dictionary will convert utf8 files into server encoding.





2. with hspell dictionaries (fresh copy from open office) I got
different and wrong results.
postgres=# select to_tsvector('cs','Příliš žlutý kůň se napil žluté
vody') @@ to_tsquery('cs','napít');
 ?column?
--
 f
(1 row)


Pls, output of:
select ts_lexize('cspell','napil');
select to_tsvector('cs','Příliš žlutý kůň se napil žluté
vody');




--
Teodor Sigaev   E-mail: [EMAIL PROTECTED]
   WWW: http://www.sigaev.ru/

---(end of broadcast)---
TIP 4: Have you searched our list archives?

  http://archives.postgresql.org


Re: [HACKERS] Per-function GUC settings: trickier than it looked

2007-09-03 Thread Simon Riggs
On Mon, 2007-09-03 at 04:09 -0500, Decibel! wrote:
> On Sun, Sep 02, 2007 at 12:08:00PM -0400, Tom Lane wrote:
> > I notice BTW that we have never updated the SET reference page since
> > subtransactions were introduced --- it still says only that SET LOCAL
> > is "local to the current transaction", without a word about
> > subtransactions.  So we have a documentation problem anyway.  I recall
> > that we had some discussion during the 8.0 dev cycle about whether
> > having SET LOCAL's effects end at the end of the current subtransaction
> > was really a good idea, given that subtransactions aren't the conceptual
> > model the SQL spec defines, but nothing was ever done about changing
> > the implementation.
> 
> ISTM that's the real problem; SET LOCAL wasn't fully updated/considered
> when subtransactions were added.
> 
> One way to handle this would be to have 3 different behaviors for SET:
> session-level, transaction-level, and sub-transaction level. If we had
> that, we could probably make an across-the-board call that all functions
> operate as if in their own sub-transaction, at least when it comes to
> SET.

What would be the use case for that? I can't see a single reason to do a
SET LOCAL SUBTRANSACTION or whatever you'd call it. What you suggest
sounds nicely symmetrical, but I don't think we need it in practice.

ISTM that SET LOCAL is mostly superceded by per-function parameters.
Most parameters need to be tied to code, not transactions. Of course, my
wish to use synchronous_commit *was* tied to a transaction, but not a
subtransaction.

per-function parameters are sorely needed anyhow, since with session
pools we can't easily use the username for SET parameters.

-- 
  Simon Riggs
  2ndQuadrant  http://www.2ndQuadrant.com


---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match


Re: [HACKERS] Hash index todo list item

2007-09-03 Thread Simon Riggs
On Sun, 2007-09-02 at 13:04 -0500, Kenneth Marshall wrote:
> Dear PostgreSQL Hackers:
> 
> After following the hackers mailing list for quite a while,
> I am going to start investigating what will need to be done
> to improve hash index performance. Below are the pieces of
> this project that I am currently considering:
> 
> 1. Characterize the current hash index implementation against
>the BTree index, with a focus on space utilization and
>lookup performance against a collection of test data. This
>will give a baseline performance test to evaluate the impact
>of changes. I initially do not plan to bench the hash creation
>process since my initial focus will be on lookup performance.
> 
> 2. Evaluate the performance of different hash index implementations
>and/or changes to the current implementation. My current plan is
>to keep the implementation as simple as possible and still provide
>the desired performance. Several hash index suggestions deal with
>changing the layout of the keys on a page to improve lookup
>performance, including reducing the bucket size to a fraction of
>a page or only storing the hash value on the page, instead of
>the index value itself. My goal in this phase is to produce one
>or more versions with better performance than the current BTree.
>
> 3. Look at build time and concurrency issues with the addition of
>some additional tests to the test bed. (1)
> 
> 4. Repeat as needed.
> 
> This is the rough plan. Does anyone see anything critical that
> is missing at this point? Please send me any suggestions for test
> data and various performance test ideas, since I will be working
> on that first.

Sounds good.

I'd be particularly interested in large indexes, say ~ 0.5 - 2GB. There
are likely to be various effects apparent as the indexes grow. It would
be too easy to do all the tests with smaller indexes and miss things.

Other factors are:
- volatility
- concurrency

My general experience is that hash-based indexes are better when the
range of inputs is relatively well-known, allowing a fast lookup. If
that is the only benefit of hash indexes, a flexible hashing scheme may
simply weaken the benefit-case for using them. If that's true, should
the index build process examine the key values in the data to determine
the best parameters to use? Kind of ANALYZE before build.

My current feeling is that they ought to be very good at handling
read-mostly situations such as privilege checking or UPDATE-intensive
situations such as Customer-Current-Credit tracking, when the number of
customers is large.

It might also be worth looking at lossy hash indexes, i.e. the index
stores only the block numbers. That would need to be part of the
discussion around how lossy we will allow indexes to be.

We currently have two kinds of full text index with different
concurrency use cases, so it should be acceptable to have hash indexes
have a clear benefit in one use case but a clear loss in another.

-- 
  Simon Riggs
  2ndQuadrant  http://www.2ndQuadrant.com


---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [pgsql-www] [HACKERS] \dF wrt text search

2007-09-03 Thread Decibel!
On Sun, Sep 02, 2007 at 11:18:45PM -0300, Marc G. Fournier wrote:
> Should be fixed now, running a manual run of it right now, give it about 15 
> minutes or so ...

Is there now monitoring for it as well?
-- 
Decibel!, aka Jim Nasby[EMAIL PROTECTED]
EnterpriseDB  http://enterprisedb.com  512.569.9461 (cell)


pgp0mRvoBjNHJ.pgp
Description: PGP signature


Re: [HACKERS] Per-function GUC settings: trickier than it looked

2007-09-03 Thread Decibel!
On Sun, Sep 02, 2007 at 12:08:00PM -0400, Tom Lane wrote:
> I notice BTW that we have never updated the SET reference page since
> subtransactions were introduced --- it still says only that SET LOCAL
> is "local to the current transaction", without a word about
> subtransactions.  So we have a documentation problem anyway.  I recall
> that we had some discussion during the 8.0 dev cycle about whether
> having SET LOCAL's effects end at the end of the current subtransaction
> was really a good idea, given that subtransactions aren't the conceptual
> model the SQL spec defines, but nothing was ever done about changing
> the implementation.

ISTM that's the real problem; SET LOCAL wasn't fully updated/considered
when subtransactions were added.

One way to handle this would be to have 3 different behaviors for SET:
session-level, transaction-level, and sub-transaction level. If we had
that, we could probably make an across-the-board call that all functions
operate as if in their own sub-transaction, at least when it comes to
SET.

Whatever we decide on, least-surprise would dictate that it's the
same whether you apply function-specific settings or not.
-- 
Decibel!, aka Jim Nasby[EMAIL PROTECTED]
EnterpriseDB  http://enterprisedb.com  512.569.9461 (cell)


pgpFCPXAAI5Eg.pgp
Description: PGP signature


Re: [HACKERS] integrated tsearch has different results than tsearch2

2007-09-03 Thread Oleg Bartunov

Pavel,

I can't read your posting. Can you use plain text format ?

Oleg
On Mon, 3 Sep 2007, Pavel Stehule wrote:


Hello
I am testing fulltext.
1. I am not able use fulltext with latin2 encoding :( I missing noteabout only 
utf8 dictionaries in doc).

2. with hspell dictionaries (fresh copy from open office) I gotdifferent and 
wrong results.
Original (old) result
ts=# select * from ts_debug('P??li? ?lu?ou?k? k?? se napil ?lut? vody');
ts_name| tok_type | description |   token   | dict_name  |  tsvector 
--+--+-+---+---+ 
 default_czech | word | Word| P??li?
|{cz_ispell,simple} | 'p??li?' default_czech | word | Word| 
?lu?ou?k? |{cz_ispell,simple} | '?lu?ou?k?' default_czech | word | Word 
   | k??   | {cz_ispell,simple} | 'k??' default_czech | lword| Latin 
word  | se| {cz_ispell,simple} | default_czech | lword| Latin word  
| napil |{cz_ispell,simple} | 'nap?t' default_czech | word | Word   
 | ?lut? |{cz_ispell,simple} | '?lut?' default_czech | lword| Latin 
word  | vody  |{cz_ispell,simple} | 'voda' (7 ??dek)
New results:postgres=# create Text search dictionary 
cspell(template=ispell,dictfile=czech, afffile=czech, stopwords=czech);CREATE 
TEXT SEARCH DICTIONARYpostgres=# CREATE text search configuration cs 
(copy=english);CREATE TEXT SEARCH CONFIGURATION
postgres=# alter text search configuration cs alter mapping for word,lword  
with cspell, simple;ALTER TEXT SEARCH CONFIGURATIONpostgres=# select * from 
ts_debug('cs','P??li? ?lu?ou?k? k?? se napil?lut? vody'); Alias |  Description  
|   Token   |  Dictionaries   |Lexized 
token---+---+---+-+-
 word  | Word  | P??li?| {cspell,simple} | cspell: {p??li?} blank | 
Space symbols |   | {}  | word  | Word  | ?lu?ou?k? 
| {cspell,simple} | cspell: {?lu?ou?k?} blank | Space symbols |   | {}  
| word  | Word  | k??   | {cspell,simple} | cspell: 
{k??} blank | Space symbols |   | {}  | lword | Latin word  
  | se| {cspell,simple} | cspell: {} blank | Space symbols |   
| {}  | lword | Latin word| napil | {cspell,simple} | 
simple: {napil} blank | Space symbols |   | {}  | word  | 
Word  | ?lut? | {cspell,simple} | simple: {?lut?} blank | Space 
symbols |   | {}  | lword | Latin word| vody  | 
{cspell,simple} | simple: {vody}(13 rows)
This query returned true in 8.2 and now:
postgres=# select to_tsvector('cs','P??li? ?lut? k?? se napil ?lut?vody') @@ 
to_tsquery('cs','nap?t'); ?column?-- f(1 row)
RegardsPavel Stehule



Regards,
Oleg
_
Oleg Bartunov, Research Scientist, Head of AstroNet (www.astronet.ru),
Sternberg Astronomical Institute, Moscow University, Russia
Internet: [EMAIL PROTECTED], http://www.sai.msu.su/~megera/
phone: +007(495)939-16-83, +007(495)939-23-83
---(end of broadcast)---
TIP 7: You can help support the PostgreSQL project by donating at

   http://www.postgresql.org/about/donate


Re: [HACKERS] [PATCH] Lazy xid assingment V2

2007-09-03 Thread Heikki Linnakangas
August Zajonc wrote:
> The thing is, the leak occurs in situation where a COMMIT hasn't
> returned to the user, so we are trying to guarantee no data-loss even
> when the user doesn't see a successful commit? That's a tall order
> obviously and hopefully people design their apps to attend to
> transaction success / failure.

No, by the time we'd delete/move to trash the files, we know that
they're no longer needed. But because deleting a relation is such a
drastic thing to do, with potential for massive data loss, we're being
extra paranoid. What if there's a bug in PostgreSQL that makes us delete
the wrong file? Or transaction wraparound happens, despite the
protections that we now have in place? Or you have bad RAM in the
server, and a bit flips at an unfortunate place? That's the kind of
situations we're worried about.

-- 
  Heikki Linnakangas
  EnterpriseDB   http://www.enterprisedb.com

---(end of broadcast)---
TIP 7: You can help support the PostgreSQL project by donating at

http://www.postgresql.org/about/donate


Re: [HACKERS] tsearch filenames unlikes special symbols and numbers

2007-09-03 Thread Magnus Hagander
On Mon, Sep 03, 2007 at 07:47:14AM +0100, Gregory Stark wrote:
> "Tom Lane" <[EMAIL PROTECTED]> writes:
> 
> > Gregory Stark <[EMAIL PROTECTED]> writes:
> >> "Tom Lane" <[EMAIL PROTECTED]> writes:
> >>> I made it reject all but latin letters, which is the same restriction
> >>> that's in place for timezone set filenames.  That might be overly
> >>> strong, but we definitely have to forbid "." and "/" (and "\" on
> >>> Windows).  Do we want to restrict it to letters, digits, underscore?
> >>> Or does it need to be weaker than that?
> >
> >> What's the problem with "."?
> >
> > ../../../../etc/passwd
> >
> > Possibly we could allow '.' as long as we forbade /, 
> 
> Right, traditionally the only characters forbidden in filenames in Unix are /
> and nul. If we want the files to play nice in Gnome etc then we should
> restrict them to ascii since we don't know what encoding the gui expects. 
> 
> Actually I think in Windows \ : and . are problems (not allowed more than one
> dot in dos).

\ and : are problems.

. is not a problem. We don't support 16-bit windows anyway, and multiple
dots works fine on any system we support.

//Magnus

---(end of broadcast)---
TIP 7: You can help support the PostgreSQL project by donating at

http://www.postgresql.org/about/donate


Re: [HACKERS] Per-function search_path => per-function GUC settings

2007-09-03 Thread Marko Kreen
On 9/2/07, Tom Lane <[EMAIL PROTECTED]> wrote:
> "Marko Kreen" <[EMAIL PROTECTED]> writes:
> > On 9/2/07, Tom Lane <[EMAIL PROTECTED]> wrote:
> >> Seems a little verbose, but maybe we could do "SET var FROM CURRENT"
> >> or "SET var FROM SESSION"?
>
> > I'd prefer FROM SESSION then.  FROM CURRENT seems unclear.
>
> Actually, I think FROM SESSION is unclear, as it opens the question
> whether the value to be applied is the session-wide setting or the
> currently active one.  Inside a transaction that has done SET LOCAL,
> these are different things.

I did not consider that.

> I think we pretty clearly want to have it take the currently active
> setting, and I'd vote for FROM CURRENT as the best way of expressing
> that.

Ok.

-- 
marko

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq


Re: [pgsql-www] [HACKERS] \dF wrt text search

2007-09-03 Thread Magnus Hagander
On Sun, Sep 02, 2007 at 09:57:16PM -0400, Tom Lane wrote:
> Robert Treat <[EMAIL PROTECTED]> writes:
> > Blah I compiled last night, using the latest snapshot in the 
> > postgresql/dev/ directory in ftp, which, as I look now, has a date listed 
> > of 
> > 2007-08-10, and looking in the source the catversion is 200702251, which is 
> > before the tsearch bits hit the tree.  So, I think my problems lie in the 
> > snapshot no longer being updated :-\   CCing www in case someone wants to 
> > fix
> > it.  
> 
> I'll bet a dollar it got broken in the master-CVS-server move.

I thought it worked off the same cvs repo as the cvsweb stuff (on svr1),
which AFAIK isn't broken (since cvsweb works). But it's not documented
anywhere, and I do recall not being able to track down exactly how it
worked, so I won't be able to proove it to the point of being able to
collect that dollar ;-)

//Magnus

---(end of broadcast)---
TIP 7: You can help support the PostgreSQL project by donating at

http://www.postgresql.org/about/donate


Re: [HACKERS] tsearch filenames unlikes special symbols and numbers

2007-09-03 Thread Trevor Talbot
On 9/2/07, Gregory Stark <[EMAIL PROTECTED]> wrote:

> Right, traditionally the only characters forbidden in filenames in Unix are /
> and nul. If we want the files to play nice in Gnome etc then we should
> restrict them to ascii since we don't know what encoding the gui expects.
>
> Actually I think in Windows \ : and . are problems (not allowed more than one
> dot in dos).

Reserved characters in Windows filenames are < > : " / \ | ? *
DOS limitations aren't relevant on the OS versions Postgres supports.

...but I thought this was about opening existing files, not creating
them, in which case the only relevant limitation is path separators.
Any other reserved characters are going to result in no open file,
rather than a security hole.

---(end of broadcast)---
TIP 7: You can help support the PostgreSQL project by donating at

http://www.postgresql.org/about/donate


[HACKERS] integrated tsearch has different results than tsearch2

2007-09-03 Thread Pavel Stehule
Hello

I am testing fulltext.

1. I am not able use fulltext with latin2 encoding :( I missing note
about only utf8 dictionaries in doc).


2. with hspell dictionaries (fresh copy from open office) I got
different and wrong results.

Original (old) result

ts=# select * from ts_debug('Příliš žluťoučký kůň se napil žluté vody');
ts_name| tok_type | description |   token   | dict_name
  |  tsvector
 --+--+-+---+
---+ 
 default_czech | word | Word| Příliš|
{cz_ispell,simple} | 'příliš'
 default_czech | word | Word| žluťoučký |
{cz_ispell,simple} | 'žluťoučký'
 default_czech | word | Word| kůň   | {cz_ispell,simple} | 'kůň'
 default_czech | lword| Latin word  | se| {cz_ispell,simple} |
 default_czech | lword| Latin word  | napil |
{cz_ispell,simple} | 'napít'
 default_czech | word | Word| žluté |
{cz_ispell,simple} | 'žlutý'
 default_czech | lword| Latin word  | vody  |
{cz_ispell,simple} | 'voda'
 (7 řádek)

New results:
postgres=# create Text search dictionary cspell(template=ispell,
dictfile=czech, afffile=czech, stopwords=czech);
CREATE TEXT SEARCH DICTIONARY
postgres=# CREATE text search configuration cs (copy=english);
CREATE TEXT SEARCH CONFIGURATION

postgres=# alter text search configuration cs alter mapping for word,
lword  with cspell, simple;
ALTER TEXT SEARCH CONFIGURATION
postgres=# select * from ts_debug('cs','Příliš žluťoučký kůň se napil
žluté vody');
 Alias |  Description  |   Token   |  Dictionaries   |Lexized token
---+---+---+-+-
 word  | Word  | Příliš| {cspell,simple} | cspell: {příliš}
 blank | Space symbols |   | {}  |
 word  | Word  | žluťoučký | {cspell,simple} | cspell: {žluťoučký}
 blank | Space symbols |   | {}  |
 word  | Word  | kůň   | {cspell,simple} | cspell: {kůň}
 blank | Space symbols |   | {}  |
 lword | Latin word| se| {cspell,simple} | cspell: {}
 blank | Space symbols |   | {}  |
 lword | Latin word| napil | {cspell,simple} | simple: {napil}
 blank | Space symbols |   | {}  |
 word  | Word  | žluté | {cspell,simple} | simple: {žluté}
 blank | Space symbols |   | {}  |
 lword | Latin word| vody  | {cspell,simple} | simple: {vody}
(13 rows)

This query returned true in 8.2 and now:

postgres=# select to_tsvector('cs','Příliš žlutý kůň se napil žluté
vody') @@ to_tsquery('cs','napít');
 ?column?
--
 f
(1 row)

Regards
Pavel Stehule

---(end of broadcast)---
TIP 6: explain analyze is your friend