Re: [HACKERS] Per-function GUC settings: trickier than it looked
Florian G. Pflug wrote: Tom Lane wrote: So, to reiterate, my idea is .) Make "SET TRANSACTION" a synonym for "SET LOCAL" at the SQL-Level. .) In pl/pgsql, "SET TRANSACTION" sets a new value that is kept after the function exits, even if the function has a matching SET-clause. .) "SET LOCAL" in pl/pgsql set a new value that is kept if the function has no matching SET-clause. If it has one, the value is restored. In any case, we emit a warning that "SET LOCAL" is going away. .) One day, make "SET LOCAL" in pl/pgsql mean "local to the surrounding BEGIN/END block". Independent of any SET-clauses the function might or might not have. I don't think it's a good idea to change SET LOCAL now and plan on changing it again later ;-). If we really want BEGIN-block-local SET capability, I'd prefer to think of some new keyword for that. But I'm not convinced it's interesting --- given the proposed behavior of function SET-clauses, attaching a SET to your function seems like it'll cover the need for restoring outer values. Hm... could we still have "SET TRANSACTION" as a synonym for "SET LOCAL"? That would blend nicely with "SET TRANSACTION ISOLATION LEVEL" and "SET TRANSACTION READ ONLY". I don't think it's a very good idea to make SET TRANSACTION an alias for SET LOCAL, because SET TRANSACTION has already got its own meaning in the SQL spec - it sets transaction modes. Although I agree with you that variables set with SET LOCAL are also attached to the transaction (by definition), I would still rather separate transaction-local GUCs from spec-defined transaction modes. As precedence, they have two separate reference pages already: http://www.postgresql.org/docs/8.1/interactive/sql-set.html http://www.postgresql.org/docs/8.1/interactive/sql-set-transaction.html [ thinking... ] Hey, wait a moment. Regarding "SET TRANSACTION READ ONLY" - This is not strictly speaking a GUC, but still, if we pretend that there are no subtransaction, that command should too propage to the outermost transaction on release, shouldn't it? ... I believe that for consistencies sake, the "set transaction read only" should have propagated to the outermost transaction on "release s1". Sounds reasonable to me. I understand SAVEPOINT/RELEASE come from the SQL standard. So does the SQL standard say anything about this? Best Regards Michael Paesold ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: [HACKERS] tsearch filenames unlikes special symbols and numbers
2007/9/4, Tom Lane <[EMAIL PROTECTED]>: > "Ben Tilly" <[EMAIL PROTECTED]> writes: > > On 9/3/07, Tom Lane <[EMAIL PROTECTED]> wrote: > >> There seems fairly clear use-case for allowing A-Z a-z 0-9 and > >> underscore (while CVS head rejects 0-9 and underscore). > > > The problem with allowing uppercase letters is that on some > > filesystems foo and Foo are the same file, and on others they are not. > > This may lead to obscure portability problems where code worked fine > > on Unix and then fails when the database is running on Windows. > > Yeah, good point. So far it seems that a-z 0-9 and underscore cover the > real use-cases, so what say we just allow those for now? It's a lot > easier to loosen up later than tighten up ... > > regards, tom lane > It's system specific. I prefere a-z and A-Z. Clasic name for dictionaries combine lower and upper characters .. for czech cs_CZ_UTF8 etc. dictfile = cs_CZ_UTF8 ... automatic convert to cs_cz_utf8.dict dictfile = 'cs_CZ_UTF8' .. check and use cs_CZ_UTF8 Regards Pavel Stehule p.s. it's important on UNIX platforms and without any efect on windows. ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
Re: [HACKERS] Per-function GUC settings: trickier than it looked
Tom Lane wrote: So, to reiterate, my idea is .) Make "SET TRANSACTION" a synonym for "SET LOCAL" at the SQL-Level. .) In pl/pgsql, "SET TRANSACTION" sets a new value that is kept after the function exits, even if the function has a matching SET-clause. .) "SET LOCAL" in pl/pgsql set a new value that is kept if the function has no matching SET-clause. If it has one, the value is restored. In any case, we emit a warning that "SET LOCAL" is going away. .) One day, make "SET LOCAL" in pl/pgsql mean "local to the surrounding BEGIN/END block". Independent of any SET-clauses the function might or might not have. I don't think it's a good idea to change SET LOCAL now and plan on changing it again later ;-). If we really want BEGIN-block-local SET capability, I'd prefer to think of some new keyword for that. But I'm not convinced it's interesting --- given the proposed behavior of function SET-clauses, attaching a SET to your function seems like it'll cover the need for restoring outer values. Hm... could we still have "SET TRANSACTION" as a synonym for "SET LOCAL"? That would blend nicely with "SET TRANSACTION ISOLATION LEVEL" and "SET TRANSACTION READ ONLY". [ thinking... ] Hey, wait a moment. Regarding "SET TRANSACTION READ ONLY" - This is not strictly speaking a GUC, but still, if we pretend that there are no subtransaction, that command should too propage to the outermost transaction on release, shouldn't it? This is what happens currently (CVS HEAD with at least your initial function-SET-clause patch already in) regression=# begin ; BEGIN regression=# savepoint s1 ; SAVEPOINT regression=# set transaction read only ; SET regression=# release s1 ; RELEASE regression=# create table test (id int) ; CREATE TABLE regression=# commit ; COMMIT compared to: regression=# begin ; BEGIN regression=# set transaction read only ; SET regression=# create table test (id int) ; ERROR: transaction is read-only I believe that for consistencies sake, the "set transaction read only" should have propagated to the outermost transaction on "release s1". greetings, Florian Pflug ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [HACKERS] Hash index todo list item
On 9/3/07, Tom Lane <[EMAIL PROTECTED]> wrote: > "Ben Tilly" <[EMAIL PROTECTED]> writes: > > That raises a very random thought. One of the nicer features of > > Oracle is the ability to have function-based indexes. So you could > > index, say, trim(lower(person.name)). > > > Is there any prospect of postgres aquiring that functionality? > > Uh, no, since it's already there; has been since Berkeley days ... Nice! I know of at least one DBA who is moving from Oracle to postgres who will be *very* happy to hear that. Ben ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
[HACKERS] Updatable cursor doubt
In CVS HEAD workspace=# begin; BEGIN workspace=# declare cu cursor for select * from t1 for read only; DECLARE CURSOR workspace=# fetch cu; a --- 1 (1 row) workspace=# delete from t1 where current of cu; DELETE 1 workspace=# commit; COMMIT Is this the intended behaviour? If so should we remove the 'READ ONLY' clause from the allowable syntax? The documentation does not have 'READ ONLY' as part of the cursor syntax anymore. Rgds, Arul Shaji ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
Re: [HACKERS] Per-function GUC settings: trickier than it looked
"Florian G. Pflug" <[EMAIL PROTECTED]> writes: > It still seems a bit strange that "SET LOCAL" is undone at function-exit, > if the function has a matching SET-clause. But we need that for backwards- > compatibility of the secure-search_path workaround, right? Yeah, I'm afraid we backed ourselves into a corner on that one. > So, to reiterate, my idea is > .) Make "SET TRANSACTION" a synonym for "SET LOCAL" at the SQL-Level. > .) In pl/pgsql, "SET TRANSACTION" sets a new value that is kept after the > function exits, even if the function has a matching SET-clause. > .) "SET LOCAL" in pl/pgsql set a new value that is kept if the function > has no matching SET-clause. If it has one, the value is restored. > In any case, we emit a warning that "SET LOCAL" is going away. > .) One day, make "SET LOCAL" in pl/pgsql mean "local to the surrounding > BEGIN/END block". Independent of any SET-clauses the function > might or might not have. I don't think it's a good idea to change SET LOCAL now and plan on changing it again later ;-). If we really want BEGIN-block-local SET capability, I'd prefer to think of some new keyword for that. But I'm not convinced it's interesting --- given the proposed behavior of function SET-clauses, attaching a SET to your function seems like it'll cover the need for restoring outer values. regards, tom lane ---(end of broadcast)--- TIP 7: You can help support the PostgreSQL project by donating at http://www.postgresql.org/about/donate
Re: [HACKERS] Per-function GUC settings: trickier than it looked
Tom Lane wrote: "Florian G. Pflug" <[EMAIL PROTECTED]> writes: Tom Lane wrote: Clear to everyone? Any objections? That makes "SET LOCAL" completely equivalent to "SET", except when used inside a function that has a corresponding SET-clause, right? Maybe it wasn't clear :-(. They aren't equivalent because in the absence of rollback, SET's effects persist past main-transaction end; SET LOCAL's don't. That's the way they were defined originally (pre-subtransactions) and it still seems to make sense. Ah, OK - things make much more sense now. So I think *if* this is done, "SET LOCAL" should be renamed to "SET FUNCTION". This would also prevent confusion, because everyone who currently uses SET LOCAL will have to change his code anyway, since the semantics change for every use-case apart from functions with SET-clauses, which don't exist in 8.2. I'm not sure how many people have really written code that depends on the behavior of SET LOCAL rolling back at successful subtransaction end. I think we'd have heard about it if very many people had noticed, because it's not what the manual says. For the one use we've actually advocated (setting a temporary value within a function and then reverting to the old setting before exit), there isn't any visible change in behavior, since abandonment of the restored value at subtransaction end still ends up with the same result. And renaming SET LOCAL also emphasized that point that we are taking away functionality here - even if that functionality might not seem very useful. We can't break the officially advocated solution for secure search_path. However, that particular coding pattern will still work with the change I'm proposing. It's only where you *don't* manually restore the prior value that you might notice a difference. BTW, I *did* check the documentation before responding to Simon's original mail, and I *did* read it as "SET LOCAL goes away a subtransaction end". I figured that since there is no word on subtransactions in that part of the documentation, transaction will apply equally to both toplevel and subtransaction. Yeah, but you know that it's subtransactions under the hood, whereas someone who's thinking in terms of SAVEPOINT/RELEASE and BEGIN/EXCEPTION probably hasn't a clue about that. I plead guilty here ;-). That whole SAVEPOINT/RELEASE thing always seemed strange to me - I just accepted it at some point, but still translated it into something hierarchical I guess .) In pl/pgsql, that fact that "SET LOCAL" goes away after the current BEGIN/END block seems entirely logical. I don't think so ... your other side-effects such as table updates don't disappear, so why should SET's I guess because LOCAL to me implies some lexical locality - like the surrounding BEGIN/END block. I'm not necessarily averse to inventing a third version of SET, but I don't see a well-thought-out proposal here. In particular, we should be making an effort to *not* expose the concept of subtransaction at the SQL level at all, because that's not what the spec has. Thanks for your explanation - I can see your point now, after realizing why the spec has SAVEPOINT/RELEASE and *not* nested BEGIN/COMMIT blocks. So, at least on the SQL-level, I guess I agree - your new semantics fit better with the sql spec, even if they seemed quite strange to me at first sight. Though maybe we should add "SET TRANSACTION" as a synonym for "SET LOCAL"? - the former seems to convey your new semantics much better than the later. It still seems a bit strange that "SET LOCAL" is undone at function-exit, if the function has a matching SET-clause. But we need that for backwards- compatibility of the secure-search_path workaround, right? Maybe we could make "SET TRANSACTION" different from "SET LOCAL" in pl/pgsql, and warn if "SET LOCAL" is used? That would enable us either get rid of "SET LOCAL" in the long term, or to really make it local to the surrounding BEGIN/END block. So, to reiterate, my idea is .) Make "SET TRANSACTION" a synonym for "SET LOCAL" at the SQL-Level. .) In pl/pgsql, "SET TRANSACTION" sets a new value that is kept after the function exits, even if the function has a matching SET-clause. .) "SET LOCAL" in pl/pgsql set a new value that is kept if the function has no matching SET-clause. If it has one, the value is restored. In any case, we emit a warning that "SET LOCAL" is going away. .) One day, make "SET LOCAL" in pl/pgsql mean "local to the surrounding BEGIN/END block". Independent of any SET-clauses the function might or might not have. The last idea might seem to create a inconsistency between the SQL-level and pl/pgsql, but I think it does not. "SET LOCAL" is local to the surrounding BEGIN/{END|COMMIT} block in both cases - it's just that you have nested such blocks in pl/pgsql, but not in plain SQL. greetings, Florian Pflug ---(end of broadcast)--- TIP 2: Don't 'kill -9
Re: [HACKERS] Hash index todo list item
"Ben Tilly" <[EMAIL PROTECTED]> writes: > That raises a very random thought. One of the nicer features of > Oracle is the ability to have function-based indexes. So you could > index, say, trim(lower(person.name)). > Is there any prospect of postgres aquiring that functionality? Uh, no, since it's already there; has been since Berkeley days ... regards, tom lane ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
Re: [HACKERS] Hash index todo list item
On Mon, Sep 03, 2007 at 05:20:34PM -0700, Ben Tilly wrote: > > That raises a very random thought. One of the nicer features of > Oracle is the ability to have function-based indexes. So you could > index, say, trim(lower(person.name)). There are a *lot* of practical > situations where that comes in handy. The best workaround that I can > think of for not having that is to have a column defined to hold the > result of the function, maintain that column with a trigger, then > index that column. Which works, but is inelegant. (It also requires > storing completely redundant data.) > > Is there any prospect of postgres aquiring that functionality? > > Ben > I believe that PostgreSQL already supports functional indexes. In fact, one suggestion to address the egregiously poor performance of the current hash index was to replace it with a functional index. Regards, Ken ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
Re: [HACKERS] Hash index todo list item
On 9/3/07, Gregory Stark <[EMAIL PROTECTED]> wrote: > > "Kenneth Marshall" <[EMAIL PROTECTED]> writes: > > > On Sun, Sep 02, 2007 at 10:41:22PM -0400, Tom Lane wrote: > >> Kenneth Marshall <[EMAIL PROTECTED]> writes: > >> > ... This is the rough plan. Does anyone see anything critical that > >> > is missing at this point? > >> > >> Sounds pretty good. Let me brain-dump one item on you: one thing that > >> hash currently has over btree is the ability to handle index items up > >> to a full page. Now, if you go with a scheme that only stores hash > >> codes and not the underlying data, you can not only handle that but > >> improve on it; > > I think that would be a big selling point for hash indexes. It would let you > index even toasted data which are larger than a page. I'm not sure whether you > can make it work for unique indexes though. But for non-unique indexes I think > it would be a solid win and mean you cover a set of use cases quite distinct > from btree indexes. > > > - Hash lookup is O(1) while btree is O(logN). > > That's not really true. There's a tradeoff between insertion time and lookup > time. In order to get O(1) lookups you need to work pretty hard to maintain > the hash table including spending a lot of time reorganizing it when you grow > it. If you don't want to spend the time on inserts then you end up with > buckets and the hash table is basically just a linear speedup to whatever > algorithm you use to scan the buckets. These facts notwithstanding, average insert performance remains O(1) if you grow the hash exponentially every time it needs to be grown. Suppose, for example, that you use a power of 2 arrangement. Then the worst case scenario, right after a split, is that all of your keys had to be inserted, all had to be moved once, half had to be moved twice, a quarter 3 times, etc. So the ratio of moves to keys is 1 + 1/2 + 1/4 + ... which is a well-known geometric series converging on 2. True, when you cross the threshold a lot of work needs to be done. Life would be simpler if you could just put up a lock while you split the hash. You can't do that for a busy transactional database though. But if you want to be clever about it, you build into your hash implementation the intelligence to be able to have 1 or 2 hash locations to search. When they are both present, all inserts go into one of them, all deletes and updates are performed against both. Then you're able to have a background job reorganize your hash while the database continues to use it. > > - What about multi-column indexes? The current implementation > > only supports 1 column. > > That seems kind of weird. It seems obvious that you mix the three hashes > together which reduces it to the solved problem. That raises a very random thought. One of the nicer features of Oracle is the ability to have function-based indexes. So you could index, say, trim(lower(person.name)). There are a *lot* of practical situations where that comes in handy. The best workaround that I can think of for not having that is to have a column defined to hold the result of the function, maintain that column with a trigger, then index that column. Which works, but is inelegant. (It also requires storing completely redundant data.) Is there any prospect of postgres aquiring that functionality? Ben ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
Re: [HACKERS] tsearch filenames unlikes special symbols and numbers
"Ben Tilly" <[EMAIL PROTECTED]> writes: > On 9/3/07, Tom Lane <[EMAIL PROTECTED]> wrote: >> There seems fairly clear use-case for allowing A-Z a-z 0-9 and >> underscore (while CVS head rejects 0-9 and underscore). > The problem with allowing uppercase letters is that on some > filesystems foo and Foo are the same file, and on others they are not. > This may lead to obscure portability problems where code worked fine > on Unix and then fails when the database is running on Windows. Yeah, good point. So far it seems that a-z 0-9 and underscore cover the real use-cases, so what say we just allow those for now? It's a lot easier to loosen up later than tighten up ... regards, tom lane ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [HACKERS] tsearch filenames unlikes special symbols and numbers
"Ben Tilly" <[EMAIL PROTECTED]> writes: > I don't know what you're discussing well enough to know if this is > relevant, but what you just said is not always true. If there is any > way to pass arbitrary binary data into your function call, then > someone can pass in a string with nul in it. Not a problem here, because the passed-in data is considered nul-terminated already. (Sometimes, not being 8-bit-clean is an advantage...) regards, tom lane ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [HACKERS] Hash index todo list item
Gregory Stark <[EMAIL PROTECTED]> writes: > "Kenneth Marshall" <[EMAIL PROTECTED]> writes: >> - What about multi-column indexes? The current implementation >> only supports 1 column. > That seems kind of weird. It seems obvious that you mix the three hashes > together which reduces it to the solved problem. No, because part of the deal is that you can do lookups using only the leading index columns. At least, all the existing multicolumn index types can do that. regards, tom lane ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [HACKERS] tsearch filenames unlikes special symbols and numbers
On 9/3/07, Tom Lane <[EMAIL PROTECTED]> wrote: > Gregory Stark <[EMAIL PROTECTED]> writes: > > "Tom Lane" <[EMAIL PROTECTED]> writes: > >> I'm not convinced that . is issue-free. On most if not all versions of > >> Unix, > >> you are allowed to open a directory as a file and read the filenames it > >> contains. While I don't say it'd be easy to manage that through > >> tsearch, there's at least a potential for discovering the filenames > >> present in . and .. --- how much do we care about that? > > > Actually I don't think that's true any more, most file systems on most > > Unixen > > do not allow it. However it appears it's still the case for Solaris so it's > > still a good point. > > Actually, now that I've woken up a bit more, it is not a problem as > long as the tsearch code always appends some kind of file extension > to what the user gives, such as ".dict". It'll be impossible to name > "." or ".." with that addition. I don't know what you're discussing well enough to know if this is relevant, but what you just said is not always true. If there is any way to pass arbitrary binary data into your function call, then someone can pass in a string with nul in it. When that hits the OS API, your appended .dict won't be seen as part of the filename. (This is a common security oversight when calling C APIs from higher-level languages such as Perl. See http://artofhacking.com/files/phrack/phrack55/P55-07.TXT for more.) [...] Cheers, Ben ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [HACKERS] tsearch filenames unlikes special symbols and numbers
On 9/3/07, Tom Lane <[EMAIL PROTECTED]> wrote: > Alvaro Herrera <[EMAIL PROTECTED]> writes: > > On the other hand, this means the name has to be quoted if it would be > > quoted as an SQL identifier, right? > > Something like that. I wasn't planning on rejecting uppercase letters, > though, which would be necessary if you wanted to be strict about > matching unquoted identifiers. > > There seems fairly clear use-case for allowing A-Z a-z 0-9 and > underscore (while CVS head rejects 0-9 and underscore). There also seem > to be good arguments for disallowing / \ : on various platforms, which > leaves us with some other punctuation in question, as well as the whole > matter of non-ASCII characters. I'm not sure whether we want to touch > the idea of non-ASCII; comments? The problem with allowing uppercase letters is that on some filesystems foo and Foo are the same file, and on others they are not. This may lead to obscure portability problems where code worked fine on Unix and then fails when the database is running on Windows. The approach that I'd suggest is allow a very restricted subset as an immediate solution (say a-z and 0-9), and plan to later allow arbitrary data to be passed in, then be encoded in some way before hitting disk. (And later need not be much later - such encodings are not that hard to write.) Cheers, Ben ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
Re: [HACKERS] Per-function GUC settings: trickier than it looked
"Florian G. Pflug" <[EMAIL PROTECTED]> writes: > Tom Lane wrote: >> Clear to everyone? Any objections? > That makes "SET LOCAL" completely equivalent to "SET", except > when used inside a function that has a corresponding SET-clause, right? Maybe it wasn't clear :-(. They aren't equivalent because in the absence of rollback, SET's effects persist past main-transaction end; SET LOCAL's don't. That's the way they were defined originally (pre-subtransactions) and it still seems to make sense. > So I think *if* this is done, "SET LOCAL" should be renamed to > "SET FUNCTION". This would also prevent confusion, because everyone > who currently uses SET LOCAL will have to change his code anyway, > since the semantics change for every use-case apart from functions > with SET-clauses, which don't exist in 8.2. I'm not sure how many people have really written code that depends on the behavior of SET LOCAL rolling back at successful subtransaction end. I think we'd have heard about it if very many people had noticed, because it's not what the manual says. For the one use we've actually advocated (setting a temporary value within a function and then reverting to the old setting before exit), there isn't any visible change in behavior, since abandonment of the restored value at subtransaction end still ends up with the same result. > And renaming SET LOCAL also emphasized that point that we are taking > away functionality here - even if that functionality might not seem > very useful. We can't break the officially advocated solution for secure search_path. However, that particular coding pattern will still work with the change I'm proposing. It's only where you *don't* manually restore the prior value that you might notice a difference. > BTW, I *did* check the documentation before responding to Simon's original > mail, and I *did* read it as "SET LOCAL goes away a subtransaction end". > I figured that since there is no word on subtransactions in that part > of the documentation, transaction will apply equally to both toplevel > and subtransaction. Yeah, but you know that it's subtransactions under the hood, whereas someone who's thinking in terms of SAVEPOINT/RELEASE and BEGIN/EXCEPTION probably hasn't a clue about that. >.) In pl/pgsql, that fact that "SET LOCAL" goes away after the current > BEGIN/END block seems entirely logical. I don't think so ... your other side-effects such as table updates don't disappear, so why should SET's? I'm not necessarily averse to inventing a third version of SET, but I don't see a well-thought-out proposal here. In particular, we should be making an effort to *not* expose the concept of subtransaction at the SQL level at all, because that's not what the spec has. regards, tom lane ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
Re: [HACKERS] Per-function GUC settings: trickier than it looked
Tom Lane wrote: "Florian G. Pflug" <[EMAIL PROTECTED]> writes: And the rule becomes (I tend to forget things, so I like simple rules that I can remember ;-) ) "For each SET-clause, there is a pseudo-subtransaction affecting only *this* GUC". The other question is whether we want to change the behavior of SET LOCAL even in the absence of function SET-clauses. The current rule is that a LOCAL setting goes away at subtransaction commit, leading to this behavior: regression=# show regex_flavor; regex_flavor -- advanced (1 row) regression=# begin; BEGIN regression=# savepoint x; SAVEPOINT regression=# set local regex_flavor to basic; SET regression=# release x; RELEASE regression=# show regex_flavor; regex_flavor -- advanced (1 row) which makes some sense if you think of "release" as "subtransaction end", but not a lot if you think of it as forgetting a savepoint. Likewise, SET LOCAL within a plpgsql exception block goes away at successful block exit, which is not the first thing you'd expect. Neither of these behaviors are documented anywhere AFAIR; certainly the SET reference page doesn't explain 'em. I think we should probably take this opportunity to fix that, and make SET LOCAL mean "persists until end of current top-level transaction, unless rolled back earlier or within a function SET clause". So: * Plain SET takes effect immediately and persists unless rolled back or overridden by another explicit SET. In particular the value will escape out of a function that has a SET-clause for the same variable. * SET LOCAL takes effect immediately and persists until rolled back, overridden by another SET, or we exit a function that has a SET-clause for the same variable. * Rollback of a transaction or subtransaction cancels any SET or SET LOCAL within it. Otherwise, the latest un-rolled-back SET or SET LOCAL determines the active value within a transaction, and the latest un-rolled-back SET determines the value that will prevail after the transaction commits. * A function SET clause saves the entry-time value, and restores it at function exit, except when overridden by an un-rolled-back SET (but not SET LOCAL) within the function. Clear to everyone? Any objections? That makes "SET LOCAL" completely equivalent to "SET", except when used inside a function that has a corresponding SET-clause, right? So I think *if* this is done, "SET LOCAL" should be renamed to "SET FUNCTION". This would also prevent confusion, because everyone who currently uses SET LOCAL will have to change his code anyway, since the semantics change for every use-case apart from functions with SET-clauses, which don't exist in 8.2. Or am I overlooking something? And renaming SET LOCAL also emphasized that point that we are taking away functionality here - even if that functionality might not seem very useful. BTW, I *did* check the documentation before responding to Simon's original mail, and I *did* read it as "SET LOCAL goes away a subtransaction end". I figured that since there is no word on subtransactions in that part of the documentation, transaction will apply equally to both toplevel and subtransaction. It might very well be that I'm the only one who read it that way, though ;-) And I must admin that I wasn't completely sure, so I *did* try it out before I posted... I'd strong prefer "SET LOCAL" to kept it's current semantics, only that "SET LOCAL" changes will now be rolled back if the function has a matching SET-clause. For multiple reasons: .) It's useful to be able to temporarily change GUCs from a client, and being able to reset them afterwards. Using a subtransaction for this is maybe a bit wastefull, but at least it works. .) In pl/pgsql, that fact that "SET LOCAL" goes away after the current BEGIN/END block seems entirely logical. .) It doesn't take away existing functionality greetings, Florian Pflug ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: [HACKERS] Hash index todo list item
"Kenneth Marshall" <[EMAIL PROTECTED]> writes: > On Sun, Sep 02, 2007 at 10:41:22PM -0400, Tom Lane wrote: >> Kenneth Marshall <[EMAIL PROTECTED]> writes: >> > ... This is the rough plan. Does anyone see anything critical that >> > is missing at this point? >> >> Sounds pretty good. Let me brain-dump one item on you: one thing that >> hash currently has over btree is the ability to handle index items up >> to a full page. Now, if you go with a scheme that only stores hash >> codes and not the underlying data, you can not only handle that but >> improve on it; I think that would be a big selling point for hash indexes. It would let you index even toasted data which are larger than a page. I'm not sure whether you can make it work for unique indexes though. But for non-unique indexes I think it would be a solid win and mean you cover a set of use cases quite distinct from btree indexes. > - Hash lookup is O(1) while btree is O(logN). That's not really true. There's a tradeoff between insertion time and lookup time. In order to get O(1) lookups you need to work pretty hard to maintain the hash table including spending a lot of time reorganizing it when you grow it. If you don't want to spend the time on inserts then you end up with buckets and the hash table is basically just a linear speedup to whatever algorithm you use to scan the buckets. > - What about multi-column indexes? The current implementation > only supports 1 column. That seems kind of weird. It seems obvious that you mix the three hashes together which reduces it to the solved problem. -- Gregory Stark EnterpriseDB http://www.enterprisedb.com ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
Re: [HACKERS] Per-function GUC settings: trickier than it looked
"Florian G. Pflug" <[EMAIL PROTECTED]> writes: > And the rule becomes (I tend to forget things, so I like simple > rules that I can remember ;-) ) "For each SET-clause, there is > a pseudo-subtransaction affecting only *this* GUC". The other question is whether we want to change the behavior of SET LOCAL even in the absence of function SET-clauses. The current rule is that a LOCAL setting goes away at subtransaction commit, leading to this behavior: regression=# show regex_flavor; regex_flavor -- advanced (1 row) regression=# begin; BEGIN regression=# savepoint x; SAVEPOINT regression=# set local regex_flavor to basic; SET regression=# release x; RELEASE regression=# show regex_flavor; regex_flavor -- advanced (1 row) which makes some sense if you think of "release" as "subtransaction end", but not a lot if you think of it as forgetting a savepoint. Likewise, SET LOCAL within a plpgsql exception block goes away at successful block exit, which is not the first thing you'd expect. Neither of these behaviors are documented anywhere AFAIR; certainly the SET reference page doesn't explain 'em. I think we should probably take this opportunity to fix that, and make SET LOCAL mean "persists until end of current top-level transaction, unless rolled back earlier or within a function SET clause". So: * Plain SET takes effect immediately and persists unless rolled back or overridden by another explicit SET. In particular the value will escape out of a function that has a SET-clause for the same variable. * SET LOCAL takes effect immediately and persists until rolled back, overridden by another SET, or we exit a function that has a SET-clause for the same variable. * Rollback of a transaction or subtransaction cancels any SET or SET LOCAL within it. Otherwise, the latest un-rolled-back SET or SET LOCAL determines the active value within a transaction, and the latest un-rolled-back SET determines the value that will prevail after the transaction commits. * A function SET clause saves the entry-time value, and restores it at function exit, except when overridden by an un-rolled-back SET (but not SET LOCAL) within the function. Clear to everyone? Any objections? As far as implementation, I think this can be made to happen by rejiggering the value stacking and unstacking rules within guc.c. I'm tempted to try to get rid of the "tentative" value slots at the same time. That's a hangover from the pre-subtransaction implementation, when we only had to remember one inactive value for the case of SET followed by SET LOCAL within a transaction. Now that we have a stack of saved values, it seems to make more sense to try to handle this case by stacking the SET value when we hit SET LOCAL at the same nesting level. regards, tom lane ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [HACKERS] tsearch filenames unlikes special symbols and numbers
Tom Lane wrote: > I'm not sure whether we want to touch > the idea of non-ASCII; comments? Non-ASCII filenames sounds like recipe for problems to me. We don't know what encoding the filenames are in on disk. -- Heikki Linnakangas EnterpriseDB http://www.enterprisedb.com ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
Re: [HACKERS] tsearch filenames unlikes special symbols and numbers
Alvaro Herrera <[EMAIL PROTECTED]> writes: > On the other hand, this means the name has to be quoted if it would be > quoted as an SQL identifier, right? Something like that. I wasn't planning on rejecting uppercase letters, though, which would be necessary if you wanted to be strict about matching unquoted identifiers. There seems fairly clear use-case for allowing A-Z a-z 0-9 and underscore (while CVS head rejects 0-9 and underscore). There also seem to be good arguments for disallowing / \ : on various platforms, which leaves us with some other punctuation in question, as well as the whole matter of non-ASCII characters. I'm not sure whether we want to touch the idea of non-ASCII; comments? regards, tom lane ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [pgsql-www] [HACKERS] \dF wrt text search
Decibel! wrote: > On Sun, Sep 02, 2007 at 11:18:45PM -0300, Marc G. Fournier wrote: >> Should be fixed now, running a manual run of it right now, give it about 15 >> minutes or so ... > > Is there now monitoring for it as well? yes Stefan ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [HACKERS] tsearch filenames unlikes special symbols and numbers
Tom Lane escribió: > Possibly we could allow '.' as long as we forbade /, but the other > trouble with allowing . is that it encourages people to try to specify > the filetype suffix (as indeed Oleg was doing). I'd prefer to keep the > suffixes out of the SQL object definitions, with an eye to possibly > someday migrating all the configuration data inside the database. > There's a reasonable argument for restricting the names used for these > things in the SQL definitions to be valid SQL identifiers, so that that > will work nicely... Well, if we were to use SQL identifiers, we couldn't forbade anything too much, seeing as almost anything can be used as an identifier, so long as it is properly quoted. But it seems to me like we could just pick an convenient subset which doesn't make any OS too angry about it (say, reject / \ . and :), and when we get to using actual SQL identifiers, we can enlarge the supported char set without creating any backwards-compatibility problem. On the other hand, this means the name has to be quoted if it would be quoted as an SQL identifier, right? -- Alvaro Herrera http://www.amazon.com/gp/registry/DXLWNGRJD34J "Nunca confiaré en un traidor. Ni siquiera si el traidor lo he creado yo" (Barón Vladimir Harkonnen) ---(end of broadcast)--- TIP 7: You can help support the PostgreSQL project by donating at http://www.postgresql.org/about/donate
Re: [HACKERS] tsearch filenames unlikes special symbols and numbers
Trevor Talbot wrote: On 9/3/07, Mark Mielke <[EMAIL PROTECTED]> wrote: Tom Lane wrote: Also, says that Windows throws an error for ":" in the filename, which means we needn't. Windows doesn't fail - but it can do odd things. For example, try: C:\> echo hi >foo:bar If one then checks the directory, one finds a "foo". : is used for naming streams and attribute types in NTFS filenames. It's not very well-known functionality and tends to confuse people, but I'm not aware of any situation where it'd be a problem for read access. (Creation is not a security risk in the technical sense, but as most administrators aren't aware of alternate data streams and the shell does not expose them, it's effectively hidden data.) If any of you are familiar with MacOS HFS resource forks, NTFS basically supports an arbitrary number of named forks. A file is collection of one or more data streams, the single unnamed stream being default. On MacOS (prior) to OSX, : was used as a directory seperator (Paths looked like "My Harddisk:My Folder:Somefile"). In OSX, "/" is used, but for backwards-compatibility the Finder translates "/" in filenames to ":". So, of you do for example "touch 'my:test'" on the shell, you see "my/test" in the Finder. Thats another argument for staying away from : in filenames. greetings, Florian Pflug ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [HACKERS] Hash index todo list item
On Mon, Sep 03, 2007 at 10:33:54AM +0100, Simon Riggs wrote: > > > > This is the rough plan. Does anyone see anything critical that > > is missing at this point? Please send me any suggestions for test > > data and various performance test ideas, since I will be working > > on that first. > > Sounds good. > > I'd be particularly interested in large indexes, say ~ 0.5 - 2GB. There > are likely to be various effects apparent as the indexes grow. It would > be too easy to do all the tests with smaller indexes and miss things. > > Other factors are: > - volatility > - concurrency > > My general experience is that hash-based indexes are better when the > range of inputs is relatively well-known, allowing a fast lookup. If > that is the only benefit of hash indexes, a flexible hashing scheme may > simply weaken the benefit-case for using them. If that's true, should > the index build process examine the key values in the data to determine > the best parameters to use? Kind of ANALYZE before build. > > My current feeling is that they ought to be very good at handling > read-mostly situations such as privilege checking or UPDATE-intensive > situations such as Customer-Current-Credit tracking, when the number of > customers is large. > > It might also be worth looking at lossy hash indexes, i.e. the index > stores only the block numbers. That would need to be part of the > discussion around how lossy we will allow indexes to be. > > We currently have two kinds of full text index with different > concurrency use cases, so it should be acceptable to have hash indexes > have a clear benefit in one use case but a clear loss in another. > > -- > Simon Riggs > 2ndQuadrant http://www.2ndQuadrant.com > Simon, Thank you for your input. I would like to include some tests with large indexes too. Do you have any ideas for a test corpus or should we try and generate the test data programatically? Many people in the literature of text retrieval use the TREC* data for at least some of their runs. I am going to check at work to see if the campus has access to the data, otherwise I will do some web crawling to generate some sample data. I have just posted a reply to Tom Lane with some further ideas for consideration in the new hash index support. Like you, I suspect that volatile data that results in many index changes may not work well with hash indexes, in general. PostgreSQL has the additional burden of needing to access both the index and the data heap. Obviously, the less I/O that is needed the better the performance is likely to be. The new HOT functionality plus clustering the table data on the hash index would effectively organize the table into the "hash buckets" which could help with reducing both the churn in the index as well as in the tables. Regards, Ken ---(end of broadcast)--- TIP 7: You can help support the PostgreSQL project by donating at http://www.postgresql.org/about/donate
Re: [HACKERS] Per-function GUC settings: trickier than it looked
Tom Lane wrote: "Florian G. Pflug" <[EMAIL PROTECTED]> writes: At least for me, the least surprising behaviour would be to revert it too. Than the rule becomes "a function is always executed in a pseudo-subtransaction that affects only GUCs" Only if it has at least one SET clause. The overhead is too high to insist on this for every function call. In that case, I agree that only variables specified in a SET-clause should be reverted. Otherwise, adding or removing SET-clauses (e.g, because you chose a different implementation of a function that suddenly doesn't need regexps anymore) will cause quite arbitrary behavior changes. And the rule becomes (I tend to forget things, so I like simple rules that I can remember ;-) ) "For each SET-clause, there is a pseudo-subtransaction affecting only *this* GUC". greetings, Florian Pflug ---(end of broadcast)--- TIP 7: You can help support the PostgreSQL project by donating at http://www.postgresql.org/about/donate
[HACKERS] Code examples
Moving to -docs On Sun, Sep 02, 2007 at 06:46:11PM -0400, Tom Lane wrote: > > Another problem I see are broken examples of dictionary and parser in > > documentation: > > http://momjian.us/main/writings/pgsql/sgml/textsearch-rule-dictionary-example.html > > http://momjian.us/main/writings/pgsql/sgml/textsearch-parser-example.html > > Yeah, I wanted to discuss that with you. Code examples in sgml docs are > a bad idea: they're impossible to use as actual templates, because of > all the weird markup changes, and there's no easy way to notice if > they're broken. It would be better to remove these from the docs and > set them up as contrib modules. Couldn't we come up with some method of specifying code examples in the docs and then having the doc build process actually run those examples and put that into the doc build? I wrote some code that does this back when I was thinking about writing a book, if anyone wants to see it. -- Decibel!, aka Jim Nasby[EMAIL PROTECTED] EnterpriseDB http://enterprisedb.com 512.569.9461 (cell) pgpfIY6tWUOJ1.pgp Description: PGP signature
Re: [HACKERS] Hash index todo list item
On Sun, Sep 02, 2007 at 10:41:22PM -0400, Tom Lane wrote: > Kenneth Marshall <[EMAIL PROTECTED]> writes: > > ... This is the rough plan. Does anyone see anything critical that > > is missing at this point? > > Sounds pretty good. Let me brain-dump one item on you: one thing that > hash currently has over btree is the ability to handle index items up > to a full page. Now, if you go with a scheme that only stores hash > codes and not the underlying data, you can not only handle that but > improve on it; but if you reduce the bucket size and don't remove the > data, it'd be a step backward. The idea I had about dealing with that > was to only reduce the size of primary buckets --- if it's necessary to > add overflow space to a bucket, the overflow units are still full pages. > So an index tuple up to a page in size can always be accommodated by > adding an overflow page to the bucket. > > Just a thought, but AFAIR it's not in the archives anywhere. > > regards, tom lane > Tom, Thank you for the input. I agree that keeping the ability to accomodate an index tuple up to a page is size worth keeping. I think that your goal in reducing the bucket size is to improve the packing efficiency of the index. Since the on disk page size remains the same, it may be possible to use a different structure overlayed on the current bucket size and still improve the packing efficiency of the index. After some more mulling, here are some further thoughts on the improved hash table implementation: - Hash lookup is O(1) while btree is O(logN). Is there a value in optimizing the NOT case, i.e. the entry is not in the table? - Space versus performance trade-off. This may tie into cache efficiency and use of L2/L3, shared buffers, main memory. Denser layouts with a higher load facter may be a bit slower in lookups but play much nicer in a multi-user system. Look at the possibility of a lossy mapping? - Build time versus update time. How does concurrency enter into the picture regarding simultaneous updates, inserts, deletes, and lookups? - Could a hybrid structure with some type of prefix compression give a more efficient layout and possibly improve performance? - Index larger fields. Btree is limited to blocksize/3, the current hash implementation can go up to a full block. - What about multi-column indexes? The current implementation only supports 1 column. More ideas are welcome and I will add them to the list for investigation. Regards, Ken ---(end of broadcast)--- TIP 7: You can help support the PostgreSQL project by donating at http://www.postgresql.org/about/donate
Re: [HACKERS] tsearch filenames unlikes special symbols and numbers
On 9/3/07, Mark Mielke <[EMAIL PROTECTED]> wrote: > Tom Lane wrote: > > Also, says that Windows throws an error for ":" in the filename, > > which means we needn't. > Windows doesn't fail - but it can do odd things. For example, try: > > C:\> echo hi >foo:bar > > If one then checks the directory, one finds a "foo". : is used for naming streams and attribute types in NTFS filenames. It's not very well-known functionality and tends to confuse people, but I'm not aware of any situation where it'd be a problem for read access. (Creation is not a security risk in the technical sense, but as most administrators aren't aware of alternate data streams and the shell does not expose them, it's effectively hidden data.) If any of you are familiar with MacOS HFS resource forks, NTFS basically supports an arbitrary number of named forks. A file is collection of one or more data streams, the single unnamed stream being default. ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
Re: [HACKERS] Per-function GUC settings: trickier than it looked
"Florian G. Pflug" <[EMAIL PROTECTED]> writes: > At least for me, the least surprising behaviour would be to > revert it too. Than the rule becomes "a function is always > executed in a pseudo-subtransaction that affects only GUCs" Only if it has at least one SET clause. The overhead is too high to insist on this for every function call. regards, tom lane ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [HACKERS] Per-function GUC settings: trickier than it looked
Tom Lane wrote: "Florian G. Pflug" <[EMAIL PROTECTED]> writes: Tom Lane wrote: So it seems that only SET LOCAL within a function with per-function GUC settings is at issue. I think that there is a pretty strong use-case for saying that if you have a per-function setting of a particular variable foo, then any "SET LOCAL foo" within the function ought to vanish at function end --- for instance a function could want to try a few different search_path settings and automatically revert to the caller's setting on exit. Agreed. > The question is what about SET LOCAL on a variable that *hasn't* been explicitly SET by the function definition. Either approach we take with it could be surprising, but probably having it revert at function end is more surprising... At least for me, the least surprising behaviour would be to revert it too. Than the rule becomes "a function is always executed in a pseudo-subtransaction that affects only GUCs" Since at least for pl/pgsql, a function body *alreay* is a BEGIN/END block - and therefore syntactically even looks like a subtransaction - this seems quite logical. And it would mean that the semantics of "SET LOCAL" won't change, just because you add an EXCEPTION clause to the function's toplevel BEGIN/END block. greetings, Florian Pflug ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
[HACKERS] [ANN] SE-PostgreSQL 8.2.4-1.0 Released
This is announcement of the first official version of SE-PostgreSQL. SE-PostgreSQL 8.2.4-1.0 Released SE-PostgreSQL development team released "SE-PostgreSQL 8.2.4-1.0" and "The SE-PostgreSQL Security Guide (Japanese/English)". You can get these packages from the following URL: http://code.google.com/p/sepgsql/downloads/list NOTE: Any packages built for Fedora development edition (rawhide) will be distributed via Fedora mirrors soon. * SE-PostgreSQL 8.2.4-1.0 sepostgresql-8.2.4-1.0.fc7.i386.rpm sepostgresql-8.2.4-1.0.fc7.src.rpm * The base security policy for Fedora 7 selinux-policy-devel-2.6.4-38.sepgsql.fc7.noarch.rpm selinux-policy-targeted-2.6.4-38.sepgsql.fc7.noarch.rpm selinux-policy-2.6.4-38.sepgsql.fc7.noarch.rpm selinux-policy-2.6.4-38.sepgsql.fc7.src.rpm * The official documentation sepgsql_security_guide.20070903.jp.pdf sepgsql_security_guide.20070903.en.pdf See the following URL, for installation details. * SE-PostgreSQL Installation Memo (Fedora 7) http://code.google.com/p/sepgsql/wiki/install_memo_Fedora7 The features of SE-PostgreSQL - Security-Enhanced PostgreSQL (SE-PostgreSQL) is a security extension built in PostgreSQL. It enables to apply a unified security policy of SELinux to both operating system and database management system. In addition, it also provides fine-grained access control including column and row level, and mandatory access control being non-bypassable, even if privileged database users. These features enables to build a database management system into information flow control scheme integrated with operating system, and to protect our information asset from threats like manipulation or leaking. The position of this version This is the first official version of SE-PostgreSQL based on PostgreSQL 8.2.4. However, it does not have enough achievement of works compared to the original PostgreSQL. Therefore, we recommend you to have enough evaluation and verification on its introduction. The series of SE-PostgreSQL 8.2.4-1.x got into maintenance phase after the release of this version, and we don't have any plan to release new version in this series, except for bug fixes. The SE-PostgreSQL development team has a plan to develop next major version of SE-PostgreSQL based on PostgreSQL 8.3, with several new features. Acknowledgment -- The development of SE-PostgreSQL is supported by Exploratory Software Project, IPA(Information-technology Promotion Agency, Japan). -- KaiGai Kohei <[EMAIL PROTECTED]> ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: [HACKERS] Per-function GUC settings: trickier than it looked
Simon Riggs <[EMAIL PROTECTED]> writes: > ISTM that SET LOCAL is mostly superceded by per-function parameters. Mostly, but not entirely. The case where you still need SET LOCAL is where the value you want to use locally has to be computed, or where you need to change it more than once within the function. Yet in such cases it'd still be handy to let the SET-clause mechanism deal with the detail of restoring the caller's value at exit. There is also a fairly nasty backward-compatibility problem. Suppose that security definer function OldSD uses the recommended-up-to-now method for setting a secure search path, which I quote from the 8.2 manual: old_path := pg_catalog.current_setting('search_path'); PERFORM pg_catalog.set_config('search_path', 'admin, pg_temp', true); -- Do whatever secure work we came for. PERFORM pg_catalog.set_config('search_path', old_path, true); (The set_config calls are equivalent to SET LOCAL.) Also suppose that security definer function NewSD uses the fancy new function-local- SET-clause method to avoid all that tedious stuff there. Now suppose that NewSD calls OldSD. If SET LOCAL overrides SET-clauses, this happens: * NewSD saves outer search path and sets its own. * OldSD saves NewSD's search path, then sets its own with SET LOCAL. * OldSD restores NewSD's search path using SET LOCAL. * NewSD tries to restore outer search path, but silently fails because SET LOCAL takes precedence. * We exit to the caller with NewSD's search path still in effect. This scenario will surely happen in the field, and therefore I argue that we *must* not allow SET LOCAL's effects to persist beyond the exit from a surrounding function-local SET clause on the same variable. I'm not sure what conclusions that leads to for other cases, though. We don't necessarily have to be consistent between the case where SET affects a variable and the case where it doesn't. regards, tom lane ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [HACKERS] tsearch filenames unlikes special symbols and numbers
Tom Lane wrote: Also, says that Windows throws an error for ":" in the filename, which means we needn't. Windows doesn't fail - but it can do odd things. For example, try: C:\> echo hi >foo:bar If one then checks the directory, one finds a "foo". Depending on *which* API one uses, the rules may change around a bit - but whatever the situation, as long as you prefix it with a valid path, the ":" is not going to cause you problems. It might still be a good idea to restrict the names to be SQL identifiers (ie, alphanumerics and underscores) for future-proofing, but it wasn't clear whether anyone but me thought that was a good argument. I'm willing to make it just be no-dir-separators. I think it is a good argument. Cheers, mark -- Mark Mielke <[EMAIL PROTECTED]> ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [HACKERS] tsearch filenames unlikes special symbols and numbers
Tom Lane wrote: Magnus Hagander <[EMAIL PROTECTED]> writes: On Mon, Sep 03, 2007 at 07:47:14AM +0100, Gregory Stark wrote: Actually I think in Windows \ : and . are problems (not allowed more than one dot in dos). \ and : are problems. Is : really a problem, given that the name in question will be appended to a known directory's path? The file name shouldn't have a ':' in it. Accessing a path with multiple ':' in it to open a file for reading should just fail normally. So yes, there should be no problem. . is not a problem. We don't support 16-bit windows anyway, and multiple dots works fine on any system we support. I'm not convinced that . is issue-free. On most if not all versions of Unix, you are allowed to open a directory as a file and read the filenames it contains. While I don't say it'd be easy to manage that through tsearch, there's at least a potential for discovering the filenames present in . and .. --- how much do we care about that? No more than discovering the file names in any other directory without using '.' or '..'? If it matters, check to ensure it is a regular file before opening it? Cheers, mark -- Mark Mielke <[EMAIL PROTECTED]>
Re: [HACKERS] tsearch filenames unlikes special symbols and numbers
"Tom Lane" <[EMAIL PROTECTED]> writes: > It might still be a good idea to restrict the names to be SQL > identifiers (ie, alphanumerics and underscores) for future-proofing, > but it wasn't clear whether anyone but me thought that was a good > argument. I'm willing to make it just be no-dir-separators. I thought that was a good argument actually. -- Gregory Stark EnterpriseDB http://www.enterprisedb.com ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
Re: [HACKERS] tsearch filenames unlikes special symbols and numbers
Gregory Stark <[EMAIL PROTECTED]> writes: > "Tom Lane" <[EMAIL PROTECTED]> writes: >> I'm not convinced that . is issue-free. On most if not all versions of Unix, >> you are allowed to open a directory as a file and read the filenames it >> contains. While I don't say it'd be easy to manage that through >> tsearch, there's at least a potential for discovering the filenames >> present in . and .. --- how much do we care about that? > Actually I don't think that's true any more, most file systems on most Unixen > do not allow it. However it appears it's still the case for Solaris so it's > still a good point. Actually, now that I've woken up a bit more, it is not a problem as long as the tsearch code always appends some kind of file extension to what the user gives, such as ".dict". It'll be impossible to name "." or ".." with that addition. Also, Magnus says that Windows throws an error for ":" in the filename, which means we needn't. So the bottom line seems to be that rejecting directory separators is sufficient to prevent any unwanted file accesses. It might still be a good idea to restrict the names to be SQL identifiers (ie, alphanumerics and underscores) for future-proofing, but it wasn't clear whether anyone but me thought that was a good argument. I'm willing to make it just be no-dir-separators. regards, tom lane ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [HACKERS] tsearch filenames unlikes special symbols and numbers
"Tom Lane" <[EMAIL PROTECTED]> writes: > I'm not convinced that . is issue-free. On most if not all versions of Unix, > you are allowed to open a directory as a file and read the filenames it > contains. While I don't say it'd be easy to manage that through > tsearch, there's at least a potential for discovering the filenames > present in . and .. --- how much do we care about that? Actually I don't think that's true any more, most file systems on most Unixen do not allow it. However it appears it's still the case for Solaris so it's still a good point. I'm sure it's not true for modern versions of Linux and I thought it was false for other modern OSes -- I'm surprised it's not for Solaris even. -- Gregory Stark EnterpriseDB http://www.enterprisedb.com ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: [HACKERS] FW: [ADMIN] max_connections and shared_buffers
"Scott Marlowe" <[EMAIL PROTECTED]> writes: > Not sure with Windows. I'm strictly a unix type of guy. I'm guessing > that Windows is detecting too many connections / out of memory and > shutting down the service. The whole thing is pretty strange. "received fast shutdown request" means that the postmaster got SIGINT --- a moment's look at the code proves there is no other possibility. Now what sent it SIGINT? AFAICS there are only two possible paths: "pg_ctl stop -m fast" or this little bit of code in win32/signal.c: /* Console control handler will execute on a thread created by the OS at the time of invocation */ static BOOL WINAPI pg_console_handler(DWORD dwCtrlType) { if (dwCtrlType == CTRL_C_EVENT || dwCtrlType == CTRL_BREAK_EVENT || dwCtrlType == CTRL_CLOSE_EVENT || dwCtrlType == CTRL_SHUTDOWN_EVENT) { pg_queue_signal(SIGINT); return TRUE; } return FALSE; } Can any Windows hackers speculate on causes of this? regards, tom lane ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: [HACKERS] tsearch filenames unlikes special symbols and numbers
On Mon, Sep 03, 2007 at 09:27:19AM -0400, Tom Lane wrote: > Magnus Hagander <[EMAIL PROTECTED]> writes: > > On Mon, Sep 03, 2007 at 07:47:14AM +0100, Gregory Stark wrote: > >> Actually I think in Windows \ : and . are problems (not allowed more > >> than one dot in dos). > > > \ and : are problems. > > Is : really a problem, given that the name in question will be appended > to a known directory's path? Yes. It won't work - the API calls will reject it. > > . is not a problem. We don't support 16-bit windows anyway, and multiple > > dots works fine on any system we support. > > I'm not convinced that . is issue-free. On most if not all versions of Unix, > you are allowed to open a directory as a file and read the filenames it > contains. While I don't say it'd be easy to manage that through > tsearch, there's at least a potential for discovering the filenames > present in . and .. --- how much do we care about that? I just meant that it's not a problem on Win32 to have a file with multiple dots in the name. There can certainly be *other* reasons for it. I don't really see the need to have an extra dot in the filename in this particular case, so I'd certainly be fine with restricting this one a lot more. //Magnus ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
Re: [HACKERS] tsearch filenames unlikes special symbols and numbers
Magnus Hagander <[EMAIL PROTECTED]> writes: > On Mon, Sep 03, 2007 at 07:47:14AM +0100, Gregory Stark wrote: >> Actually I think in Windows \ : and . are problems (not allowed more >> than one dot in dos). > \ and : are problems. Is : really a problem, given that the name in question will be appended to a known directory's path? > . is not a problem. We don't support 16-bit windows anyway, and multiple > dots works fine on any system we support. I'm not convinced that . is issue-free. On most if not all versions of Unix, you are allowed to open a directory as a file and read the filenames it contains. While I don't say it'd be easy to manage that through tsearch, there's at least a potential for discovering the filenames present in . and .. --- how much do we care about that? regards, tom lane ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: [HACKERS] helps required in postgresql and visual basic 6.0
Vishnu Aggarwal wrote: > I am software engineer. i have a problem in postgresql. i am using > postgresql 8.2.4 version and i have save image in table by java program > my table fields as id-integer ,image -bytea > > i insert image by java in prepared statement by setbinarystream() its > running perfects. > i can retrieve image by java application also. > > but when i used this in visual basic 6.0 its gives problem This mailing list is for discussions regarding development of PostgreSQL server. Please post your question on a more appropriate list. -- Heikki Linnakangas EnterpriseDB http://www.enterprisedb.com ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
[HACKERS] helps required in postgresql and visual basic 6.0
hello sir I am software engineer. i have a problem in postgresql. i am using postgresql 8.2.4 version and i have save image in table by java program my table fields as id-integer ,image -bytea i insert image by java in prepared statement by setbinarystream() its running perfects. i can retrieve image by java application also. but when i used this in visual basic 6.0 its gives problem Dim Chunk() As Byte Open imagename For Binary Access Write As #1 ReDim Chunk(1 To RS("FImage").ActualSize) Chunk() = RS("FImage").GetChunk(RS("FImage").ActualSize) Put 1, , Chunk() Close #1 its says when its comes in image field its gives errors operation is not allowed in this context helps me out regards vishnu aggarwal
Re: [HACKERS] integrated tsearch has different results than tsearch2
1. I am not able use fulltext with latin2 encoding :( I missing note about only utf8 dictionaries in doc). You can use any server encoding, but dictionary's files should be in utf8 - dictionary will convert utf8 files into server encoding. 2. with hspell dictionaries (fresh copy from open office) I got different and wrong results. postgres=# select to_tsvector('cs','Příliš žlutý kůň se napil žluté vody') @@ to_tsquery('cs','napít'); ?column? -- f (1 row) Pls, output of: select ts_lexize('cspell','napil'); select to_tsvector('cs','Příliš žlutý kůň se napil žluté vody'); -- Teodor Sigaev E-mail: [EMAIL PROTECTED] WWW: http://www.sigaev.ru/ ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
Re: [HACKERS] Per-function GUC settings: trickier than it looked
On Mon, 2007-09-03 at 04:09 -0500, Decibel! wrote: > On Sun, Sep 02, 2007 at 12:08:00PM -0400, Tom Lane wrote: > > I notice BTW that we have never updated the SET reference page since > > subtransactions were introduced --- it still says only that SET LOCAL > > is "local to the current transaction", without a word about > > subtransactions. So we have a documentation problem anyway. I recall > > that we had some discussion during the 8.0 dev cycle about whether > > having SET LOCAL's effects end at the end of the current subtransaction > > was really a good idea, given that subtransactions aren't the conceptual > > model the SQL spec defines, but nothing was ever done about changing > > the implementation. > > ISTM that's the real problem; SET LOCAL wasn't fully updated/considered > when subtransactions were added. > > One way to handle this would be to have 3 different behaviors for SET: > session-level, transaction-level, and sub-transaction level. If we had > that, we could probably make an across-the-board call that all functions > operate as if in their own sub-transaction, at least when it comes to > SET. What would be the use case for that? I can't see a single reason to do a SET LOCAL SUBTRANSACTION or whatever you'd call it. What you suggest sounds nicely symmetrical, but I don't think we need it in practice. ISTM that SET LOCAL is mostly superceded by per-function parameters. Most parameters need to be tied to code, not transactions. Of course, my wish to use synchronous_commit *was* tied to a transaction, but not a subtransaction. per-function parameters are sorely needed anyhow, since with session pools we can't easily use the username for SET parameters. -- Simon Riggs 2ndQuadrant http://www.2ndQuadrant.com ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [HACKERS] Hash index todo list item
On Sun, 2007-09-02 at 13:04 -0500, Kenneth Marshall wrote: > Dear PostgreSQL Hackers: > > After following the hackers mailing list for quite a while, > I am going to start investigating what will need to be done > to improve hash index performance. Below are the pieces of > this project that I am currently considering: > > 1. Characterize the current hash index implementation against >the BTree index, with a focus on space utilization and >lookup performance against a collection of test data. This >will give a baseline performance test to evaluate the impact >of changes. I initially do not plan to bench the hash creation >process since my initial focus will be on lookup performance. > > 2. Evaluate the performance of different hash index implementations >and/or changes to the current implementation. My current plan is >to keep the implementation as simple as possible and still provide >the desired performance. Several hash index suggestions deal with >changing the layout of the keys on a page to improve lookup >performance, including reducing the bucket size to a fraction of >a page or only storing the hash value on the page, instead of >the index value itself. My goal in this phase is to produce one >or more versions with better performance than the current BTree. > > 3. Look at build time and concurrency issues with the addition of >some additional tests to the test bed. (1) > > 4. Repeat as needed. > > This is the rough plan. Does anyone see anything critical that > is missing at this point? Please send me any suggestions for test > data and various performance test ideas, since I will be working > on that first. Sounds good. I'd be particularly interested in large indexes, say ~ 0.5 - 2GB. There are likely to be various effects apparent as the indexes grow. It would be too easy to do all the tests with smaller indexes and miss things. Other factors are: - volatility - concurrency My general experience is that hash-based indexes are better when the range of inputs is relatively well-known, allowing a fast lookup. If that is the only benefit of hash indexes, a flexible hashing scheme may simply weaken the benefit-case for using them. If that's true, should the index build process examine the key values in the data to determine the best parameters to use? Kind of ANALYZE before build. My current feeling is that they ought to be very good at handling read-mostly situations such as privilege checking or UPDATE-intensive situations such as Customer-Current-Credit tracking, when the number of customers is large. It might also be worth looking at lossy hash indexes, i.e. the index stores only the block numbers. That would need to be part of the discussion around how lossy we will allow indexes to be. We currently have two kinds of full text index with different concurrency use cases, so it should be acceptable to have hash indexes have a clear benefit in one use case but a clear loss in another. -- Simon Riggs 2ndQuadrant http://www.2ndQuadrant.com ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
Re: [pgsql-www] [HACKERS] \dF wrt text search
On Sun, Sep 02, 2007 at 11:18:45PM -0300, Marc G. Fournier wrote: > Should be fixed now, running a manual run of it right now, give it about 15 > minutes or so ... Is there now monitoring for it as well? -- Decibel!, aka Jim Nasby[EMAIL PROTECTED] EnterpriseDB http://enterprisedb.com 512.569.9461 (cell) pgp0mRvoBjNHJ.pgp Description: PGP signature
Re: [HACKERS] Per-function GUC settings: trickier than it looked
On Sun, Sep 02, 2007 at 12:08:00PM -0400, Tom Lane wrote: > I notice BTW that we have never updated the SET reference page since > subtransactions were introduced --- it still says only that SET LOCAL > is "local to the current transaction", without a word about > subtransactions. So we have a documentation problem anyway. I recall > that we had some discussion during the 8.0 dev cycle about whether > having SET LOCAL's effects end at the end of the current subtransaction > was really a good idea, given that subtransactions aren't the conceptual > model the SQL spec defines, but nothing was ever done about changing > the implementation. ISTM that's the real problem; SET LOCAL wasn't fully updated/considered when subtransactions were added. One way to handle this would be to have 3 different behaviors for SET: session-level, transaction-level, and sub-transaction level. If we had that, we could probably make an across-the-board call that all functions operate as if in their own sub-transaction, at least when it comes to SET. Whatever we decide on, least-surprise would dictate that it's the same whether you apply function-specific settings or not. -- Decibel!, aka Jim Nasby[EMAIL PROTECTED] EnterpriseDB http://enterprisedb.com 512.569.9461 (cell) pgpFCPXAAI5Eg.pgp Description: PGP signature
Re: [HACKERS] integrated tsearch has different results than tsearch2
Pavel, I can't read your posting. Can you use plain text format ? Oleg On Mon, 3 Sep 2007, Pavel Stehule wrote: Hello I am testing fulltext. 1. I am not able use fulltext with latin2 encoding :( I missing noteabout only utf8 dictionaries in doc). 2. with hspell dictionaries (fresh copy from open office) I gotdifferent and wrong results. Original (old) result ts=# select * from ts_debug('P??li? ?lu?ou?k? k?? se napil ?lut? vody'); ts_name| tok_type | description | token | dict_name | tsvector --+--+-+---+---+ default_czech | word | Word| P??li? |{cz_ispell,simple} | 'p??li?' default_czech | word | Word| ?lu?ou?k? |{cz_ispell,simple} | '?lu?ou?k?' default_czech | word | Word | k?? | {cz_ispell,simple} | 'k??' default_czech | lword| Latin word | se| {cz_ispell,simple} | default_czech | lword| Latin word | napil |{cz_ispell,simple} | 'nap?t' default_czech | word | Word | ?lut? |{cz_ispell,simple} | '?lut?' default_czech | lword| Latin word | vody |{cz_ispell,simple} | 'voda' (7 ??dek) New results:postgres=# create Text search dictionary cspell(template=ispell,dictfile=czech, afffile=czech, stopwords=czech);CREATE TEXT SEARCH DICTIONARYpostgres=# CREATE text search configuration cs (copy=english);CREATE TEXT SEARCH CONFIGURATION postgres=# alter text search configuration cs alter mapping for word,lword with cspell, simple;ALTER TEXT SEARCH CONFIGURATIONpostgres=# select * from ts_debug('cs','P??li? ?lu?ou?k? k?? se napil?lut? vody'); Alias | Description | Token | Dictionaries |Lexized token---+---+---+-+- word | Word | P??li?| {cspell,simple} | cspell: {p??li?} blank | Space symbols | | {} | word | Word | ?lu?ou?k? | {cspell,simple} | cspell: {?lu?ou?k?} blank | Space symbols | | {} | word | Word | k?? | {cspell,simple} | cspell: {k??} blank | Space symbols | | {} | lword | Latin word | se| {cspell,simple} | cspell: {} blank | Space symbols | | {} | lword | Latin word| napil | {cspell,simple} | simple: {napil} blank | Space symbols | | {} | word | Word | ?lut? | {cspell,simple} | simple: {?lut?} blank | Space symbols | | {} | lword | Latin word| vody | {cspell,simple} | simple: {vody}(13 rows) This query returned true in 8.2 and now: postgres=# select to_tsvector('cs','P??li? ?lut? k?? se napil ?lut?vody') @@ to_tsquery('cs','nap?t'); ?column?-- f(1 row) RegardsPavel Stehule Regards, Oleg _ Oleg Bartunov, Research Scientist, Head of AstroNet (www.astronet.ru), Sternberg Astronomical Institute, Moscow University, Russia Internet: [EMAIL PROTECTED], http://www.sai.msu.su/~megera/ phone: +007(495)939-16-83, +007(495)939-23-83 ---(end of broadcast)--- TIP 7: You can help support the PostgreSQL project by donating at http://www.postgresql.org/about/donate
Re: [HACKERS] [PATCH] Lazy xid assingment V2
August Zajonc wrote: > The thing is, the leak occurs in situation where a COMMIT hasn't > returned to the user, so we are trying to guarantee no data-loss even > when the user doesn't see a successful commit? That's a tall order > obviously and hopefully people design their apps to attend to > transaction success / failure. No, by the time we'd delete/move to trash the files, we know that they're no longer needed. But because deleting a relation is such a drastic thing to do, with potential for massive data loss, we're being extra paranoid. What if there's a bug in PostgreSQL that makes us delete the wrong file? Or transaction wraparound happens, despite the protections that we now have in place? Or you have bad RAM in the server, and a bit flips at an unfortunate place? That's the kind of situations we're worried about. -- Heikki Linnakangas EnterpriseDB http://www.enterprisedb.com ---(end of broadcast)--- TIP 7: You can help support the PostgreSQL project by donating at http://www.postgresql.org/about/donate
Re: [HACKERS] tsearch filenames unlikes special symbols and numbers
On Mon, Sep 03, 2007 at 07:47:14AM +0100, Gregory Stark wrote: > "Tom Lane" <[EMAIL PROTECTED]> writes: > > > Gregory Stark <[EMAIL PROTECTED]> writes: > >> "Tom Lane" <[EMAIL PROTECTED]> writes: > >>> I made it reject all but latin letters, which is the same restriction > >>> that's in place for timezone set filenames. That might be overly > >>> strong, but we definitely have to forbid "." and "/" (and "\" on > >>> Windows). Do we want to restrict it to letters, digits, underscore? > >>> Or does it need to be weaker than that? > > > >> What's the problem with "."? > > > > ../../../../etc/passwd > > > > Possibly we could allow '.' as long as we forbade /, > > Right, traditionally the only characters forbidden in filenames in Unix are / > and nul. If we want the files to play nice in Gnome etc then we should > restrict them to ascii since we don't know what encoding the gui expects. > > Actually I think in Windows \ : and . are problems (not allowed more than one > dot in dos). \ and : are problems. . is not a problem. We don't support 16-bit windows anyway, and multiple dots works fine on any system we support. //Magnus ---(end of broadcast)--- TIP 7: You can help support the PostgreSQL project by donating at http://www.postgresql.org/about/donate
Re: [HACKERS] Per-function search_path => per-function GUC settings
On 9/2/07, Tom Lane <[EMAIL PROTECTED]> wrote: > "Marko Kreen" <[EMAIL PROTECTED]> writes: > > On 9/2/07, Tom Lane <[EMAIL PROTECTED]> wrote: > >> Seems a little verbose, but maybe we could do "SET var FROM CURRENT" > >> or "SET var FROM SESSION"? > > > I'd prefer FROM SESSION then. FROM CURRENT seems unclear. > > Actually, I think FROM SESSION is unclear, as it opens the question > whether the value to be applied is the session-wide setting or the > currently active one. Inside a transaction that has done SET LOCAL, > these are different things. I did not consider that. > I think we pretty clearly want to have it take the currently active > setting, and I'd vote for FROM CURRENT as the best way of expressing > that. Ok. -- marko ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [pgsql-www] [HACKERS] \dF wrt text search
On Sun, Sep 02, 2007 at 09:57:16PM -0400, Tom Lane wrote: > Robert Treat <[EMAIL PROTECTED]> writes: > > Blah I compiled last night, using the latest snapshot in the > > postgresql/dev/ directory in ftp, which, as I look now, has a date listed > > of > > 2007-08-10, and looking in the source the catversion is 200702251, which is > > before the tsearch bits hit the tree. So, I think my problems lie in the > > snapshot no longer being updated :-\ CCing www in case someone wants to > > fix > > it. > > I'll bet a dollar it got broken in the master-CVS-server move. I thought it worked off the same cvs repo as the cvsweb stuff (on svr1), which AFAIK isn't broken (since cvsweb works). But it's not documented anywhere, and I do recall not being able to track down exactly how it worked, so I won't be able to proove it to the point of being able to collect that dollar ;-) //Magnus ---(end of broadcast)--- TIP 7: You can help support the PostgreSQL project by donating at http://www.postgresql.org/about/donate
Re: [HACKERS] tsearch filenames unlikes special symbols and numbers
On 9/2/07, Gregory Stark <[EMAIL PROTECTED]> wrote: > Right, traditionally the only characters forbidden in filenames in Unix are / > and nul. If we want the files to play nice in Gnome etc then we should > restrict them to ascii since we don't know what encoding the gui expects. > > Actually I think in Windows \ : and . are problems (not allowed more than one > dot in dos). Reserved characters in Windows filenames are < > : " / \ | ? * DOS limitations aren't relevant on the OS versions Postgres supports. ...but I thought this was about opening existing files, not creating them, in which case the only relevant limitation is path separators. Any other reserved characters are going to result in no open file, rather than a security hole. ---(end of broadcast)--- TIP 7: You can help support the PostgreSQL project by donating at http://www.postgresql.org/about/donate
[HACKERS] integrated tsearch has different results than tsearch2
Hello I am testing fulltext. 1. I am not able use fulltext with latin2 encoding :( I missing note about only utf8 dictionaries in doc). 2. with hspell dictionaries (fresh copy from open office) I got different and wrong results. Original (old) result ts=# select * from ts_debug('Příliš žluťoučký kůň se napil žluté vody'); ts_name| tok_type | description | token | dict_name | tsvector --+--+-+---+ ---+ default_czech | word | Word| Příliš| {cz_ispell,simple} | 'příliš' default_czech | word | Word| žluťoučký | {cz_ispell,simple} | 'žluťoučký' default_czech | word | Word| kůň | {cz_ispell,simple} | 'kůň' default_czech | lword| Latin word | se| {cz_ispell,simple} | default_czech | lword| Latin word | napil | {cz_ispell,simple} | 'napít' default_czech | word | Word| žluté | {cz_ispell,simple} | 'žlutý' default_czech | lword| Latin word | vody | {cz_ispell,simple} | 'voda' (7 řádek) New results: postgres=# create Text search dictionary cspell(template=ispell, dictfile=czech, afffile=czech, stopwords=czech); CREATE TEXT SEARCH DICTIONARY postgres=# CREATE text search configuration cs (copy=english); CREATE TEXT SEARCH CONFIGURATION postgres=# alter text search configuration cs alter mapping for word, lword with cspell, simple; ALTER TEXT SEARCH CONFIGURATION postgres=# select * from ts_debug('cs','Příliš žluťoučký kůň se napil žluté vody'); Alias | Description | Token | Dictionaries |Lexized token ---+---+---+-+- word | Word | Příliš| {cspell,simple} | cspell: {příliš} blank | Space symbols | | {} | word | Word | žluťoučký | {cspell,simple} | cspell: {žluťoučký} blank | Space symbols | | {} | word | Word | kůň | {cspell,simple} | cspell: {kůň} blank | Space symbols | | {} | lword | Latin word| se| {cspell,simple} | cspell: {} blank | Space symbols | | {} | lword | Latin word| napil | {cspell,simple} | simple: {napil} blank | Space symbols | | {} | word | Word | žluté | {cspell,simple} | simple: {žluté} blank | Space symbols | | {} | lword | Latin word| vody | {cspell,simple} | simple: {vody} (13 rows) This query returned true in 8.2 and now: postgres=# select to_tsvector('cs','Příliš žlutý kůň se napil žluté vody') @@ to_tsquery('cs','napít'); ?column? -- f (1 row) Regards Pavel Stehule ---(end of broadcast)--- TIP 6: explain analyze is your friend