Re: [HACKERS] Syntax for partitioning

2009-11-18 Thread Simon Riggs
On Wed, 2009-11-18 at 13:24 +0900, Itagaki Takahiro wrote:
 Simon Riggs si...@2ndquadrant.com wrote:
 
  Why not just wait until we have a whole patch and then apply?
 
 A whole patch can be written by many contributers instead of only
 one person, no?  I think we need to split works for partitioning
 into serveral parts to encourage developing it. I just did one of
 the parts, syntax. Anothe patch Partitioning option for COPY
 will do a good job in the field of INSERT. 

If we can agree the parts that are required, I would at least be
confident that we have understood this enough to allow one part to
proceed ahead of the others.

For partitioning the parts are these

1. Syntax for explicit partitioning
2. Internal data representations
3. Optimizations
   many and various
4. Data Routing
   a) Data routing on INSERT/COPY
   b) UPDATE handling when the UPDATE causes partition migration

If this patch puts forward a solution for (2) also, then it is
potentially worthwhile. That is the real blocking point here. Once we
have that other people will quickly fill in the later parts.

I foresee a data structure that is a sorted list of boundary-values,
cached on the parent-relation. This should be accessible to allow
bsearch of particular values during both planning and execution. Same
rules apply as btree operator classes. For multi-level hierarchies the
parent level should have the union of all sub-hierarchies. I think we
need an index on pg_inherits also.

So please do (1) and (2), not just (1) in isolation.

-- 
 Simon Riggs   www.2ndQuadrant.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Very bad FTS performance with the Polish config

2009-11-18 Thread Oleg Bartunov

Wojciech,

your polish_english, polish configurations uses ispell language and slow,
while english configuration doesn't contains ispell. So, what's your
complains ? Try add ispell dictionary to english configuration and see
timings.

Oleg

On Wed, 18 Nov 2009, Wojciech Knapik wrote:



Hello


This has been discussed in #postgresql and posted to -performance a
couple days ago, but no solution has been found. The discussion can be
found here:
http://archives.postgresql.org/pgsql-performance/2009-11/msg00162.php

I just finished implementing a search engine for my site and found
ts_headline extremely slow when used with a Polish tsearch
configuration, while fast with English. All of it boils down to a simple
testcase, but first some background.

I tested on 8.3.1 on G5/OSX 10.5.8 and Xeon/Gentoo AMD64-2008.0
(2.6.21), then switched both installations to 8.3.8 (both packages
compiled from source, but provided by the distro - port/emerge). The
Polish dictionaries and config were created according to this article
(it's in Polish, but the code is self-explanatory):

http://www.depesz.com/index.php/2008/04/22/polish-tsearch-in-83-polski-tsearch-w-postgresie-83/

Now for the testcase:

text = 'Lorem ipsum dolor sit amet, consectetur adipisicing elit, sed do
eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad
minim veniam, quis nostrud exercitation ullamco laboris nisi ut aliquip
ex ea commodo consequat. Duis aute irure dolor in reprehenderit in
voluptate velit esse cillum dolore eu fugiat nulla pariatur. Excepteur
sint occaecat cupidatat non proident, sunt in culpa qui officia deserunt
mollit anim id est laborum.'

# explain analyze select ts_headline('polish', text,
plainto_tsquery('polish', 'foobar'));
QUERY PLAN

Result  (cost=0.00..0.01 rows=1 width=0) (actual time=6.407..6.470
rows=1 loops=1)
Total runtime: 6.524 ms
(2 rows)

# explain analyze select ts_headline('english', text,
plainto_tsquery('english', 'foobar'));
QUERY PLAN

Result  (cost=0.00..0.01 rows=1 width=0) (actual time=0.861..0.895
rows=1 loops=1)
Total runtime: 0.935 ms
(2 rows)

# explain analyze select ts_headline('simple', text,
plainto_tsquery('simple', 'foobar'));
QUERY PLAN

Result  (cost=0.00..0.01 rows=1 width=0) (actual time=0.627..0.660
rows=1 loops=1)
Total runtime: 0.697 ms
(2 rows)

#

As you can see, the results differ by an order of magnitude between
Polish and English. While in this simple testcase it's a non-issue, in
the real world this translates into enormous overhead.

One of the queries I ran testing my site's search function took
1870ms. When I took that query and changed all ts_headline(foo) calls to
just foo, the time dropped below 100ms. That's the difference between
something completely unacceptable and something quite useful.

I can post various details about the hardware, software and specific
queries, but the testcases speak for themselves. I'm sure you can easily
reproduce my results.

I'm putting my code into production tomorrow, since I can't wait
anymore. Hints would be very much appreciated!


cheers,
Wojciech Knapik

PS. This issue is not related to the loading time of dictionaries, or
calls to ts_headline for results that won't be displayed. A few other
details can be found here
http://pastie.textmate.org/private/hqnqfnsfsknjyjlffzmog along with
snippets of my conversations in #postgresql that lead to this testcase.
Big thanks to RhodiumToad for helping me with fts for the last couple
days ;]






Regards,
Oleg
_
Oleg Bartunov, Research Scientist, Head of AstroNet (www.astronet.ru),
Sternberg Astronomical Institute, Moscow University, Russia
Internet: o...@sai.msu.su, http://www.sai.msu.su/~megera/
phone: +007(495)939-16-83, +007(495)939-23-83

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] UTF8 with BOM support in psql

2009-11-18 Thread Peter Eisentraut
On ons, 2009-11-18 at 12:52 +0900, Itagaki Takahiro wrote:
 Peter Eisentraut pete...@gmx.net wrote:
 
  Together, that should cover a lot of cases.  Not perfect, but far from
  useless.
 
 For Japanese users on Windows, the client encoding are always set to SJIS
 because of the restriction of cmd.exe. But the script file can be written
 in UTF8 with BOM. I don't think we should depend on client encoding.

Set by whom, how, and because of what restriction?


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] UTF8 with BOM support in psql

2009-11-18 Thread Peter Eisentraut
On tis, 2009-11-17 at 23:22 -0500, Andrew Dunstan wrote:
 Itagaki Takahiro wrote:
  I don't want user to check the encoding of scripts before executing
 --
  it is far from fail-safe.
 
 

 
 That's what we require in all other cases. Why should UTF8 be special?

But now we're back to the original problem.  Certain editors insert BOMs
at the beginning of the file.  And that is by any definition before the
embedded client encoding declaration.  I think the only ways to solve
this are:

1) Ignore the BOM if a client encoding declaration of UTF8 appears in a
narrowly defined location near the beginning of the file (XML and
PEP-0263 style).  For *example*, we could ignore the BOM if the file
starts with exactly BOM\encoding UTF8\n.  Would probably not work
well in practice.

2) Parse two alternative versions of the file, one with the BOM ignored
and one with the BOM not ignored, until you need to make a decision.
Hilariously complicated, but would perhaps solve the problem.

3) Give up, do nothing.



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] RFC for adding typmods to functions

2009-11-18 Thread Peter Eisentraut
On tis, 2009-11-17 at 17:09 -0500, Tom Lane wrote:
 I can see the following definitional issues:

Should we be able to find the answers to those, or at least a basis of
discussion about those, in the SQL standard?  Has anyone checked?



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Unpredictable shark slowdown after migrating to 8.4

2009-11-18 Thread Sergey Konoplev
Thank you for the hints.

 Why only those modes?  I'd search for locks with granted=false, then see
 all the other locks held by the process that's holding the conflicting
 lock with granted=true (i.e. the one you're waiting on).


Something like this?

SELECT
granted,
pid,
virtualxid,
transactionid,
virtualtransaction,
count(1) AS locks,
current_query
FROM
pg_locks AS l
LEFT JOIN pg_stat_activity AS a ON
pid = procpid
GROUP BY 1, 2, 3, 4, 5, 7
ORDER BY 1, 6 DESC;


And two more queries to do extended analysis of its results after restarting PG:

SELECT
pg_stat_activity.datname,
pg_class.relname,
pg_locks.transactionid,
pg_locks.mode,
pg_locks.granted,
pg_stat_activity.usename,
pg_stat_activity.current_query,
pg_stat_activity.query_start,
age(now(),pg_stat_activity.query_start) AS age,
pg_stat_activity.procpid
FROM
pg_stat_activity,
pg_locks
LEFT OUTER JOIN pg_class ON
pg_locks.relation = pg_class.oid
WHERE
pg_locks.pid = pg_stat_activity.procpid
ORDER BY
query_start;


SELECT * FROM pg_locks;


Are there another things I should do when the problem rise up again?

-- 
Regards,
Sergey Konoplev

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Very bad FTS performance with the Polish config

2009-11-18 Thread Wojciech Knapik


Tom Lane wrote:


  I tested on 8.3.1 on G5/OSX 10.5.8 and Xeon/Gentoo AMD64-2008.0
(2.6.21), then switched both installations to 8.3.8 (both packages
compiled from source, but provided by the distro - port/emerge). The
Polish dictionaries and config were created according to this article
(it's in Polish, but the code is self-explanatory):



http://www.depesz.com/index.php/2008/04/22/polish-tsearch-in-83-polski-tsearch-w-postgresie-83/


I tried to duplicate this test, but got no further than here:

u8=# CREATE TEXT SEARCH DICTIONARY polish_ispell (
TEMPLATE = ispell,
DictFile = polish,
AffFile = polish,
StopWords = polish
);
ERROR:  syntax error
CONTEXT:  line 174 of configuration file 
/home/tgl/testversion/share/postgresql/tsearch_data/polish.affix:   L E C   
   -C,GĹEM #zalec (15a)

u8=# 


Seems there's something about the current version of the dictionary that
we don't like.  I used sjp-ispell-pl-20091117-src.tar.bz2 ...


Here are the files I used (polish.affix, polish.dict already generated):
http://wolniartysci.pl/pl.tar.gz

These should work fine. I'd be grateful if you could test and see if you 
get similar results.


cheers,
Wojciech Knapik

PS. Weird, I get the emails without a reply-to set for the list..


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Very bad FTS performance with the Polish config

2009-11-18 Thread Wojciech Knapik



your polish_english, polish configurations uses ispell language and slow,
while english configuration doesn't contains ispell. So, what's your
complains ? Try add ispell dictionary to english configuration and see
timings.


Oh, so this is not anomalous ? These are the expected speeds for an 
ispell dictionary ? I didn't realize that. Sorry for the bother then. It 
just seemed way too slow to be practical.


cheers,
Wojciech Knapik

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Rejecting weak passwords

2009-11-18 Thread Albe Laurenz
Itagaki Takahiro wrote:
 Looks good. I change status of the patch to Ready for Committer.

Thanks for the help!

 BTW, it might not be a work for this patch, we also need to
 reject too long VALID UNTIL setting. If the password is
 complex, we should not use the same password for a long time.

There are some cases, e.g. application servers logging into the
database, where you cannot just let the password expire, so I
think this would at best have to be a rule with exceptions.

Another thing that makes VALID UNTIL inconvenient to use is
that after expiration, logins simply fail, and the user
is never prompted to change the password.

But of course you are right, requiring a limited password
lifetime is closely related to requiring a good password.

Yours,
Laurenz Albe

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] plperl and inline functions -- first draft

2009-11-18 Thread Alexey Klyukin

On Nov 18, 2009, at 5:46 AM, Andrew Dunstan wrote:

 
 
 Joshua Tolley wrote:
 +plperl_call_data *save_call_data = current_call_data;
 +boololdcontext = trusted_context;
 + +  if (SPI_connect() != SPI_OK_CONNECT)
 +elog(ERROR, could not connect to SPI manager);
  
 ...
 +current_call_data = (plperl_call_data *) 
 palloc0(sizeof(plperl_call_data));
 +current_call_data-fcinfo = fake_fcinfo;
 +current_call_data-prodesc = desc; 
  
 
 I don't think this is done in the right order. If it is then this comment in 
 plperl_func_handler is wrong (as well as containing a typo):
 
   /*
* Create the call_data beforing connecting to SPI, so that it is not
* allocated in the SPI memory context
*/
 

Yes, current_call_data can't be allocate in the SPI memory context, since it's 
used to extract the result after SPI_finish is called, although it doesn't lead 
to problems here since no result is returned. Anyway, I'd move SPI_connect 
after the current_call_data initialization.

I also noticed that no error context is set in the inline handler, not sure 
whether it really useful except for the sake of consistency, but in case it is 
- here is the patch:


inline_callback.diff
Description: Binary data


--
Alexey Klyukin  http://www.CommandPrompt.com/
The PostgreSQL Company - Command Prompt, Inc


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] RFC for adding typmods to functions

2009-11-18 Thread Pavel Stehule
2009/11/18 Peter Eisentraut pete...@gmx.net:
 On tis, 2009-11-17 at 17:09 -0500, Tom Lane wrote:
 I can see the following definitional issues:

 Should we be able to find the answers to those, or at least a basis of
 discussion about those, in the SQL standard?  Has anyone checked?


I am not sure if SQL standard is good inspiration in this case. Does
SQL standard typmod less varchar or numeric? Does SQL standard
polymorphic types?

Maybe only one should be in standard. Reply to question should exists
functions foo(varchar(3)) and foo(varchar(10)) in same time?

Regards
Pavel




-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] byteain for new hex escaped data

2009-11-18 Thread Kris Jurka


Looking at how byteain detects whether the input it is passed is the new 
hex format escape or the old octal escape, it uses:


char   *inputText = PG_GETARG_CSTRING(0);
if (inputText[0] == '\\'  inputText[1] == 'x')

Doesn't this read off the end of inputText in the case of SELECT 
''::bytea, or is there some padding happening somewhere that makes this 
legal?


Kris Jurka


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Summary and Plan for Hot Standby

2009-11-18 Thread Heikki Linnakangas
Tatsuo Ishii wrote:
 Please correct me if I'm wrong. Parse will result in obtaining
 RowExclusiveLock on the target table if it is parsing
 INSERT/UPDATE/DELETE. If so, is this ok in the standby?
 Any attempt to take RowExclusiveLock will fail.

 Any attempt to execute INSERT/UPDATE/DELETE will fail.

 This behaviour should be identical to read only transaction mode. If it
 is not documented as an exception, please report as a bug.
 
 Is it?
 
 It seems read only transaction mode is perfectly happy with
 RowExclusiveLock:

Hmm, that's a good point. I can't immediately see that that would cause
any trouble, but it gives me an uncomfortable feeling about the locking.
Which locks exactly need to be replayed in standby, and why? Which locks
can read-only transactions acquire?

The doc says:
+   In recovery, transactions will not be permitted to take any table lock
+   higher than AccessShareLock. In addition, transactions may never assign
+   a TransactionId and may never write WAL.
+   Any LOCK TABLE command that runs on the standby and requests a specific
+   lock type other than AccessShareLock will be rejected.

which seems wrong, given Tatsuo-sans example. Is that paragraph only
referring to LOCK TABLE, and not other means of acquiring locks? Either
way, it needs to be clarified or fixed.

access/transam/README says:
+Further details on locking mechanics in recovery are given in comments
+with the Lock rmgr code.

but there's no explanation there either *why* the locking works as it
is. In LockAcquire(), we forbid taking locks higher than AccessShareLock
during recovery mode, but only for LOCKTAG_OBJECT locks. Why?

-- 
  Heikki Linnakangas
  EnterpriseDB   http://www.enterprisedb.com

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] byteain for new hex escaped data

2009-11-18 Thread Peter Eisentraut
On ons, 2009-11-18 at 06:46 -0500, Kris Jurka wrote:
 Looking at how byteain detects whether the input it is passed is the new 
 hex format escape or the old octal escape, it uses:
 
  char   *inputText = PG_GETARG_CSTRING(0);
  if (inputText[0] == '\\'  inputText[1] == 'x')
 
 Doesn't this read off the end of inputText in the case of SELECT 
 ''::bytea, or is there some padding happening somewhere that makes this 
 legal?

In case of ''::bytea, the inputText is  as a C string, and so
inputText[0] == '\0' and the second test is not executed.


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] RFC for adding typmods to functions

2009-11-18 Thread Peter Eisentraut
On ons, 2009-11-18 at 11:46 +0100, Pavel Stehule wrote:
 I am not sure if SQL standard is good inspiration in this case.

I'm not sure either, but I think it's premature to make a conclusion
about that without having checked at all.


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] RFC for adding typmods to functions

2009-11-18 Thread Pavel Stehule
2009/11/18 Peter Eisentraut pete...@gmx.net:
 On ons, 2009-11-18 at 11:46 +0100, Pavel Stehule wrote:
 I am not sure if SQL standard is good inspiration in this case.

 I'm not sure either, but I think it's premature to make a conclusion
 about that without having checked at all.

ok, I recheck SQL/PSM part again :)

Pavel




-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] UTF8 with BOM support in psql

2009-11-18 Thread Andrew Dunstan



Peter Eisentraut wrote:

But now we're back to the original problem.  Certain editors insert BOMs
at the beginning of the file.  And that is by any definition before the
embedded client encoding declaration.  I think the only ways to solve
this are:

1) Ignore the BOM if a client encoding declaration of UTF8 appears in a
narrowly defined location near the beginning of the file (XML and
PEP-0263 style).  For *example*, we could ignore the BOM if the file
starts with exactly BOM\encoding UTF8\n.  Would probably not work
well in practice.

2) Parse two alternative versions of the file, one with the BOM ignored
and one with the BOM not ignored, until you need to make a decision.
Hilariously complicated, but would perhaps solve the problem.

3) Give up, do nothing.

  


4) set the client encoding before the file is read in any of the ways 
that have already been discussed and then allow psql to eat the BOM.


cheers

andrew

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] operator exclusion constraints

2009-11-18 Thread Jeff Davis
I'm in Tokyo right now, so please excuse my abbreviated reply.

On Tue, 2009-11-17 at 23:13 -0500, Robert Haas wrote:
 Forgive me if this is discussed before, but why does this store the
 strategy numbers of the relevant operators instead of the operators
 themselves?

At constraint definition time, I need to make sure that the strategy
numbers can be identified anyway, so it wouldn't save any work in
ATAddOperatorExclusionConstraint. At the time it seemed slightly more
direct to use strategy numbers in index_check_constraint, but it's
probably about the same.

 It seems like this could lead to surprising behavior if
 the user modifies the definition of the operator class.

Right now, operator classes can't be modified in any meaningful way. Am
I missing something?

 I'm wondering if we can't use the existing
 BuildIndexValueDescription() rather than the new function
 tuple_as_string().  I realize there are two tuples, but maybe it makes
 sense to just call it twice?

Are you suggesting I change the error output, or reorganize the code to
try to reuse BuildIndexValueDescription, or both?

Regards,
Jeff Davis


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] UTF8 with BOM support in psql

2009-11-18 Thread Peter Eisentraut
On ons, 2009-11-18 at 08:52 -0500, Andrew Dunstan wrote:
 4) set the client encoding before the file is read in any of the ways 
 that have already been discussed and then allow psql to eat the BOM.

This is certainly a workaround, just like piping the file through a
suitable sed expression would be, but conceptually, the client encoding
is a property of the file and should therefore be marked in the file.


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] operator exclusion constraints

2009-11-18 Thread Dimitri Fontaine
Robert Haas robertmh...@gmail.com writes:
 Forgive me if this is discussed before, but why does this store the
 strategy numbers of the relevant operators instead of the operators
 themselves?  It seems like this could lead to surprising behavior if
 the user modifies the definition of the operator class.

Wild guess:

  http://www.postgresql.org/docs/8.4/static/xindex.html

  34.14.2. Index Method Strategies

  The operators associated with an operator class are identified by
  strategy numbers, which serve to identify the semantics of each
  operator within the context of its operator class. For example,
  B-trees impose a strict ordering on keys, lesser to greater, and so
  operators like less than and greater than or equal to are
  interesting with respect to a B-tree. Because PostgreSQL allows the
  user to define operators, PostgreSQL cannot look at the name of an
  operator (e.g.,  or =) and tell what kind of comparison it
  is. Instead, the index method defines a set of strategies, which can
  be thought of as generalized operators. Each operator class specifies
  which actual operator corresponds to each strategy for a particular
  data type and interpretation of the index semantics.

Regards,
-- 
dim

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] operator exclusion constraints

2009-11-18 Thread Josh Berkus
All,

FWIW, I'm doing a redesign of a client's production web application
right now.  I was able, by combining OEC and the Period type from
pgfoundry, to make a set of constraints for declaratively asserting in a
sports database:

That the same player couldn't belong to two different teams at the same
time;
That the same player couldn't belong to the same team in two different
positions with overlapping time periods.

This worked as spec'd, and would be extremely useful for this real-world
app if it was ready to use in production now.

However, I do have an issue with the SQLSTATE returned from the OEC
violation.  Currently it returns constraint violation, which, from the
perspective of an application developer, is not useful.  OECs are, in
application terms, materially identical to UNIQUE constraints and serve
the same purpose.  As such, I'd far rather see OECs return unique key
violation instead, as any existing application error-trapping code would
handle the violation more intelligently if it did.

--Josh Berkus


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Very bad FTS performance with the Polish config

2009-11-18 Thread Oleg Bartunov

On Wed, 18 Nov 2009, Wojciech Knapik wrote:




your polish_english, polish configurations uses ispell language and slow,
while english configuration doesn't contains ispell. So, what's your
complains ? Try add ispell dictionary to english configuration and see
timings.


Oh, so this is not anomalous ? These are the expected speeds for an ispell 
dictionary ? I didn't realize that. Sorry for the bother then. It just seemed 
way too slow to be practical.


You can see real timings using ts_lexize() function for different dictionaries
(try several time to avoid cold-start problem) instead of ts_headline(), 
which involves other factors.


On my test machine I see no real difference between very simple dictionary
and french ispell, snowball dictionaries:

dev-oleg=# select ts_lexize('simple','voila');
 ts_lexize
---
 {voila}
(1 row)

Time: 0.282 ms
dev-oleg=# select ts_lexize('simple','voila');
 ts_lexize
---
 {voila}
(1 row)

Time: 0.269 ms

dev-oleg=# select ts_lexize('french_stem','voila');
 ts_lexize
---
 {voil}
(1 row)

Time: 0.187 ms

I see no big difference in ts_headline as well:

dev-oleg=# select ts_headline('english','I can do voila', 'voila'::tsquery);
  ts_headline
---
 I can do bvoila/b
(1 row)

Time: 0.265 ms
dev-oleg=# select ts_headline('nomaofr','I can do voila', 'voila'::tsquery);
  ts_headline
---
 I can do bvoila/b
(1 row)

Time: 0.299 ms

This is 8.4.1 version of PostgreSQL.

Regards,
Oleg
_
Oleg Bartunov, Research Scientist, Head of AstroNet (www.astronet.ru),
Sternberg Astronomical Institute, Moscow University, Russia
Internet: o...@sai.msu.su, http://www.sai.msu.su/~megera/
phone: +007(495)939-16-83, +007(495)939-23-83

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] Patch - Reference Function Parameters by Name

2009-11-18 Thread George Gensure
Attached is a patch to perform parameter reference lookups by name in
the body of functions.  I'm hesitant to put it in for the commitfest
as is, without a couple of questions posed to the group:

1. palloc needs no free?  I suppose this is a general knowledge
question, but it seemed to be the case after trying to look for
deallocation
2. I inserted myself more than I had expected along the road from SQL
to columnref_hook, and I'm not sure all of those lookups of parameter
names and function name are required.
3. Since it was mentioned in an earlier email that the function
name.parameter name syntax was desired, I went ahead and added
that, but it required another passthrough as indicated in 2
4. I made a judgement call in terms of resolution: if the
columnref_hook for param-by-name resolution is called with a non-null
node (meaning a column was already found), we avoid being an ambiguous
reference, and prefer the column already found.

Passes all tests in make check, and I'll add some tests for this after
I get feedback for the above items.

Regards,
-George


param_names.patch
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] UTF8 with BOM support in psql

2009-11-18 Thread Tom Lane
Peter Eisentraut pete...@gmx.net writes:
 This is certainly a workaround, just like piping the file through a
 suitable sed expression would be, but conceptually, the client encoding
 is a property of the file and should therefore be marked in the file.

In a perfect world things would be like that, but the world is
imperfect.  When only one of the available encodings even pretends
to have a marking convention, and even that one convention is broken,
imagining that you can fix it is just a recipe for making things worse.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Very bad FTS performance with the Polish config

2009-11-18 Thread Wojciech Knapik

Oleg Bartunov wrote:


your polish_english, polish configurations uses ispell language
and slow, while english configuration doesn't contains ispell.
So, what's your complains ? Try add ispell dictionary to english
configuration and see timings.


Oh, so this is not anomalous ? These are the expected speeds for an
ispell dictionary ? I didn't realize that. Sorry for the bother
then. It just seemed way too slow to be practical.


You can see real timings using ts_lexize() function for different 
dictionaries (try several time to avoid cold-start problem) instead

of ts_headline(), which involves other factors.

On my test machine I see no real difference between very simple
dictionary and french ispell, snowball dictionaries:


ts_lexize seems to be just as fast for simple, polish_ispell and 
english_stem with the 'voila' argument.


polish_ispell is in fact *faster* for the lorem ipsum text repeated a 
couple times (10 ?). Which suggests that the issue is with ts_headline 
iteself.



I see no big difference in ts_headline as well:

dev-oleg=# select ts_headline('english','I can do voila', 
'voila'::tsquery);

  ts_headline
---
 I can do bvoila/b
(1 row)

Time: 0.265 ms


Yes, for 4-word texts the results are similar.
Try that with a longer text and the difference becomes more and more 
significant. For the lorem ipsum text, 'polish' is about 4 times slower, 
than 'english'. For 5 repetitions of the text, it's 6 times, for 10 
repetitions - 7.5 times...



This is 8.4.1 version of PostgreSQL.


An that was 8.3.8/OSX.

cheers,
Wojciech Knapik

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Python 3.1 support

2009-11-18 Thread Peter Eisentraut
On sön, 2009-11-15 at 18:39 -0700, James Pye wrote:
 I can see how function modules might look like a half-step backwards from 
 function fragments at first, but the benefits of a *natural* initialization 
 section (the module body) was enough to convince me. The added value on the 
 PL developer's side was also compelling. Tracebacks were trivial to 
 implement, and there is no need to munge the function's source. It seemed 
 like a win all around...

The question is whether it helps the user, not the implementer.  As far
as I can tell, it just creates more typing for no benefit whatsoever.
Also, it's inconsistent with normal Python script files and with other
PLs.

 AFA native typing is concerned, I think the flexibility and potential it 
 offers is useful, no? Already, plpython3 provides properties on PG's datetime 
 types to access the date_part()'s of the object.
 
 OTOH, for folk who primarily use the PL to access functionality in Python 
 modules(bindings), native typing may be of no direct utility as they will 
 likely need to convert anyways. (If that's your common use-case, then the 
 absence of interest in native typing is quite understandable.)

Right, if I use PL/Python, I do it because I want to use Python.  I
don't need another PostgreSQL implementation on top of Python.  The
maintenance effort required to keep those two consistent aside.

Again, I'm only one user.  But so far I haven't seen anyone else speak
up here, and clearly accepting this for inclusion will need nontrivial
convincing.

  the pain of dealing with a second implementation.
 
 What pain are you anticipating? Maintenance?

Right.


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Very bad FTS performance with the Polish config

2009-11-18 Thread Pavel Stehule
2009/11/18 Oleg Bartunov o...@sai.msu.su:
 On Wed, 18 Nov 2009, Wojciech Knapik wrote:


 your polish_english, polish configurations uses ispell language and slow,
 while english configuration doesn't contains ispell. So, what's your
 complains ? Try add ispell dictionary to english configuration and see
 timings.

 Oh, so this is not anomalous ? These are the expected speeds for an ispell
 dictionary ? I didn't realize that. Sorry for the bother then. It just
 seemed way too slow to be practical.

 You can see real timings using ts_lexize() function for different
 dictionaries
 (try several time to avoid cold-start problem) instead of ts_headline(),
 which involves other factors.

 On my test machine I see no real difference between very simple dictionary
 and french ispell, snowball dictionaries:


It's depend on language (and dictionary sizes).

for czech:

postgres=# select ts_lexize('simple','vody');
 ts_lexize
---
 {vody}
(1 row)

Time: 0.785 ms

postgres=# select ts_lexize('cspell','vody');
 ts_lexize
---
 {voda}
(1 row)

Time: 1.041 ms

I afraid so czech and polland language is very hard (with long affix file).

Regards
Pavel

 dev-oleg=# select ts_lexize('simple','voila');
  ts_lexize
 ---
  {voila}
 (1 row)

 Time: 0.282 ms
 dev-oleg=# select ts_lexize('simple','voila');
  ts_lexize
 ---
  {voila}
 (1 row)

 Time: 0.269 ms

 dev-oleg=# select ts_lexize('french_stem','voila');
  ts_lexize
 ---
  {voil}
 (1 row)

 Time: 0.187 ms

 I see no big difference in ts_headline as well:

 dev-oleg=# select ts_headline('english','I can do voila', 'voila'::tsquery);
      ts_headline
 ---
  I can do bvoila/b
 (1 row)

 Time: 0.265 ms
 dev-oleg=# select ts_headline('nomaofr','I can do voila', 'voila'::tsquery);
      ts_headline
 ---
  I can do bvoila/b
 (1 row)

 Time: 0.299 ms

 This is 8.4.1 version of PostgreSQL.

        Regards,
                Oleg
 _
 Oleg Bartunov, Research Scientist, Head of AstroNet (www.astronet.ru),
 Sternberg Astronomical Institute, Moscow University, Russia
 Internet: o...@sai.msu.su, http://www.sai.msu.su/~megera/
 phone: +007(495)939-16-83, +007(495)939-23-83

 --
 Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
 To make changes to your subscription:
 http://www.postgresql.org/mailpref/pgsql-hackers


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Very bad FTS performance with the Polish config

2009-11-18 Thread Oleg Bartunov

On Wed, 18 Nov 2009, Wojciech Knapik wrote:


Yes, for 4-word texts the results are similar.
Try that with a longer text and the difference becomes more and more 
significant. For the lorem ipsum text, 'polish' is about 4 times slower, than 
'english'. For 5 repetitions of the text, it's 6 times, for 10 repetitions - 
7.5 times...


Again, I see nothing unclear here, since dictionaries (as specified
in configuration) apply to ALL words in document. The more words in 
document, the more overhead.


You can pass not all document to ts_headline, but just part, to have
predicted performance. This is useful in any case.


Regards,
Oleg
_
Oleg Bartunov, Research Scientist, Head of AstroNet (www.astronet.ru),
Sternberg Astronomical Institute, Moscow University, Russia
Internet: o...@sai.msu.su, http://www.sai.msu.su/~megera/
phone: +007(495)939-16-83, +007(495)939-23-83

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Python 3.1 support

2009-11-18 Thread Nathan Boley
 Again, I'm only one user.  But so far I haven't seen anyone else speak
 up here, and clearly accepting this for inclusion will need nontrivial
 convincing.

Well, FWIW, I am excited about better type integration.

Also, I am a little skeptical about this patch. I am sorry if this has
already been discussed, but would this mean that I need to choose
whether pl/python is built against Python 2.* or Python 3.*?

-Nathan

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Python 3.1 support

2009-11-18 Thread Tom Lane
Nathan Boley npbo...@gmail.com writes:
 Also, I am a little skeptical about this patch. I am sorry if this has
 already been discussed, but would this mean that I need to choose
 whether pl/python is built against Python 2.* or Python 3.*?

Yes.  That's exactly what I was complaining about upthread.  I'm not
a Python user, but from what I can gather of the 2-to-3 changes,
having to choose one at package build time is going to be a disaster.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] RFC for adding typmods to functions

2009-11-18 Thread Kevin Grittner
Tom Lane t...@sss.pgh.pa.us wrote:
 
 there is a constituency that cares --- mainly people who use
 client-side code that tends to fall over if it doesn't see a
 suitable maxlength attached to query result columns.
 
I suspect it will primarily be software which is dealing with large
enough result sets that reading through it all to find the maximum
width for a column in a particular request is not viable.  It's also
likely to be more of an issue with software which needs to work with
multiple database products.  Finally, it simplifies life for
application developers who want a form or report to have columns which
are wide enough to handle the data which might show up, when they
don't want the column widths to change from run to run -- as when
daily reports will be added to a binder.
 
 The first example I came across in the archives was
 http://archives.postgresql.org/pgsql-sql/2002-06/msg00235.php
 
It's not surprising that we got a post about Crystal Reports having an
issue; all of the above applies to it.
 
-Kevin

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Python 3.1 support

2009-11-18 Thread Joshua D. Drake
On Wed, 2009-11-18 at 12:06 -0500, Tom Lane wrote:
 Nathan Boley npbo...@gmail.com writes:
  Also, I am a little skeptical about this patch. I am sorry if this has
  already been discussed, but would this mean that I need to choose
  whether pl/python is built against Python 2.* or Python 3.*?
 
 Yes.  That's exactly what I was complaining about upthread.  I'm not
 a Python user, but from what I can gather of the 2-to-3 changes,
 having to choose one at package build time is going to be a disaster.
 

Agreed. We really need to have a plpython and plpython3. Heck this would
play nicely too because we support backward compatibility but also
upward version differences.

Joshua D. Drake


   regards, tom lane
 


-- 
PostgreSQL.org Major Contributor
Command Prompt, Inc: http://www.commandprompt.com/ - 503.667.4564
Consulting, Training, Support, Custom Development, Engineering
If the world pushes look it in the eye and GRR. Then push back harder. - 
Salamander


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Patch - Reference Function Parameters by Name

2009-11-18 Thread Tom Lane
George Gensure wer...@gmail.com writes:
 Attached is a patch to perform parameter reference lookups by name in
 the body of functions.  I'm hesitant to put it in for the commitfest
 as is, without a couple of questions posed to the group:

I looked through this very quickly.  I'm not in favor of the approach
you have chosen of hacking all the upper layers in order to pass
parameter names through them; and putting the function name into those
APIs too is right out.  What I did in plpgsql avoided that by
establishing a callback protocol and keeping all the knowledge of names
within the callback function.  SQL functions have a different call path
to the parser, so we might need to adjust things in that path; but you
definitely should not be needing to mess with plancache.c any further.

 1. palloc needs no free?  I suppose this is a general knowledge
 question, but it seemed to be the case after trying to look for
 deallocation

Depends.  If you're creating something that is meant to live about
as long as the current statement anyway, you can just leave it to be
garbage-collected when the current memory context is destroyed.
There are cases where you need to be more aggressive about pfree'ing
things to avoid large cumulative memory usage, but probably anything
that is invoking parsing doesn't really need to worry (the parse process
is going to create a whole lot more trash than you will anyway).

 4. I made a judgement call in terms of resolution: if the
 columnref_hook for param-by-name resolution is called with a non-null
 node (meaning a column was already found), we avoid being an ambiguous
 reference, and prefer the column already found.

The consensus seems to be that we should throw error for ambiguity.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Python 3.1 support

2009-11-18 Thread Tom Lane
Joshua D. Drake j...@commandprompt.com writes:
 On Wed, 2009-11-18 at 12:06 -0500, Tom Lane wrote:
 Yes.  That's exactly what I was complaining about upthread.  I'm not
 a Python user, but from what I can gather of the 2-to-3 changes,
 having to choose one at package build time is going to be a disaster.

 Agreed. We really need to have a plpython and plpython3.

Peter was concerned about duplicative maintenance effort, but what I
think this patch shows is that (at least for the near future) both
could be built from a single source file.  What we need is configure
and makefile support to do that.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Python 3.1 support

2009-11-18 Thread Joshua D. Drake
On Wed, 2009-11-18 at 12:28 -0500, Tom Lane wrote:
 Joshua D. Drake j...@commandprompt.com writes:
  On Wed, 2009-11-18 at 12:06 -0500, Tom Lane wrote:
  Yes.  That's exactly what I was complaining about upthread.  I'm not
  a Python user, but from what I can gather of the 2-to-3 changes,
  having to choose one at package build time is going to be a disaster.
 
  Agreed. We really need to have a plpython and plpython3.
 
 Peter was concerned about duplicative maintenance effort, but what I
 think this patch shows is that (at least for the near future) both
 could be built from a single source file.  What we need is configure
 and makefile support to do that.

Ahh, so we would have:

--enable-plpython2=/usr/bin/python2
--enable-plpython3=/usr/bin/python3

?

That seems reasonable if we can run both. Although I wonder if longer
term (2.x is going to be support a long time) we will end up with
frustration within the single source file trying to keep things
straight.

Joshua D. Drake


 
   regards, tom lane
 


-- 
PostgreSQL.org Major Contributor
Command Prompt, Inc: http://www.commandprompt.com/ - 503.667.4564
Consulting, Training, Support, Custom Development, Engineering
If the world pushes look it in the eye and GRR. Then push back harder. - 
Salamander


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Python 3.1 support

2009-11-18 Thread Tom Lane
Joshua D. Drake j...@commandprompt.com writes:
 On Wed, 2009-11-18 at 12:28 -0500, Tom Lane wrote:
 Peter was concerned about duplicative maintenance effort, but what I
 think this patch shows is that (at least for the near future) both
 could be built from a single source file.

 That seems reasonable if we can run both. Although I wonder if longer
 term (2.x is going to be support a long time) we will end up with
 frustration within the single source file trying to keep things
 straight.

Once it gets to the point where it's more trouble to keep them together
than not, we can split the source.  But judging from this patch, a
single source file is the ticket for the moment.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Timezones (in 8.5?)

2009-11-18 Thread Kevin Grittner
Andrew Gierth and...@tao11.riddles.org.uk wrote:
 
 If he meant (A), then you store the event as:
 (ts,tz) = (timestamp '2010-07-27 10:30:00',
'Chile/Santiago')
 
 If he meant (B), then you store the event as
 (tsz,tz) = (timestamp '2010-07-27 10:30:00' at time zone
 'Chile/Santiago', 'Chile/Santiago')
 
You seem to be agreeing that these problems can't be solved without
storing a time zone string in addition to the timestamp.  As I read
it, Hernán was wishing for types which include this, rather than
having to do the above dance with multiple values.
 
-Kevin

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Very bad FTS performance with the Polish config

2009-11-18 Thread Wojciech Knapik


Oleg Bartunov wrote:


Yes, for 4-word texts the results are similar.
Try that with a longer text and the difference becomes more and more 
significant. For the lorem ipsum text, 'polish' is about 4 times 
slower, than 'english'. For 5 repetitions of the text, it's 6 times, 
for 10 repetitions - 7.5 times...


Again, I see nothing unclear here, since dictionaries (as specified
in configuration) apply to ALL words in document. The more words in 
document, the more overhead.


You're missing the point. I'm not surprised that the function takes more 
time for larger input texts - that's obvious. The thing is, the 
computation times rise more steeply when the Polish config is used. 
Steeply enough, that the difference between the Polish and English 
configs becomes enormous in practical cases.


Now this may be expected behaviour, but since I don't know if it is, I 
posted to the mailing lists to find out. If you're saying this is ok and 
there's nothing to fix here, then there's nothing more to discuss and we 
may consider the thread closed.

If not, ts_headline deserves a closer look.

cheers,
Wojciech Knapik

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Python 3.1 support

2009-11-18 Thread Nathan Boley
 Here's the patch to support Python =3.1 with PL/Python.  The
 compatibility code is mostly in line with the usual 2-3 C porting
 practice and is documented inline.


I took a cursory look at this patch and, while the logic seems sound
and roughly in line with the suggested python porting procedure, I'm
not quite certain what this implies for potential future patches.

For instance, if I wanted to write a type converter for bytea - the
python 3 byte type would the expectation be that I ensure that it
works in Python 2? Or is an ifdef that ignores it in the case of
Python 2 OK, and we can just put a note in the docs.

Also, how far back do we want to maintain 2.x compatibility? 2.0? If I
wanted to submit a patch that makes use of the list sort method, do I
need to ensure that it can either use the cmp arguments or a key
argument?

What if I wanted to implement a set returning function that made use
of an iterators next() method. Would I just put ifdefs around the code
or a preprocessor definition that defines NEXT as next() for Python
2.x and __next__() for 3.x?

I guess that my first impression is that Python broke compatibility
for a reason, and that either plpython can't evolve, or it will
quickly become impossible to maintain. That being said, I mostly buy
the maintenance arguments from the previous discussion, but if we want
to have plpython and plpython3, a bunch of defines and ifdefs does not
seem like the best way to do this.

Would a better approach be to maintain compatibility layer? ie
plython_compat.h/c
plython2.c
plython3.c

Then patches that apply to a python3 can be applied to plython3.c and
any changed function can be ripped out of plython_compat and moved
into plpython2.

I'm sorry to snipe from the sidelines like this. If we didn't expect
plpython to evolve then this patch seems like the correct approach,
but there is clearly some desire to expand plpython and following this
path seems like it will end in a much more painful split in the future
or a necessary rewrite.

If there is some consensus that this is the best approach, then I will
do a more comprehensive review.

-Nathan

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Rejecting weak passwords

2009-11-18 Thread Tom Lane
Itagaki Takahiro itagaki.takah...@oss.ntt.co.jp writes:
 BTW, it might not be a work for this patch, we also need to
 reject too long VALID UNTIL setting. If the password is
 complex, we should not use the same password for a long time.

This is a good point --- people who have password strength policies
tend to want a limited usage period as well.  It's even conceivable
that you could have different allowed lifespans depending on how
strong the password is.  I suggest we alter the hook signature to pass
it the valuntil time along with the other parameters it's already
getting, and let the one hook enforce policies for both.

I'm reviewing the patch now, and barring objections will make this
change before committing.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Python 3.1 support

2009-11-18 Thread James Pye
On Nov 18, 2009, at 8:37 AM, Peter Eisentraut wrote:
 The question is whether it helps the user, not the implementer.

Sure, but do you have a patch waiting to implement tracebacks?

I'd argue the reason it's never been done is due to the way procedures are 
currently managed in PL/Python. And *without some significant refactoring*, any 
patch fully implementing tracebacks is going to be a seriously ugly hack.

What helped the implementer here would help the user.

  As far
 as I can tell, it just creates more typing for no benefit whatsoever.

def main(*args): is annoying, but not entirely lamentable...
It's explicit, as well(no need to document munging that occurs behind the 
scenes).

Also, compare the cases where you need to cache some initialized data:

if 'key' not in SD:
 ...
 SD['key'] = my_newly_initialized_data
...


With function modules, SD is not needed as you have your module globals to keep 
your locally cached data in:

...
data = my_newly_initialized_data

def main(*args):
 ...


 Also, it's inconsistent with normal Python script files

Hinges on whether normal is actually normal.
I often use the __name__ convention in script files myself:

if __name__ == '__main__':
 main(...)

That is, using that convention, the script can be import'd and used without 
executing the script functionality. (It has proven to be very handy a few 
times now)

  and with other PLs.

I don't understand why that's a significant enough interest to note.

 I don't need another PostgreSQL implementation on top of Python.

Indeed, and I do understand that. That is, I have removed some features with 
that very thought in mind. (OTOH, I consider the date_part properties on 
datetime types to be special: too likely useful.)

[tho, PostgreSQL implementation? I think I understand what you were getting 
at, but..]

 The maintenance effort required to keep those two consistent aside.

I don't think there are many consistency issues here.
What did you have in mind?

 Again, I'm only one user.  But so far I haven't seen anyone else speak up 
 here, and clearly accepting this for inclusion will need nontrivial 
 convincing.

Agreed. It would seem quite doomed.

At this point, I'm not going to try getting it into PG. (apparent futility and 
such)
-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Rejecting weak passwords

2009-11-18 Thread Tom Lane
Itagaki Takahiro itagaki.takah...@oss.ntt.co.jp writes:
 Albe Laurenz laurenz.a...@wien.gv.at wrote:
 I agree on the second point, and I changed the patch accordingly.
 Here's the latest version.

 Looks good. I change status of the patch to Ready for Committer.

Applied with some minor modifications.  Aside from the added valuntil
parameter, I changed the isencrypted parameter to an int with some
#define'd values.  It seems easily foreseeable that we'll replace the
MD5 encryption scheme someday, and it'd be good to ensure that this
API is extendable when that happens.  Also, I got rid of the bool
return value and made the hook responsible for throwing its own errors.
I don't know about you guys, but I would cheerfully kill anybody who
tried to make me use a password checker that didn't tell me anything
about why it thinks my password is too weak.  (The CrackLib API we
are using is lamentably badly designed on this score --- does it have
another call that provides a more useful error report?)  Even if you
think weak password is adequate for that class of complaints, the
single error message would certainly not do for complaints about the
valuntil date being too far away.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] Oversight in CREATE FUNCTION + EXPLAIN?

2009-11-18 Thread Joshua D. Drake
Hey,

So I ran across this today:

CREATE OR REPLACE FUNCTION RETURN_LOTS(INT) RETURNS SETOF INT AS 
   $$ 
  SELECT generate_series(1,$1); 
   $$ 
COST 0.5 ROWS 50 SET work_mem TO '5MB' LANGUAGE 'SQL';

postgres=# explain analyze select return_lots(1000);
QUERY PLAN   
---
 Result  (cost=0.00..0.26 rows=1 width=0) (actual time=0.059..21411.187
rows=1000 loops=1)
 Total runtime: 25951.904 ms

Shouldn't the estimated rows be 50?

Joshua D. Drake


-- 
PostgreSQL.org Major Contributor
Command Prompt, Inc: http://www.commandprompt.com/ - 503.667.4564
Consulting, Training, Support, Custom Development, Engineering
If the world pushes look it in the eye and GRR. Then push back harder. - 
Salamander


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] Amazing performance failure with SQL function

2009-11-18 Thread Joshua D. Drake
I was just writing a syntical example and wanted to make sure it worked.
I found this:

CREATE OR REPLACE FUNCTION RETURN_LOTS(INT) RETURNS SETOF INT AS 
   $$ 
  SELECT generate_series(1,$1); 
   $$ COST 0.5 ROWS 1000 SET work_mem TO '5MB' LANGUAGE 'SQL';

postgres=# explain analyze select return_lots(1000);
  QUERY
PLAN   
---
 Result  (cost=0.00..0.26 rows=1 width=0) (actual time=0.057..21255.309
rows=1000 loops=1)
 Total runtime: 25784.077 ms
(2 rows)

O.k. slow, but no big deal right? Well:

postgres=# SET cpu_operator_cost to 0.5;
SET
postgres=# set work_mem to 5MB;
SET
postgres=# explain analyze SELECT generate_series(1,1000);
  QUERY
PLAN  
--
 Result  (cost=0.00..0.51 rows=1 width=0) (actual time=0.004..6796.389
rows=1000 loops=1)
 Total runtime: 11301.681 ms
(2 rows)

This is repeatable. I expect a little regression because we have to
compile the SQL but 14 seconds? 

postgres=# select version();

version   
-
 PostgreSQL 8.3.8 on x86_64-pc-linux-gnu, compiled by GCC gcc-4.4.real
(Ubuntu 4.4.1-3ubuntu3) 4.4.1
(1 row)


Joshua D. Drake


-- 
PostgreSQL.org Major Contributor
Command Prompt, Inc: http://www.commandprompt.com/ - 503.667.4564
Consulting, Training, Support, Custom Development, Engineering
If the world pushes look it in the eye and GRR. Then push back harder. - 
Salamander


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Oversight in CREATE FUNCTION + EXPLAIN?

2009-11-18 Thread Tom Lane
Joshua D. Drake j...@commandprompt.com writes:
 Shouldn't the estimated rows be 50?

It is if you do select * from return_lots(1000).

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Amazing performance failure with SQL function

2009-11-18 Thread Tom Lane
Joshua D. Drake j...@commandprompt.com writes:
 This is repeatable. I expect a little regression because we have to
 compile the SQL but 14 seconds? 

generate_series is a quite efficient C function.  I think it's pretty
damn good that the overhead of a SQL function on top of that is only 2X.

Or were you expecting the SRF to be inlined?  If so, you need to
(a) be using 8.4, (b) mark it STABLE or IMMUTABLE.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Listen / Notify - what to do when the queue is full

2009-11-18 Thread Tom Lane
Joachim Wieland j...@mcknight.de writes:
 4) Allow readers to read uncommitted notifications as well.

The question that strikes me here is one of timing --- apparently,
readers will now have to check the queue *without* having received
a signal?  That could amount to an unpleasant amount of extra overhead
when the notify system isn't even in use.  (Users who don't care about
notify will define unpleasant amount as not zero.)

I haven't read the patch, so maybe you have some cute solution to that,
but if so please explain what.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Listen / Notify - what to do when the queue is full

2009-11-18 Thread Greg Stark
On Mon, Nov 16, 2009 at 2:35 PM, Andrew Chernow a...@esilo.com wrote:

 1) drop new notifications if the queue is full (silently or with
 rollback)

 I like this one best, but not with silence of course. While it's not the
 most
 polite thing to do, this is for a super extreme edge case. I'd rather just
 throw an exception if the queue is full rather than start messing with the

 +1

So if you guys are going to insist on turning the notification
mechanism isn't a queueing mechanism I think it at least behooves you
to have it degrade gracefully into a notification mechanism and not
become entirely useless by dropping notification messages.

That is, if the queue overflows what you should do is drop the
payloads and condense all the messages for a given class into a single
notification for that class with unknown payload. That way if a
cache which wants to invalidate specific objects gets a queue overflow
condition then at least it knows it should rescan the original data
and rebuild the cache and not just serve invalid data.

I still think you're on the wrong path entirely and will end up with a
mechanism which serves neither use case very well instead of two
separate mechanisms that are properly designed for the two use cases.

-- 
greg

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Listen / Notify - what to do when the queue is full

2009-11-18 Thread Joachim Wieland
On Thu, Nov 19, 2009 at 1:48 AM, Tom Lane t...@sss.pgh.pa.us wrote:
 Joachim Wieland j...@mcknight.de writes:
 4) Allow readers to read uncommitted notifications as well.

 The question that strikes me here is one of timing --- apparently,
 readers will now have to check the queue *without* having received
 a signal?  That could amount to an unpleasant amount of extra overhead
 when the notify system isn't even in use.  (Users who don't care about
 notify will define unpleasant amount as not zero.)

The sequence in CommitTransaction() is like that:

1) add notifications to queue
2) commit to clog
3) signal backends

Only those backends are signalled that listen to at least one channel,
if the notify system isn't in use, then nobody will ever be signalled
anyway.

If a backend is reading a transaction id that has not yet committed,
it will not deliver the notification. It knows that eventually it will
receive a signal from that transaction and then it first checks its
list of uncommitted notifications it has already read and then checks
the queue for more pending notifications.


Joachim

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] TRIGGER with WHEN clause

2009-11-18 Thread KaiGai Kohei
Itagaki-san,

I don't have any more comments in this patch, so I hope it to be reviewed
by committers then upstreamed.

Thanks for your good jobs.

Itagaki Takahiro wrote:
 KaiGai Kohei kai...@ak.jp.nec.com wrote:
 
 In addition, I could find a few matters.
 * TOAST may be necessary for pg_trigger?
 
 I added toast relation to pg_trigger.
 DECLARE_TOAST(pg_trigger, 2336, 2337);
 
 I think having a toast relation for pg_trigger is reasonable
 because pg_trigger already has a variable field tgargs
 even if we don't have the new field tgqual from the patch.
 I'm not sure why we don't have a toast relation for pg_trigger
 because user might pass very long trigger arguments.
 
 * ROW INSERT TRIGGER on COPY FROM statement
 
 Thanks. Good catch! Fixed and regression test added.
 
 * Using system column in WHEN clause
 2) Describe a notice on the user documentation not to use system columns
in the WHEN clause, because these are assigned on after the trigger
invocations.
 
 I'd like to only add documentation because I don't have a whole solution.
 
 System columns are not available in the literalWHEN/ clause
 because those values are initialized after triggers are called.
 They might return wrong values if they used in expressions of the clause.
 
 
 Regards,
 ---
 ITAGAKI Takahiro
 NTT Open Source Software Center
 


-- 
OSS Platform Development Division, NEC
KaiGai Kohei kai...@ak.jp.nec.com

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] Not safe to send CSV data message

2009-11-18 Thread Tom Lane
So I went to investigate bug #5196: turned on log_destination = csvlog
etc, and restarted the postmaster.  I got this on stderr:

2009-11-18 20:08:52.104 EST : : LOG:  loaded library passwordcheck
Not safe to send CSV data

The first line is a consequence of having still got
shared_preload_libraries set from testing the new contrib/passwordcheck
module.  However, what I'm on the warpath about right now is the second
line.  That message is useless, not to mention not conformant to our
style guidelines.  And the fact that it comes out at all suggests that
the csvlog startup logic is rather broken.  Comments?

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Python 3.1 support

2009-11-18 Thread James Pye
On Nov 18, 2009, at 1:36 PM, James Pye wrote:
 At this point, I'm not going to try getting it into PG. (apparent futility 
 and such)

ugh, on second thought, I think I've written a bit too much code to stop now. 
I'm going to get plpython3 as far as I can and submit it to the next commitfest.
-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Listen / Notify - what to do when the queue is full

2009-11-18 Thread Andrew Chernow


That is, if the queue overflows what you should do is drop the
payloads and condense all the messages for a given class into a single
notification for that class with unknown payload. That way if a
cache which wants to invalidate specific objects gets a queue overflow
condition then at least it knows it should rescan the original data
and rebuild the cache and not just serve invalid data.



That's far more complicated than throwing an error and it discards user payload 
information.  Let the error indicate a rescan is needed.


--
Andrew Chernow
eSilo, LLC
every bit counts
http://www.esilo.com/

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Listen / Notify - what to do when the queue is full

2009-11-18 Thread Josh Berkus
On 11/16/09 3:19 AM, Joachim Wieland wrote:
  1) drop new notifications if the queue is full (silently or with rollback)
  2) block until readers catch up (what if the backend that tries to write the
 notifications actually is the lazy reader that everybody is waiting for 
 to
 proceed?)
  3) invent a new signal reason and send SIGUSR1 to the lazy readers, they
 need to interrupt whatever they are doing and copy the
 notifications into their
 own address space (without delivering the notifications since they are in 
 a
 transaction at that moment).

(4) drop *old* notifications if the queue is full.

Since everyone has made the point that LISTEN is not meant to be a full
queueing system, I have no problem dropping notifications LRU-style.  If
we've run out of room, the oldest notifications should go first; we
probably don't care about them anyway.

We should probably also log the fact that we ran out of room, so that
the DBA knows that they ahve a design issue.  For volume reasons, I
don't think we want to log every dropped message.

Alternately, it would be great to have a configuration option which
would allow the DBA to choose any of 3 behaviors via GUC:

drop-oldest (as above)
drop-largest (if we run out of room, drop the largest payloads first to
save space)
error (if we run out of room, error and rollback)

--Josh Berkus

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Listen / Notify - what to do when the queue is full

2009-11-18 Thread Andrew Chernow



We should probably also log the fact that we ran out of room, so that
the DBA knows that they ahve a design issue.


Can't they just bump allowed memory and avoid a redesign?


Alternately, it would be great to have a configuration option which
would allow the DBA to choose any of 3 behaviors via GUC:

drop-oldest (as above)
drop-largest (if we run out of room, drop the largest payloads first to
save space)
error (if we run out of room, error and rollback)



I mentioned this up thread.  I completely agree that overflow behavior should be 
tunable.


--
Andrew Chernow
eSilo, LLC
every bit counts
http://www.esilo.com/

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] Review: psql-wrap-formatting/Unicode UTF-8 table formatting for psql text output

2009-11-18 Thread gabrielle
Overview:
Patch to make data output that includes newlines  wrapped lines
consistent with the headers for that data.
Link: https://commitfest.postgresql.org/action/patch_view?id=220

Submission review:
* is in context diff
* applies cleanly to current HEAD
* includes its own test .sql file to verify new output
* includes appropriate doc patches

Usability review:
* the patch does indeed do what it says
* maintains backward compatibility, should the end-user desire it
* included tests perform as advertised

Coding review:
* looks good to me
* fits in with the surrounding code
* no compiler warnings, no crashes
* complete  understandable docs

Personally, I think this rocks.  It really improves the readability of
query output.

gabrielle
--
on behalf of the PDXPUG Patch Review Team (Dan, Mark, John, Brad,  me)

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Listen / Notify - what to do when the queue is full

2009-11-18 Thread Tom Lane
Joachim Wieland j...@mcknight.de writes:
 The sequence in CommitTransaction() is like that:

 1) add notifications to queue
 2) commit to clog
 3) signal backends

 Only those backends are signalled that listen to at least one channel,
 if the notify system isn't in use, then nobody will ever be signalled
 anyway.

 If a backend is reading a transaction id that has not yet committed,
 it will not deliver the notification.

But you were saying that this patch would enable sending more data than
would fit in the queue.  How will that happen if the other backends
don't look at the queue until you signal them?

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Not safe to send CSV data message

2009-11-18 Thread Tom Lane
Andrew Dunstan and...@dunslane.net writes:
 So the logger there has been doing CSV logging for quite a while without 
 memory ballooning.

I was able to generate a noticeable leak by cranking log_rotation_size
way down ... it's about 1K per size rotation event.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Listen / Notify - what to do when the queue is full

2009-11-18 Thread Tom Lane
Josh Berkus j...@agliodbs.com writes:
 (4) drop *old* notifications if the queue is full.

 Since everyone has made the point that LISTEN is not meant to be a full
 queueing system, I have no problem dropping notifications LRU-style.

NO, NO, NO, a thousand times no!

That turns NOTIFY into an unreliable signaling system, and if I haven't
made this perfectly clear yet, any such change will be committed over my
dead body.

If we are unable to insert a new message into the queue, the correct
recourse is to fail the transaction that is trying to insert the *new*
message.  Not to drop messages from already-committed transactions.
Failing the current transaction still leaves things in a consistent
state, ie, you don't get messages from aborted transactions but that's
okay because they didn't change the database state.

I think Greg has a legitimate concern about whether this redesign
reduces the usefulness of NOTIFY for existing use-cases, though.
Formerly, since pg_listener would effectively coalesce notifies
across multiple sending transactions instead of only one, it was
impossible to overflow the queue, unless maybe you managed to
bloat pg_listener to the point of being out of disk space, and
even that was pretty hard.  There will now be a nonzero chance
of transactions failing at commit because of queue full.  If the
chance is large this will be an issue.  (Is it sane to wait for
the queue to be drained?)

BTW, did we discuss the issue of 2PC transactions versus notify?
The current behavior of 2PC with notify is pretty cheesy and will
become more so if we make this change --- you aren't really
guaranteed that the notify will happen, even though the prepared
transaction did commit.  I think it might be better to disallow
NOTIFY inside a prepared xact.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Listen / Notify - what to do when the queue is full

2009-11-18 Thread Tom Lane
Andrew Chernow a...@esilo.com writes:
 I mentioned this up thread.  I completely agree that overflow behavior should 
 be 
 tunable.

There is only one correct overflow behavior.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] ProcessUtility_hook

2009-11-18 Thread Euler Taveira de Oliveira
Itagaki Takahiro escreveu:
 Here is a patch to add ProcessUtility_hook to handle all DDL
 commands in user modules. (ProcessUtility_hook_20091021.patch)
 It adds a global variable 'ProcessUtility_hook' and
 export the original function as 'standard_ProcessUtility'.
 
I've reviewed your patch. It applies cleanly and compiles without warnings. It
lacks documentation. Could you update it?

The functionality is divided in two parts. The first part is a hook in the
utility module. The idea is capture the commands that doesn't pass through
executor. I'm afraid that that hook will be used only for capturing non-DML
queries. If so, why don't we hack the tcop/postgres.c and grab those queries
from the same point we log statements? This feature is similar to the executor
one. But in the executor case, we use the plan for other things. Other
utilities are (i) using that hook to filter out some non-DML commands and (ii)
developing some non-DML replication mechanism for trigger-based replication
solutions.

The second part is to use that hook to capture non-DML commands for
pg_stat_statements module. Do we need to have rows = 0 for non-DML commands?
Maybe NULL could be an appropriate value. The PREPARE command stopped to count
 the number of rows. Should we count the rows in EXECUTE command or in the
PREPARE command? The other command that doesn't count properly is COPY. Could
you fix that? I'm concerned about storing some commands that expose passwords
like CREATE ROLE foo PASSWORD 'secret'; I know that the queries are only
showed to superusers but maybe we should add this information to documentation
or provide a mechanism to exclude those commands.

I don't know if it is worth the trouble adding some code to capture VACUUM and
ANALYZE commands called inside autovacuum.


-- 
  Euler Taveira de Oliveira
  http://www.timbira.com/

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Listen / Notify - what to do when the queue is full

2009-11-18 Thread Andrew Chernow

Tom Lane wrote:

Andrew Chernow a...@esilo.com writes:
I mentioned this up thread.  I completely agree that overflow behavior should be 
tunable.


There is only one correct overflow behavior.



I count three.

1. wait
2. error
3. skip

#1 and #2 are very similar to a file system.  If FS buffers are full on write, 
it makes you wait.  In non-blocking mode, it throws an EAGAIN error.  IMHO those 
two behaviors are totally acceptable for handling notify overflow.  #3 is pretty 
weak but I *think* there are uses for it.


--
Andrew Chernow
eSilo, LLC
every bit counts
http://www.esilo.com/

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Not safe to send CSV data message

2009-11-18 Thread Tom Lane
Andrew Dunstan and...@dunslane.net writes:
 Tom Lane wrote:
 And the fact that it comes out at all suggests that
 the csvlog startup logic is rather broken.  Comments?

 Not sure why you say that. This can only happen very early in the 
 startup process before the postmaster has had a chance to finish setting 
 up the syslogger process and dup the pipes. As soon as that happens 
 redirection_done is set to true and this message is no longer possible.

Well, in that case the code is operating as designed and the bleating is
simply inappropriate.  What I was wondering was whether we should try to
launch the syslogger before we do process_shared_preload_libraries().
But now that I think about it, I think that ordering was intentional
on the grounds that some types of monitoring plugins might want to be
live in all postmaster children including the syslogger.  In any case
there will certainly always be *some* postmaster messages that could
be emitted after setting the log_destination GUC and before launching
the syslogger child.  If the designed behavior is that we dump to
stderr during that interval, we should just do it, without the useless
and confusing bleating.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Listen / Notify - what to do when the queue is full

2009-11-18 Thread Tom Lane
Andrew Chernow a...@esilo.com writes:
 Tom Lane wrote:
 There is only one correct overflow behavior.

 I count three.

Waiting till you can insert is reasonable (especially if we have some
behavior that nudges other backends to empty the queue).  If by skip
you mean losing the notify but still committing, that's incorrect.
There is no room for debate about that.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Not safe to send CSV data message

2009-11-18 Thread Andrew Dunstan



Tom Lane wrote:

In any case
there will certainly always be *some* postmaster messages that could
be emitted after setting the log_destination GUC and before launching
the syslogger child.  If the designed behavior is that we dump to
stderr during that interval, we should just do it, without the useless
and confusing bleating.


  


I'm fine with that. I don't remember whether I put that in or whether it 
came from the original patch author(s). Either way, I assume the reason 
was to explain why the message appeared on stderr rather than the 
CSVlog. Now we have a couple of years of experience with CSVlog I agree 
it's not needed (if it were we'd probably have had more complaints like 
yours anyway).


cheers

andrew

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Listen / Notify - what to do when the queue is full

2009-11-18 Thread Andrew Chernow

Tom Lane wrote:

Andrew Chernow a...@esilo.com writes:

Tom Lane wrote:

There is only one correct overflow behavior.



I count three.


Waiting till you can insert is reasonable (especially if we have some
behavior that nudges other backends to empty the queue).  If by skip
you mean losing the notify but still committing, that's incorrect.
There is no room for debate about that.


Yeah like I said, skip felt weak.

In regards to waiting, what would happen if other backends couldn't help empty 
the queue because they to are clogged?  ISTM that any attempt to flush to other 
non-disk queues is doomed to possible overflows as well.  Then what? 
Personally, I would just wait until room became available or the transaction was 
canceled.  We could get fancy and tack a timeout value onto the wait.


--
Andrew Chernow
eSilo, LLC
every bit counts
http://www.esilo.com/

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Listen / Notify - what to do when the queue is full

2009-11-18 Thread Tom Lane
Andrew Chernow a...@esilo.com writes:
 Personally, I would just wait until room became available or the transaction 
 was 
 canceled.

Works for me, as long as there's a CHECK_FOR_INTERRUPTS in there to
allow a cancel to happen.  The current patch seems to have a lot of
pointless logging and no CHECK_FOR_INTERRUPTS ;-)

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Timezones (in 8.5?)

2009-11-18 Thread Andrew Gierth
 Kevin == Kevin Grittner kevin.gritt...@wicourts.gov writes:

  If he meant (A), then you store the event as:
  (ts,tz) = (timestamp '2010-07-27 10:30:00',
  'Chile/Santiago')
 
  If he meant (B), then you store the event as
  (tsz,tz) = (timestamp '2010-07-27 10:30:00' at time zone
  'Chile/Santiago', 'Chile/Santiago')
 
 Kevin You seem to be agreeing that these problems can't be solved
 Kevin without storing a time zone string in addition to the
 Kevin timestamp.  As I read it, Hernán was wishing for types which
 Kevin include this, rather than having to do the above dance with
 Kevin multiple values.

Right, but including more data in a single type is the wrong approach,
since it complicates the semantics and interferes with normalization.
For example, if you have a type T which incorporates a timestamp and a
timezone, what semantics does the T = T operator have? What semantics
apply if the definitions of timezones change? What if you're storing
times of events at specific places; in that case you want to associate
the timezone with the _place_ not the event (so that if the timezone
rules change, moving the place from one timezone to another, you only
have to change the place, not all the events that refer to it).

-- 
Andrew (irc:RhodiumToad)

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Very bad FTS performance with the Polish config

2009-11-18 Thread Sushant Sinha
ts_headline calls ts_lexize equivalent to break the text. Off course there
is algorithm to process the tokens and generate the headline. I would be
really surprised if the algorithm to generate the headline is somehow
dependent on language (as it only processes the tokens). So Oleg is right
when he says ts_lexize is something to be checked.

I will try to replicate what you are trying to do but in the meantime can
you run the same ts_headline under psql multiple times and paste the result.

-Sushant.

2009/11/19 Wojciech Knapik webmas...@wolniartysci.pl


 Oleg Bartunov wrote:

  Yes, for 4-word texts the results are similar.
 Try that with a longer text and the difference becomes more and more
 significant. For the lorem ipsum text, 'polish' is about 4 times slower,
 than 'english'. For 5 repetitions of the text, it's 6 times, for 10
 repetitions - 7.5 times...


 Again, I see nothing unclear here, since dictionaries (as specified
 in configuration) apply to ALL words in document. The more words in
 document, the more overhead.


 You're missing the point. I'm not surprised that the function takes more
 time for larger input texts - that's obvious. The thing is, the computation
 times rise more steeply when the Polish config is used. Steeply enough, that
 the difference between the Polish and English configs becomes enormous in
 practical cases.

 Now this may be expected behaviour, but since I don't know if it is, I
 posted to the mailing lists to find out. If you're saying this is ok and
 there's nothing to fix here, then there's nothing more to discuss and we may
 consider the thread closed.
 If not, ts_headline deserves a closer look.

 cheers,
 Wojciech Knapik


 --
 Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
 To make changes to your subscription:
 http://www.postgresql.org/mailpref/pgsql-hackers



Re: [HACKERS] Not safe to send CSV data message

2009-11-18 Thread Andrew Dunstan



Tom Lane wrote:

So I went to investigate bug #5196: turned on log_destination = csvlog
etc, and restarted the postmaster.  I got this on stderr:

2009-11-18 20:08:52.104 EST : : LOG:  loaded library passwordcheck
Not safe to send CSV data

The first line is a consequence of having still got
shared_preload_libraries set from testing the new contrib/passwordcheck
module.  However, what I'm on the warpath about right now is the second
line.  That message is useless, not to mention not conformant to our
style guidelines.  


(donning armor)

Possibly.


And the fact that it comes out at all suggests that
the csvlog startup logic is rather broken.  Comments?

  


Not sure why you say that. This can only happen very early in the 
startup process before the postmaster has had a chance to finish setting 
up the syslogger process and dup the pipes. As soon as that happens 
redirection_done is set to true and this message is no longer possible.


My memory is fading a bit about this now, but ISTR we debated the logic 
some when I was working to get this committed around 2 1/2 years ago. 
It's quite possible we (or I) got it wrong, but I'm not sure what we 
should be doing if the logger isn't ready yet.


BTW, re that bug, I recently turned on CSV logging under 8.3 for a 
client running Suse 10.2 64bit, with fairly heavy usage. Here is the 
current state of the non-client processes:


postgres  7080  0.0  0.4 12913224 267216 ? SOct28   2:58 
/usr/bin/postmaster -D /pg_database/pg_data1/pg
postgres  7087  0.1  0.0  99824 60696 ?Ss   Oct28  30:32 
postgres: logger process  
postgres  7092  0.0 16.3 12919564 10791472 ?   Ss   Oct28   7:41 
postgres: writer process  
postgres  7093  0.0  0.0 12919256 10432 ?  Ss   Oct28   0:29 
postgres: wal writer process  
postgres  7094  0.0  0.0 12920892 3544 ?   Ss   Oct28   0:00 
postgres: autovacuum launcher process 
postgres  7095  0.0  0.0  34396  1904 ?Ss   Oct28   0:05 
postgres: archiver process   last was 0001020500A3
postgres  7096  0.0  0.0  36156  2916 ?Ss   Oct28  10:26 
postgres: stats collector process  

So the logger there has been doing CSV logging for quite a while without 
memory ballooning.  (The writer process, on the other hand, has eaten up 
a boatload of memory, by the look of it)  


cheers

andrew




--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] xpath_table equivalent

2009-11-18 Thread Scott Bailey

Andrew Dunstan wrote:




I've been reading over the documentation to find an alternative to 
the deprecated xpath_table functionality. I think it may be a 
possibility but I'm not seeing a clear alternative.


Thanks,

Chris Graner


The standard is XMLTABLE and is implemented by both db2 and oracle but 
is on our list of unimplemented features. I would love to see this 
implemented in Postgres. I recall it coming up here before. But I 
don't think it went beyond discussing which xquery library we could use.





Yes, Chris spoke to me about this last night and emailed me an example 
of what he needs today, and I've spent the couple of hours thinking 
about it. Not have a nice way of getting a recordset out of a piece of 
XML is actually quite a gap in our API.


The trouble is that XMLTABLE is a horrible grammatical mess, ISTM, and I 
don't much like the way xpath_table() works either. Passing a table name 
as text into a function is rather ugly.


I think we could do with a much simple, albeit non-standard, API. 
Something like:


   xpathtable(source xml, rootnodes text, leaves variadic  text[]) 
returns setof record


But unless I'm mistaken we'd need the proposed LATERAL extension to make 
it iterate nicely over a table. Then we could possibly do something like:


   select x.bar, x.blurfl
   from
   foo f,
   lateral 
xpathtable(f.xmlfield,'//foo','bar','baz[1]/blurfl','@is-some-property')

   as x(bar int, blurfl text, xmlprop bool)
   where f.otherfield or x.xmlprop;

cheers

andrew


I agree that the syntax of XMLTABLE is odd. But not demonstrably worse 
than xpathtable. If we are going to exert effort on it, why not do it in 
a standards compliant way? Otherwise I'd suggest a stop gap of just 
adding some support functions to make it easier to extract a scalar 
value from a node. Something like what I did here.


http://scottrbailey.wordpress.com/2009/06/19/xml-parsing-postgres/

The nice thing about XMLTABLE is that it adds xquery support. I think 
the majority of xquery engines seem to be written in Java. XQuilla is 
C++. I'm not sure if our licensing is compatible, but it I would love 
the irony of using Berkeley DB XML (formerly Sleepycat) now that its 
owned by Oracle.


Scott

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Summary and Plan for Hot Standby

2009-11-18 Thread Simon Riggs
On Wed, 2009-11-18 at 14:51 +0200, Heikki Linnakangas wrote:
 Tatsuo Ishii wrote:
  Please correct me if I'm wrong. Parse will result in obtaining
  RowExclusiveLock on the target table if it is parsing
  INSERT/UPDATE/DELETE. If so, is this ok in the standby?
  Any attempt to take RowExclusiveLock will fail.
 
  Any attempt to execute INSERT/UPDATE/DELETE will fail.
 
  This behaviour should be identical to read only transaction mode. If it
  is not documented as an exception, please report as a bug.
  
  Is it?
  
  It seems read only transaction mode is perfectly happy with
  RowExclusiveLock:
 
 Hmm, that's a good point. I can't immediately see that that would cause
 any trouble, but it gives me an uncomfortable feeling about the locking.
 Which locks exactly need to be replayed in standby, and why? Which locks
 can read-only transactions acquire?
 
 The doc says:
 +   In recovery, transactions will not be permitted to take any table lock
 +   higher than AccessShareLock. In addition, transactions may never assign
 +   a TransactionId and may never write WAL.
 +   Any LOCK TABLE command that runs on the standby and requests a specific
 +   lock type other than AccessShareLock will be rejected.
 
 which seems wrong, given Tatsuo-sans example. Is that paragraph only
 referring to LOCK TABLE, and not other means of acquiring locks? Either
 way, it needs to be clarified or fixed.
 
 access/transam/README says:
 +Further details on locking mechanics in recovery are given in comments
 +with the Lock rmgr code.
 
 but there's no explanation there either *why* the locking works as it
 is. In LockAcquire(), we forbid taking locks higher than AccessShareLock
 during recovery mode, but only for LOCKTAG_OBJECT locks. Why?

Recovery does *not* take the same locks as the original statements on
the master took. For example, the WAL record for an INSERT just makes
its changes without acquiring locks. This is OK as long as we only allow
read-only users to acquire AccessShareLocks. If we allowed higher locks
we might need to do deadlock detection, which would add more complexity.

The above restrictions are limited to LOCKTAG_OBJECT so that advisory
locks work as advertised. So advisory locks can take both shared and
exclusive locks. This never conflicts with recovery because advisory
locks are not WAL logged.

-- 
 Simon Riggs   www.2ndQuadrant.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Rejecting weak passwords

2009-11-18 Thread Albe Laurenz
Tom Lane wrote:
 Applied with some minor modifications.  Aside from the added valuntil
 parameter, I changed the isencrypted parameter to an int with some
 #define'd values.  It seems easily foreseeable that we'll replace the
 MD5 encryption scheme someday, and it'd be good to ensure that this
 API is extendable when that happens.  Also, I got rid of the bool
 return value and made the hook responsible for throwing its 
 own errors.
 I don't know about you guys, but I would cheerfully kill anybody who
 tried to make me use a password checker that didn't tell me anything
 about why it thinks my password is too weak.  (The CrackLib API we
 are using is lamentably badly designed on this score --- does it have
 another call that provides a more useful error report?)  Even if you
 think weak password is adequate for that class of complaints, the
 single error message would certainly not do for complaints about the
 valuntil date being too far away.

Thank you.

I agree on all points.
I did not know that contrib modules get translated too, else I would
have thrown the error messages there.

Yours,
Laurenz Albe

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers