Re: [HACKERS] Role privileges in PostgreSQL.

2007-05-25 Thread Tom Lane
"Akmal Akmalhojaev" <[EMAIL PROTECTED]> writes:
> I have the following question: where role privileges on working with tables,
> data bases and so on are stored in PostgreSQL (In what system catalogs?)?

Privileges are attached to the target objects, eg pg_class.relacl for
relations.  Look for columns of type aclitem[] in the catalog descriptions:
http://developer.postgresql.org/pgdocs/postgres/catalogs.html

regards, tom lane

---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org


[HACKERS] Autovacuum versus rolled-back transactions

2007-05-25 Thread Tom Lane
The pgstats subsystem does not correctly account for the effects of
failed transactions.  Note the live/dead tuple counts in this example:

regression=# create table foo (f1 int);
CREATE TABLE
regression=# insert into foo select x from generate_series(1,1000) x;
INSERT 0 1000
-- wait a second for stats to catch up
regression=# select * from pg_stat_all_tables where relname = 'foo';
 relid  | schemaname | relname | seq_scan | seq_tup_read | idx_scan | 
idx_tup_fetch | n_tup_ins | n_tup_upd | n_tup_del | n_live_tup | n_dead_tup | 
last_vacuum | last_autovacuum | last_analyze | last_autoanalyze 
++-+--+--+--+---+---+---+---+++-+-+--+--
 496849 | public | foo |0 |0 |  |   
|  1000 | 0 | 0 |   1000 |  0 | 
| |  | 
(1 row)

regression=# begin;
BEGIN
regression=# insert into foo select x from generate_series(1,1000) x;
INSERT 0 1000
regression=# rollback;
ROLLBACK
-- wait a second for stats to catch up
regression=# select * from pg_stat_all_tables where relname = 'foo';
 relid  | schemaname | relname | seq_scan | seq_tup_read | idx_scan | 
idx_tup_fetch | n_tup_ins | n_tup_upd | n_tup_del | n_live_tup | n_dead_tup | 
last_vacuum | last_autovacuum | last_analyze | last_autoanalyze 
++-+--+--+--+---+---+---+---+++-+-+--+--
 496849 | public | foo |0 |0 |  |   
|  2000 | 0 | 0 |   2000 |  0 | 
| |  | 
(1 row)

This means that a table could easily be full of dead tuples from failed
transactions, and yet autovacuum won't do a thing because it doesn't
know there are any.  Perhaps this explains some of the reports we've
heard of tables bloating despite having autovac on.

It seems to me this is a "must fix" if we expect people to rely on
autovacuum for real in 8.3.

I think it's fairly obvious how n_live_tup and n_dead_tup ought to
change in response to a failed xact, but maybe not so obvious for the
other counters.  I suggest that the scan/fetch counters (seq_scan,
seq_tup_read, idx_scan, idx_tup_fetch) as well as all the block I/O
counters should increment the same for committed and failed xacts,
since they are meant to count work done regardless of whether the work
was in vain.  I am much less sure how we want n_tup_ins, n_tup_upd,
n_tup_del to act though.  Should they be advanced "as normal" by a
failed xact?  That's what the code is doing now, and if you think they
are counters for work done, it's not so unreasonable.

It may boil down to whether we would like the identity
n_live_tup = n_tup_ins - n_tup_del
to continue to hold, or the similar one for n_dead_tup.  The problem
basically is that pgstats is computing n_live_tup and n_dead_tup
using those identities rather than by tracking what really happens.
I don't think we can have those identities if failed xacts update the
counts "normally".  Is it worth having separate counters for the numbers
of failed inserts/updates?  (Failed deletes perhaps need not be counted,
since they change nothing.)  Or we could change the backends so that the
reported n_tup_ins/del/upd are made to still produce the right live/dead
tup counts according to the identities, but then those counts would not
reflect work done.  Another alternative is for transactions to tally
the number of live and dead tuples they create, with understanding of
rollbacks, and send those to the stats collector independently of the
action counters.

I don't think I want to add separate failed-insert/update counters,
because that will bloat the stats reporting file, which is uncomfortably
large already when you have lots of tables.  The separate-tally method
would avoid that, at the price of more stats UDP traffic.

I'm kind of leaning to the separate-tally method and abandoning the
assumption that the identities hold.  I'm not wedded to the idea
though.  Any thoughts?

regards, tom lane

---(end of broadcast)---
TIP 7: You can help support the PostgreSQL project by donating at

http://www.postgresql.org/about/donate


[HACKERS] Role privileges in PostgreSQL.

2007-05-25 Thread Akmal Akmalhojaev

Hello.

I have the following question: where role privileges on working with tables,
data bases and so on are stored in PostgreSQL (In what system catalogs?)?

Thanks.
Akmal.


Re: [HACKERS] Reviewing temp_tablespaces GUC patch

2007-05-25 Thread Jaime Casanova

On 5/25/07, Bernd Helmle <[EMAIL PROTECTED]> wrote:

--On Freitag, Mai 25, 2007 00:02:06 + Jaime Casanova
<[EMAIL PROTECTED]> wrote:
>>
>
> sounds good. can we see the new patch?

Attached tablespace.c.diff shows my current changes to use an OID  lookup
list.

>


+   if (source >= PGC_S_INTERACTIVE && IsTransactionState())
+   {
+   /*
+* Verify that all the names are valid tablespace names
+* We do not check for USAGE rights should we?
+*/
+   Oid cur_tblspc = get_tablespace_oid(curname);
+   if (cur_tblspc == InvalidOid)
+   {
+   ereport((source == PGC_S_TEST) ? NOTICE : ERROR,
+   (errcode(ERRCODE_UNDEFINED_OBJECT),
+errmsg("tablespace \"%s\" does not exist", curname)));
+   }
+   else
+   {
+   /*
+* Append new OID to temporary list. We can't
+* use the lookup table directly, because there could
+* be an ereport() in subsequent loops.
+*/
+   oidlist = lappend_oid(oidlist, cur_tblspc);
+   }
+   }


the list of oid's is only filled when you execute
SET temp_tablespaces = 'somelist'

but if you use the GUC in postgresql.conf at startup then not, so the
temp_tablespaces are not used even if they are setted

can you do that outside
+   if (source >= PGC_S_INTERACTIVE && IsTransactionState())

--
regards,
Jaime Casanova

"Programming today is a race between software engineers striving to
build bigger and better idiot-proof programs and the universe trying
to produce bigger and better idiots.
So far, the universe is winning."
  Richard Cook

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [HACKERS] Reviewing temp_tablespaces GUC patch

2007-05-25 Thread Jaime Casanova

On 5/25/07, Tom Lane <[EMAIL PROTECTED]> wrote:

Bernd Helmle <[EMAIL PROTECTED]> writes:
> --On Freitag, Mai 25, 2007 10:49:29 + Jaime Casanova
> <[EMAIL PROTECTED]> wrote:
>> No, because the RemovePgTempFiles() call in PostmasterMain() will
>> remove all tmp files at startup.

I believe we do not call RemovePgTempFiles during a crash recovery
cycle; this is intentional on the theory that the temp files might contain
useful debugging clues.


ah, i forgot that


 So there is a potential problem there.
Not sure how important it really is though --- neither crashes nor
tablespace drops ought to be so common that we need a really nice
solution.



the only semi-sane solution i can think of, is to have a superuser
only function that acts as a wrapper for RemovePgTempFiles(), but
still exists a chance for shoot yourself on the foot...

--
regards,
Jaime Casanova

"Programming today is a race between software engineers striving to
build bigger and better idiot-proof programs and the universe trying
to produce bigger and better idiots.
So far, the universe is winning."
  Richard Cook

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [HACKERS] Reviewing temp_tablespaces GUC patch

2007-05-25 Thread Tom Lane
Bernd Helmle <[EMAIL PROTECTED]> writes:
> --On Freitag, Mai 25, 2007 10:49:29 + Jaime Casanova 
> <[EMAIL PROTECTED]> wrote:
>> No, because the RemovePgTempFiles() call in PostmasterMain() will
>> remove all tmp files at startup.

> Hmm isn't RemovePgTempFiles() called on postmaster startup only? What will 
> happen if a temp tablespace is out of disk space, and the backend leaves 
> all previously created temp files there? Under these assumption we'll need 
> to restart the postmaster to get a clean tablespace ready to drop...

Theoretically, a backend will always remove its temp files during
transaction abort, so the only case that is really of concern is a
backend crashing before it can get around to doing that.  However, I
believe we do not call RemovePgTempFiles during a crash recovery cycle;
this is intentional on the theory that the temp files might contain
useful debugging clues.  So there is a potential problem there.
Not sure how important it really is though --- neither crashes nor
tablespace drops ought to be so common that we need a really nice
solution.

regards, tom lane

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [HACKERS] Reviewing temp_tablespaces GUC patch

2007-05-25 Thread Bernd Helmle
--On Freitag, Mai 25, 2007 00:02:06 + Jaime Casanova 
<[EMAIL PROTECTED]> wrote:




the original patch is from Albert Cervera =)


Ah, missed that, thanks ;)






sounds good. can we see the new patch?


Attached tablespace.c.diff shows my current changes to use an OID  lookup 
list.




the reason for those messages is that the tablespace can get full or
can be dropped before use, so we throw the message for the dba to take
actions. if no one thinks is a good idea the message can be removed.



I could imagine that this could irritate DBA's (at least, that is what 
happened to me during testing). It's okay that someone could drop a 
tablespace concurrently to other transactions, but i have concerns that 
with temp_tablespaces this could happen during _queries_. Do queries 
delete/recreate temp files during execution, maybe within sorts so that the 
used temp tablespace looks empty for a certain period of time?



The silent
mechanism to drop a tablespace during temporary usage makes me a little
bit uncomfortable about its robustness.



maybe using the list you put in TopMemoryContext we can deny the
ability to drop the tablespace until it's removed from the list of
temp tablespaces.


That would mean we have to share this information between backends. This 
looks complicated since every user could have its own temp_tablespaces 
GUC



--
 Thanks

   BerndIndex: tablespace.c
===
RCS file: /projects/cvsroot/pgsql/src/backend/commands/tablespace.c,v
retrieving revision 1.45
diff -c -B -r1.45 tablespace.c
*** tablespace.c	22 Mar 2007 19:51:44 -	1.45
--- tablespace.c	25 May 2007 15:27:23 -
***
*** 63,73 
  #include "utils/fmgroids.h"
  #include "utils/guc.h"
  #include "utils/lsyscache.h"
  
  
! /* GUC variable */
  char	   *default_tablespace = NULL;
  
  
  static bool remove_tablespace_directories(Oid tablespaceoid, bool redo);
  static void set_short_version(const char *path);
--- 63,80 
  #include "utils/fmgroids.h"
  #include "utils/guc.h"
  #include "utils/lsyscache.h"
+ #include "utils/memutils.h"
  
  
! /* GUC variables */
  char	   *default_tablespace = NULL;
+ char   *temp_tablespaces = NULL;
  
+ static int	   next_temp_tablespace;
+ static int	   num_temp_tablespaces;
+ 
+ /* OID list of current temp tablespaces */
+ static List *tmp_tblspc_lookup = NIL;
  
  static bool remove_tablespace_directories(Oid tablespaceoid, bool redo);
  static void set_short_version(const char *path);
***
*** 935,940 
--- 942,1080 
  	return result;
  }
  
+ /*
+  * Routines for handling the GUC variable 'temp_tablespaces'.
+  */
+ 
+ /* assign_hook: validate new temp_tablespaces, do extra actions as needed */
+ const char *
+ assign_temp_tablespaces(const char *newval, bool doit, GucSource source)
+ {
+ 	char	   *rawname;
+ 	ListCell   *l;
+ 	MemoryContext cur_cntxt;
+ 	List	   *namelist = NIL;
+ 	List   *oidlist = NIL;
+ 
+ 	/* Need a modifiable copy of string */
+ 	rawname = pstrdup(newval);
+ 
+ 	/* Parse string into list of identifiers */
+ 	if (!SplitIdentifierString(rawname, ',', &namelist))
+ 	{
+ 		/* syntax error in name list */
+ 		pfree(rawname);
+ 		list_free(namelist);
+ 		return NULL;
+ 	}
+
+ 	num_temp_tablespaces = 0;
+
+ 	foreach(l, namelist)
+ 	{
+ 		char	   *curname = (char *) lfirst(l);
+ 		if (curname[0] == '\0')
+ 			continue;
+ 
+ 		/*
+ 		 * If we aren't inside a transaction, we cannot do database access so
+ 		 * cannot verify the individual names.	Must accept the list on faith.
+ 		 */
+ 		if (source >= PGC_S_INTERACTIVE && IsTransactionState())
+ 		{
+ 			/*
+ 			 * Verify that all the names are valid tablespace names 
+ 			 * We do not check for USAGE rights should we?
+ 			 */
+ 			Oid cur_tblspc = get_tablespace_oid(curname);
+ 			if (cur_tblspc == InvalidOid)
+ 			{
+ ereport((source == PGC_S_TEST) ? NOTICE : ERROR,
+ 		(errcode(ERRCODE_UNDEFINED_OBJECT),
+ 		 errmsg("tablespace \"%s\" does not exist", curname)));
+ 			}
+ 			else
+ 			{
+ /* 
+  * Append new OID to temporary list. We can't
+  * use the lookup table directly, because there could
+  * be an ereport() in subsequent loops. 
+  */
+ oidlist = lappend_oid(oidlist, cur_tblspc);
+ 			}
+ 		}
+ 		num_temp_tablespaces++;
+ 	}
+ 
+ 	/*
+ 	 * Select the first tablespace to use
+ 	 */
+ 	Assert(num_temp_tablespaces >= 0);
+ 	if (num_temp_tablespaces != 0)
+ 		next_temp_tablespace = MyProcPid % num_temp_tablespaces;
+ 
+ 	/* Looks good for now, free any old lookup table and copy new OID
+ 	   list to our lookup table in permanent storage */
+ 	cur_cntxt = MemoryContextSwitchTo(TopMemoryContext);
+
+ 	if (tmp_tblspc_lookup != NIL)
+ 	{
+ 		list_free(tmp_tblspc_lookup);
+ 	}
+ 
+ 	tmp_tblspc_lookup = list_copy(oidlist);
+ 	MemoryContextSwitchTo(cur_cntxt);
+
+ 	pfree(rawname);
+ 	list_free(namelist);
+ 	return newval;
+ }
+ 
+ /*
+  * GetTempTablespace 

Re: [HACKERS] Reviewing temp_tablespaces GUC patch

2007-05-25 Thread Bernd Helmle
--On Freitag, Mai 25, 2007 10:49:29 + Jaime Casanova 
<[EMAIL PROTECTED]> wrote:



No, because the RemovePgTempFiles() call in PostmasterMain() will
remove all tmp files at startup.


Hmm isn't RemovePgTempFiles() called on postmaster startup only? What will 
happen if a temp tablespace is out of disk space, and the backend leaves 
all previously created temp files there? Under these assumption we'll need 
to restart the postmaster to get a clean tablespace ready to drop...


--
 Thanks

   Bernd

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
  choose an index scan if your joining column's datatypes do not
  match


Re: [HACKERS] like/ilike improvements

2007-05-25 Thread Tom Lane
Andrew Dunstan <[EMAIL PROTECTED]> writes:
>   do { (t)++; (tlen)--}  while ((*(t) & 0xC0) == 0x80 && tlen > 0)

The while *must* test those two conditions in the other order.
(Don't laugh --- we've had reproducible bugs before in which the backend
dumped core because of running off the end of memory due to this type
of mistake.)

> In fact, I'm wondering if that might make the other UTF8 stuff redundant 
> - the whole point of what we're doing is to avoid expensive calls to 
> NextChar;

+1 I think.  This test will be approximately the same expense as what
the outer loop would otherwise be (tlen > 0 and *t != firstpat), and
doing it this way removes an entire layer of intellectual complexity.
Even though the code is hardly different, we are no longer dealing in
misaligned pointers anywhere in the match algorithm.

regards, tom lane

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly


Re: [HACKERS] like/ilike improvements

2007-05-25 Thread Tom Lane
"Zeugswetter Andreas ADI SD" <[EMAIL PROTECTED]> writes:
>> You have to be on a first byte before you can meaningfully 
>> apply NextChar, and you have to use NextChar or else you 
>> don't count characters correctly (eg "__" must match 2 chars 
>> not 2 bytes).

> Well, for utf8 NextChar could advance to the next char even if the
> current byte
> position is in the middle of a multibyte char (skip over all 10xx). 

No doubt the macro could be made to work that way, but would it result
in correct matching behavior?  I doubt it --- you just matched an "_"
to half a character, or some such.

regards, tom lane

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq


Re: [HACKERS] like/ilike improvements

2007-05-25 Thread Andrew Dunstan



[EMAIL PROTECTED] wrote:



Is it worth the effort to pre-process the pattern?

For example:

%% -> %
  


This is already done, required by spec.


%_ -> _%

If applied recursively, this would automatically cover:

%_%  -> _%
_%_  -> __%

The 'benefit' would be that the pattern matching code would not
need an inner if statement?
  


I doubt it's worth the trouble.


Also - I didn't see a response to my query with regard treating UTF-8
as a two pass match. First pass treating it as bytes. If the first pass
matches, the second pass doing a full analysis. In the case of low
selectivity, this will be a win, as the primary filter would be the
full speed byte-based matching.
  


All matching will now be done byte-wise. CHAREQ is dead.

Advancing will also be done byte-wise except for:
 . where text matching is against _ for UTF8
 . where text matching is against % or _ for other multi-byte charsets.

So two passes doesn't sound like much of a win.

I had also asked why the focus would be on high selectivity. Why would
the primary filter criteria for a properly designed select statement by
a like with high selectivity? The only time I have ever used like is
when I expect low selectivity. Is there a reasonable case I am missing?


  


I think you'd need to show something close to a Pareto improvement: 
nobody worse off and some people better off. If you can do that then 
send in a patch.


However, I'm trying to minimise special case processing for UTF8, not 
create a whole new code path for it. The less special cases we have the 
easier it will be to maintain.



cheers

andrew

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [HACKERS] like/ilike improvements

2007-05-25 Thread Andrew Dunstan



Zeugswetter Andreas ADI SD wrote:


You have to be on a first byte before you can meaningfully 
apply NextChar, and you have to use NextChar or else you 
don't count characters correctly (eg "__" must match 2 chars 
not 2 bytes).



Well, for utf8 NextChar could advance to the next char even if the
current byte
position is in the middle of a multibyte char (skip over all 10xx). 



  


It doesn't matter - we are satisfied that it won't happen. However, this 
might well be a useful optimisation of NextChar() for the UTF8 case as 
something like


 do { (t)++; (tlen)--}  while ((*(t) & 0xC0) == 0x80 && tlen > 0)

In fact, I'm wondering if that might make the other UTF8 stuff redundant 
- the whole point of what we're doing is to avoid expensive calls to 
NextChar;


cheers

andrew

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

  http://www.postgresql.org/docs/faq


Re: [HACKERS] Reviewing temp_tablespaces GUC patch

2007-05-25 Thread Jaime Casanova

On 5/25/07, Tom Lane <[EMAIL PROTECTED]> wrote:

"Jaime Casanova" <[EMAIL PROTECTED]> writes:
> On 5/24/07, Alvaro Herrera <[EMAIL PROTECTED]> wrote:
>> What happens if you create a cursor that stores something (sort
>> intermediate results?) in a temp tablespace, FETCH some from it, then
>> someone else drops the tablespace and FETCH some more?

> you can't drop a tablespace that is not empty.

So a temp file left over by a crashed backend would indefinitely prevent
dropping the tablespace, until someone manually cleaned it up?



No, because the RemovePgTempFiles() call in PostmasterMain() will
remove all tmp files at startup.

--
regards,
Jaime Casanova

"Programming today is a race between software engineers striving to
build bigger and better idiot-proof programs and the universe trying
to produce bigger and better idiots.
So far, the universe is winning."
  Richard Cook

---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [HACKERS] like/ilike improvements

2007-05-25 Thread Zeugswetter Andreas ADI SD

> > However, I have just about convinced myself that we don't need 
> > IsFirstByte for matching "_" for UTF8, either preceded by "%" or
not, 
> > as it should always be true. Can anyone come up with a counter
example?
> 
> You have to be on a first byte before you can meaningfully 
> apply NextChar, and you have to use NextChar or else you 
> don't count characters correctly (eg "__" must match 2 chars 
> not 2 bytes).

Well, for utf8 NextChar could advance to the next char even if the
current byte
position is in the middle of a multibyte char (skip over all 10xx). 

(Assuming utf16 surrogate pairs are not encoded as 2 x 3bytes, which is
not valid utf8 anyway)   

Andreas

---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org


Re: [HACKERS] Why not keeping positions in GIN?

2007-05-25 Thread Guillaume Smet

On 5/25/07, Hitoshi Harada <[EMAIL PROTECTED]> wrote:

It's not only about Japanese. When you search "phrase" for text in English,
the same logic above will be needed. I don't research about tsearch2 but is
there any problem?? Also, in some case int-array inverted index needs the
entry positions as well, I guess. Obtaining positions with posting lists is
"general" enough for GIN, isn't it?

Is there any future plan around it?


We talked of this with Oleg and Teodor when I worked on GIN for
pg_trgm. I know there is a long term plan to solve this issue (and
especially improve ranking in full text search).

I'm not sure the position is general enough.  What I'd like to have is
the ability to add metadata. For example, in the case of pg_trgm, I'd
like to have the length of the original string as it's a strong factor
in similarity calculation. Currently, we get a lot of results which
are rechecked after the first index pass: it's not very efficient.

--
Guillaume

---(end of broadcast)---
TIP 7: You can help support the PostgreSQL project by donating at

   http://www.postgresql.org/about/donate