Re: [HACKERS] patch to implement ECPG side tracing / tracking ...

2010-01-13 Thread Hans-Juergen Schoenig

Michael Meskes wrote:

Hans,

nce Jaime already asked for a use case, just a few small comments from
me.

  

@@ -4,6 +4,7 @@
 #include postgres_fe.h

 #include ctype.h
+#include inttypes.h



This is not portable. You don't want to include this header.

Did I see this right that you use the statement cache for auto-prepared
statements even if the statement is not auto prepared? Some statements are not
profiled, how did you decide which one to do? 


There is no test case.

Before looking into it in detail I think we should first figure out if this
feature really has a benefit.

Michael
  


hello ...

the use cases for this thing are quite simple: we are currently porting 
hundreds (!) of complex Informix terminal applications to PostgreSQL. 
these are basically terminal applications used to perform a certain 
tasks. given the vast amount of code, we simply cannot change a single 
program because if we have to dig into the actual application code, we 
are dead before actually starting (business logic is a nightmare). so, 
to get around the problem we are basically adding all extensions to ECPG 
we need to make this work. this is why we did all this SQLDA stuff and 
so on you have seen recently.


the current problems are a bit more delicate: we have this vast number 
of programs and some of them perform better than Informix and some 
simply don't. Informix has some sort of explain mode (I forgot the 
exact name) which allows you to see which query is executed how by the 
system. effectively, you can use it to performance tune your precompiler 
application. in PostgreSQL it is currently a little hard to get from the 
log what is executed how often by which application in which speed and 
so on. so, we came up with the idea of adding a flag to the precompiler 
which essential keep stats for us and display it on exit (could be sent 
to a file then or so without anybody's notice). this would give 
excellent data to start with and it would make checking the database 
part of the application easily.
why for prepared queries: we found out that Informix is heavily using 
prepared queries internally. we already fixed something in this area 
(patch sent some time ago) and we were finally able to catch up with 
Informix performance-wise in this area (mostly cursor work). before this 
auto_prepare fix, we were sometimes 2-3 times slower than Informix. 
saving on network time solved the job. now we are left with many many 
programs performing somehow strange and we need to check for every 
program why. a decent summary on exit would be gold here.


it seems we will also come up with a server-side extension soon which 
basically compares and logs planner / executor starts the way we do it 
for stored procedures now (thanks to martin pilhak). we simply need it 
so that we can figure out which of our XXX programs did what then. 
testing one after the other is not so easy, some of them depend on each.


to make it short: it is impossible to port hundreds of applications to 
PostgreSQL without having the chance to trace what the precompiler is 
doing how often in which program via which connection. it is simply 
impossible. so, we really and desparately need this patch in.


many thanks,

hans


--
Cybertec Schoenig  Schoenig GmbH
Reyergasse 9 / 2
A-2700 Wiener Neustadt
Web: www.postgresql-support.de


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH] Provide rowcount for utility SELECTs

2009-12-28 Thread Hans-Juergen Schoenig

hello ...

just as a background info: this will have some positive side effects on 
embedded C programs which should be portable.

informix, for instance, will also return a row count on those commands.

   regards,

  hans



Pavel Stehule wrote:

2009/12/28 Boszormenyi Zoltan z...@cybertec.at:
  

Hi,

attached is a small patch that makes it possible for clients
to receive row count for SELECT ... INTO ... and CREATE TABLE ... AS ...

Comments?




good idea

+1

Pavel

  

Best regards,
Zoltán Böszörményi

--
Bible has answers for everything. Proof:
But let your communication be, Yea, yea; Nay, nay: for whatsoever is more
than these cometh of evil. (Matthew 5:37) - basics of digital technology.
May your kingdom come - superficial description of plate tectonics

--
Zoltán Böszörményi
Cybertec Schönig  Schönig GmbH
http://www.postgresql.at/



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers





  



--
Cybertec Schoenig  Schoenig GmbH
Reyergasse 9 / 2
A-2700 Wiener Neustadt
Web: www.postgresql-support.de


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] next CommitFest

2009-11-09 Thread Hans-Juergen Schoenig -- PostgreSQL

*snip*


One pretty major fly in the ointment is that neither Hot Standby nor
Streaming Replication has been committed or shows much sign of being
about to be committed.  I think this is bad.  These are big features
that figure to have some bugs and break some things.  If they're not
committed in time for alpha3, then there won't be any significant
testing of these prior to alpha4/beta1, at the earliest.  I think
that's likely to lead to either (1) a very long beta period followed
by a late release or (2) a buggy release.  I feel like Simon Riggs and
Fujii Masao really pulled out all the stops to get these ready in time
for the September CommitFest, and while I'm not in a hurry to break
the world, I think the sooner these can hit the tree, the better of
we'll be in terms of releasing 8.5.

Just my $0.02,

  


absolutely, we should be commit this.
we did some testing and things look stable.
also, people would most likely want to build code on top of it in be 
ready for 8.5 (support scripts, etc.). this is important in order to 
create some acceptance in user land.

this stuffs seems mature and very well thought.

just my $0.02 ...

   regards,

   hans-jürgen schönig

--
Cybertec Schoenig  Schoenig GmbH
Reyergasse 9 / 2
A-2700 Wiener Neustadt
Web: www.postgresql-support.de


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] draft RFC: concept for partial, wal-based replication

2009-10-30 Thread Hans-Juergen Schoenig -- PostgreSQL

hello ...

as my day has worked out quite nicely up to know i thought to f... it up 
and post a new concept which has been requested by a customer. the goal 
is to collect some feedback, ideas and so on (not to be mixed up with 
flames).
we have funding for this and we are trying to sort out how to do it the 
best way. comments are welcome ...

note, this is a first draft i want to refine based on some comments.
here we go ...


Partial WAL Replication for PostgreSQL:
---

As of now the PostgreSQL community has provided patches and functionalities
which allow full WAL-based replication as well as hot-standby. To extend 
this

functionality and to make PostgreSQL even more suitable for enterprise
computing than it is today, we have the commitment of a sponsor to fund 
partial

replication for PostgreSQL 8.5 / 8.6.

This is the first draft of a proposal to make partial WAL-based 
replication work
and to provide and additional set of fancy features to the community 
which has

been waiting for real in-core replication for a decade or more.


Why partial replication?


In some cases people have master servers which contain enormous amounts 
of data
(XX TB or so). If more than just one replica of this data is needed it 
might
happen that different slaves are used for different purposes.  This 
implies that

not all data will be used by all maschines.
An example: Consider a server at a phone company collecting phone calls, 
billing
data, and maybe network routing data. Data is used by different 
department and

one maschine is not enough to serve all three departments. With the new
functionality proposed here we could make 3 replicas each holding just a 
group
of tables for specific tasks thus allowing people to buy cheaper 
hardware for

slaves and use more maschines instead.


Current status:
---

Hot-standy and streaming replication have been a huge leap step forward 
for the
community and what is proposed here will be an extension to those 
patches and
functionalities. This concept is NOT aimed to replace anything - it is 
mainly an

addon.


Nodes and replication filters:
--

As of 8.4 standby systems are done by creating an archive_command along 
with a

base backup. Although it is easy to do some users still reported some
difficulties due to a total misunderstanding of PITR.

The idea is to add a functionality to add slaves like this:

CREATE REPLICA node_name
   CONNECT FROM SLAVE 'connect_string'
   TRANSFER COMMAND 'command'
   [ USING replication_filter ];

'command' would be any shell script copying data from the local master 
to the
new database node called node_name. Replication filters can be used to 
make X

replicas contain the same tables. Filtersets can be created like this:

CREATE REPLICATION FILTER filter_name
   [ EMPTY | FULL ] [ INCLUDE | EXCLUDE CHANGES ];

Replication filters can be modified ...

ALTER REPLICATION FILTER filter_name RENAME TO new_filtername;
ALTER REPLICATION FILTER filter_name
   { ADD | REMOVE } { TABLE | INDEX | SEQUENCE } object;

Filter sets can be dropped like this ...

DROP REPLICATION FILTER filter_name;

Internally CREATE REPLICA would initiate a base backup to the new slave 
server
just like we would do it manually otherwise. The server would 
automatically use
the user defined 'command' to copy one file after the other to the slave 
box.
The idea is basically stolen from archive_command and friends. At this 
stage we
either copy the entire instance as we would do it with a normal base 
backup or

just what is needed (defined by the replication filter). Users would
automatically only copy data to a slave which is really needed there and 
which
matches their filter config. If the copy is done, we can register the 
new node

inside a system table and commit the transaction. Also, we can automatically
create a useful recovery.conf setup - we know how to connect from the 
slave to

the master (we can use ' CONNECT FROM SLAVE [ USING ] ' to write a proper
recovery.conf file).

Tables can easily be added or removed from a replication filter with ALTER
REPLICATION FILTER.

Replicas can be removed easily:

DROP REPLICA node_name;

Why SQL to add a node? We are convinced that this is the most simplistic 
way of

doing things.  It is the most intuitive way of doing things.  We believe it
gives users a real feeling of simplicity. The current way of doing base 
backups
should stay in place as it is - it has proven to be nice for countless 
tasks.
However, it is not suitable for managing 10 or more replicas easily. 
Especially

not when they are not full blown copies of the master.


Technical ideas:


System tables:

We suggest to always replicate the entire system catalog.  It woulde be 
a total
disaster to try some other implementation. The same applies for other 
tables - we

always replicate entire tables; no WHERE-clauses allowed when it comes to

Re: [HACKERS] contrib/plantuner - enable PostgreSQL planner hints

2009-10-12 Thread Hans-Juergen Schoenig -- PostgreSQL

hi there ...

for this work i will include you in my evening prayers for at least one 
week.
i know there has been a lot of discussion about this but what you just 
posted it excellent and more important: USEFUL to many people.


i had something else in mind recently as well: virtual indexes. it would 
help people to decide whether and index would make sense if it would 
actually exist. in some cases this would make sense as well as many 
datasets are just to big to try out if an index help.s


if there was a vote whether this should be in contrib or in core: +999 
from me ...


   many thanks,

  hans


Oleg Bartunov wrote:

Hi there,

this is an announcement of our new contribution module for PostgreSQL 
- Plantuner - enable planner hints

(http://www.sai.msu.su/~megera/wiki/plantuner).

Example:

=# LOAD 'plantuner';
=# create table test(id int);
=# create index id_idx on test(id);
=# create index id_idx2 on test(id);
=# \d test
 Table public.test
 Column |  Type   | Modifiers
+-+---
 id | integer |
Indexes:
id_idx btree (id)
id_idx2 btree (id)
=# explain select id from test where id=1;
  QUERY PLAN
---
 Bitmap Heap Scan on test  (cost=4.34..15.03 rows=12 width=4)
   Recheck Cond: (id = 1)
   -  Bitmap Index Scan on id_idx2  (cost=0.00..4.34 rows=12 width=0)
 Index Cond: (id = 1)
(4 rows)
=# set enable_seqscan=off;
=# set plantuner.forbid_index='id_idx2';
=# explain select id from test where id=1;
  QUERY PLAN
--
 Bitmap Heap Scan on test  (cost=4.34..15.03 rows=12 width=4)
   Recheck Cond: (id = 1)
   -  Bitmap Index Scan on id_idx  (cost=0.00..4.34 rows=12 width=0)
 Index Cond: (id = 1)
(4 rows)
=# set plantuner.forbid_index='id_idx2,id_idx';
=# explain select id from test where id=1;
   QUERY PLAN
-
 Seq Scan on test  (cost=100.00..140.00 rows=12 width=4)
   Filter: (id = 1)
(2 rows)



Regards,
Oleg
_
Oleg Bartunov, Research Scientist, Head of AstroNet (www.astronet.ru),
Sternberg Astronomical Institute, Moscow University, Russia
Internet: o...@sai.msu.su, http://www.sai.msu.su/~megera/
phone: +007(495)939-16-83, +007(495)939-23-83




--
Cybertec Schoenig  Schoenig GmbH
Reyergasse 9 / 2
A-2700 Wiener Neustadt
Web: www.postgresql-support.de


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] SELECT ... FOR UPDATE [WAIT integer | NOWAIT] for 8.5

2009-09-23 Thread Hans-Juergen Schoenig -- PostgreSQL

Tom Lane wrote:

Josh Berkus j...@agliodbs.com writes:
  

Jeff,


Will statement_timeout not suffice for that use case?
  


  

Well, currently statement_timeout doesn't affect waiting for locks.



Sure it does.

  

And as a DBA, I don't think I'd want the same timeout for executing
queries as for waiting for a lock.



this is exactly the point it is simply an additional use case.
while statement_timeout is perfect to kick out queries which take too 
long a lock_timeout serves a totally different purpose because you will 
get a totally different error message. imagine some old 4GL terminal 
application: in this case you will hardly reach a statement_timeout 
because you will simply want to wait until things appear on your screen. 
however, you definitely don't want to wait forever if somebody keeps 
working on some product which is on stock and never finishes.


btw, this old terminal application i was talking about is exactly the 
usecase we had - this is why this patch has been made.
we are porting roughly 2500 terminal application from informix to 
postgresql. we are talking about entire factory production lines and so 
on here (the ECPG patches posted recently are for the same project, btw.).
there are countless use-cases where you want to know whether you are 
locked out or whether you are just taking too long - the message is 
totally different. the goal of the patch is to have a mechanism to make 
sure that you don't starve to death.


as far is syntax is concerned: there are good reasons for WAIT and good 
reasons for a GUC.
while the WAIT syntax is clearly for a very precise instruction for a 
very certain place in a program, a GUC is a more overall policy. i don't 
see a reason why we should not have both anyway.
a GUC has the charm that it can be assigned to roles, procedures, etc. 
nicely a WAIT clause has the charm of being incredibly precise. i can 
see good arguments for both.
the code itself is pretty simplistic - it needs no effort to be up to 
date and it does not harm anything else - it is pretty isolated.


   many thanks,

  hans

--
Cybertec Schoenig  Schoenig GmbH
Reyergasse 9 / 2
A-2700 Wiener Neustadt
Web: www.postgresql-support.de


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] SELECT ... FOR UPDATE [WAIT integer | NOWAIT] for 8.5

2009-09-23 Thread Hans-Juergen Schoenig -- PostgreSQL

Jeff Janes wrote:

Will statement_timeout not suffice for that use case?


we tried to get around it without actually touching the core but we 
really need this functionality.
patching the core here is not the primary desire we have. it is all 
about modeling some functionality which was truly missing.


   many thanks,

  hans

--
Cybertec Schoenig  Schoenig GmbH
Reyergasse 9 / 2
A-2700 Wiener Neustadt
Web: www.postgresql-support.de


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] happy birthday Tom Lane ...

2009-09-18 Thread Hans-Juergen Schoenig -- PostgreSQL

Tom,

On behalf of the entire PostgreSQL team here in Austria I want to wish 
you a happy birthday.

We hope that you fill be a vital part of PostgreSQL for many years to come.

  Best regards,

  Hans-Jürgen Schönig + team


--
Cybertec Schoenig  Schoenig GmbH
Reyergasse 9 / 2
A-2700 Wiener Neustadt
Web: www.postgresql-support.de


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] combined indexes with Gist - planner issues?

2009-08-31 Thread Hans-Juergen Schoenig -- PostgreSQL

hello everybody,

we are seriously fighting with some planner issue which seems to be 
slightly obscure to us.

we have a table which is nicely indexed (several GB in size).
i am using btree_gist operator classes to use a combined index including 
an FTI expression along with a number:


db=# \d product.t_product
  Table product.t_product
   Column | Type  |   
Modifiers   
---+---+
id| bigint| not null default 
nextval('product.t_product_id_seq'::regclass)

shop_id   | integer   |
art_number| text  |
title | text  |
description   | text  |
display_price | numeric(10,4) |

Indexes:
   t_product_pkey PRIMARY KEY, btree (id)
   idx_test gist (display_price, to_tsvector('german'::regconfig, 
(title || ' '::text) || description))
*idx_test2 gist (to_tsvector('german'::regconfig, (title || ' 
'::text) || description), display_price)*



what we basically expected here is that Postgres will scan the table 
using the index to give us the cheapest products containing the words we 
are looking for.
i am totally surprised to see that we have to fetch all products given 
the words, sort and then do the limit.
this totally kills performance because some words simply show up 
millions of times. this totally kills everything.


the plans look like this:

db=#  explain analyze SELECT art_number, title
   FROM product.t_product
   WHERE to_tsvector('german'::regconfig, (title || ' '::text) || 
description) @@ plainto_tsquery('harddisk')

   ORDER BY display_price
   LIMIT 10;
  QUERY 
PLAN  

Limit  (cost=108340.08..108340.10 rows=10 width=54) (actual 
time=1328.900..1328.909 rows=10 loops=1)
  -  Sort  (cost=108340.08..108422.48 rows=32961 width=54) (actual 
time=1328.899..1328.905 rows=10 loops=1)

Sort Key: display_price
Sort Method:  top-N heapsort  Memory: 18kB
-  Bitmap Heap Scan on t_product  (cost=2716.62..107627.80 
rows=32961 width=54) (actual time=1052.706..1328.772 rows=55 loops=1)
  Recheck Cond: (to_tsvector('german'::regconfig, ((title 
|| ' '::text) || description)) @@ plainto_tsquery('harddisk'::text))
  -  Bitmap Index Scan on idx_test2  (cost=0.00..2708.38 
rows=32961 width=0) (actual time=1052.576..1052.576 rows=55 loops=1)
Index Cond: (to_tsvector('german'::regconfig, 
((title || ' '::text) || description)) @@ plainto_tsquery('harddisk'::text))

Total runtime: 1328.942 ms
(9 rows)


runtime increases badly if words start to be more likely ...


db=#  explain analyze SELECT art_number, title
   FROM product.t_product
   WHERE to_tsvector('german'::regconfig, (title || ' '::text) || 
description) @@ plainto_tsquery('spiel')

   ORDER BY display_price
   LIMIT 10;
 QUERY 
PLAN 
--
Limit  (cost=108340.08..108340.10 rows=10 width=54) (actual 
time=33489.675..33489.682 rows=10 loops=1)
  -  Sort  (cost=108340.08..108422.48 rows=32961 width=54) (actual 
time=33489.675..33489.675 rows=10 loops=1)

Sort Key: display_price
Sort Method:  top-N heapsort  Memory: 18kB
-  Bitmap Heap Scan on t_product  (cost=2716.62..107627.80 
rows=32961 width=54) (actual time=774.923..33408.522 rows=56047 loops=1)
  Recheck Cond: (to_tsvector('german'::regconfig, ((title 
|| ' '::text) || description)) @@ plainto_tsquery('spiel'::text))
  -  Bitmap Index Scan on idx_test2  (cost=0.00..2708.38 
rows=32961 width=0) (actual time=759.078..759.078 rows=56047 loops=1)
Index Cond: (to_tsvector('german'::regconfig, 
((title || ' '::text) || description)) @@ plainto_tsquery('spiel'::text))

Total runtime: 33489.906 ms
(9 rows)

i am wondering why postgres is not able to use a combined index here?
is this some obscure thing related to gist, a logical problem or a 
planner deficiency?


ideas are welcome.

   many thanks,

  hans



--
Cybertec Schoenig  Schoenig GmbH
Reyergasse 9 / 2
A-2700 Wiener Neustadt
Web: www.postgresql-support.de


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] combined indexes with Gist - planner issues?

2009-08-31 Thread Hans-Juergen Schoenig -- PostgreSQL

Tom Lane wrote:

Hans-Juergen Schoenig -- PostgreSQL postg...@cybertec.at writes:
  
what we basically expected here is that Postgres will scan the table 
using the index to give us the cheapest products containing the words we 
are looking for.
i am totally surprised to see that we have to fetch all products given 
the words, sort and then do the limit.



I don't know why you'd find that surprising.  GIST indexes have no
support for ordering.

regards, tom lane

  


ok, i thought it would be something gist specific i was not aware of.
the golden question now is: i am looking for the cheapest products given 
a certain text in an insane amount of data.
how to do it? other quals which could narrow down the amount of data 
would not help.


i cannot see an option with regular weapons ...
maybe you can an idea how to fix core to make it work? maybe there is a 
mechanism we could need.

we really have to make this work - no matter what it takes.
we are willing to put effort into that.

   many thanks,

  hans

--
Cybertec Schoenig  Schoenig GmbH
Reyergasse 9 / 2
A-2700 Wiener Neustadt
Web: www.postgresql-support.de


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Bison crashes postgresql

2009-08-31 Thread Hans-Juergen Schoenig -- PostgreSQL

Andrew Dunstan wrote:



Werner Echezuria wrote:

Hi, I have a code in which I translate some code from sqlf to sql, but
when it comes to yy_parse the server crashes, I have no idea why,
because it works fine in other situations.
  


I don't understand why you're doing what you're doing this way. 
Wouldn't it be better to patch the main postgres parser and make your 
functionality first class rather than having it run via an SQL string 
and a function that calls a secondary parser?


cheers

andrew



yes, this is the thing i had in mind as well.
what is your ultimate goal?

   many thanks,

  hans


--
Cybertec Schoenig  Schoenig GmbH
Reyergasse 9 / 2
A-2700 Wiener Neustadt
Web: www.postgresql-support.de


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] combined indexes with Gist - planner issues?

2009-08-31 Thread Hans-Juergen Schoenig -- PostgreSQL

hello ...

we did some experiments with doing such a table.
the problem is if you want to allow arbitrary combinations of words 
which can be modeled perfectly with FTI.
you would instantly end up with a self join with 5 relations or so - 
which is again bad.


there are too many common words to consider doing with partly with gist 
and partly with a btree.


is there any option to adapt gist in a way that a combined index would 
make sense here?


   many thanks,

  hans




Heikki Linnakangas wrote:

Hans-Juergen Schoenig -- PostgreSQL wrote:
  

my knowledge of how gist works internally is not too extensive. any
kickstart idea would be appreciated.



If there's not too many of those common words, you can create a simple
partial b-tree index for each, and handle the less common words with the
gist index you have (you can drop the display_price column from the index).

Another idea:

Create a table containing one row for each word in each product:

CREATE TABLE t_product_word (id bigint, word text, display_price
numeric(10,4));

with triggers to keep it up-to-date. You can then create a regular two
column b-tree index on that:

CREATE INDEX idx_word_price ON t_product_word (word, display_price);

And query with:

SELECT p.art_number, p.title
   FROM t_product p INNER JOIN t_product_word pw ON p.id = pw.id
   WHERE pw.word = 'harddisk'
ORDER BY pw.display_price DESC LIMIT 10;

The t_product_word table will be huge, but with a few gigabytes of data
it should still be manageable.

  



--
Cybertec Schoenig  Schoenig GmbH
Reyergasse 9 / 2
A-2700 Wiener Neustadt
Web: www.postgresql-support.de


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] tsvector extraction patch

2009-07-03 Thread Hans-Juergen Schoenig -- PostgreSQL

hello,

this patch has not made it through yesterday, so i am trying to send it 
again.

i made a small patch which i found useful for my personal tasks.
it would be nice to see this in 8.5. if not core then maybe contrib.
it transforms a tsvector to table format which is really nice for text 
processing and comparison.


test=# SELECT * FROM tsvcontent(to_tsvector('english', 'i am pretty sure 
this is a good patch'));

lex   | rank
+--
good   |8
patch  |9
pretti |3
sure   |4
(4 rows)

  many thanks,

 hans

--
Cybertec Schoenig  Schoenig GmbH
Reyergasse 9 / 2
A-2700 Wiener Neustadt
Web: www.postgresql-support.de


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] tsvector extraction patch

2009-07-03 Thread Hans-Juergen Schoenig -- PostgreSQL

Hans-Juergen Schoenig -- PostgreSQL wrote:

hello,

this patch has not made it through yesterday, so i am trying to send 
it again.

i made a small patch which i found useful for my personal tasks.
it would be nice to see this in 8.5. if not core then maybe contrib.
it transforms a tsvector to table format which is really nice for text 
processing and comparison.


test=# SELECT * FROM tsvcontent(to_tsvector('english', 'i am pretty 
sure this is a good patch'));

lex   | rank
+--
good   |8
patch  |9
pretti |3
sure   |4
(4 rows)

  many thanks,

 hans




--
Cybertec Schoenig  Schoenig GmbH
Reyergasse 9 / 2
A-2700 Wiener Neustadt
Web: www.postgresql-support.de

diff -dcrpN postgresql-8.4.0.old/contrib/Makefile postgresql-8.4.0/contrib/Makefile
*** postgresql-8.4.0.old/contrib/Makefile	2009-03-26 00:20:01.0 +0100
--- postgresql-8.4.0/contrib/Makefile	2009-06-29 11:03:04.0 +0200
*** WANTED_DIRS = \
*** 39,44 
--- 39,45 
  		tablefunc	\
  		test_parser	\
  		tsearch2	\
+ 		tsvcontent	\
  		vacuumlo
  
  ifeq ($(with_openssl),yes)
diff -dcrpN postgresql-8.4.0.old/contrib/tsvcontent/Makefile postgresql-8.4.0/contrib/tsvcontent/Makefile
*** postgresql-8.4.0.old/contrib/tsvcontent/Makefile	1970-01-01 01:00:00.0 +0100
--- postgresql-8.4.0/contrib/tsvcontent/Makefile	2009-06-29 11:20:21.0 +0200
***
*** 0 
--- 1,19 
+ # $PostgreSQL: pgsql/contrib/tablefunc/Makefile,v 1.9 2007/11/10 23:59:51 momjian Exp $
+ 
+ MODULES = tsvcontent
+ DATA_built = tsvcontent.sql
+ DATA = uninstall_tsvcontent.sql
+ 
+ 
+ SHLIB_LINK += $(filter -lm, $(LIBS))
+ 
+ ifdef USE_PGXS
+ PG_CONFIG = pg_config
+ PGXS := $(shell $(PG_CONFIG) --pgxs)
+ include $(PGXS)
+ else
+ subdir = contrib/tsvcontent
+ top_builddir = ../..
+ include $(top_builddir)/src/Makefile.global
+ include $(top_srcdir)/contrib/contrib-global.mk
+ endif
diff -dcrpN postgresql-8.4.0.old/contrib/tsvcontent/tsvcontent.c postgresql-8.4.0/contrib/tsvcontent/tsvcontent.c
*** postgresql-8.4.0.old/contrib/tsvcontent/tsvcontent.c	1970-01-01 01:00:00.0 +0100
--- postgresql-8.4.0/contrib/tsvcontent/tsvcontent.c	2009-06-29 11:18:35.0 +0200
***
*** 0 
--- 1,169 
+ #include postgres.h
+ 
+ #include fmgr.h
+ #include funcapi.h
+ #include miscadmin.h
+ #include executor/spi.h
+ #include lib/stringinfo.h
+ #include nodes/nodes.h
+ #include utils/builtins.h
+ #include utils/lsyscache.h
+ #include utils/syscache.h
+ #include utils/memutils.h
+ #include tsearch/ts_type.h
+ #include tsearch/ts_utils.h
+ #include catalog/pg_type.h
+ 
+ #include tsvcontent.h
+ 
+ PG_MODULE_MAGIC;
+ 
+ PG_FUNCTION_INFO_V1(tsvcontent);
+ 
+ Datum
+ tsvcontent(PG_FUNCTION_ARGS)
+ {
+ 	FuncCallContext 	*funcctx;
+ 	TupleDesc		ret_tupdesc;
+ 	AttInMetadata		*attinmeta;
+ 	int			call_cntr;
+ 	int			max_calls;
+ 	ts_to_txt_fctx		*fctx;
+ 	Datum			result[2];
+ 	bool			isnull[2] = { false, false };
+ 	MemoryContext 		oldcontext;
+ 
+ 	/* input value containing the TS vector */
+ 	TSVector	in = PG_GETARG_TSVECTOR(0);
+ 
+ 	/* stuff done only on the first call of the function */
+ 	if (SRF_IS_FIRSTCALL())
+ 	{
+ 		TupleDesc	tupdesc;
+ 		int		i, j;
+ 		char		*wepv_base;
+ 
+ 		/* create a function context for cross-call persistence */
+ 		funcctx = SRF_FIRSTCALL_INIT();
+ 
+ 		/*
+ 		 * switch to memory context appropriate for multiple function calls
+ 		 */
+ 		oldcontext = MemoryContextSwitchTo(funcctx-multi_call_memory_ctx);
+ 
+ 		switch (get_call_result_type(fcinfo, NULL, tupdesc))
+ 		{
+ 			case TYPEFUNC_COMPOSITE:
+ /* success */
+ break;
+ 			case TYPEFUNC_RECORD:
+ /* failed to determine actual type of RECORD */
+ ereport(ERROR,
+ 		(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
+ 		errmsg(function returning record called in context 
+ that cannot accept type record)));
+ break;
+ 			default:
+ /* result type isn't composite */
+ elog(ERROR, return type must be a row type);
+ break;
+ 		}
+ 
+ 		/* make sure we have a persistent copy of the tupdesc */
+ 		tupdesc = CreateTupleDescCopy(tupdesc);
+ 
+ 		/*
+ 		 * Generate attribute metadata needed later to produce tuples from raw
+ 		 * C strings
+ 		 */
+ 		attinmeta = TupleDescGetAttInMetadata(tupdesc);
+ 		funcctx-attinmeta = attinmeta;
+ 
+ 		/* allocate memory */
+ 		fctx = (ts_to_txt_fctx *) palloc(sizeof(ts_to_txt_fctx));
+ 
+ 		wepv_base = (char *)in + offsetof(TSVectorData, entries) + in-size * sizeof(WordEntry);
+ 		
+ 		fctx-n_tsvt = 0;
+ 		for (i = 0; i  in-size; i++)
+ 		{
+ 			if (in-entries[i].haspos)
+ 			{
+ WordEntryPosVector *wepv = (WordEntryPosVector *)
+ (wepv_base + in-entries[i].pos + SHORTALIGN(in-entries[i].len));
+ 
+ fctx-n_tsvt += wepv-npos;
+ 			}
+ 			else
+ fctx-n_tsvt++;
+ 		}
+ 
+ 		fctx-tsvt = palloc(fctx-n_tsvt * sizeof(tsvec_tuple));
+ 
+ 		for (i = 0, j = 0; i  in-size; i++)
+ 		{
+ 			int pos = in-entries[i].pos

[HACKERS] tsvector extraction patch

2009-07-02 Thread Hans-Juergen Schoenig -- PostgreSQL

hello,

i made a small patch which i found useful for my personal tasks.
it would be nice to see this in 8.5. if not core then maybe contrib.
it transforms a tsvector to table format which is really nice for text 
processing and comparison.


test=# SELECT * FROM tsvcontent(to_tsvector('english', 'i am pretty sure 
this is a good patch'));

 lex   | rank
+--
good   |8
patch  |9
pretti |3
sure   |4
(4 rows)

   many thanks,

  hans

--
Cybertec Schoenig  Schoenig GmbH
Reyergasse 9 / 2
A-2700 Wiener Neustadt
Web: www.postgresql-support.de

diff -dcrpN postgresql-8.4.0.old/contrib/Makefile postgresql-8.4.0/contrib/Makefile
*** postgresql-8.4.0.old/contrib/Makefile	2009-03-26 00:20:01.0 +0100
--- postgresql-8.4.0/contrib/Makefile	2009-06-29 11:03:04.0 +0200
*** WANTED_DIRS = \
*** 39,44 
--- 39,45 
  		tablefunc	\
  		test_parser	\
  		tsearch2	\
+ 		tsvcontent	\
  		vacuumlo
  
  ifeq ($(with_openssl),yes)
diff -dcrpN postgresql-8.4.0.old/contrib/tsvcontent/Makefile postgresql-8.4.0/contrib/tsvcontent/Makefile
*** postgresql-8.4.0.old/contrib/tsvcontent/Makefile	1970-01-01 01:00:00.0 +0100
--- postgresql-8.4.0/contrib/tsvcontent/Makefile	2009-06-29 11:20:21.0 +0200
***
*** 0 
--- 1,19 
+ # $PostgreSQL: pgsql/contrib/tablefunc/Makefile,v 1.9 2007/11/10 23:59:51 momjian Exp $
+ 
+ MODULES = tsvcontent
+ DATA_built = tsvcontent.sql
+ DATA = uninstall_tsvcontent.sql
+ 
+ 
+ SHLIB_LINK += $(filter -lm, $(LIBS))
+ 
+ ifdef USE_PGXS
+ PG_CONFIG = pg_config
+ PGXS := $(shell $(PG_CONFIG) --pgxs)
+ include $(PGXS)
+ else
+ subdir = contrib/tsvcontent
+ top_builddir = ../..
+ include $(top_builddir)/src/Makefile.global
+ include $(top_srcdir)/contrib/contrib-global.mk
+ endif
diff -dcrpN postgresql-8.4.0.old/contrib/tsvcontent/tsvcontent.c postgresql-8.4.0/contrib/tsvcontent/tsvcontent.c
*** postgresql-8.4.0.old/contrib/tsvcontent/tsvcontent.c	1970-01-01 01:00:00.0 +0100
--- postgresql-8.4.0/contrib/tsvcontent/tsvcontent.c	2009-06-29 11:18:35.0 +0200
***
*** 0 
--- 1,169 
+ #include postgres.h
+ 
+ #include fmgr.h
+ #include funcapi.h
+ #include miscadmin.h
+ #include executor/spi.h
+ #include lib/stringinfo.h
+ #include nodes/nodes.h
+ #include utils/builtins.h
+ #include utils/lsyscache.h
+ #include utils/syscache.h
+ #include utils/memutils.h
+ #include tsearch/ts_type.h
+ #include tsearch/ts_utils.h
+ #include catalog/pg_type.h
+ 
+ #include tsvcontent.h
+ 
+ PG_MODULE_MAGIC;
+ 
+ PG_FUNCTION_INFO_V1(tsvcontent);
+ 
+ Datum
+ tsvcontent(PG_FUNCTION_ARGS)
+ {
+ 	FuncCallContext 	*funcctx;
+ 	TupleDesc		ret_tupdesc;
+ 	AttInMetadata		*attinmeta;
+ 	int			call_cntr;
+ 	int			max_calls;
+ 	ts_to_txt_fctx		*fctx;
+ 	Datum			result[2];
+ 	bool			isnull[2] = { false, false };
+ 	MemoryContext 		oldcontext;
+ 
+ 	/* input value containing the TS vector */
+ 	TSVector	in = PG_GETARG_TSVECTOR(0);
+ 
+ 	/* stuff done only on the first call of the function */
+ 	if (SRF_IS_FIRSTCALL())
+ 	{
+ 		TupleDesc	tupdesc;
+ 		int		i, j;
+ 		char		*wepv_base;
+ 
+ 		/* create a function context for cross-call persistence */
+ 		funcctx = SRF_FIRSTCALL_INIT();
+ 
+ 		/*
+ 		 * switch to memory context appropriate for multiple function calls
+ 		 */
+ 		oldcontext = MemoryContextSwitchTo(funcctx-multi_call_memory_ctx);
+ 
+ 		switch (get_call_result_type(fcinfo, NULL, tupdesc))
+ 		{
+ 			case TYPEFUNC_COMPOSITE:
+ /* success */
+ break;
+ 			case TYPEFUNC_RECORD:
+ /* failed to determine actual type of RECORD */
+ ereport(ERROR,
+ 		(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
+ 		errmsg(function returning record called in context 
+ that cannot accept type record)));
+ break;
+ 			default:
+ /* result type isn't composite */
+ elog(ERROR, return type must be a row type);
+ break;
+ 		}
+ 
+ 		/* make sure we have a persistent copy of the tupdesc */
+ 		tupdesc = CreateTupleDescCopy(tupdesc);
+ 
+ 		/*
+ 		 * Generate attribute metadata needed later to produce tuples from raw
+ 		 * C strings
+ 		 */
+ 		attinmeta = TupleDescGetAttInMetadata(tupdesc);
+ 		funcctx-attinmeta = attinmeta;
+ 
+ 		/* allocate memory */
+ 		fctx = (ts_to_txt_fctx *) palloc(sizeof(ts_to_txt_fctx));
+ 
+ 		wepv_base = (char *)in + offsetof(TSVectorData, entries) + in-size * sizeof(WordEntry);
+ 		
+ 		fctx-n_tsvt = 0;
+ 		for (i = 0; i  in-size; i++)
+ 		{
+ 			if (in-entries[i].haspos)
+ 			{
+ WordEntryPosVector *wepv = (WordEntryPosVector *)
+ (wepv_base + in-entries[i].pos + SHORTALIGN(in-entries[i].len));
+ 
+ fctx-n_tsvt += wepv-npos;
+ 			}
+ 			else
+ fctx-n_tsvt++;
+ 		}
+ 
+ 		fctx-tsvt = palloc(fctx-n_tsvt * sizeof(tsvec_tuple));
+ 
+ 		for (i = 0, j = 0; i  in-size; i++)
+ 		{
+ 			int pos = in-entries[i].pos;
+ 			int len = in-entries[i].len;
+ 
+ 			if (in-entries[i].haspos)
+ 			{
+ WordEntryPosVector *wepv = (WordEntryPosVector *)
+ 

Re: [HACKERS] SELECT ... FOR UPDATE [WAIT integer | NOWAIT] for 8.5

2009-05-13 Thread Hans-Juergen Schoenig

hello everybody,

from my side the goal of this discussion is to extract a consensus so 
that we can go ahead and implement this issue for 8.5.
our customer here needs a solution to this problem and we have to come 
up with something which can then make it into PostgreSQL core.

how shall we proceed with the decision finding process here?
i am fine with a GUC and with an grammar extension - i just need a 
decision which stays unchanged.


comments and votes are welcome.

   many thanks,

  hans

--
Cybertec Schönig  Schönig GmbH
Professional PostgreSQL Consulting, Support, Training
Gröhrmühlgasse 26, A-2700 Wiener Neustadt
Web: www.postgresql-support.de


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] SELECT ... FOR UPDATE [WAIT integer | NOWAIT] for 8.5

2009-05-11 Thread Hans-Juergen Schoenig

hello everybody,

i would like to propose an extension to our SELECT FOR UPDATE mechanism.
especially in web applications it can be extremely useful to have the 
chance to terminate a lock after a given timeframe.

i would like to add this functionality to PostgreSQL 8.5.

the oracle syntax is quite clear and easy to use here:

http://download.oracle.com/docs/cd/B19306_01/server.102/b14200/statements_10002.htm#i2126016

informix should behave pretty much the same way.
are there any arguments from hackers' side against this feature?

   many thanks,

  hans

--
Cybertec Schönig  Schönig GmbH
Professional PostgreSQL Consulting, Support, Training
Gröhrmühlgasse 26, A-2700 Wiener Neustadt
Web: www.postgresql-support.de


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] SELECT ... FOR UPDATE [WAIT integer | NOWAIT] for 8.5

2009-05-11 Thread Hans-Juergen Schoenig

hello greg,

the thing with statement_timeout is a little bit of an issue.
you could do:
   SET statement_timeout TO ...;
   SELECT FOR UPDATE ...
   SET statement_timeout TO default;

this practically means 3 commands.
the killer argument, however, is that the lock might very well happen 
ways after the statement has started.

imagine something like that (theoretical example):

   SELECT ...
  FROM
  WHERE x  ( SELECT some_very_long_thing)
   FOR UPDATE ...;

some operation could run for ages without ever taking a single, relevant 
lock here.

so, you don't really get the same thing with statement_timeout.

   regards,

  hans




Greg Stark wrote:
Can't you to this today with statement_timeout? Surely you do want to 
rollback the whole transaction or at least the subtransaction if you 
have error handling.







--
Cybertec Schönig  Schönig GmbH
Professional PostgreSQL Consulting, Support, Training
Gröhrmühlgasse 26, A-2700 Wiener Neustadt
Web: www.postgresql-support.de


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] SELECT ... FOR UPDATE [WAIT integer | NOWAIT] for 8.5

2009-05-11 Thread Hans-Juergen Schoenig




I tend to think there should be protocol level support for options 
like this but that would require buy-in from the interface writers.





how would you do it?
if you support it on the protocol level, you still need a way to allow 
the user to tell you how ...

i would see WAIT for DELETE, UPDATE and SELECT FOR UPDATE.
did you have more in mind?




the killer argument, however, is that the lock might very well happen 
ways after the statement has started.


Sure. But Isn't the statement_timeout behaviour what an application 
writer would actually want? Why would he care how long some sub-part 
of the statement took? Isn't an application -you used the example of a 
web app - really concerned with its response time?





no, for a simple reason: in this case you would depend ways too much in 
other tasks. some other reads which just pump up the load or some 
nightly cronjobs would give you timeouts which are not necessarily 
related to locking. we really want to protect us against some LOCK 
TABLE IN ACCESS EXCLUSIVE MODE - i am not looking for a solution which 
kills queries after some time (we have that already). i want protect 
myself against locking issues.
this feature is basically supported by most big vendor (informix, 
oracle, just to name a few). i am proposing this because i have needed 
it for a long time already and in this case it is also needed for a 
migration project.


   hans



--
Cybertec Schönig  Schönig GmbH
Professional PostgreSQL Consulting, Support, Training
Gröhrmühlgasse 26, A-2700 Wiener Neustadt
Web: www.postgresql-support.de


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] about hacking postgresql

2009-04-05 Thread Hans-Juergen Schoenig

abdelhak benmohamed wrote:


hello,

here more of details

I have a set of transaction.  Naturally, the transactions execute 
themselves in competition.  But I would want to give to every 
transaction a priority. Thus the transaction more priority must 
execute itself in first. 

 


I thought, as first step, to change the transaction syntax as follows

Start transaction (priority)

 


But I do not know where to do exactly the change in gram.y

for I have to try to change in TransactionStmt, but I always receive 
the message syntax error at now near (


 


thanks you a lot

 





hello ...

if you pump then in through the same database connection you do 
basically the same thing.
if you want transactions to lock out each other, take a look at advisory 
locks.
implementing this for start transaction seems pretty useless to me 
however.


   best regards,

  hans

--
Cybertec Schönig  Schönig GmbH
Professional PostgreSQL Consulting, Support, Training
Gröhrmühlgasse 26, A-2700 Wiener Neustadt
Web: www.postgresql-support.de


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] 8.4 release planning

2009-01-26 Thread Hans-Juergen Schoenig

Josh Berkus wrote:

All,

So, some feedback to make this decision more difficult:

Users: care about HS more than anything else in the world.  I'm 
convinced that if we took a staw poll, 80% of our users would be in 
favor of waiting for HS.  This one feature will make more of a 
difference in the number of PG users than any feature since the 
Windows port.  Maybe more.


on the other hand:

We held back version 4 months 7.4 for Windows, before it became 
apparent that there was at least a year more work to do.  That was a 
mistake, and in many ways HS seems like a similar case.




I can only confirm what Josh is saying here.
We would also assume that 80% have been waiting for Simon's work for 
years. In fact, I have been dealing fulltime with PostgreSQL since 1999 
and it has been a missing issue since than.
Now that we are so close to fixing this issue for so many people out 
there, we should give it all the attention we have and support Simon + 
team wherever we can.
I think Simon has responded to all question is almost realtime. We 
should take that into consideration.
Also, Simon is focuing on a very open development model - this naturally 
means a lot of mailing list traffic. Isn't this what this project is all 
about?


I am in favor of giving this patch a useful timeframe for completion.
If people decide to give this patch a chance, we will definitely agree 
on putting some significant manpower in here as well.

We are not the only ones who want to see that in.
We already see people saying that they delay migrations because they are 
hoping for readable slaves to go in.
Also, in the past 10 years I have been tortured with when can we have 
replication each and every day ...
I am fed up :). i cannot hear it anymore (... but MySQL has 
replication *aargh*).


   best regards,

  hans

--
--
Cybertec Schönig  Schönig GmbH
Professional PostgreSQL Consulting, Support, Training
Gröhrmühlgasse 26, A-2700 Wiener Neustadt
Web: www.postgresql-support.de


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] A smaller default postgresql.conf

2008-08-19 Thread Hans-Juergen Schoenig

Peter Eisentraut wrote:
I seem to recall that there was general support for installing a smaller 
default postgresql.conf file with only, say, a dozen parameters mentioned for 
initial tuning.  The complete file can stay as a sample.  Any objections to 
that?  (Let's not discuss quite yet exactly which parameters are the chosen 
ones.)


  


i think this would make sense as long as this small file tells users 
where to find the full story.

generally i would say that this would be a step into the right direction.

alternatively we could use some sort of #include mechanism to split 
most important and not so important.


   hans


--
Cybertec Schönig  Schönig GmbH
PostgreSQL Solutions and Support
Gröhrmühlgasse 26, A-2700 Wiener Neustadt
Tel: +43/1/205 10 35 / 340
www.postgresql-support.de, www.postgresql-support.com


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] any psql static binary for iphone ?

2008-08-18 Thread Hans-Juergen Schoenig

Peter Eisentraut wrote:

Am Sunday, 17. August 2008 schrieb Oleg Bartunov:
  

is there psql static binary, which I can use on my iphone (version 1) ?



I have no idea, but just as a thought, using phpPgAdmin might be a good 
workaround.


  


postgres seems to compile nicely on the iphone.
compilations stops at gram.c however :) the file is just too big to 
compile on 96MB of RAM :).

first the screen turns to black and  then it reboots.
so far i have not seen how i can add a swap file to the iphone and i was 
too lazy to cross compile *g*.

but until gram.c - no warning; no errors *g*.

   regards,

  hans

--
Cybertec Schönig  Schönig GmbH
PostgreSQL Solutions and Support
Gröhrmühlgasse 26, A-2700 Wiener Neustadt
Tel: +43/1/205 10 35 / 340
www.postgresql-support.de, www.postgresql-support.com


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] any psql static binary for iphone ?

2008-08-18 Thread Hans-Juergen Schoenig

postgres seems to compile nicely on the iphone.

compilations stops at gram.c however :) the file is just too big to compile
on 96MB of RAM :).
first the screen turns to black and  then it reboots.
so far i have not seen how i can add a swap file to the iphone and i was too
lazy to cross compile *g*.
but until gram.c - no warning; no errors *g*.



iirc you don't have to compile gram.c for psql?

merlin
  


no, not for psql ...
i wanted to give pgbench a try.
just plain curiosity.

   hans

--
Cybertec Schönig  Schönig GmbH
PostgreSQL Solutions and Support
Gröhrmühlgasse 26, A-2700 Wiener Neustadt
Tel: +43/1/205 10 35 / 340
www.postgresql-support.de, www.postgresql-support.com


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Status of DISTINCT-by-hashing work

2008-08-06 Thread Hans-Juergen Schoenig

Tom Lane wrote:

I've pretty much finished the project I got a bee in my bonnet about
last week, which is to teach SELECT DISTINCT how to (optionally) use
hashing for grouping in the same way that GROUP BY has been able to do
for awhile.

There are still two places in the system that hard-wire the use of
sorting for duplicate elimination:

* Set operations (UNION/INTERSECT/EXCEPT)

* Aggregate functions with DISTINCT

I'm thinking of trying to fix set operations before I leave this topic,
but I'm not sure it's worth the trouble to change DISTINCT aggregates.
They'd be a lot more work (since there's no executor infrastructure
in place that could be used) and the return on investment seems low.

Comments?

regards, tom lane

  


i feel it exactly the same way.
DISTINCT has been a place people wanted to see fixed for a while but set 
operations are nothing I would really worry about.

what we have now is absolutely fine.

given the list of more important issues, i'd vote for something else.

   best regards,

  hans

--
Cybertec Schönig  Schönig GmbH
PostgreSQL Solutions and Support
Gröhrmühlgasse 26, A-2700 Wiener Neustadt
Tel: +43/1/205 10 35 / 340
www.postgresql-support.de, www.postgresql-support.com


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: New relkind (was Re: [HACKERS] Exposing quals)

2008-07-08 Thread Hans-Juergen Schoenig

Simon Riggs wrote:

On Mon, 2008-07-07 at 16:26 -0700, David Fetter wrote:
  

On Mon, Jul 07, 2008 at 06:46:29PM -0400, Andrew Dunstan wrote:


For the record, I agree with Jan's suggestion of passing a pointer
to the parse tree, and offline gave David a suggestion verbally as
to how this could be handled for PL/PerlU.

I don't think we should be tied too closely to a string
representation, although possibly the first and simplest callback
function would simply stringify the quals.
  

As I understand Jan's plan, the idea is to create a new relkind with
an exit to user code at leaf nodes in the plan tree.  This would
require an API design for both user C code and for each PL to use, but
would then allow PostgreSQL's optimizer to work on JOINs, etc.

Jan, have I got that right so far?  Do you have something in the way
of a rough patch, docs, etc. for this?



It sounds like we can make it happen as text for other DBMS and as plan
nodes for PostgreSQL, which is the best solution all round.

Personally not too worried which way we do this - as long as we do it
for 8.4 :-) It's obviously happening in the background, so I'll leave it
alone.

  


I think the concept involving the plan tree is gold. Hannu Krosing 
mentioned some idea like that recently as well.
If the function had the chance to tell the planner how it is gonna 
operate (e.g produces sorted output, etc.) it would be perfect.
The golden thing here would be if we could teach a function whether it 
is  STREAMABLE | NOT STREAMABLE. streamable would make sure that we 
don't have to materialize the output of a set returning function. this 
would allow google-like analysis in postgresql easily by allowing to 
fetch data from any amount of data from any data source.


   best regards,

  hans


--
Cybertec Schönig  Schönig GmbH
PostgreSQL Solutions and Support
Gröhrmühlgasse 26, A-2700 Wiener Neustadt
Tel: +43/1/205 10 35 / 340
www.postgresql-support.de, www.postgresql-support.com


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] CONNECT BY and WITH ...

2008-07-07 Thread Hans-Juergen Schoenig

good morning everybody,

i know that this is really a hot potato on the mailing list but i think 
it is useful to discuss this issue.
in the past few months we have been working with a customer to improve 
evgen's CONNECT BY patch.
as we have a nice and promising WITH RECURSIVE patch the original 
CONNECT BY codes are pretty obsolete.


however, in the past view weeks I have been asked more than once if it 
is possible to use the current with patch and add the parser support for 
CONNECT BY to it. so, people had the choice whether to go with CONNECT 
BY syntax (to be Oracle compliant, which is important) or ANSI SQL 
compliant (which is important as well).


how are the feelings towards an improvement like that?
i would ease the pain of many people for sure.

   best regards,

  hans

--
Cybertec Schönig  Schönig GmbH
PostgreSQL Solutions and Support
Gröhrmühlgasse 26, A-2700 Wiener Neustadt
Tel: +43/1/205 10 35 / 340
www.postgresql-support.de, www.postgresql-support.com


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] WITH RECURSIVE updated to CVS TIP

2008-07-05 Thread Hans-Juergen Schoenig

hello david,

i did some quick testing with this wonderful patch.
it seems there are some flaws in there still:

test=# explain select count(*)
test-# from ( WITH RECURSIVE t(n) AS ( SELECT 1 UNION ALL  
SELECT DISTINCT n+1 FROM t )

test(# SELECT * FROM t WHERE n  50) as t
test-# WHERE n  100;
server closed the connection unexpectedly
This probably means the server terminated abnormally
before or while processing the request.
The connection to the server was lost. Attempting reset: Failed.
! \q

this one will kill the planner :(
removing the (totally stupid) distinct avoids the core dump.


i found one more issue;

-- broken: wrong result
test=# select count(*) from ( WITH RECURSIVE t(n) AS (
SELECT 1 UNION ALL SELECT n + 1 FROM t)
SELECT * FROM t WHERE n  50) as t WHERE n  (
select count(*) from ( WITH RECURSIVE t(n) AS (
SELECT 1 UNION ALL SELECT n + 1 FROM t )
SELECT * FROM t WHERE n  50) as t WHERE n  100) ;
 count
---
 1
(1 row)

if i am not totally wrong, this should give us a different result.

i am looking forward to see this patch in core :).
it is simply wonderful ...

many thanks,

hans






On Jul 3, 2008, at 1:11 AM, David Fetter wrote:


Folks,

Please find patch enclosed, including some documentation.

Can we see about getting this in this commitfest?

Cheers,
David.
--
David Fetter [EMAIL PROTECTED] http://fetter.org/
Phone: +1 415 235 3778  AIM: dfetter666  Yahoo!: dfetter
Skype: davidfetter  XMPP: [EMAIL PROTECTED]

Remember to vote!
Consider donating to Postgres: http://www.postgresql.org/about/ 
donaterecursive_query-7.patch.bz2

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers




--
Cybertec Schönig  Schönig GmbH
PostgreSQL Solutions and Support
Gröhrmühlgasse 26, 2700 Wiener Neustadt
Tel: +43/1/205 10 35 / 340
www.postgresql-support.de, www.postgresql-support.com



[HACKERS] interval madness ...

2008-06-28 Thread Hans-Juergen Schoenig

hello everybody ...

i am sitting here in a training and i am wondering about the  
following issue ...


test=# select now() + '3 years 2 decades 4000 seconds 9  
minutes'::interval;

   ?column?
---
 2031-06-28 11:58:35.052423+02
(1 row)

test=# select now() + '3 years 2 decades 4000 seconds 9 minutes 1  
century'::interval;

   ?column?
---
 2131-06-28 11:59:01.635835+01
(1 row)


why do i get a different timezone just because of adding one more  
century?

i cannot see an obvious reason.

many thanks,

hans

--
Cybertec Schönig  Schönig GmbH
PostgreSQL Solutions and Support
Gröhrmühlgasse 26, 2700 Wiener Neustadt
Tel: +43/1/205 10 35 / 340
www.postgresql-support.de, www.postgresql-support.com



Re: [HACKERS] interval madness ...

2008-06-28 Thread Hans-Juergen Schoenig


On Jun 28, 2008, at 11:39 AM, Gregory Stark wrote:


Hans-Juergen Schoenig [EMAIL PROTECTED] writes:

why do i get a different timezone just because of adding one more   
century?

i cannot see an obvious reason.


What version of Postgres and what setting of TZ?

--
  Gregory Stark
  EnterpriseDB  http://www.enterprisedb.com
  Ask me about EnterpriseDB's Slony Replication support!

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers



i am on OS X.

test=# select now();
 now
--
 2008-06-28 11:42:58.59121+02
(1 row)

test=# select version();
  
version
 
-
 PostgreSQL 8.3.0 on i386-apple-darwin8.11.1, compiled by GCC i686- 
apple-darwin8-gcc-4.0.1 (GCC) 4.0.1 (Apple Computer, Inc. build 5370)

(1 row)


many thanks,

hans





--
Cybertec Schönig  Schönig GmbH
PostgreSQL Solutions and Support
Gröhrmühlgasse 26, 2700 Wiener Neustadt
Tel: +43/1/205 10 35 / 340
www.postgresql-support.de, www.postgresql-support.com



[HACKERS] DROP ROLE dependency tracking ...

2008-05-25 Thread Hans-Juergen Schoenig

good morning,

some days ago i have fallen over an issue which feels more or less like 
a bug. consider:



test=# create role xy LOGIN;
CREATE ROLE

test=# grant connect on database test to xy;
GRANT

test=# drop role xy;
ERROR:  role xy cannot be dropped because some objects depend on it
DETAIL:  access to database test

this is a totally fresh instance --- all i did was creating a db called 
test.
failing would make sense if i would the owner of an object but i fact i 
don't own anything.


test=# SELECT version();

version 
--
PostgreSQL 8.4devel on x86_64-unknown-linux-gnu, compiled by GCC gcc 
(GCC) 4.2.3 (Ubuntu 4.2.3-2ubuntu7)

(1 row)

is this a known issue?

   many thanks,

  hans


--
Cybertec Schönig  Schönig GmbH
PostgreSQL Solutions and Support
Gröhrmühlgasse 26, A-2700 Wiener Neustadt
Tel: +43/1/205 10 35 / 340
www.postgresql-support.de, www.postgresql-support.com


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCHES] WITH RECURSIVE patch V0.1

2008-05-25 Thread Hans-Juergen Schoenig

Gregory Stark wrote:

Joshua D. Drake [EMAIL PROTECTED] writes:

  

Couldn't we just have it pay attention to the existing
max_stack_depth?
  

Recursive query does not consume stack. The server enters an infinite
loop without consuming stack. Stack-depth error does not happen.


We could have a separate guc variable which limits the maximum number of
levels of recursive iterations. That might be a useful feature for DBAs that
want to limit their users from issuing an infinite query.
  

statement_timeout :)



Good point.

Though it occurs to me that if you set FETCH_COUNT in psql (or do the
equivalent in your code ) statement_timeout becomes much less useful.

  


i don't think statement_timeout is a good idea at all.
it is not deterministic. depending on the load on the server some 
queries will execute while others fail.

a separate GUC is needed.

   best regards,

  hans



--
Cybertec Schönig  Schönig GmbH
PostgreSQL Solutions and Support
Gröhrmühlgasse 26, A-2700 Wiener Neustadt
Tel: +43/1/205 10 35 / 340
www.postgresql-support.de, www.postgresql-support.com


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] DROP ROLE dependency tracking ...

2008-05-25 Thread Hans-Juergen Schoenig

Alvaro Herrera wrote:

Hans-Juergen Schoenig wrote:

  

test=# create role xy LOGIN;
CREATE ROLE

test=# grant connect on database test to xy;
GRANT

test=# drop role xy;
ERROR:  role xy cannot be dropped because some objects depend on it
DETAIL:  access to database test

this is a totally fresh instance --- all i did was creating a db called  
test.
failing would make sense if i would the owner of an object but i fact i  
don't own anything.



But in some cases you might not want to lose the information associated
to the grants you've done.  That's why we have REASSIGN OWNED.

Hmm, but then there's no way to lose it, even if you do want that.  DROP
OWNED does not touch grants (which is per design), but we don't have
DROP ROLE CASCADE.  So maybe there's a functionality gap here ...

  


when thinking of REASSIGNED OWNED people tend to think about tables 
rather than about CONNECT rights.
i would suggest to make DROP ROLE just kill the role unless there is a 
real object depending on it.

i would not see a permission to be an object. what do you think?

   many thanks,

  hans



--
Cybertec Schönig  Schönig GmbH
PostgreSQL Solutions and Support
Gröhrmühlgasse 26, A-2700 Wiener Neustadt
Tel: +43/1/205 10 35 / 340
www.postgresql-support.de, www.postgresql-support.com


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] XIDs and big boxes again ...

2008-05-12 Thread Hans-Juergen Schoenig

Joshua D. Drake wrote:

Hans-Juergen Schoenig wrote:


regards, tom lane
  



overhead is not an issue here - if i lose 10 or 15% i am totally fine 
as long as i can reduce vacuum overhead to an absolute minimum.

overhead will vary with row sizes anyway - this is not the point.


I am not buying this argument. If you have a 5TB database, I am going 
to assume you put it on enterprise class hardware. Enterprise class 
hardware can handle the I/O required to appropriately run vacuum.


We have a customer that is constantly running 5 autovacuum workers on 
only 28 spindles. We are in the process of upgrading them to 50 
spindles at which point I will likely try 10 autovacuum workers.





i forgot to mention - i am on 8.1 here.
so, VACUUM is not so smart yet.

my changes are pretty much random I/O - so tuple header does not 
contribute to a lot more I/O as i have to read entire blocks anway.

this is why i said - it is not that kind of an issue.

and no, updating is not a 5 min task ...

   hans

--
Cybertec Schönig  Schönig GmbH
PostgreSQL Solutions and Support
Gröhrmühlgasse 26, A-2700 Wiener Neustadt
Tel: +43/1/205 10 35 / 340
www.postgresql-support.de, www.postgresql-support.com


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] XIDs and big boxes again ...

2008-05-11 Thread Hans-Juergen Schoenig

hello everybody,

i know that we have discussed this issue already. my view of the problem 
has changed in the past couple of weeks, however. maybe other people had 
similar experiences.
i have been working on a special purpose application which basically 
looks like that:


   - 150.000 tables (for several reasons heavily constraint excluded): 
small changes made once in a while

   - XX medium sized tables which are heavily changed.
   - size:  5 TB

my DB is facing around 600mio transaction a month. 85% of those contain 
at least some small modification so I cannot save on XIDs.
my problem is that I cannot VACUUM FREEZE my 150k tables where most of 
the data is as I have a couple of thousand transactions a day modifying 
this data.
but, i also have troubles to prevent myself from transaction wraparound 
as it is pretty boring to vacuum that much data under heavy load - with 
some useful vacuum delay it just takes too long.

i basically have to vacuum the entire database too often to get spare XIDs.

i suggest to introduce a --with-long-xids flag which would give me 62 / 
64 bit XIDs per vacuum on the entire database.

this should be fairly easy to implement.
i am not too concerned about the size of the tuple header here - if we 
waste 500 gb of storage here i am totally fine.


any chances to get a properly written fix like that in?
maybe somebody else has similar problems? hannu krosing maybe? :-P

   hans

--
Cybertec Schönig  Schönig GmbH
PostgreSQL Solutions and Support
Gröhrmühlgasse 26, A-2700 Wiener Neustadt
Tel: +43/1/205 10 35 / 340
www.postgresql-support.de, www.postgresql-support.com


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] XIDs and big boxes again ...

2008-05-11 Thread Hans-Juergen Schoenig

Tom Lane wrote:

Gregory Stark [EMAIL PROTECTED] writes:
  

... Keep in mind you're proposing to make everything run 3% slower instead of
using that 3% i/o bandwidth headroom to run vacuum outside the critical path.



I think that's actually understating the problem.  Assuming this is a
64-bit machine (which it had better be, if you want XID to be 64 bits...)
then the effective increase in tuple header size is not just 12 bytes
but 16 bytes, due to alignment padding.  Greg's 3% overhead number is
only on-target if your average row width is presently about 530 bytes.
It could easily be a whole lot less than that, and the overhead
proportionally higher.

regards, tom lane
  



overhead is not an issue here - if i lose 10 or 15% i am totally fine as 
long as i can reduce vacuum overhead to an absolute minimum.

overhead will vary with row sizes anyway - this is not the point.

the point is that you don't want to potentially vacuum a table when only 
a handful of records has been changed.


   many thanks,

  hans

--
Cybertec Schönig  Schönig GmbH
PostgreSQL Solutions and Support
Gröhrmühlgasse 26, A-2700 Wiener Neustadt
Tel: +43/1/205 10 35 / 340
www.postgresql-support.de, www.postgresql-support.com


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Adding pipelining support to set returning functions

2008-04-11 Thread Hans-Juergen Schoenig

Hannu Krosing wrote:

A question to all pg hackers

Is anybody working on adding pipelining to set returning functions.

How much effort would it take ?

Where should I start digging ?
  


i asked myself basically the same question some time ago.
pipelining seems fairly impossible unless we ban joins on those 
plugins completely.
i think this should be fine for your case (no need to join PL/proxy 
partitions) - what we want here is to re-unify data and sent it through 
centralized BI.




BACKGROUND:

AFAICS , currently set returning functions materialise their results
before returning, as seen by this simple test:

hannu=# select * from generate_series(1,10) limit 2;
 generate_series 
-

   1
   2
(2 rows)

Time: 1.183 ms


hannu=# select * from generate_series(1,1000) limit 2;
 generate_series 
-

   1
   2
(2 rows)

Time: 3795.032 ms

being able to pipeline (generate results as needed) would enable several
interesting techniques, especially if combined with pl/proxy or any
other functions which stream external data.

Applications and design patterns like http://telegraph.cs.berkeley.edu/
or http://labs.google.com/papers/mapreduce.html would suddenly become
very easy to implement.

-
Hannu

  


currently things like nodeSeqscan do SeqNext and so on - one records is 
passed on to the next level.

why not have a nodePlugin or so doing the same?
or maybe some additional calling convention for streaming functions...

e.g.:
CREATE STREAMING FUNCTION xy() RETURNS NEXT RECORD AS $$
   return exactly one record to keep doing
   return NULL to mark end of table
$$ LANGUAGE 'any';

so - for those function no ...
   WHILE ...
  RETURN NEXT

but just one tuple per call ...
this would pretty much do it for this case.
i would not even call this a special case - whenever there is a LOT of 
data, this could make sense.


   best regards,

  hans

--
Cybertec Schönig  Schönig GmbH
PostgreSQL Solutions and Support
Gröhrmühlgasse 26, A-2700 Wiener Neustadt
Tel: +43/1/205 10 35 / 340
www.postgresql-support.de


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [Fwd: Re: [PATCHES] 64-bit CommandIds]

2008-03-21 Thread Hans-Juergen Schoenig


Decibel! [EMAIL PROTECTED] writes:

If we're going to make this a ./configure option, ISTM we should  
do  the same
with XID size as well. I know there are high-velocity  databases  
that could use

that.


Keep in mind we just changed things so that read-only transactions  
don't
consume xids. That means you would have to be actually modifying 2- 
billion

records before wrap-around becomes an issue.

If you're modifying 2-billion records that quickly presumably  
you're going to

have other pressing reasons to run vacuum aside from xid freezing...

Also, consider that you're suggesting increasing the per-tuple  
overhead from

24 bytes to, if my arithmetic is right, 40 bytes.

So really you would need, say, a system with enough i/o bandwidth  
to handle
2-billion updates or inserts per day and with enough spare i/o  
bandwidth that
another 16-bytes on every one of those updates is ok, but without  
the ability

to run vacuum.

Also, we still have hope that the visibility map info will make  
running vacuum

even less of an imposition.

All that said I don't really see much reason not to make it an  
option. I just

don't think anyone really needs it. In 5-10 years though...




Doing this for XIDs is pretty useless this days.
It is only targeted for command ids which are consumed heavily by  
stored procedure languages.
It happens once on a while that a complex business logic procedure  
runs out of command ids inside a transaction.

the idea is to give users a chance to avoid that.
touching XIDs does not make sense to me at all.

many thanks,

hans



--
Cybertec Schönig  Schönig GmbH
PostgreSQL Solutions and Support
Gröhrmühlgasse 26, 2700 Wiener Neustadt
Tel: +43/1/205 10 35 / 340
www.postgresql.at, www.cybertec.at




Re: [HACKERS] Ad Hoc Indexes

2008-02-18 Thread Hans-Juergen Schoenig


On Feb 18, 2008, at 9:09 PM, Peter Eisentraut wrote:


Justin wrote:

Now for my question It does not appear PostgreSQL does not have an Ad
Hoc Indexes ability where the Query planner will create an in memory
index based on the Select, Update, Insert or Delete commands.


How is that supposed to work?  In order to create an index you  
would need to
visit all the rows in the table.  If you do that, you could just as  
well

answer the query off a sequential scan.




this is not quite true.
this kind of indexing makes sense if you visit the same data over and  
over again. WITH-queries would be an example for that and self joins  
could benefit from the this feature too.


the question however is: why not create normal indexes straight away?
i am not sure if the benefit of ad-hoc indexes justify additional  
complexity in the code ...


best regards,

hans


--
Cybertec Schönig  Schönig GmbH
PostgreSQL Solutions and Support
Gröhrmühlgasse 26, 2700 Wiener Neustadt
Tel: +43/1/205 10 35 / 340
www.postgresql.at, www.cybertec.at




Re: [HACKERS] Doubt in IndexScanDescData

2008-02-17 Thread Hans-Juergen Schoenig


On Feb 17, 2008, at 4:33 PM, Suresh wrote:


[ include/access/relscan.h ]

In  IndexScanDescData,  whats the purpose of having two Relation  
variables.


typedef struct IndexScanDescData
{
RelationheapRelation;   /* heap relation  
descriptor, or NULL */
RelationindexRelation;  /* index relation  
descriptor */


...
}IndexScanDescData;




The index does not contain the entire tuple. If you index column A  
the index will not contain values in column B of the same table.
Thus, if you find a record in the index one of the things which have  
to be done is to get the record from disk to check visibility and  
other columns.


best regards,

hans-juergen schoenig



--
Cybertec Schönig  Schönig GmbH
PostgreSQL Solutions and Support
Gröhrmühlgasse 26, 2700 Wiener Neustadt
Tel: +43/1/205 10 35 / 340
www.postgresql.at, www.cybertec.at




Re: [HACKERS] Doubt in IndexScanDescData

2008-02-17 Thread Hans-Juergen Schoenig

take a look at that ...

http://www.postgresql.org/docs/8.3/static/indexam.html

this might clear up the problem.
here is an example making clear what happens:

select phone_number from phonebook where name = 'xy';

index is asked to find the right place in the heap to retrieve the data.
this is what happens during an index scan.
i suggest to step tnrough this process with a debugger to see what is  
going on.


hans



On Feb 17, 2008, at 5:13 PM, Suresh wrote:


Hans-Juergen Schoenig [EMAIL PROTECTED] wrote:

On Feb 17, 2008, at 4:33 PM, Suresh wrote:


[ include/access/relscan.h ]

In  IndexScanDescData,  whats the purpose of having two Relation  
variables.


typedef struct IndexScanDescData
{
RelationheapRelation;   /* heap relation  
descriptor, or NULL */
RelationindexRelation;  /* index relation  
descriptor */


...
}IndexScanDescData;




The index does not contain the entire tuple. If you index column A  
the index will not contain values in column B of the same table.
Thus, if you find a record in the index one of the things which  
have to be done is to get the record from disk to check visibility  
and other columns.


Yes thats correct. But I still dont get it. To get record from the  
disk on match, we need Relation data. But whats the purpose having  
two seperate Relation variables ?


Does it mean that heaprelation will contain only info about that  
particular column of the table and index relation will have info  
about the whole tuple of the relation ?



best regards,
hans-juergen schoenig



--
Cybertec Schönig  Schönig GmbH
PostgreSQL Solutions and Support
Gröhrmühlgasse 26, 2700 Wiener Neustadt
Tel: +43/1/205 10 35 / 340
www.postgresql.at, www.cybertec.at




Never miss a thing. Make Yahoo your homepage.




--
Cybertec Schönig  Schönig GmbH
PostgreSQL Solutions and Support
Gröhrmühlgasse 26, 2700 Wiener Neustadt
Tel: +43/1/205 10 35 / 340
www.postgresql.at, www.cybertec.at




Re: [HACKERS] [PATCHES] Proposed patch: synchronized_scanning GUC variable

2008-01-28 Thread Hans-Juergen Schoenig


On Jan 28, 2008, at 6:14 PM, Simon Riggs wrote:


On Sun, 2008-01-27 at 21:04 -0500, Tom Lane wrote:

[ redirecting thread to -hackers ]

Neil Conway [EMAIL PROTECTED] writes:

On Sun, 2008-01-27 at 21:54 +, Gregory Stark wrote:

I liked the synchronized_sequential_scans idea myself.



I think that's a bit too long. How about synchronized_scans, or
synchronized_seqscans?


We have enable_seqscan already, so that last choice seems to fit in.


If we're going to have a GUC, we may as well make it as useful as
possible.

Currently we set synch scan on when the table is larger than 25% of
shared_buffers. So increasing shared_buffers can actually turn this
feature off.

Rather than having a boolean GUC, we should have a number and make the
parameter synchronised_scan_threshold. This would then be the  
size of
a table above which we would perform synch scans. If its set to -1,  
then
this would be the same as off in all cases. The default value  
would be

25% of shared_buffers. (Think we can only do that at initdb time
currently).

If we do that, its clearly different from the enable_* parameters, so
the name is easier to decide ;-)



+1
This is in fact a lot more flexible and transparent.
It gives us a lot more control over the process and it is easy to  
explain / understand.


best regards,

hans

--
Cybertec Schönig  Schönig GmbH
PostgreSQL Solutions and Support
Gröhrmühlgasse 26, 2700 Wiener Neustadt
Tel: +43/1/205 10 35 / 340
www.postgresql.at, www.cybertec.at




Re: [HACKERS] autonomous transactions

2008-01-28 Thread Hans-Juergen Schoenig


On Jan 25, 2008, at 7:27 AM, Decibel! wrote:


On Wed, Jan 23, 2008 at 05:50:02PM -0500, Tom Lane wrote:

Simon Riggs [EMAIL PROTECTED] writes:

From looking at how Oracle does them, autonomous transactions are
completely independent of the transaction that originates them  
-- they
take a new database snapshot. This means that uncommitted  
changes in the
originating transaction are not visible to the autonomous  
transaction.



Oh! Recursion depth would need to be tested for as well. Nasty.


Seems like the cloning-a-session idea would be a possible  
implementation

path for these too.


Oracle has a feature where you can effectively save a session and  
return
to it. For example, if filling out a multi-page web form, you could  
save

state in the database between those calls. I'm assuming that they use
that capability for their autonomous transactions; save the current
session to the stack, clone it, run the autonomous transaction, then
restore the saved one.



If you want to use it for webforms you cannot just put it on the  
stack - you had to put it in shared memory because you don't know if  
you will ever get the same database connection back from the pool.
personally i like marko's idea. if a snapshot was identified by a key  
it would be perfect. we could present the snapshots saved as a nice  
nice superuser-readable system view (similar to what we do for 2PC)


the only thing i would do is to give those snapshots some sort of  
timeout (configurable). otherwise we will get countless VACUUM  
related reports.

this sounds like a very cool feature - definitely useful.

many thanks,

hans

--
Cybertec Schönig  Schönig GmbH
PostgreSQL Solutions and Support
Gröhrmühlgasse 26, 2700 Wiener Neustadt
Tel: +43/1/205 10 35 / 340
www.postgresql.at, www.cybertec.at




Re: [HACKERS] Default index tablespace

2008-01-25 Thread Hans-Juergen Schoenig


On Jan 25, 2008, at 1:01 PM, Peter Eisentraut wrote:

What about a feature to set a default tablespace just for indexes?   
I have
been told that this was originally proposed when tablespaces where  
designed,
but did not end up being implemented.  Does anyone recall the  
details?  I

have had people ask me about this feature.

--
Peter Eisentraut
http://developer.postgresql.org/~petere/



This has already been suggested a while ago.
It is definitely useful.

regards,

hans


--
Cybertec Schönig  Schönig GmbH
PostgreSQL Solutions and Support
Gröhrmühlgasse 26, 2700 Wiener Neustadt
Tel: +43/1/205 10 35 / 340
www.postgresql.at, www.cybertec.at




Re: [HACKERS] Declarative partitioning grammar

2008-01-15 Thread Hans-Juergen Schoenig





sure, but this can become really tedious for 1024 partitions,


Well, managing 1024 partitions manually is a tedious job, no matter  
what grammar you take: You'll have to deal with 1024 different  
partition names.


What do you need so many partitions for?




imagine a structure which is partitioned by day. if you keep data for  
3 years (which is perfectly reasonable) you already have 1000  
partitions.
some applications produce so much data that splitting it into days is  
perfectly reasonable.
if your get 30 GB a day making monthly tables is not too funny  
anymore ...

just think of CREATE INDEX or VACUUM ...

having so many tables is not funny but it can be the only reasonable  
choice.


best regards,

hans

--
Cybertec Schönig  Schönig GmbH
PostgreSQL Solutions and Support
Gröhrmühlgasse 26, 2700 Wiener Neustadt
Tel: +43/1/205 10 35 / 340
www.postgresql.at, www.cybertec.at




Re: [HACKERS] Proposal: Select ... AS OF Savepoint

2007-11-02 Thread Hans-Juergen Schoenig


I think Simon Riggs is already working on that idea. This one is  
fairly easy to implement. I think these are some of the features  
only a time-stamp based database can implement. I think database  
standards were formed during the time, when the data consistency  
was provided with Lock based mechanisms. And moreover i have  
already committed on the indexes with snapshot and i am still  
waiting for its approval from hackers. If that does go through,  
then i need to work on the reverse mapping hash tables, which is  
really a long task. So i may not be able to take  up  time-travel now.



if i remember my last talk with Simon correctly the idea is to have  
timetravel across transactions.
having this feature inside a transaction will not make it into CVS as  
it is basically of no practical use.
i would suggest to put some effort into making it work across  
transactions. just saving the snapshot is not enough
here - there are a couple of other things which have to be taken into  
consideration (transaction wraparound, etc.)


if you want to work on timetravel my team and i can provide some  
assistance as we wanted to help in this area anyway.


best regards,

hans


--
Cybertec Schönig  Schönig GmbH
Gröhrmühlgasse 26, 2700 Wiener Neustadt
Tel: +43/1/205 10 35 / 340
www.postgresql.at, www.cybertec.at




Re: [HACKERS] Clarification on a Time travel feature

2007-10-31 Thread Hans-Juergen Schoenig

hello ...

i guess there is no formal proposal yet but there are some ideas  
around and some major challenges have been discussed already.

i think simon riggs was planning to work on it in the future.
the basic idea here is to have the option to create a snapshot  
which then stays in the database. the main challenge is that PostgreSQL
should not keep all version of a row since the snapshot but VACUUM  
should be able to clean out all rows which are not seen by any  
snapshot or any ongoing transaction.

this should be a quite fancy solution which is quite space efficient.

internally we had the idea of tweaking VACUUM a little:

VACUUM BEFORE timestamp;
and ...
SET current_snapshot TO '2007-10-10 ...';

this would allow a queries to use any snapshot after the timestamp  
defined by VACUUM (if data is around).

the downside here: you might potentially eat up more space.
flashback data should be read only, of course.

best regards,

hans



On Oct 31, 2007, at 11:31 AM, Gokulakannan Somasundaram wrote:


Hi,
   I went through the mailing list and couldn't get answer to the  
question.


a) Is there a proposal in place for going back in time within a  
transaction?




--
Thanks,
Gokul.
CertoSQL Project,
Allied Solution Groups.
(www.alliedgroups.com)




--
Cybertec Schönig  Schönig GmbH
Gröhrmühlgasse 26, 2700 Wiener Neustadt
Tel: +43/1/205 10 35 / 340
www.postgresql.at, www.cybertec.at




[HACKERS] support for hyper-long transactions ...

2007-10-16 Thread Hans-Juergen Schoenig
at the moment i am working on an application which is supposed to run  
extremely large transactions (a lot of server side stored procedure  
stuff which can hardly be split into small transactions for  
visibility reasons).
so, from time to time it happens that i exceed my CommandCounter (  
2.000.000.000 statements inside the same transaction).


my idea is: how about adding a configure option to compile postgres  
with a 64 bit command counter. this would allow larger transactions  
for special purpose applications while it would not have an impact on  
normal applications.


comments are welcome ...

best regards,

hans


--
Cybertec Schönig  Schönig GmbH
Gröhrmühlgasse 26, 2700 Wiener Neustadt
Tel: +43/1/205 10 35 / 340
www.postgresql.at, www.cybertec.at




Re: [HACKERS] support for hyper-long transactions ...

2007-10-16 Thread Hans-Juergen Schoenig


AFAICS, maximum number of command ids is actually 2^32-1, or over 4
billion. Are you sure you bumped into that limit and not something  
else?

What's the error message you're getting?

What version of Postgres is this? PG 8.3 will have another related  
limit

on the number of combocids you can have.



it is clearly caused by the CommandCounter - it is indicated by the  
error message. i don't have the message on my notebook here but this  
is exactly what is going on if we run out of statements.


i would not see atomic as a problem here as we can support it for 64  
bit boxes only.
i would vote for some autoconf flag which is off by default to make  
sure that other applications don't waste space here.


to answer the question you had before:
it is an application going through some enormous amount of raw data  
and trying to do some filtering, analysis and preaggregation (which  
is not an issue here). the thing is that filtering and analysis are  
quite complex and have to be done on a per entry level (a lot of  
conditional lookups, if statements, custom aggregated, status changed  
and so on). if you are forced to do this fancy logic for 1xx mio  
records you can easily run out of commands.


many thanks,

hans



--
Cybertec Schönig  Schönig GmbH
Gröhrmühlgasse 26, 2700 Wiener Neustadt
Tel: +43/1/205 10 35 / 340
www.postgresql.at, www.cybertec.at




Re: [HACKERS] correct behavior of ANALYZE ...

2007-08-30 Thread Hans-Juergen Schoenig

hi tom ...

the idea behind this is to disable the stats on a certain column  
entirely.
this would give me more control about the plan. in this special case  
data is changing so frequently that the default values are ways  
better than trying to keep the real stats up to date.
in case of default value i know what the DB does given a certain  
where clause - this is beyond my control when stats drop in.
i guess there are corner cases where no stats on certain fields can  
definitely help to make plans a little bit more stable.


many thanks,

hans



On Aug 29, 2007, at 6:44 PM, Tom Lane wrote:


Hans-Juergen Schoenig [EMAIL PROTECTED] writes:

i came across some interesting behavior of pg_stats and i am not sure
if this is something we should treat the way we do it.


Setting target zero means expend no work on this column.  In my book
that includes not doing anything to any pre-existing pg_stats entry.
What you propose would defeat the ability to analyze an unchanging
column once and then make ANALYZE skip over it henceforth.

regards, tom lane

---(end of  
broadcast)---

TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq




--
Cybertec Geschwinde  Schönig GmbH
Gröhrmühlgasse 26, 2700 Wiener Neustadt
Tel: +43/1/205 10 35 / 340
www.postgresql.at, www.cybertec.at




[HACKERS] correct behavior of ANALYZE ...

2007-08-29 Thread Hans-Juergen Schoenig
i came across some interesting behavior of pg_stats and i am not sure  
if this is something we should treat the way we do it.

consider:

test_hans=# select * from pg_stats where attname = 'id' and tablename  
= 't_testhugo';
schemaname | tablename  | attname | null_frac | avg_width |  
n_distinct |  most_common_vals   |  
most_common_freqs |   histogram_bounds
| correlation
++-+---+--- 
++- 
+--- 
+--+-
public | t_testhugo | id  | 0 | 8 |  
15 | {18,17,16,19,20,15} |  
{0.20,0.186333,0.155333,0.148667,0.095,0.090} |  
{11,13,14,14,14,21,21,22,25} |0.557774

(1 row)


test_hans=# alter TABLE t_testhugo alter column id set statistics 2;
ALTER TABLE
test_hans=# ANALYZE t_testhugo ;
ANALYZE

test_hans=# select * from pg_stats where attname = 'id' and tablename  
= 't_testhugo';
schemaname | tablename  | attname | null_frac | avg_width |  
n_distinct | most_common_vals | most_common_freqs | histogram_bounds  
| correlation
++-+---+--- 
++--+--- 
+--+-
public | t_testhugo | id  | 0 | 8 |  
12 | {18,17}  | {0.21,0.19}   | {12,19,23}   | 
0.597255

(1 row)


test_hans=# alter TABLE t_testhugo alter column id set statistics 0;
ALTER TABLE

i expected the histogram to be gone her and stats should be disabled.  
instead, we keep the old histogram here.



test_hans=# ANALYZE t_testhugo ;
ANALYZE
test_hans=# select * from pg_stats where attname = 'id' and tablename  
= 't_testhugo';
schemaname | tablename  | attname | null_frac | avg_width |  
n_distinct | most_common_vals | most_common_freqs | histogram_bounds  
| correlation
++-+---+--- 
++--+--- 
+--+-
public | t_testhugo | id  | 0 | 8 |  
12 | {18,17}  | {0.21,0.19}   | {12,19,23}   | 
0.597255

(1 row)

is that what we expect?
if no, i will go and fit it ...

hans



--
Cybertec Geschwinde  Schönig GmbH
Gröhrmühlgasse 26, 2700 Wiener Neustadt
Tel: +43/1/205 10 35 / 340
www.postgresql.at, www.cybertec.at




[HACKERS] crypting prosrc in pg_proc

2007-08-09 Thread Hans-Juergen Schoenig

hello everybody,

one of our customers wants to store the code of interpreted  
procedures (PL/pgSQL, PL/Perl) and so in an encrypted way.
so the idea we had to add one more column to pg_proc telling us  
whether prosrc is encrypted or not. people could chose then whether  
to crypt codes there or not (speed of decryption can be an issue).


should not be hard to implement ...

what do people think about this feature?

many thanks,

hans


--
Cybertec Geschwinde  Schönig GmbH
Gröhrmühlgasse 26, 2700 Wiener Neustadt
Tel: +43/1/205 10 35 / 340
www.postgresql.at, www.cybertec.at




Re: [HACKERS] crypting prosrc in pg_proc

2007-08-09 Thread Hans-Juergen Schoenig
the idea is basically to hide codes - many companies want that and  
ask for it again and again.


i would suggest keys to reside in $PGDATA. we do this for SSL and so  
already.


initdb could create such keys so that they are unique to every  
database instance.
decrypting could be avoided as much as possible basically we should  
just decrypt on first all and when it changes.


for pg_dump i would suggest two options:
a.) pass the keys to dump in a decrypted way
b.) dump in encrypted way.

i would think that this is a quite valuable features. would be nice  
to have it.
maybe we can agree on a nice mechanism here which will be implemented  
then.


hans



On Aug 9, 2007, at 3:57 PM, Andrew Dunstan wrote:




Hans-Juergen Schoenig wrote:

hello everybody,

one of our customers wants to store the code of interpreted  
procedures (PL/pgSQL, PL/Perl) and so in an encrypted way.
so the idea we had to add one more column to pg_proc telling us  
whether prosrc is encrypted or not. people could chose then  
whether to crypt codes there or not (speed of decryption can be an  
issue).


should not be hard to implement ...

what do people think about this feature?




Perhaps you could give us a justification for it. Are you intending  
to have stored procs contain security sensitive information? Or is  
this an attempt to hide closed source code from prying eyes? Where  
would the encryption keys be stored? And how would it work with  
pg_dump?


This doesn't sound very well thought out, frankly.

cheers

andrew

---(end of  
broadcast)---

TIP 7: You can help support the PostgreSQL project by donating at

   http://www.postgresql.org/about/donate




--
Cybertec Geschwinde  Schönig GmbH
Gröhrmühlgasse 26, 2700 Wiener Neustadt
Tel: +43/1/205 10 35 / 340
www.postgresql.at, www.cybertec.at




Re: [HACKERS] crypting prosrc in pg_proc

2007-08-09 Thread Hans-Juergen Schoenig


On Aug 9, 2007, at 4:47 PM, korry.douglas wrote:



the idea is basically to hide codes - many companies want that and  
ask for it again and again.

Hide code from who (or is that whom?)?



the code should be hidden from the guy who is actually executing the  
function.

so:
some user is doing: select func();

the backend loads the keys from PGDATA, decrypts the codes executes  
them. as a normal user cannot look into the backend the code is safe.

the keys are only visible to the sysadmis but not at SQL level.




The PL compiler(s) will need to decrypt the code.



no, the backend will pass the decrypted codes to the call handler.
there is no need for Perl, Python or so to be aware of this issue.


If a compiler can decrypt it, then anyone can decrypt it (because  
the compilers are open-source).
And the problem is that any user that can run a function must be  
able to compile that function, and therefore, any user that can run  
a function must have the decryption key for that function.  So, I'm  
not sure you've secured the source code from any user that can run  
the function.



why that? the backend is doing the job. the user does not pass the  
keys. it is a database internal thing. the only idea is to make sure  
that pg_proc does not contain user readable code.





Of course, if your goal is to hide the code from someone snooping  
through the pg_proc relation (on disk), then encryption will  
certainly help (provided the key is properly protected).





no, somebody who has access to the filesystem is not relevant.
just think of shipping some database inside some router or inside  
some other stuff. the vendor just wants to make sure that other  
people don't fully understand the magic going on.


hans


--
Cybertec Geschwinde  Schönig GmbH
Gröhrmühlgasse 26, 2700 Wiener Neustadt
Tel: +43/1/205 10 35 / 340
www.postgresql.at, www.cybertec.at




Re: [HACKERS] crypting prosrc in pg_proc

2007-08-09 Thread Hans-Juergen Schoenig


On Aug 9, 2007, at 4:34 PM, Peter Eisentraut wrote:


Am Donnerstag, 9. August 2007 16:09 schrieb Hans-Juergen Schoenig:

the idea is basically to hide codes - many companies want that and
ask for it again and again.


If you want to design a security feature, you need to offer a  
threat and risk

analysis, not just the whining of customers.

--
Peter Eisentraut
http://developer.postgresql.org/~petere/




well, the complete analysis is easy - the solution is not.
currently we have basically no option to reduce access to the system  
tables. this would be hard anyway as we need those tables for  
basically all kinds of operations.
the problem here is that vendors of appliances don't want people to  
spider their codes. this is a fact - it is not the idea of open  
source to do so but bloody reality. in addition to that people are  
not willing to code everything in C just to hide.
so, there has to be a concept to achieve this for stored procedures  
somehow.
i am afraid the source level encryption is the easiest thing and most  
understandable thing to do.


so, better ideas are welcome.

hans



--
Cybertec Geschwinde  Schönig GmbH
Gröhrmühlgasse 26, 2700 Wiener Neustadt
Tel: +43/1/205 10 35 / 340
www.postgresql.at, www.cybertec.at




Re: [HACKERS] crypting prosrc in pg_proc

2007-08-09 Thread Hans-Juergen Schoenig


On Aug 9, 2007, at 5:03 PM, Greg Smith wrote:


On Thu, 9 Aug 2007, Andrew Dunstan wrote:

There are also some fairly impressive code obfuscators about, that  
your clients might find useful.


All they really need is to find a sufficiently clever PL/Perl  
programmer.



we should make this a PL/Brainfuck implementation then ;)

hans


--
Cybertec Geschwinde  Schönig GmbH
Gröhrmühlgasse 26, 2700 Wiener Neustadt
Tel: +43/1/205 10 35 / 340
www.postgresql.at, www.cybertec.at




Re: [HACKERS] crypting prosrc in pg_proc

2007-08-09 Thread Hans-Juergen Schoenig


The basic problem is this: if you have to decrypt the code in order  
to give it to a compiler (PL/pgSQL, Java, Perl, ...) then there is  
a point in time where the source code is in plaintext form - it  
would be trivial to add an fprintf( stderr, %s, plainTextForm )  
to the PL handler to steal the code.


  -- Korry



if somebody is allowed to recompile on the box GRANT and REVOKE are  
all useful.

you did not get the point, i think.
we are shipping appliances - users can only use SQL; no compilers and  
no SSH logins allowed ...


hans


--
Cybertec Geschwinde  Schönig GmbH
Gröhrmühlgasse 26, 2700 Wiener Neustadt
Tel: +43/1/205 10 35 / 340
www.postgresql.at, www.cybertec.at




Re: [HACKERS] crypting prosrc in pg_proc

2007-08-09 Thread Hans-Juergen Schoenig



Make a loadable PL plpgsqlsec or something like that that's just
a thin wrapper around the plpgsql call handler, and all it does is
decrypt the source text.



perfect idea, simple perfect.
i did not consider that yet.
i was hoping for some enlightenment like that.

would be a nice module for contrib or pgfoundry ...

many thanks,

hans


--
Cybertec Geschwinde  Schönig GmbH
Gröhrmühlgasse 26, 2700 Wiener Neustadt
Tel: +43/1/205 10 35 / 340
www.postgresql.at, www.cybertec.at




Re: [HACKERS] 2PC-induced lockup

2007-07-10 Thread Hans-Juergen Schoenig

is it good to allow locks on system tables at all?
i am not so sure. have seen some disaster in the past with that. just  
consider somebody placing ACCESS EXCLUSIVE LOCK on a system table. it  
is basically denial of service.


best regards,

hans



On Jul 10, 2007, at 3:14 PM, Peter Eisentraut wrote:


The following command sequence appears to lock up the database system:

BEGIN;
LOCK pg_authid;
PREPARE TRANSACTION 'foo';
\q

After that you can't connect anymore, even in single-user mode.   
The only way
I could find is to clear out the pg_twophase directory, but I'm not  
sure

whether it is safe to do that.

Should this be prevented somehow, and is there a better recovery path?

--
Peter Eisentraut
http://developer.postgresql.org/~petere/

---(end of  
broadcast)---

TIP 5: don't forget to increase your free space map settings




--
Cybertec Geschwinde  Schönig GmbH
Gröhrmühlgasse 26, 2700 Wiener Neustadt
Tel: +43/1/205 10 35 / 340
www.postgresql.at, www.cybertec.at




[HACKERS] Threaded Python on BSD ...

2007-07-09 Thread Hans-Juergen Schoenig

hello all ...

does anybody remember why threaded python is not allowed on some  
flavors of BSD?

i was surprised to read this in the configure script ...

# threaded python is not supported on bsd's
echo $as_me:$LINENO: checking whether Python is compiled with thread  
support 5
echo $ECHO_N checking whether Python is compiled with thread  
support... $ECHO_C 6
pythreads=`${PYTHON} -c import sys; print int('thread' in  
sys.builtin_module_names)`

if test $pythreads = 1; then
  echo $as_me:$LINENO: result: yes 5
echo ${ECHO_T}yes 6
  case $host_os in
  openbsd*|freebsd*)
{ { echo $as_me:$LINENO: error: threaded Python not supported  
on this platform 5
echo $as_me: error: threaded Python not supported on this platform  
2;}

   { (exit 1); exit 1; }; }


is there an issue with BSD itself or is it just a matter of linking  
the backend against pthreads?
the problem is that this is a bit of a showstopper for skytools on  
BSD ...


many thanks,

hans


--
Cybertec Geschwinde  Schönig GmbH
Gröhrmühlgasse 26, 2700 Wiener Neustadt
Tel: +43/1/205 10 35 / 340
www.postgresql.at, www.cybertec.at




Re: [HACKERS] Threaded Python on BSD ...

2007-07-09 Thread Hans-Juergen Schoenig


On Jul 9, 2007, at 12:24 PM, Marko Kreen wrote:


On 7/9/07, Hans-Juergen Schoenig [EMAIL PROTECTED] wrote:
does anybody remember why threaded python is not allowed on some  
flavors of BSD?


AFAIR the problem is they use separate libc for threaded things,
and main postgres is (and will be) linked with non-threaded libc.

--
marko



ok, so some linking tweaks should be enough to make this work.
this is doable (to make BSD fundamentalists happy here). i was just  
thinking of some BSD compliance thing which would be worse ...


many thanks,

hans


--
Cybertec Geschwinde  Schönig GmbH
Gröhrmühlgasse 26, 2700 Wiener Neustadt
Tel: +43/1/205 10 35 / 340
www.postgresql.at, www.cybertec.at




Re: [HACKERS] Hi, I wanto joinin the developer group of postgresql

2007-04-27 Thread Hans-Juergen Schoenig
shieldy wrote:
 Hi, I wanto joinin the developer group of postgresql。
  But, I just donot know how to put the first step, as I installed the
 postgresql, and also get the postgresql code. after that, I also
 installed the cygwin on my computer( as my os is windows xp). but now
 I wonder what's my next step. as I have extends some aspects in the
 postgresql spatial data. can you give me some suggestions on how
 should I go on? thankyou!

hello ...

usually the best thing to do is to take a look at the official todo list
of PostgreSQL and focus of on one area of the database code. everything
at once is usually too complex.
just pick the area you are most interested it and start with small patches.

once you are familiar with the process here you can extend you scope.

best regards,

   hans



-- 
Cybertec Geschwinde  Schönig GmbH
Schöngrabern 134; A-2020 Hollabrunn
Tel: +43/1/205 10 35 / 340
www.postgresql.at, www.cybertec.at


---(end of broadcast)---
TIP 7: You can help support the PostgreSQL project by donating at

http://www.postgresql.org/about/donate


[HACKERS] max_locks_per_transactions ...

2007-02-01 Thread Hans-Juergen Schoenig
Right now max_locks_per_transactions defines the average number of locks 
taken by a transaction. thus, shared memory is limited to 
max_locks_per_transaction * (max_connections + max_prepared_transactions).
this is basically perfect. however, recently we have seen a couple of 
people having trouble with this. partitioned tables are becoming more 
and more popular so it is very likely that a single transaction can eat 
up a great deal of shared memory. some people having a lot of data 
create daily tables. if done for 3 years we already lost 1000 locks per 
inheritance-structure.


i wonder if it would make sense to split max_locks_per_transaction into 
two variables: max_locks (global size) and max_transaction_locks (local 
size). if set properly this would prevent good short running 
transactions from running out of shared memory when some evil long 
running transactions start to suck up shared memory.


if people find this useful we would glady implement this new feature for 
8.3.


   many thanks,

  hans

--
Cybertec Geschwinde  Schönig GmbH
Schöngrabern 134; A-2020 Hollabrunn
Tel: +43/1/205 10 35 / 340
www.postgresql.at, www.cybertec.at


---(end of broadcast)---
TIP 7: You can help support the PostgreSQL project by donating at

   http://www.postgresql.org/about/donate


Re: [HACKERS] max_locks_per_transactions ...

2007-02-01 Thread Hans-Juergen Schoenig

Simon Riggs wrote:

On Thu, 2007-02-01 at 09:15 +0100, Hans-Juergen Schoenig wrote:
  
Right now max_locks_per_transactions defines the average number of locks 
taken by a transaction. thus, shared memory is limited to 
max_locks_per_transaction * (max_connections + max_prepared_transactions).
this is basically perfect. however, recently we have seen a couple of 
people having trouble with this. partitioned tables are becoming more 
and more popular so it is very likely that a single transaction can eat 
up a great deal of shared memory. some people having a lot of data 
create daily tables. if done for 3 years we already lost 1000 locks per 
inheritance-structure.


i wonder if it would make sense to split max_locks_per_transaction into 
two variables: max_locks (global size) and max_transaction_locks (local 
size). if set properly this would prevent good short running 
transactions from running out of shared memory when some evil long 
running transactions start to suck up shared memory.



Do partitioned tables use a lock even when they are removed from the
plan as a result of constraint_exclusion? I thought not. So you have
lots of concurrent multi-partition scans.

  


maybe i was a bit unprecise before - let me clarify.
the application we are talking about contains  1 tb of data. the main 
table (about 90% of the data) is partitioned into about 3.700 subtables. 
for this kind of application this makes perfect sense as subsets of data 
(= subtable) change frequently.


two types of queries are executed by the system:

   - short OLTP operations adding data to the huge tables
   - a large set of analysis stuff which tortures the database with 
more complex queries.


the main issue is that to a large extend those analysis queries have to 
run concurrently.
the thing now is: if there are many concurrent operations which need 
this partitioned structure the amount of locks is growing quite fast (in 
this +3700 locks per transaction).
so, it can happen that we run out of shared memory inside some OLTP 
transaction just because too many background processes are sucking up 
shared memory.


of course it would be simple to pump max_locks_per_transaction - this is 
not the point.
the idea is rather: max_locks_per_transaction is a somehow obscure way 
of putting things. many people are simply misleaded. most people assume 
that this is indeed a per transaction limit and then they are surprised 
when a transaction which hardly needs locks fails.


i would suggest to replace the existing parameter but something else:

   - a switch to define the global size of the lock pool (e.g. max_locks)
   - a switch which defines the upper limit for the current backend / 
transaction


we could make a transaction fail which takes too many locks.
the advantage would be that the transaction causes the problem and not 
some other innocent small operation.


   best regards,

  hans


--
Cybertec Geschwinde  Schönig GmbH
Schöngrabern 134; A-2020 Hollabrunn
Tel: +43/1/205 10 35 / 340
www.postgresql.at, www.cybertec.at


---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

  http://www.postgresql.org/docs/faq


[HACKERS] Core dump in PL/pgSQL ...

2006-12-19 Thread Hans-Juergen Schoenig

one of our customers here found a bug in PL/pgSQL.
this is how you can create this one:

CREATE OR REPLACE FUNCTION public.make_victim_history () RETURNS  
trigger AS $body$ DECLARE


schemarec RECORD;
exec_schemaselect text;
curs2 refcursor;

BEGIN

  exec_schemaselect := 'SELECT nspname FROM pg_class c JOIN  
pg_namespace n ON n.oid = c.relnamespace WHERE c.oid = ' || TG_RELID;


  OPEN curs2 FOR EXECUTE exec_schemaselect;
  FETCH curs2 INTO schemarec;
  CLOSE curs2;

  RAISE NOTICE 'schemarecord: %',schemarec.nspname;

  RAISE NOTICE 'begin new block';
BEGIN
RAISE NOTICE 'insert now';
EXECUTE 'insert into public_history.victim SELECT * from  
public.victim where id=1;';


EXCEPTION
WHEN OTHERS THEN
 -- do nothing
END;

RETURN NEW;
END;
$body$
LANGUAGE 'plpgsql' VOLATILE CALLED ON NULL INPUT SECURITY INVOKER;


--TABLE ERSTELLEN
CREATE TABLE public.victim (
  id BIGINT,
  name TEXT,
  created TIMESTAMP WITHOUT TIME ZONE,
  create_user BIGINT,
  changed TIMESTAMP WITHOUT TIME ZONE,
  change_user BIGINT,
  state SMALLINT
) WITHOUT OIDS;

INSERT INTO victim VALUES (1, 'hans', now(), 2, now(), 3, 4);

-- TRIGGER ERSTELLEN
CREATE TRIGGER victim_tr BEFORE UPDATE OR DELETE ON  
public.victim FOR EACH ROW EXECUTE PROCEDURE  
public.make_victim_history();


-- BAD BAD STATEMENT
UPDATE public.victim SET changed=NOW(), change_user = 1;


a quick fix is to prevent the language from freeing the tuple twice -  
this should safely prevent the core dump here.

we still have to make sure that the tuple if freed properly. stay tuned.
here is the patch ...


hans



diff -rc postgresql-8.2.0-orig/src/backend/executor/spi.c  
postgresql-8.2.0/src/backend/executor/spi.c
*** postgresql-8.2.0-orig/src/backend/executor/spi.c	Tue Nov 21  
23:35:29 2006

--- postgresql-8.2.0/src/backend/executor/spi.c Tue Dec 19 15:04:42 2006
***
*** 264,270 
/* free Executor memory the same as _SPI_end_call would do */
MemoryContextResetAndDeleteChildren(_SPI_current-execCxt);
/* throw away any partially created tuple-table */
!   SPI_freetuptable(_SPI_current-tuptable);
_SPI_current-tuptable = NULL;
}
  }
--- 264,270 
/* free Executor memory the same as _SPI_end_call would do */
MemoryContextResetAndDeleteChildren(_SPI_current-execCxt);
/* throw away any partially created tuple-table */
! //SPI_freetuptable(_SPI_current-tuptable);
_SPI_current-tuptable = NULL;
}
  }




--
Cybertec Geschwinde  Schönig GmbH
Schöngrabern 134; A-2020 Hollabrunn
Tel: +43/1/205 10 35 / 340
www.postgresql.at, www.cybertec.at




Re: [HACKERS] Core dump in PL/pgSQL ...

2006-12-19 Thread Hans-Juergen Schoenig

oh sorry, i think i missed that one ...
many thanks,

hans



On Dec 19, 2006, at 3:42 PM, Stefan Kaltenbrunner wrote:


Hans-Juergen Schoenig wrote:

[...]

a quick fix is to prevent the language from freeing the tuple  
twice - this should safely prevent the core dump here.
we still have to make sure that the tuple if freed properly. stay  
tuned.

here is the patch ...


this seems to be already fixed with:

http://archives.postgresql.org/pgsql-committers/2006-12/msg00063.php


Stefan

---(end of  
broadcast)---

TIP 6: explain analyze is your friend




--
Cybertec Geschwinde  Schönig GmbH
Schöngrabern 134; A-2020 Hollabrunn
Tel: +43/1/205 10 35 / 340
www.postgresql.at, www.cybertec.at




[HACKERS] Planner estimates and cast operations ,...

2006-09-04 Thread Hans-Juergen Schoenig
i am looking at some corner case which might also cause troubles for  
other people.

consider the following:

	SELECT some_timestamp::date FROM very_large_table GROUP BY  
some_timestamp::date


my very_large_table is around 1billion entries.
the problem is: the planner has a problem here as it is taking the  
(correct) estimates for timestamp. this avoids a HashAggregate  
because the dataset seems to large for work_mem.
what the planner cannot know is that the number of days is quite  
limited (in my case around 1000 different values).

i wonder how to teach the planner to take the cast into consideration.

at the moment the planner uses the per column statistics - it cannot  
know that the cast might change the number of different values.

how about the following?

Command: CREATE CAST
Description: define a new cast
Syntax:
CREATE CAST (sourcetype AS targettype)
[USING SELECTIVITY number | funcname(argtypes)]
WITH FUNCTION funcname (argtypes)
[ AS ASSIGNMENT | AS IMPLICIT ]

if it was possible to assign a constant or some function to the cast  
i think we could make the example used above work. by default no  
costs are changed. if somebody is doing some fancy query it would be  
possible to tweak GOUOP BY planning by assigning some cleverly  
written function or a constant to the scenery.


a constant would be useful in terms of casts to boolean or so.

does anybody have an idea which could help solving this issue?

best regards,

hans


---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly


Re: [HACKERS] Planner estimates and cast operations ,...

2006-09-04 Thread Hans-Juergen Schoenig

hi tom ...

i thought about creating an index on the expression but the problem  
is that this is hardly feasable.
in 8.0 (what i have here) this would block the table and i would run  
out of disk space as well. this is a 600 gb biest :(


what about the planner approach?
this would solve the problem for some other issues as well. an index  
might not be flexible enough :(.


many thanks,

hans


On Sep 4, 2006, at 4:57 PM, Tom Lane wrote:


Hans-Juergen Schoenig [EMAIL PROTECTED] writes:

consider the following:



SELECT some_timestamp::date FROM very_large_table GROUP BY
some_timestamp::date



my very_large_table is around 1billion entries.
the problem is: the planner has a problem here as it is taking the
(correct) estimates for timestamp. this avoids a HashAggregate
because the dataset seems to large for work_mem.
what the planner cannot know is that the number of days is quite
limited (in my case around 1000 different values).
i wonder how to teach the planner to take the cast into  
consideration.


Create an index on that expression.

regression=# create table foo(x) as select x * '864 sec'::interval  
+ now()::timestamp from generate_series(1,1) x;

SELECT
regression=# analyze foo;
ANALYZE
regression=# explain select x::date from foo group by x::date;
  QUERY PLAN
---
 HashAggregate  (cost=205.00..330.00 rows=1 width=8)
   -  Seq Scan on foo  (cost=0.00..180.00 rows=1 width=8)
(2 rows)

regression=# create index fooi on foo((x::date));
CREATE INDEX
regression=# analyze foo;
ANALYZE
regression=# explain select x::date from foo group by x::date;
  QUERY PLAN
---
 HashAggregate  (cost=205.00..206.26 rows=101 width=8)
   -  Seq Scan on foo  (cost=0.00..180.00 rows=1 width=8)
(2 rows)

regression=#

I had to cheat a little bit here: I tried to do this example with a
timestamptz column, and the index creation failed because  
timestamptz to

date isn't immutable (it depends on TimeZone).  If yours is too, you
could perhaps do something involving AT TIME ZONE to generate an
immutable conversion to date.

It would perhaps make sense to provide a way to cue ANALYZE to compute
stats on expressions that aren't actually being indexed, but I see no
good reason to limit our attention to cast expressions.

regards, tom lane



---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [HACKERS] Planner estimates and cast operations ,...

2006-09-04 Thread Hans-Juergen Schoenig


On Sep 4, 2006, at 7:04 PM, Bruno Wolff III wrote:


On Mon, Sep 04, 2006 at 17:19:37 +0200,
  Hans-Juergen Schoenig [EMAIL PROTECTED] wrote:


i thought about creating an index on the expression but the problem
is that this is hardly feasable.
in 8.0 (what i have here) this would block the table and i would run


That may be hard to deal with.




it is ...
but the problem is not primarily that i have some problem with a  
certain query. somehow this can be solved somehow. i am thinking  
about GROUP BY and estimates in general here ...

just wondering if there is a chance to improve ...


out of disk space as well. this is a 600 gb biest :(


I wouldn't expect this to be a problem. If you have 10^9 rows, I  
would expect
the index to be less than 10% of you current size. If you are so  
close to
your disk space limit that that is a problem, you have a problem in  
any case.





the index itself is not too large but when building it up it is  
written several times. it is not funny when dealing with so much  
data ...





what about the planner approach?
this would solve the problem for some other issues as well. an index
might not be flexible enough :(.


If you disable sorting you might be able to get it to switch plans.  
Lying

about the amount of work memory so that the planner thinks the hash
will fit in memory despite its misguessing the number of buckets  
might also

help.



setting work_mem to 2gb does not help here ;)
set it to the max value on 8.0.
this was my first try too.
the problem is - there is no magic switch to mislead the planner a  
little without hacking the system stats (which is not what people  
should do i would say ;) ).


my question is: is adding hooks for selectivity a feasable way of  
dealing with things like that?


hans





---(end of broadcast)---
TIP 4: Have you searched our list archives?

  http://archives.postgresql.org


Re: [HACKERS] [PATCHES] COPY view

2006-08-22 Thread Hans-Juergen Schoenig

Tom Lane wrote:

Stefan Kaltenbrunner [EMAIL PROTECTED] writes:
  

Bruce Momjian wrote:


Well, the patch was submitted in time, and it is a desired feature.  If
we want to hold it for 8.3 due to lack of time, we can, but I don't
think we can decide now that it must wait.
  


  

well I thought the agreed approach to that was allowing COPY from
arbitrary expressions without the need to go through the extra CREATE
VIEW step?



Exactly.  This is not the feature that was agreed to.  Just because we
have a patch for it doesn't mean that we have to put it in.  If we do
put it in, we'll be stuck carrying that feature forever, even after
someone gets around to doing it right.

regards, tom lane
  



It has been made as COPY FROM / TO view because people wanted it to be 
done that way.
My original proposal was in favour of arbitrary SELECTs (just like 
proposed by the TODO list) but this was rejected. So, we did it that way 
(had to explain to customer why views are better). Now everybody wants 
the original select which was proposed.


I can understand if things are not committed because of bad code quality 
or whatever but to be honest: It is more of less frustrating if things 
are done differently because of community wish and then rejected because 
things are not done the original way ...


Things have been submitted months ago and now we are short of time. I 
think everybody on the list is going a superior job but after 6 years I 
still have no idea how patches are treated ;).


   best regards,

  hans


--
Cybertec Geschwinde  Schönig GmbH
Schöngrabern 134; A-2020 Hollabrunn
Tel: +43/1/205 10 35 / 340
www.postgresql.at, www.cybertec.at


---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
  choose an index scan if your joining column's datatypes do not
  match


[HACKERS] Costs estimates for (inline SQL) functions ...

2006-08-20 Thread Hans-Juergen Schoenig

As my last mail did not seem to go through here one more try ...
When looking at some fairly complex SQL stuff I came across some 
interesting issue which is a bit surprising to me:



CREATE OR REPLACE FUNCTION xy() RETURNS SETOF record AS $$
  SELECT relname::text, relpages::int4
  FROM pg_class;
$$ LANGUAGE SQL IMMUTABLE;

explain SELECT * FROM xy() AS (relname text, relpages int4);


[EMAIL PROTECTED]:/tmp$ psql test  check.sql
CREATE FUNCTION
   QUERY PLAN

Function Scan on xy  (cost=0.00..12.50 rows=1000 width=36)
(1 row)

As far as i remember inlined SQL code has been implemented into the 
planner around 7.4.

This should also be true according to ...

http://conferences.oreillynet.com/cs/os2003/view/e_sess/4372

In my theory the function is inlined and therefore we should see 
different costs here (the ones of the real query).

Does anybody happen to know more about this issue?

  Many thanks,

 hans

--
Cybertec Geschwinde  Schönig GmbH
Schöngrabern 134; A-2020 Hollabrunn
Tel: +43/1/205 10 35 / 340
www.postgresql.at, www.cybertec.at


---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [HACKERS] new feature: LDAP database name resolution

2006-02-28 Thread Hans-Juergen Schoenig


On 21 Feb 2006, at 10:42, Martijn van Oosterhout wrote:


On Tue, Feb 21, 2006 at 10:02:58AM +0100, Albe Laurenz wrote:
Thank you also for drawing my attention to pg_service.conf - I  
have not

been aware of it.
There are two 'shortcomings':
- It still means that you have to change the config file on every
client.


Well yes. However, you could generate the config file automatically
from another source, either LDAP or something else.



this is definitely the best way of doing it. in fact some folks out  
there use similar configurations to manager large scale systems  
efficiently.


best regards,

hans-jürgen schönig

--
cybertec geschwinde  schönig gmbh
schöngrabern 134, a-2020 hollabrunn
www.cybertec.at


---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [HACKERS] generalizing the planner knobs

2005-12-05 Thread Hans-Juergen Schoenig


On Dec 5, 2005, at 4:17 AM, Tom Lane wrote:


Greg Stark [EMAIL PROTECTED] writes:

Plan stability is also an important feature, especially for OLTP
systems which have hard real-time requirements. OLTP systems  
typically
don't care about getting the best plan for a query, only a plan  
that

is good enough.


Good enough means it can keep up with the rate of incoming  
requests; it
doesn't matter whether it keeps up with 10% headroom or 20%  
headroom. But if
one incoming query even one in a thousand takes 1000% of the time  
available

then the entire system risks falling down.


Is it worth pointing out that using the same plan all the time is *no*
recipe for guaranteeing response time?  There is no such thing as a  
plan

that is good for every case --- outlying data values can make a
usually-good plan blow out your performance guarantee anyway.   
Disabling

the planner is just a recipe for ensuring that that will happen, IMHO.

regards, tom lane




I think I know what Greg is trying to say: I think in this plan  
stability does not mean that the plan has to be completely fixed -  
usually it is all about indexing. People start with an empty  
perfectly analyzed database and data is added. However, some day some  
cron job doing ANALYZE or whatever fails and the system will slow  
down or even break down because data is added to some table which is  
still seq-scanned. This is what usually happens and which leads to  
support cases.


Adding hints to some comments or to the statement itself is not a  
good solution as well. This is why I proposed a table or some flag  
telling the planner what to favour (= always use a certain index). So  
the basic idea is not to turn index of in general but to have the  
chance to do it on a per index basis. I guess this would not be to  
complex to implement and it solves 90% of all problems without having  
to hide some information inside comments (which is no good at all).


best regards,

hans




---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly


Re: [HACKERS] generalizing the planner knobs

2005-12-03 Thread Hans-Juergen Schoenig
wouldn't it be more flexible to define a multiplicator or some sort of bool flag on a per object level?oracle hints are a total overkill and i agree with tom that usually people will abuse this feature.if we had a per object flag the actual planner hint can be decoupled from the actual query (i don't think putting a hint inside a query is the most clever thing).changing a flag would be as simple as running UPDATE on some system table.this should not be too intrusive as well.	best regards,		hansOn Dec 1, 2005, at 7:45 PM, Jonah H. Harris wrote:Tom,  Don't get me wrong, I agree with you completely.  I would rather put effort into enhancing the planner than in developing work-arounds.  In 99% of all cases the planner works correctly, but I know people who actually have to disable planning options (mergejoin) in production applications because they get bad plans.  The "bad" plans are not really bad in terms of what the planner knows about the query, just in areas where the planner doesn't look at other things.  I also agree that a significant amount of work would be required to add run-time hints which would be better spent enhancing the system as a whole.  My only suggestion was that it would be better than Part 1 of Neil's statement.  Somehow I missed the end mention of multipliers which I agree requires less effort.  On 12/1/05, Tom Lane [EMAIL PROTECTED] wrote: "Jonah H. Harris" [EMAIL PROTECTED] writes: In the last couple weeks I too have been thinking about planner hints. Assuming I have read your post correctly, the issue I see with this idea is  that, in most cases, there won't be much of a difference between adding an arbitrary cost value to each type of node and disabling it completely. Also, by fiddling with an arbitrary cost the user may introduce a lot of  variation into the planner which may actually result in worse query plans.Which is pretty much exactly the problem with "planner hints", too.I've resisted that suggestion in the past and will continue to do so, because hints are accidents waiting to happen.  Even if the hint is righttoday for your current Postgres version and current data distribution,it's likely not to be right further down the road --- but once the hint is embedded in your application, how often are you going to revisit it?As an example, a hint forcing the planner to use an indexscan with aparticular index might have been a great idea in PG 8.0 and a lousy idea in 8.1, because it would prevent substitution of a possibly-far-betterbitmap indexscan.The enable_foo switches are debug aids, not something you are expectedto fool with for production purposes, and the same would be true of Neil's suggested multipliers.  While I don't feel any strong need forvariable multipliers, they'd be a small enough incremental amount ofwork that the suggestion doesn't require a lot of supporting argument. Adding a planner hint facility would be several orders of magnitudemore work, and it would be taking the system in a design direction thatI think is fundamentally misguided.regards, tom lane

Re: R: [HACKERS] feature proposal ...

2005-09-22 Thread Hans-Juergen Schoenig
absolutely - the main advantage of the syntax tweak is that you can  
add parameters more easily.


best regards,

hans



On 22 Sep 2005, at 21:25, Jim C. Nasby wrote:


On Wed, Sep 21, 2005 at 11:31:42AM -0400, Tom Lane wrote:

=?ISO-8859-1?Q?Hans-J=FCrgen_Sch=F6nig?= [EMAIL PROTECTED]  
writes:



Paolo Magnoli wrote:


Can't you just use a view?





no because a new is not a heap ...



I think Paolo's idea is much better than munging the syntax of COPY,
though.  Fixing COPY so that you *could* copy from a view would  
provide

all the desired functionality without any syntactic warts.



While I'm all for COPY from views, I think I'd rather have the  
syntactic

warts than code warts. ISTM that

CREATE TEMP VIEW some_name AS SELECT * FROM table WHERE ...;
COPY some_name TO stdout;

is much uglier than

COPY SELECT * FROM table WHERE ... TO stdout;
--
Jim C. Nasby, Sr. Engineering Consultant  [EMAIL PROTECTED]
Pervasive Software  http://pervasive.comwork: 512-231-6117
vcard: http://jim.nasby.net/pervasive.vcf   cell: 512-569-9461




---(end of broadcast)---
TIP 4: Have you searched our list archives?

  http://archives.postgresql.org


Re: [HACKERS] random system table corruption ...

2005-09-15 Thread Hans-Juergen Schoenig

alvora,

what concerns me here: this is a sun system and the problem happened  
during normal operation.
there should not be a recovery related operation. something which is  
also interesting: there are two corrupted pages in there (page number  
22 and 26).

strange thing :(.

thanks a lot,

hans


On 11 Sep 2005, at 20:01, Alvaro Herrera wrote:


On Sun, Sep 11, 2005 at 01:12:34PM +0200, Hans-Jürgen Schönig wrote:


in the past we have faced a couple of problems with corrupted system
tables. this seems to be a version independent problem which  
occurs on

hackers' from time to time.
i have checked a broken file and i have seen that the corrupted  
page has

actually been zeroed out.



IIRC the XFS filesystem zeroes out pages that it recovers from the
journal but did not have a fsync on them (AFAIK XFS journals only
metadata, so page creation but not the content itself).  I don't think
this would be applicable to your case, because we do fsync modified
files on checkpoint, and rewrite them completely from WAL images after
that.  But I thought I'd mention it.

--
Alvaro Herrera -- Valdivia, Chile Architect,  
www.EnterpriseDB.com

Just treat us the way you want to be treated + some extra allowance
 for ignorance.(Michael Brusser)

---(end of  
broadcast)---

TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match




---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly