from:"Julian"

[GENERAL] Can we not give tyrannical pedants control of #postgresql?

2017-01-20 Thread Julian Paul


Hi all!
It's really a bad sign when some user is given operator status who is 
intolerant to minor offtopic conversations that span no more than a 
couple of line buffers. Witnessing a user getting kicked for asking for 
my location was way beyond reasonable, considering even the channel was 
rather idle. Not only that, conversations in the past have elevated to 
much more heated levels with rare instances of such drastic measures. 
And they've been great!
I've previously considered and advocated #postgresql as the best channel 
on freenode for years, I'm not sure I can hold this perception from the 
facts witnessed recently.
This particular user and a few others seem to be more inclined to 
bullying behaviour quick with drastic overractions rather than stable 
and calm tolerance that was appreciated in the past (I haven't been 
perfect!).
I won't name names, however I've rarely seen one invoke +o privileges in 
the past and rarely that such a user has shown an immature perspective 
to databases and intolerant to the perspectives of others that aren't 
regularly feeding the buffer.
I think a little bit of rational consideration on what kind of people 
would even join #postgresql should be considered and not be so quick to 
kick due to minor infractions. This leaves one to kick/ban bots and 
offtopic spammers and severely abusive behaviour.


Freenode also has other options beside a +k (+q).

I think the operator status of such individuals should be revoked to be 
honest. However even in regards to database design, server instances and 
personal real world experience of individuals such as myself, it is 
unfortunate I won't be able to share that with new users on IRC.


I hope that particular stereotypes aren't proven here, but it appears 
#postgresql encourages a particular tier and makes aware of it's rigid 
hierarchy. I owe alot to #postgresql but not to these particular users, 
I've perhaps been idle for too long and the channel has change for the 
worse, well that's not my fault. I leave it with the community to sort out.


All the best, Julian.


--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] Lock contention in TransactionIdIsInProgress()

2015-11-09 Thread Julian v. Bock

Hi

>>>>> "JJ" == Jeff Janes  writes:

JJ> This is a known issue and is fixed for 9.6 in commit
JJ> 8a7d0701814a4e.

thanks for the quick reply. Is there a chance this will get into the 9.5
branch as well?

Regards,
Julian v. Bock

-- 
Julian v. Bock   Projektleitung Software-Entwicklung
OpenIT GmbH  Tel +49 211 239 577-0
In der Steele 33a-41 Fax +49 211 239 577-10
D-40599 Düsseldorf   http://www.openit.de

HRB 38815 Amtsgericht Düsseldorf USt-Id DE 812951861
Geschäftsführer: Oliver Haakert, Maurice Kemmann


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

[GENERAL] Lock contention in TransactionIdIsInProgress()

2015-11-05 Thread Julian v. Bock

Hi

I have been investigating a problem with one of our production databases
where every few weeks the database suddenly came to a grinding
halt. When this happened, queries needed much more time to complete and
just opening a local db connection with psql could take up to 30
seconds.

The PostgreSQL version is 9.2.13.

The database has a central table (about 60GB including indexes) with
records which can be searched via a web interface. The table is a
materialized view that is updated via several custom triggers on other
tables. When data is changed, all relevant record are DELETEd and then
INSERTed from a view.

The table is replicated via slony1 (2.1.4). All search queries are
executed on a slave db. The databases run on Xeon servers with 24 cores
and 96GB of ram.

I managed to reproduce the problem using the following steps:

- Perform an update on the master server which causes about 50.000 rows
  to be repeatedly deleted and inserted in a single transaction. For
  testing I did this 20 times for about a million deletes and inserts.(*)

- Slony starts to replicate the individual deletes and inserts to the
  slave db.

- After this has been running for a few minutes, I run a particular
  query(**) on the slave repeatedly in up to 20 psql connections. For
  each query a new connection is opened.

- After a few seconds the backend processes go up to almost 100% cpu
  usage and take a long time to complete. When i run psql on the console
  it takes anywhere from 0s (almost immediately) to tens of seconds to
  get to the prompt. (The average time gets worse the longer the
  replication is running.)

After doing some debugging it seems that the backend processes are
spending most of their time in spinlocks in TransactionIdIsInProgress()
trying to get a lock on ProcArrayLock. This function is called more
often (when processing searches) the longer the replication transaction
is running.

TransactionIdIsInProgress() conveniently has some debug counters
enabled via #defining XIDCACHE_DEBUG. When the backend processes
processing the searches finish their transaction, the xc_by_main_xid
counter is about the same for each search. The longer the slony
replication transaction is running, the higher the number gets.

Serious slowdowns start at about 100.000 and I had it get up to more
than 2.000.000 (not sure whether that corresponds to the number of
deleted / inserted tuples).


Is this a known problem?

Unfortunately, I did not yet manage to create a self contained
example. Just copying the table to another database on the slave and
deleting / inserting the tuples without slony did not reproduce the
issue. I am not sure whether this is due to reduced bloat in the table
after pg_dump / pg_restore or whether it is relevant that slony deletes
and inserts the rows individually (or something entirely different).


(*) On the production server a poor interaction between triggers and the
way the application updates data caused the rows to be deleted /
inserted many times. Fortunately, we could work around this so there
should be no immediate danger that this comes up on the live db again.

(**) The search query I used to reproduce the problem does not look
particularly nasty and returns about 260 rows in a few hundred
milliseconds under normal circumstances. It does no full table scan but
uses a few indexes.

The returned rows are among the rows that get deleted and inserted
repeatedly.


Regards,
Julian v. Bock

-- 
Julian v. Bock   Projektleitung Software-Entwicklung
OpenIT GmbH  Tel +49 211 239 577-0
In der Steele 33a-41 Fax +49 211 239 577-10
D-40599 Düsseldorf   http://www.openit.de

HRB 38815 Amtsgericht Düsseldorf USt-Id DE 812951861
Geschäftsführer: Oliver Haakert, Maurice Kemmann


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] Largest PG database known to man!

2013-10-01 Thread Julian

On 02/10/13 07:49, Mark Jones wrote:
> Hi all,
>
> We are currently working with a customer who is looking at a database
> of between 200-400 TB! They are after any confirmation of PG working
> at this size or anywhere near it.
> Anyone out there worked on anything like this size in PG please? If
> so, can you let me know more details etc..  
> 
>
Wow that's awesome - but you know the difference between 200TB and 400TB
is quite significant (100%)? Like a whole bunch of cash
significant...unless we are talking GB.
But is that it? This isn't really fair, is this a test?
Jules


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] Partitioning V schema

2013-09-20 Thread Julian


On 21/09/13 02:51, Gregory Haase wrote:


I would look towards how PostGis handles the Tiger census data for 
guidance. It's a similar, massive data set.


Greg Haase


I'm not sure why it wouldn't handle it fine?
The question is at what point would third party "imported" datasets, 
required for local lookup require their own dedicated solution (database 
and/or server)?
Especially when we are talking about a large amounts of data that come 
with their own unique global identifiers. Something like ISO 3166, even 
small, would do well with its own database - but not large enough to 
require a dedicated server.


This is the question I put to the OP, I would be interested to know.

Regards,
Julian


--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] Partitioning V schema

2013-09-19 Thread Julian

Hi Dave,
How many rows of data are we talking here and how much information? (GiB)

Are you able to provide the table definition? (can normalisation
partition off some of this data?).

Have you addressed dedicated options for lookup data, tune the database
appropriately and keep that single table?

With postgres we have schemas, so it can provide some convenience and
design options.

So look at it in terms how your query will look.

SELECT schema.table.column FROM schema.table;

vs

SELECT schema_table.column FROM schema_table;

Not much in it?

However, I tend to go with partitions when required to be generated on
demand dynamically and automatically (which probably isn't the case
here). SCHEMAs have other uses, provide a level of security (GRANT) and
useful in design when partitioning off blocks of related datasets
completely.

Regards,
Julian

On 19/09/13 17:02, Dave Potts wrote:
> Hi List
>
> I am looking for some general advice about the best was of splitting 
> a large data table,I have  2 different choices, partitioning or
> different schemas.
>
> The data table refers to the number of houses that can be include in a
> city, as such there are large number of records.
>
>
> I am wondering if decided to partition the table if the update
> speed/access might be faster that just declaring a different schema
> per city.
>
> Under the partition the data table would appear to be smaller, so I
> should get an increase in speed, but the database still have to do
> some sort of indexing.
>
> If I used different schemas, it resolves data protection issues, but
> doing a backup might become a nightmare
>
> In general which is the fast access method?
>
> regards
>
>
> Dave.
>
>
>
>
>

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] Best practice on inherited tables

2013-05-19 Thread Julian

On 19/05/13 13:02, Chris Travers wrote:
> 
> I actually think that bringing some object-oriented principles into
> database design can result in some very useful things, provided that one
> remembers that applications are modelling behavior while databases are
> modelling information (and so the two behave quite differently).  The OO
> principles are helpful particularly when looking at derived data.
> 

INHERIT is feature that immediately gets interest from an OOP
perspective, however its usefulness, IMO, is naught. I agree that
applications model behavior (do this) but databases don't model
information, their core purpose is to store information in such a away
that the information may be modeled how you see fit, using a query
language (SQL).

Storing data to how you wish to receive it, is not realizing the true
power of a RDBMS like postgres.

The principles of OOP are to be brought into database design with risk.
INHERITS actual usefulness is merely an alternative to already accepted
relational database principles and actually goes *against* the
relational model. Postgres has alot of features. So even without the
current caveats I warn against using INHERIT when I see it mentioned.

Regards,
Julian.

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] Best practice on inherited tables

2013-05-18 Thread Julian

On 17/05/13 22:46, Frank Lanitz wrote:
> Hi folkes,
> 
> I'm looking for a nice way to build this scenario:
> I've got a lot of locations with some special types. For example I've
> got workplaces, places like real laboratories and virtual places like
> maybe parcel service. For each of the different types I need to store
> some common attributes as well as some special ones. Having OOP in mind
> I came to the point of inherit tables. so I've create something like
> that (just a minimal example):
> 
> CREATE TABLE locations(
>   id SERIAL PRIMARY KEY,
>   name varchar(50)
> );
> CREATE TABLE workplaces(
>   workers integer
> ) INHERITS (locations);
> 
Hi,
Wouldn't you prefer something like locations, workers, worker_locations
table schemas?
INHERITS is a feature, but you have to be careful and differentiate for
when it is actually useful. It can bring up debate involving OOP
practices transfered over into general database design. For which I
quash with:

database design != application design

How you store data should be irrelevant to application interfaces
(API's). People do it and some frameworks encourage it.


Regards,
Julian.


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] How to convert numbers into words in postgresql

2013-05-14 Thread Julian

On 15/05/13 08:27, Merlin Moncure wrote:
> On Tue, May 14, 2013 at 5:24 PM, Adrian Klaver  
> wrote:
>> On 05/14/2013 03:17 PM, John R Pierce wrote:
>>>
>>> On 5/14/2013 2:57 PM, Alvaro Herrera wrote:

 Jashaswee escribió:
>
>> i want to convert numbers into words in postgresql.is there any
> query for it?
>> if yes please reply soon

 Try cash_words()
>>>
>>>
>>>
>>> um, where is that documented?  I've searched the various 'functions'
>>> pages, and not seen it, nor is it in the manual Index.
>>
>>
>>
>> production=> \df cash_words
>>  List of functions
>>Schema   |Name| Result data type | Argument data types |  Type
>> ++--+-+
>>  pg_catalog | cash_words | text | money   | normal
>>
>> Seems to only work with money type.
> 
> from the source:
> /* cash_words()
>  * This converts a int4 as well but to a representation using words
>  * Obviously way North American centric - sorry
>  */
> Datum
> cash_words(PG_FUNCTION_ARGS)
> {
> 
> so, it comes down to the money type is one of the more bizarre things
> left from postgres past and we try not to advertise too loudly I
> suppose.
> 
> merlin
> 
> 
What the?
I currently use a python number to words module (n2w.py if OP is
interested in it, let me know). I've been using this for years and never
knew about the cash_words() function, what other secret functions does
postgres have? :)
Now there is a use for the money type, perhaps this function could be
upgrade to not need it?

jules.


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] Design advice requested

2013-05-09 Thread Julian

On 09/05/13 17:42, Johann Spies wrote:
> Hallo Julian,
>
> Thanks for your reply.
>
> Firstly, don't worry too much about speed in the design phase,
> there may
> be differences of opinion here, but mine is that even with database
> design the first fundamental layer is the relationship model.
>
>
> It is good to hear but when a simple query requesting 20 records takes
> nearly 7 minutes to complete, it becomes nearly unusable.

Hi, can you reply to the list?
This is a performance question now. You might want to start a new thread on
your performance issues.
Have you utilized indexes? (refer to Hash Cond)
Hash Cond: ((rauthor.rart_id)::text = (ritem.ut)::text)

look at "actual time=start..finish" on the planner process blocks and
also the
finish time of the preceding block.

Buffers: shared hit=104662 read=4745067, temp read=1006508 written=1006446

Give the specs of your hardware. More experience people will be able
to advise on getting the best out of this query.

Once you are reasonably happy with your schema you might want
to consider partitioning the larger datasets. You can also cache (temp)
blocks of data how you see appropriate (snapshots).

http://www.postgresql.org/docs/9.2/static/ddl-partitioning.html

Regards.
Julian.

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] Does it make sense to break a large query into separate functions?

2013-05-08 Thread Julian

On 08/05/13 23:40, Seref Arikan wrote:
> Greetings, 
> 
> I have a domain specific language implementation that ends up as sql
> after an automatic transformation/code generation phase. 
> The complexity of the generated sql queries can be significant, with
> lots of self joins and many other criteria injected into sql. 
> 
> I've just noticed that some parts of the autogenerated queries can be
> functions on their own. Would moving these parts to their own functions
> help the query planner? 
> 
> There are any two tables and all queries use them, so I thought the
> query planner may be able to do better if I let it deal with smaller
> queries rather than one quite large one. I don't know how the query
> planner works when there are calls to other functions from a function
> though.
> 
> Changing the code generator would take time, so I decided to ask for
> more educated input than mine.
> 
> Best regards
> Seref
> 
Hi Seref,
The code generated sql queries isn't giving you much to work with (or a
choice). However I suspect its doing its best dealing with this data
structure (relationship model). I could be wrong.

But that might be where the problem is.

Regards,
Julian.


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] Design advice requested

2013-05-08 Thread Julian

On 08/05/13 21:21, Johann Spies wrote:

> Basically my request is for advice on how to make this database as
> fast as possible with as few instances of duplicated data while
> providing both for the updates on level 0 and value added editing on
> level 1.
> 
> Regards
> Johann
Hi Johann.

Firstly, don't worry too much about speed in the design phase, there may
be differences of opinion here, but mine is that even with database
design the first fundamental layer is the relationship model. That is,
regardless of how the raw data is presented to you (CSV, raw text, other
relationship models or ideas), the same back to basics problem must be
solved - What is the most effective and efficient way of storing this
data, that will allow for database flexibility and scalability (future
adaptation of new relationship models).

Secondly, assuming the CSV and other raw data is in the flat (fat) table
format (contains columns of duplicate data). Its your job to determine
how to break it down into separate sections (tables) of data and how
they relate to each other (normalization). One to many, many to many,
etc. There's also other things to consider (i.e data history, revision),
but those are the basics.

Thirdly, its the queries and the relationships they reveal (joins)
between sections of data (tables) that assist in making the data
presentable and you can always later on utilize caches for blocks of
data that can be in the database itself (temp tables, MV's etc)
TIP: whether its temps, views, or materialized views its a good idea to
be consistent with the name i.e. "some_view". This provides a level of
abstraction and is handy in the design phase.

It doesn't matter if you are dealing with petabytes of data.

Thats all I can suggest without actually looking at a sample of the data
(problem) you are dealing with. Its a matter of breaking it down into
logical steps and having some fun.

Regards,
Julian.

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] OK to put temp tablespace on volatile storage or to omit it from backups?

2013-04-30 Thread Julian Glass

On 01/05/13 12:36, Yang Zhang wrote:
> On Tue, Apr 30, 2013 at 7:21 PM, Julian Glass  
> wrote:
>> On 01/05/13 09:55, Yang Zhang wrote:
>>> I would intuit that it's fine, but I just want to make sure there are
>>> no gotchas from a recovery point of view:
>>>
>>> If I were to lose my temp tablespace upon system crash, would this
>>> prevent proper crash recovery?
>>>
>>> Also, if I were to omit the temp tablespace from the base backup,
>>> would that prevent proper backup recovery?
>>>
>>> Thanks.
>>>
>>>
>> Yes I find this interesting. I wonder if you can snapshot the tablespace
>> filesystem upon initiation, then mount the snapshot backup before
>> restarting the server.
>>
>> Worth testing.
> This strikes me as the sort of thing that is dangerous to attempt to
> validate using strictly black-box testing, esp. by someone such as
> myself who has little knowledge of PG internals - just because it
> works for certain test cases of mine doesn't yield generalizable
> guarantees.
System disk failure is pretty serious, but I'm not sure if shutting down
the server would be required.
I have no experience yet with a tablespace failure on temporary objects
(tablespaces + MV is a future interest).
Is there a reason why dropping the temp objects, including the
tablespace and resetting temp_tablespaces to DEFAULT would not work?
Its not clear how you utilize this temp tablespace. (Assuming you are
using temp_tablespaces and not defining it withing the table definition.)
Regards,
Julians
(Sorry about missing the ML previously)



-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] how to create materialized view in postgresql 8.3

2013-04-10 Thread Julian

On 10/04/13 23:33, Vincent Veyron wrote:
> Le lundi 08 avril 2013 à 09:36 -0700, Mike Christensen a écrit :
>> This is the number one requested feature on Uservoice:
>>
>>
>> http://postgresql.uservoice.com/forums/21853-general/suggestions/247548-materialized-views
>>
>>
> 
> I find this rather surprising, considering the fact that a properly
> tuned database will return queries over millions of rows and multiple
> joins in milliseconds, given the proper hardware.
> 
> I can see how a datawharehouse with terrabytes of data can make use of
> them, but that is hardly a common situation. It seems to me many of
> these people clamouring for materialized views would be better off
> simply using a proprer data structure (and taking a few SQL courses).
> 
> Am I misguided?
> 

Theres database and application systems (literally everywhere on the web
IMO) where people think that throwing extra hardware at a problem will
solve what proper caching solutions would achieve with no upgrades at all.

IMO, for most things "web", data is retrieved more than it is set or
modified.

MV's will always perform better caching a query result, than a query
(VIEW) and MV's and tablespaces seem to be made for each other.

As for proper data structures, for whatever reason (migrating,
upgrading, maintaining) really bad query code exists (including mine).

Jules.

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] Using varchar primary keys.

2013-04-02 Thread Julian

On 03/04/13 06:37, Merlin Moncure wrote:
> On Tue, Apr 2, 2013 at 2:01 PM, Samantha Atkins  wrote:
>> Natural keys are in user data space.  Thus they are not guaranteed invariant 
>> and therefore cannot serve as persistent identity.

Can't find Samantha's original post. I agree but done right they can be
(persistent and unique)

> That is true, but irrelevant in most real world cases.  Also, nothing
> is keeping you from using an extra marker if/when you need to provide
> an invariant lookup.
> 
>> Also natural keys have the considerable defect of being of different type 
>> and arity per logical entity type.   This means that very nice things like 
>> dynamic relationships and persistent collections of references to multiple 
>> types of things cannot be built.  It also increases ORM  and data cache 
>> complexity.
> 
> OO evangelism.
> 
>> These are considerable weaknesses.

The Google cache is filled with "foo vs bar"
i.e. natural vs surrogate and its amazing what you get, even surrogate
key under wikipedia comes with a "its sources remain unclear because it
lacks inline citations" disclaimer.
I consider it pretty much a non debate.

>> You cannot guess a uuid from say uuid.uuid4().  If you think you can I look 
>> forward to seeing your proof.
> 
> I never claimed that.  I said that postgresql random() can be guessed,
> which it can, since it's based on lrand48.
> 
> merlin
> 
> 
Trying to get back on topic with the original post.

I have the iso (?) country code table, I have no problem with the PK
being the char(2) country code.
These natural codes/keys are thought out, debated and eventually decided
by other people.
But I also don't have a problem with adding an integer (serial) column
and making that my PK, although that PK might be pretty useless to the
rest of the world.

So thats that, having to really think it out is probably a good sign
that you should stick to a surrogate unless you are really sure.
(again I don't advocate ON UPDATE CASCADE as a solution should you
change your mind)

As to the whole natural vs surrogate/synthetic key debate, as I
mentioned in an earlier post I use them both. The question is when is it
worthy of a PK.

1)cust_id=123 (surrogate: PK)
vs
2)cust_id=1231 (natural: checkbit such as barcode data etc)
vs
3)cust_id=

For me, 2) is more likely to be a PK than 3), but it is entirely
possible that neither would be a PK.

Global/Universal unique identifiers, useful with replication and
database partitioning (in my instance)

1)cust_id=

vs

2)cust_id=--

1) will work, but 128bits is alot of random data that could be useful to
the app.
2) cust_id is not as universally unique, but if that was ever a problem
I could also wrap that in a encoded binary with a small change to the
app and no change to the database now it resembles something truly random.

The difference is 2) is more useful and contains "routing" data.
These are all natural and exposed to the world. But the question still
remains are they worthy of being a PK within the database of origin?
So far the answer from me is "doesn't have to be" but everyone else
might think it is, they may even make it a PK.

Regards,
Jules.

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] Using varchar primary keys.

2013-04-01 Thread Julian

On 02/04/13 06:35, jesusthefrog wrote:
> On the topic of 'natural' versus 'synthetic' primary keys, I am
> generally in the camp that an extra ID field won't cost you too much,
> and while one may not need it for a simple table (i.e. id, name) one
> might add any number of columns later, and you'll be glad to have it.

Definitely, a lookup table can be extended to hold all kinds of
statistics and translations on a "tag". I didn't look too deeply into
the actual usage (not sure if any real examples where given). But ON
UPDATE CASCADE if not a feature I would recommend in this context
(referencing a PK) if at all.

> I am, however, against using sequences (or serial integers in Postgres)
> for reasons of scaling and replication across multiple copies of a
> database running on different servers.
> 
> My preferred method is to give every table an ID column of UUID type and
> generate a UUID using the uuid-ossp contrib module. This also prevents
> someone not familiar with the database design from using an ID somewhere
> they should not (as is possible with natural PKs) or treating the ID as
> an integer, not an identifier (as is all too common with serial integers).
> 

It was instagrams own implementation, but mentions twitters "Snowflake".
Interesting read still only requires a bigint.
http://goo.gl/gRlXC

IMO uuid is convenient but large and for all that data it doesn't hold
anything useful. I have been tempted though.

Jules.

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] Using varchar primary keys.

2013-03-31 Thread Julian

On 01/04/13 12:19, Modulok wrote:
> On 3/31/13, Tim Uckun  wrote:
>> Consider the following scenario.
>>
>> I have a typical tagging structure. There is a table called tags, there is
>> a table called taggings. The taggings table acts as a many to many join
>> table between the taggers and the tags.
>>
>> The tags table has two fields id and tag. id is a serial data type.
>>
>> The taggings has a tag_id field along with the context, tagger_id etc.
>>
>> I don't think there is even one query in this scenario which does not join
>> the taggings table to the tags table so I am wondering if there is any
>> advantage at all of having that id field in the tags table. Why shouldn't I
>> just put the tag itself as the primary key?  The tags are quite short so if
>> pg is smart about it I would guess they would take up less space than an
>> integer in some cases.
>>
>> I guess the same scenario happens in many of my lookup tables. I have all
>> these tables with just two fields in them.  id field and then some varchar
>> field.  Then every time I want to display that record I join with five
>> other tables so I can look up all the associated lookup tables and display
>> the text value to the user.
>>
>> Given that you can still set foreign keys and cascade commands to adjust
>> child records either way the tables would be properly normalized so I am
>> wondering what I am gaining by using these serial ID fields.
> 
> You can. Arguably, if the natural key is relatively short and consists of only
> one column E.F. Codd would probably have encourage you to use it. I'd leave 
> the
> serial column for convenience of identifying individual records on the command
> line, especially if the varchar can store characters that are not easily typed
> in a console.
> 
> As for the primary key being a 'natural key' vs. a 'synthetic' one,
> that's a huge debate akin to emacs vs. vi. Depending on your situation, 
> there's
> nothing wrong with using natural keys. It is discussed at length and
> in some cases encouraged here::
> 
> http://database-programmer.blogspot.com/2008/01/database-skills-sane-approach-to.html
> 
> 
> -Modulok-
> 
> 
I disagree with almost everything in that blog post:

Shouldn't the first question be to figure out why even a bigserial
(int8) would not suffice as a unique identifier for row data? There are
reasons where it wont, what are they?
(theres a great blog post by a twitter developer on the development of
their unique identifier (PK) - couldn't find it)

Natural Keys have a purpose but when do they exist in the database?

For example a Surrogate Key that is exposed to the world may get wrapped
in a checkbit algorithm or perhaps even a hmac encoded key to validate
the genuineness of incoming requests before an expensive database lookup
is initiated.

So is there a need to store the Natural Keys generated in the
application? It might be useful to external parties though they may even
call it "foos_pks".

What about the Natural Keys of an external source? Should that be stored
in a lookup table along with the integer based Surrogate Key?
Maybe I'll call it "bars_pks".

As to the original question, my response may seem off topic, but
exposing PK's to the world has some pitfalls (and not all to do with
security) and character based keys have pitfalls as compared to
Surrogate Keys that stay out of the displayed data.

CREATE TABLE tags (tag text PRIMARY KEY);
CREATE TABLE child_tag (parent_tag text REFERENCES tags);

INSERT INTO tags VALUES ('fu');
INSERT INTO child_tag VALUES ('fu');

--Spelling error.

UPDATE tags SET tag = 'foo' WHERE tag = 'fu';

This will fail unless you ON UPDATE CASCADE.

Some things to think about.

Regards,
Jules.

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] Money casting too liberal?

2013-03-31 Thread Julian

On 31/03/13 21:57, Gavan Schneider wrote:
> On 30/3/13 at 12:58 AM, D'Arcy J.M. Cain wrote:
> Basically if MONEY is to be a useful tool it should really handle money
> matters in a way that makes accountants happy. If it can't do that then
> nobody is going to bother using it for serious work since NUMERIC and
> INTEGER will do the job just as well without the surprises.
> 
> Regards
> Gavan Schneider

Hi Gavan.
I most certainly do not store my GL values like that. I'm sure its all
about making the accountant "happy" but if he looked at any of my
accounting databases I've worked on he would just be "confused" (hes way
old school). He gets the "conventions" on printouts and a highlighter. :)

Anyhow, money could perhaps inspire a generic integer based data type
suitable for monetary values. A strict number literal for input and
output. e.g:

#SELECT '12.345'::decint2(3);

   decint2

   12.345

For example something like decint8(6) could still handle trillions with
6 decimal places and exchange rates could fit well within a decint4(x).
I think that would be useful. Admittedly I'm just thinking on the fly
here, so will appreciated people correcting me or commenting.

Aside from that, it would appear that people are just happy with and
recommend numeric for the time being.

Just some thoughts. Regards.
Jules.

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] Money casting too liberal?

2013-03-30 Thread Julian

On 29/03/13 23:32, Gavan Schneider wrote:
> Some people wrote:
> 
>> ... Hmm... This should optionally apply to time.
>> ... for anything that really matters, I'll work with UTC.
>>
> Is there a Godwin's law 
> equivalent for when our conversations end up with timezones getting
> mentioned? :)
> 
> Regards
> Gavan Schneider
> 
> 
>
Interesting discussion.

The comparisons with timezones ends when it comes to exchange rates.
The rate at the time of transaction has to the stored (somewhere)
associated with the base value. Timezones are rather fixed.

I went with numeric over money a long time ago (many numerous
discussions in #postgresql).

As per the docs on NUMERIC:
"It is especially recommended for storing monetary amounts and other
quantities where exactness is required"

"However, arithmetic on numeric values is very slow compared to the
integer types"

With a current WIP. I'm starting to think that numeric is probably
overkill for storing monetary values as well (are we going to go much
more than 6 decimal places? and thats just for storing the rates...).

Now considering just using integers. All the formatting of the input and
display for output is done in the front end, just don't make a mistake
there.

Boring story...

An accountant came to visit us and pulled out his calculator and started
doing some simple math, he did not use the decimal point once, he always
knew where it was - although I would bet he would have trouble with
division.

Regards,
Jules.

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] table logging

2012-10-28 Thread Julian


On 29/10/12 06:25, Jeff Janes wrote:

I am looking for some very simple table logging.  I am not trying to
do auditing in a hostile environment, just simple logging.

I found two candidates, tablelog from pgfoundry, and
http://wiki.postgresql.org/wiki/Audit_trigger_91plus

The first has the advantage of being simple, but hasn't been
maintained in 5 years which is a little worrisome.

The latter is very flexible, but is rather complex as you have to fish
out the data with queries against hstore object, instead of tables
with the same structure as the tables being logged.  On the other hand
it does deal with columns being added to the underlying tables more
gracefully.

Are there widely used/supported options other than these two, and of
course rolling my own?

Cheers,

Jeff



Hi Jeff, It does really depend on the requirements of your app and what
you want to log.
I have used tablelog and also rolled my own, both work fine.
Theres also the idea of "revision" logging (such as a web page
you wish to revert back to/review).
These are "insert" tables, there are never any actual "DELETE"s or
"UPDATE"s just "INSERT"s there is no log table per se.

After a brief look at Audit Trigger it seems like a simple solution to 
table
auditing, perhaps you can hack at that to not use hstore and log what 
you want?


I usually avoid full blown "audits" on a table if I can and go with the 
"revision"

style of logging (such as change of address, email, etc) if its required.

This solution can lead to bloat if only the data in 1 column differs
to the original. But bloat is also hard to avoid if one letter in a 
large web page
(stored in 1 column) differs to the original, overcoming that can lead 
to complexities.


So this all depends on the requirements of your app.

It can be fair to say that all kinds of logging leads to bloat that you'll
have to deal with later. (partitioning, purging etc).

Julian.



--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] How to don't update sequence on rollback of a transaction

2012-08-03 Thread Julian


Hi,
If you want guaranteed "consecutive" sequential numbering you have to 
implement your own solution. I was brought to task by a number of people 
about this (accountants).  So its not a good idea to use a sequence for 
things like invoice, receipt and other such accounting objects (not only 
monetary), unless its somehow acceptable in your region.  You can pretty 
much duplicate the functionality of sequences as normal tables with the 
benefit of them being transaction safe.

Be sure you are using it for reasons where its absolutely required.
For everything else a sequence does what its intended to do.

Regards,
Julian

P.S I have heard of people using a sequence in an AFTER trigger to 
generate consecutive numbering to some success.  But anything could happen.


On 08/03/12 17:56, Frank Lanitz wrote:

Am 02.08.2012 17:15, schrieb Andrew Hastie:

Hi Frank,

I believe this is by design. See the bottom of the documentation on
sequences where it states ;-

"*Important:* To avoid blocking concurrent transactions that obtain
numbers from the same sequence, a |nextval| operation is never rolled
back; that is, once a value has been fetched it is considered used, even
if the transaction that did the |nextval| later aborts. This means that
aborted transactions might leave unused "holes" in the sequence of
assigned values. |setval| operations are never rolled back, either."

http://www.postgresql.org/docs/9.1/static/functions-sequence.html

If you really want to reset the sequence, I think you would have to call
SELECT SETVAL(.) at the point you request the roll-back.

Yepp. Somehow I missed that part of documentation. I don't think setval
will do the trick I want to perform, but Craig's idea looks very well.

Thanks for feedback!

Cheers,
Frank





--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] Debian upgrade with PostgreSQL

2012-07-30 Thread Julian


Hi Patrick,

   First read this as a general guide for debian.
   
http://www.debian.org/releases/stable/i386/release-notes/ch-upgrading.html


There was a few gotchas for our upgrade that made us for now stay on 
lenny and postgres 8.3.
One being our old client application not being able to deal with the 
bytea format properly (sticking with 8.3 until our new client app is ready).


refer to this.
http://www.postgresql.org/docs/9.0/static/release-9-0.html#AEN100218

Regards,
Julian.

Read through
On 07/30/12 20:57, Patrick Ernst wrote:

Hello,

We are running PostgreSQL 8.3 on a Debian Lenny system. Since the 
system is outdated, we want to update to Debian Squeeze.


Based on your knowledge and experience are there any special 
preparations for PostgreSQL we have to take care of? Or can you give 
us some tips how we should approach the system upgrade so that 
everything runs smoothly?


Regards,

Patrick




--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] Stange "duplicate key value violates unique constraint" after "delete" at ON UPDATE trigger

2012-01-27 Thread Julian v. Bock

Hi

>>>>> "DK" == Dmitry Koterov  writes:

DK> create table a(i integer);
DK> CREATE UNIQUE INDEX a_idx ON a USING btree (i);
DK> CREATE FUNCTION a_tr() RETURNS trigger AS
DK> $body$
DK> BEGIN
DK> DELETE FROM a WHERE i = NEW.i;
DK> RETURN NEW;
DK> END;
DK> $body$
DK> LANGUAGE 'plpgsql';
DK> CREATE TRIGGER a_tr BEFORE INSERT ON a FOR EACH ROW EXECUTE PROCEDURE
DK> a_tr();

The DELETE doesn't see the row the other transaction inserted and
doesn't delete anything (and doesn't block). This happens later when the
row is inserted and the index is updated.

You can try the insert and catch the unique violation in a loop (see
http://www.postgresql.org/docs/9.1/static/plpgsql-control-structures.html)
although that won't work with a BEFORE trigger.

Regards,
Julian

-- 
Julian v. Bock   Projektleitung Software-Entwicklung
OpenIT GmbH  Tel +49 211 239 577-0
In der Steele 33a-41 Fax +49 211 239 577-10
D-40599 Düsseldorf   http://www.openit.de

HRB 38815 Amtsgericht Düsseldorf USt-Id DE 812951861
Geschäftsführer: Oliver Haakert, Maurice Kemmann

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

[GENERAL] Restoring default privileges on a table

2011-12-28 Thread julian

This message has been digitally signed by the sender.

_GENERAL__Restoring_default_privileges_on_a_table.eml
Description: Binary data


-
Hi-Tech Gears Ltd, Gurgaon, India


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

[GENERAL] Restoring default privileges on a table

2011-12-28 Thread Julian Mehnle

Hi all,

I just ran into a long-standing issue with the Pg privileges system.

It seems there is no way to restore a table's default privileges to what
they were directly after creating the table.

`REVOKE ALL ON table …` revokes not only privileges that were explicitly
granted using GRANT but also all regular privileges that were in effect
due to default privileges, so the owner can't SELECT from their own table
anymore.  So this is not an option (although I *thought* it was and just
hosed my production system for an hour until I realized what had
happened).

Note that this is different from the "default privileges" managed through
`ALTER DEFAULT PRIVILEGES`.  Tom Lane implemented a way to get rid of such
sets of modified default privileges back in April 2010:

http://archives.postgresql.org/pgsql-hackers/2010-04/msg00139.php

http://git.postgresql.org/gitweb/?p=postgresql.git;a=commit;h=60bd2b1941c6211e973770e69cfa33367cc5db48

What *I'm* talking about here is this:

http://archives.postgresql.org/pgsql-sql/2010-06/msg00042.php

Whereas on Pg 8.3 it was at least possible to discern default privileges
(null, shown as an empty string) from an empty set of privileges (shown
as {}) in \z output, on Pg 9.0+ both are shown as an empty string, which
I think is particularly bad.  Could \z and \dp be changed to show
"default" rather than an empty string when default privileges are in
effect for an object?  Or, conversely, could we show "none" if there's
an empty set of privileges for an object?

Also, I would like to propose a new command for applying the current
default privileges as defined (or undefined) through `ALTER DEFAULT
PRIVILEGES` to one or more database objects:

=> APPLY DEFAULT PRIVILEGES TO TABLE foo;
=> APPLY DEFAULT PRIVILEGES TO ALL TABLES IN SCHEMA bar;

Alternatively, this could be made a variant on GRANT:

=> GRANT DEFAULT PRIVILEGES ON TABLE foo;
=> GRANT DEFAULT PRIVILEGES ON ALL TABLES IN SCHEMA bar;

But that might be misleading since it could effectively *remove*
privileges as well.

The important thing is: there should be a way to restore default
privileges on a database object.

What do you think?

-Julian


signature.asc
Description: This is a digitally signed message part.

[GENERAL] PostgreSQL performance

2010-08-30 Thread Valoo, Julian

Hi 

 

Are there any performance tuning tools for PostgreSQL, besides explain.
Any system management views to find out missing indexes or indexes that
are not being used.

 

 Thanks 

 

 

 

Julian Valoo

SQL Database Administrator

Corporate and Transactional Banking IT

BankCity

e-mail julian.va...@fnb.co.za

www.fnb.co.za www.shine2010.co.za

 

 


To read FirstRand Bank's Disclaimer for this email click on the following 
address or copy into your Internet browser: 
https://www.fnb.co.za/disclaimer.html 

If you are unable to access the Disclaimer, send a blank e-mail to
firstrandbankdisclai...@fnb.co.za and we will send you a copy of the Disclaimer.
<>

Re: [GENERAL] Efficient Way to Merge Two Large Tables

2010-07-13 Thread Julian Mehnle

Joshua Rubin wrote:

> I have two tables each with nearly 300M rows. There is a 1:1
> relationship between the two tables and they are almost always joined
> together in queries. The first table has many columns, the second has
> a foreign key to the primary key of the first table and one more
> column. It is expected that for every row in table1, there is a
> corresponding row in table2. We would like to just add the one column
> to the first table and drop the second table to allow us to index this
> extra column.
>
> This query would work after adding the column to the first table:
> UPDATE table1 SET new_column = table2.new_column FROM table2 WHERE
> table1.row_id = table2.row_id;
>
> However, this will take much too long, I have not successfully
> completed this on our staging server after running it for 3+ days.

Can you get the query plan (EXPLAIN) of the update query?  My guess is the 
join cost scales superlinearly.

You might be able to chop this up into smaller UPDATEs by limiting the 
rows to be updated in each round by the primary key.

E.g.:

  UPDATE table1 SET new_column = table2.new_column FROM table2
WHERE
  table1.row_id = table2.row_id and
  table1.row_id >= 0e6 and table1.row_id < 1e6 and
  table2.row_id >= 0e6 and table2.row_id < 1e6;

for a moving row_id window.

This has helped me in the past with a similar scenario (where both tables 
were partitioned by the PK, but it would presumably still work in the 
unpartitioned case).

-Julian

signature.asc
Description: This is a digitally signed message part.

Re: [GENERAL] "--variable foo=bar" vs. "\set foo quux" in ~/.psqlrc

2010-05-15 Thread Julian Mehnle

Tom Lane wrote:

> Julian Mehnle  writes:
> > Can anyone confirm that --variable command-line options are evaluated
> > before .psqlrc is read and executed?  If so, does anyone know the
> > rationale for that?  It seems counterintuitive to me, as it makes
> > overriding variables from the command-line impossible.
>
> Seems entirely intuitive to me, as we can certainly not process .psqlrc
> before processing the command line options (one of them might be -X for
> instance).

Well, OK, that may be true for -X and certain others, but not necessarily 
for --variable.

> Furthermore, changing it would likely break existing applications.

Fair enough.  Any comments on my other points and questions, Tom?

-Julian

signature.asc
Description: This is a digitally signed message part.

[GENERAL] "--variable foo=bar" vs. "\set foo quux" in ~/.psqlrc

2010-05-14 Thread Julian Mehnle

Hi all,

I'm trying to add the database host name to my psql prompts.  The obvious
solution is to add %M or %m to the PROMPT{1,2} variables in ~/.psqlrc.

However I have to work with a few databases that can be reached only
through SSH tunnels, for which I use aliases like this:

  alias dbfoo='ssh -q -f -o "ExitOnForwardFailure yes" -L 
15432:server-beyond-tunnel:5432
   gateway-host sleep 5; psql -h localhost -p 15432 -U dbuser foodb'

So the problem with the above prompt solution is that this will show a
hostname of "localhost" because that's what psql connects to.

Now I've tried using the following in .psqlrc:

  \set host :HOST
  \set PROMPT1 '%:host::%/%x%R%# '
  \set PROMPT2 '%:host::%/%x%R%# '

and then overriding the "host" variable from the command line using
"--variable host=server-beyond-tunnel".  Unfortunately the --variable
option doesn't seem to have any effect, presumably because it's evaluated
*before* .psqlrc is read and executed.

Can anyone confirm that --variable command-line options are evaluated
before .psqlrc is read and executed?  If so, does anyone know the
rationale for that?  It seems counterintuitive to me, as it makes
overriding variables from the command-line impossible.

If there is consensus that evaluating --variable options *after* .psqlrc
was read and executed is an acceptable change, or that a --variable-late
option should be introduced, I might come up with a patch.

TIA,

-Julian


signature.asc
Description: This is a digitally signed message part.

Re: [GENERAL] regexp_matches() quantified-capturing-parentheses oddity

2009-12-08 Thread Julian Mehnle

Tom, thanks for your reply.

I wrote:

>   wisu-dev=# SELECT regexp_matches('q...@foo@bar.zip', '([...@.]|[...@.]+)+', 
> 'g');
>{p} 
>
>   wisu-dev=# SELECT regexp_matches('q...@foo@bar.zip', 
> '([...@.]|[...@.]+){1,2}', 'g');
>{...@} 
>{...@}
>{.}
>{p}
>
>   wisu-dev=# SELECT regexp_matches('q...@foo@bar.zip', 
> '([...@.]|[...@.]+){1,3}', 'g');
>{foo} 
>{.}
>{p}
>
> What's going on here??

FWIW, I have a vague idea of what these do:

They match greedily, i.e., exactly as many instances of the subexpression
as maximally allowed by the quantifier (and no less!), backtracking and
regrouping word characters if necessary to get that many instances,
and it always returns only the last of each tuple of sub-expressions,
repeating until the string is exhausted.

E.g., I think:

'([...@.]|[...@.]+){1,2}' matches (quux @ foo @ bar . zi p) and returns every
second of those: @ @ . p

'([...@.]|[...@.]+){1,3}' matches (quux @ foo @ bar . z i p) and returns every
third of those: foo . p

'([...@.]|[...@.]+)+' matches (q u u x @ f o o @ b a r . z i p) and returns
every 16th of those: p

I see that Perl behaves similarly, except for the trying to always match
exactly as many instances of the subexpression as *maximally* allowed by
the quantified, and backtracking if necessary for this to work.  That last
part is very, very weird.

Tom Lane wrote:

> These might be a bug, but the behavior doesn't seem to me that it'd be
> terribly well defined in any case. The function should be pulling the 
> match to the parenthesized subexpression, but here that subexpression
> has got multiple matches --- which one would you expect to get?

I had *hoped* regexp_matches('q...@foo@bar.zip', '([...@.]|[...@.]+)') (without
'g') would return all the subexpression matches as a *single* array in a
*single* row.

However, now that I've checked the Perl regexp engine's behavior, I would
at least expect it to work just like Perl, i.e., allow fewer matches at
the end, without tracking back and regrouping:

  $ perl -le 'print(join(" ", "qu...@foo\@bar.zip" =~ 
m/([...@.]|[...@.]+){1,2}/g))'
  @ @ . zip

  $ perl -le 'print(join(" ", "qu...@foo\@bar.zip" =~ 
m/([...@.]|[...@.]+){1,3}/g))'
  foo . zip

> Instead of (foo)+ I'd try
>   ((foo+))if you want all the matches
>   (foo)(foo)* if you want the first one
>   (?:foo)*(foo)   if you want the last one

I would use the ((foo+)) form, but of course it doesn't return all of the
subexpression matches as separate elements, which was the point of my
exercise.

For what it's worth, I'm now using a "FOR ... IN SELECT regexp_matches(...)
LOOP" construct in a custom plpgsql function.

-Julian

signature.asc
Description: This is a digitally signed message part.

[GENERAL] regexp_matches() quantified-capturing-parentheses oddity

2009-12-08 Thread Julian Mehnle

Hi all,

  wisu-dev=# SELECT regexp_matches('q...@foo@bar.zip', '([...@.]|[...@.]+)', 
'g');
   {quux}
   {...@}
   {foo}
   {...@}
   {bar}
   {.}
   {zip}

So far, so good.  However, can someone please explain the following to me?

  wisu-dev=# SELECT regexp_matches('q...@foo@bar.zip', '([...@.]|[...@.]+)+', 
'g');
   {p}

  wisu-dev=# SELECT regexp_matches('q...@foo@bar.zip', 
'([...@.]|[...@.]+){1,2}', 'g');
   {...@}
   {...@}
   {.}
   {p}

  wisu-dev=# SELECT regexp_matches('q...@foo@bar.zip', 
'([...@.]|[...@.]+){1,3}', 'g');
   {foo}
   {.}
   {p}

What's going on here??

Regards,

-Julian Mehnle


signature.asc
Description: This is a digitally signed message part.

Re: [GENERAL] support for embedded db and a clustered index?

2008-10-25 Thread Julian Bui

Hi all,

Does pgsql provide support for a clustered index?  By that I mean can I
specify that the database insert records in the order of a particular
attribute.  Many of my queries are time-range based and my row inserts are
done chronologically, so it would benefit to have them sorted by when they
were inserted.

Also, can pgsql be embedded in an application?  By that I mean it is easily
deployable with an application.  The end user is unaware that a database
even exists and the end user does not need to install pgsql.  I do NOT mean
a db that runs on an embedded device (although I am hoping pgsql has a small
memory footprint) and I also do NOT mean to ask if pgsql supports embedded
sql.

Any help would be greatly appreciated.

Thanks,
Julian

[GENERAL] Casting composite types

2008-07-26 Thread Julian Scarfe


Using 8.1

# create table foo (a integer, b integer);
# create table baz (b integer, c integer);

# insert into foo values (8,9);
# insert into baz values (9,1);

# select * from foo;
a | b
---+---
8 | 9
(1 row)

# select * from baz;
b | c
---+---
9 | 1
(1 row)

# create view foobaz as select foo.*, baz.c from foo join baz using (b);

# select * from foobaz;
a | b | c
---+---+---
8 | 9 | 1
(1 row)

So far so good.

I have many functions that take the composite type foo, and therefore wish 
to be able to cast a foobaz into a foo, by taking only columns in foo (i.e. 
a and b).  But of course there's no cast defined:


# select foobaz::foo from foobaz;
ERROR:  cannot cast type foobaz to foo

# select foo(foobaz) from foobaz;
ERROR:  function foo(foobaz) does not exist
HINT:  No function matches the given name and argument types. You may need 
to add explicit type casts.


Ideally, I'd love to avoid writing a separate function for each foo,baz pair 
of types as I have many of each. In any case, I want to avoid specifying the 
columns of foo in the code of foo(foobaz) so that the function doesn't break 
when I alter the foo table.  The best I've got for the latter is:


CREATE OR REPLACE FUNCTION foo(foobaz)  RETURNS foo AS $$
my ($foobaz) = @_;
my $foo = {};
$row = spi_fetchrow(spi_query("SELECT * from foo limit 1"));
for (keys %$row) {$foo->{$_} = $foobaz->{$_}};
return $foo;
$$ LANGUAGE plperlu;

which feels very cumbersome, but works, provided foo is not empty.

# select foo(foobaz) from foobaz;
 foo
---
(8,9)
(1 row)

Am I missing an obvious trick or syntax here for such an 'autocast'?  Or 
have I just been corrupted by Perl to take types too lightly?


Thanks

Julian 



--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

[GENERAL] last analyze time in 8.1?

2007-06-07 Thread Julian Scarfe

Is there any way of getting at the last time a table was analyzed (by 
autovacuum) in 8.1 or is that only recorded (in pg_stat_*_tables) since 8.2?


TIA

Julian 




---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

  http://www.postgresql.org/docs/faq

Re: [GENERAL] Timezones -- what comes out does not go in?

2006-08-16 Thread Julian Scarfe


From: "Tom Lane" <[EMAIL PROTECTED]>


This works in CVS HEAD, thanks to some hard work by Joachim Wieland.


One of these days I'll find an issue *before* you folks have patched it. :-)

Thanks

Julian 




---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
  choose an index scan if your joining column's datatypes do not
  match

[GENERAL] Timezones -- what comes out does not go in?

2006-08-16 Thread Julian Scarfe


The timezone on our boxes is set to Etc/UCT (possibly a distro default).

(psql 8.1.4)

=> select to_char(current_timestamp, 'HH24:MI:SS TZ');
  to_char
--
09:05:48 UCT
(1 row)

=>  select '09:05:48 UCT'::time;
ERROR:  invalid input syntax for type time: "09:05:48 UCT"

UTC works, of course:

=> select '09:05:48 UTC'::time;
  time
--
09:05:48
(1 row)


Is this behaviour expected? Desirable?

Thanks
Julian


---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

  http://www.postgresql.org/docs/faq

[GENERAL] INSERT OR UPDATE

2006-01-02 Thread Julian Scarfe

I'm keeping config information for an application in a series of related 
tables.  I'd like a command that INSERTs data if it's new, or UPDATEs it if 
the key is duplicated.


Copying the config info from one database to another virgin installation is 
easy, of course. I can just use pg_dump on the original db1 to create the 
DDL and run it into the new database db2.


The problem comes when I've made changes in db1 that I want to copy to db2, 
but db2 may itself have additional entries not present in db1.  I'd like to 
update the entries in db2 that have equivalent entries in db1 with their db1 
values, without destroying the rest of the data in db2.


Here's an example:

db1 is created as follows:

create table a (
a_id text PRIMARY KEY,
param_b text,
param_c text
);

create table d (
d_id text PRIMARY KEY,
a_id text references a(a_id) ON UPDATE CASCADE,
param_e text,
param_f text
);

insert into a values ('a1', 'b1', 'c1');
insert into a values ('a2', 'b2', 'c2');

insert into d values ('d1', 'a1', 'e1', 'f1');
insert into d values ('d2', 'a2', 'e2', 'f2');
insert into d values ('d3', 'a1', 'e3', 'f3');

The databases have identical schema:
$ pg_dump -s db1 | psql db2

and provided db2 is empty, I can just copy the contents across:
$ pg_dump -ad db1 > db1.config
$ psql db2 < db1.config

So far so good.

But now db2 gets some additional entries:
db2=# insert into a values ('a3', 'b3', 'c3');
db2=# insert into d values ('d4', 'a1', 'e4', 'f4');
db2=# insert into d values ('d5', 'a2', 'e5', 'f5');

Meanwhile, I make some config changes in db1:
db1=# update a set param_b = 'b1 new' where a_id = 'a1';
db1=# update d set param_e = 'e1 new' where d_id = 'd1';
db1=# update d set param_f = 'f2 new' where d_id = 'd2';

So:
$ pg_dump -ad db1 > db1.config
and db1.config now contains
INSERT INTO a VALUES ('a2', 'b2', 'c2');
INSERT INTO a VALUES ('a1', 'b1 new', 'c1');
INSERT INTO d VALUES ('d3', 'a1', 'e3', 'f3');
INSERT INTO d VALUES ('d1', 'a1', 'e1 new', 'f1');
INSERT INTO d VALUES ('d2', 'a2', 'e2', 'f2 new');

I want to update the data in db2 to reflect the values in db1. I can't 
truncate the tables in db2 because I'd lose the additional entries (a3, d4, 
d5).


But of course when I do...
$ psql db2 < db1.config
... I get ...
ERROR:  duplicate key violates unique constraint "a_pkey"
ERROR:  duplicate key violates unique constraint "d_pkey"
...and the a1, d1, d2 rows are not updated to match db1.

What I'd really like is to be able to do:

INSERT OR UPDATE INTO a VALUES ('a2', 'b2', 'c2');
INSERT OR UPDATE INTO a VALUES ('a1', 'b1 new', 'c1');
INSERT OR UPDATE INTO d VALUES ('d3', 'a1', 'e3', 'f3');
INSERT OR UPDATE INTO d VALUES ('d1', 'a1', 'e1 new', 'f1');
INSERT OR UPDATE INTO d VALUES ('d2', 'a2', 'e2', 'f2 new');

so that the rest of the row is treated as an UPDATE if the primary key is a 
duplicate.


Of course I can write something at the application level to examine each row 
and take appropriate action. But it feels like this may be a commonly 
encountered problem for which there may be a database-level solution.  Am I 
missing something obvious?


Thanks

Julian Scarfe






---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings

Re: [GENERAL] PROBLEM: Function does not exist

2005-06-22 Thread Julian Legeny

Hello Michael,

   yes, you have right. Solution is cast it to smallint.

   Thank you for your advice,
   with best regards,

   Julian Legeny

Thursday, June 23, 2005, 2:48:17 AM, you wrote:

MF> On Thu, Jun 23, 2005 at 12:36:50AM +0200, Julian Legeny wrote:
>>
>> CREATE OR REPLACE FUNCTION insert_bf_filter(int4, int4, "varchar",
>> "varchar", int4, int2, int2, int4, int2, int4)

MF> [-snip-]

>> select INTGR, TMSTP from INSERT_BF_FILTER (53, 1354, 'test_filter_name',
>> 'test_filter_description', 1, 0, 0, 10, 0, 100)
>> 
>> And it returns me following error:
>> ERROR:  function insert_bf_filter(integer, integer, "unknown", "unknown",
>> integer, integer, integer, integer, integer, integer) does not exist
>> HINT:  No function matches the given name and argument types. You may need 
>> to add explicit type casts.

MF> Follow the hint: use explicit type casts for the int2 arguments.  Try this:

MF> select INTGR, TMSTP
MF> from INSERT_BF_FILTER(53, 1354, 'test_filter_name', 
'test_filter_description',
MF>   1, 0::int2, 0::int2, 10, 0::int2, 100);



---(end of broadcast)---
TIP 6: Have you searched our list archives?

   http://archives.postgresql.org

[GENERAL] PROBLEM: Function does not exist

2005-06-22 Thread Julian Legeny

Hello,

   I'm using postgres 8.0.3 under win xp. I have problem to call psql function.

   I have tables:
   1. BF_DOMAIN
   2. BF_USER
   3. BF_FILTER (this table uses FK from the previous two tables)

   and  function  INSERT_BF_FILTER that inserts data to the table BF_FILTER. But
when  I  call  this  function,  it  is  not  found  because incorrect parameters
"unknown" instead of varchar.

  Definition of tables and function:

CREATE TABLE bf_domain
(
  id serial NOT NULL,
  name varchar(100) NOT NULL,
  description varchar(1024),
  creation_date timestamp NOT NULL,
  modification_date timestamp NOT NULL,
  CONSTRAINT bf_dom_pk PRIMARY KEY (id),
  CONSTRAINT bf_dom_uq UNIQUE (name)
) 
WITH OIDS;



CREATE TABLE bf_user
(
  id serial NOT NULL,
  domain_id int4 NOT NULL,
  first_name varchar(50) NOT NULL,
  last_name varchar(50) NOT NULL,
  phone varchar(20) NOT NULL,
  fax varchar(20) NOT NULL,
  address varchar(1024) NOT NULL,
  email varchar(50) NOT NULL,
  login_name varchar(30) NOT NULL,
  "password" varchar(50) NOT NULL,
  login_enabled int2 NOT NULL DEFAULT 0,
  guest_access_enabled int2 NOT NULL DEFAULT 0,
  super_user int2 NOT NULL DEFAULT 0,
  internal_user int2 NOT NULL DEFAULT 0,
  creation_date timestamp NOT NULL,
  modification_date timestamp NOT NULL,
  CONSTRAINT bf_usr_pk PRIMARY KEY (id),
  CONSTRAINT bf_usr_fk FOREIGN KEY (domain_id) REFERENCES bastatest.bf_domain 
(id) ON UPDATE NO ACTION ON DELETE CASCADE,
  CONSTRAINT bf_usr_email_uq UNIQUE (email),
  CONSTRAINT bf_usr_login_uq UNIQUE (login_name)
) 
WITH OIDS;



CREATE TABLE bf_filter
(
  id serial NOT NULL,
  domain_id int4 NOT NULL,
  user_id int4 NOT NULL,
  name varchar(50) NOT NULL,
  description varchar(1024),
  data_type int4 NOT NULL,
  shared int2 NOT NULL DEFAULT 0,
  match_all int2 NOT NULL DEFAULT 0,
  page_size int4 NOT NULL DEFAULT 10,
  order_desc int2 NOT NULL DEFAULT 0,
  order_column int4,
  creation_date timestamp NOT NULL,
  modification_date timestamp NOT NULL,
  CONSTRAINT bf_fltr_pk PRIMARY KEY (id),
  CONSTRAINT bf_fltr_dom_fk FOREIGN KEY (domain_id) REFERENCES 
bastatest.bf_domain (id) ON UPDATE NO ACTION ON DELETE CASCADE,
  CONSTRAINT bf_fltr_usr_fk FOREIGN KEY (user_id) REFERENCES bastatest.bf_user 
(id) ON UPDATE NO ACTION ON DELETE CASCADE
) 
WITH OIDS;



CREATE TYPE type_int_timestamp AS
   (intgr int4,
tmstp timestamp);



CREATE OR REPLACE FUNCTION insert_bf_filter(int4, int4, "varchar", "varchar", 
int4, int2, int2, int4, int2, int4)
  RETURNS type_int_timestamp AS
$BODY$ 
DECLARE 
   IN_DOMAIN_ID ALIAS FOR $1; 
   IN_USER_ID ALIAS FOR $2; 
   IN_NAME ALIAS FOR $3; 
   IN_DESCRIPTION ALIAS FOR $4; 
   IN_DATA_TYPE ALIAS FOR $5; 
   IN_SHARED ALIAS FOR $6; 
   IN_MATCH_ALL ALIAS FOR $7; 
   IN_PAGE_SIZE ALIAS FOR $8; 
   IN_ORDER_DESC ALIAS FOR $9; 
   IN_ORDER_COLUMN ALIAS FOR $10; 
   out_key INTEGER; 
   out_timestamp TIMESTAMP; 
   output_result type_int_timestamp; 
BEGIN 
   SELECT INTO out_timestamp now(); 
   INSERT INTO BF_FILTER(DOMAIN_ID, USER_ID, NAME,
  DESCRIPTION, DATA_TYPE, SHARED, MATCH_ALL, PAGE_SIZE, 
  ORDER_DESC, ORDER_COLUMN, CREATION_DATE, MODIFICATION_DATE)
   VALUES (IN_DOMAIN_ID, IN_USER_ID, IN_NAME, IN_DESCRIPTION, IN_DATA_TYPE,
   IN_SHARED, IN_MATCH_ALL, IN_PAGE_SIZE, IN_ORDER_DESC, 
   IN_ORDER_COLUMN, out_timestamp, out_timestamp); 
   out_key := CURRVAL('bf_filter_id_seq'); 
   output_result.intgr := out_key; 
   output_result.tmstp := out_timestamp; 
   RETURN output_result; 
END 
$BODY$
  LANGUAGE 'plpgsql' VOLATILE;




I call following SQL command from pgAdmin:
select INTGR, TMSTP from INSERT_BF_FILTER (53, 1354, 'test_filter_name', 
'test_filter_description', 1, 0, 0, 10, 0, 100)

And it returns me following error:
ERROR:  function insert_bf_filter(integer, integer, "unknown", "unknown", 
integer, integer, integer, integer, integer, integer) does not exist
HINT:  No function matches the given name and argument types. You may need to 
add explicit type casts.


Can you advice me what can be a problem?

-- 

Thank you for your answer,
best regards,

Julian

mailto:[EMAIL PROTECTED]


---(end of broadcast)---
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]

Re: [GENERAL] ORDER BY options (how to order data as AAA, aaa, BBB, bbb, ... ZZZ, zzz)

2005-05-10 Thread Julian Legeny

Hello,

   that's what I was looking for.

   Thanks to all for advices,
   with best regards,

   Julian Legeny


Tuesday, May 10, 2005, 12:14:38 PM, you wrote:

RS> SELECT * FROM MY_TABLE ORDER BY lower(NAME), NAME

RS> The second NAME is to ensure that AAA comes before aaa, otherwise the order 
is undetermined.

RS> ---(end of broadcast)---
RS> TIP 2: you can get off all lists at once with the unregister command
RS> (send "unregister YourEmailAddressHere" to [EMAIL PROTECTED])



---(end of broadcast)---
TIP 6: Have you searched our list archives?

   http://archives.postgresql.org

[GENERAL] ORDER BY options (how to order data as AAA, aaa, BBB, bbb, ... ZZZ, zzz)

2005-05-10 Thread Julian Legeny

Hello,

   I have following problem:

I have table MY_TABLE with following records:

   NAME
---
   ccc
   CCC
   AAA
   aaa
   bbb
   BBB

When I use default select that sort all data by NAME:

SELECT * FROM MY_TABLE ORDER BY NAME;

result is following:

   NAME
---
   AAA
   BBB
   CCC
   aaa
   bbb
   ccc
  

But I would like to sort all data as following:

   NAME
---
   AAA
   aaa
   BBB
   bbb
   CCC
   ccc


How can I write sql command (or set up ORDER BY options) for selecting that?

   
   Thank you in advance for answer,
   best regards,

   Julian Legeny

mailto:[EMAIL PROTECTED]


---(end of broadcast)---
TIP 4: Don't 'kill -9' the postmaster

Re: [GENERAL] 'prepare' is not quite schema-safe

2005-05-02 Thread Julian Mehnle

Vlad [EMAIL PROTECTED] wrote:
> ok, since there is no gurantee that server-side prepared query is
> still active, pergaps postgresql interface library provide way to
> check if a prepared before query still alive prior runing exec, so
> that dbd::pg driver can make sure it's still there, right before
> executing?
>
> If there is no such function (and I can't find it), then it will be
> hard for a driver to make things working right with server-side
> prepared queries!

You can always use fully qualified class (table) names in your prepared
queries, i.e. explicitly specify the schema name.


---(end of broadcast)---
TIP 6: Have you searched our list archives?

   http://archives.postgresql.org

Re: [GENERAL] SQL Question

2005-04-16 Thread Julian Scarfe

From: "Alex" <[EMAIL PROTECTED]>
- How can i select only the newest record for each ProdId ?
100| 2005-04-01
200| 2005-04-01
DISTINCT ON was made for this and on the similar tables I have performs 
rather more efficiently than using a subquery.

select distinct on (ProdId) ProdId , LastUpdate
from produpdate
order by ProdId , LastUpdate desc;
- How can i select to retrieve the last 2 dates in record
100| 2005-04-01 | 2005-03-01
200| 2005-04-01 | 2005-03-01
To get the previous one, my first thought is something like:
select distinct on (ProdId) ProdId , LastUpdate
from produpdate p1
where LastUpdate <> (
   select max(LastUpdate ) from produpdate p2 where p2.ProdId = p1.ProdId
)
order by ProdId , LastUpdate desc ;
but there may be a much more efficient way of getting the nth result in 
general.

Julian Scarfe 


---(end of broadcast)---
TIP 3: if posting/reading through Usenet, please send an appropriate
 subscribe-nomail command to [EMAIL PROTECTED] so that your
 message can get through to the mailing list cleanly

Re: [GENERAL] Index use with left join

2005-04-09 Thread Julian Scarfe

"Julian Scarfe" <[EMAIL PROTECTED]> writes:
Does the planner "realise" that
the intersection, Query 6, will still return 150 rows, or does it assume
independence of the filters in some way and estimate
20,000*(150/20,000)*(396/20,000)?
From: "Tom Lane" <[EMAIL PROTECTED]>
It assumes independence of the conditions --- which is why having two
of them reduced the rowcount estimate so much.  There are some limited
cases in which it can recognize redundant conditions, but offhand I
think that only works for scalar inequalities (like "x < 5 AND x < 6").
Even that's smarter than I dared hope for!
Any general suggestions for workarounds?
Not much, other than trying to avoid redundant conditions.
Did you look into the state of the PostGIS work on geometric statistics?
No, though PostGIS is clearly the way forward for my needs in the 
medium/long term.

PostGIS stores bounding boxes for its geometric features.  The operators 
like && and @ work as intersect and containment for the bounding boxes, 
while Intersects() and Contains() use more exact but presumably 
computationally expensive functions.  I don't yet know how these, GiST 
indexes and the planner get along together.  But I imagine the issue I've 
come across is one of the, if not the, most important one in spatially 
enabled databases.

Thanks again
Julian

---(end of broadcast)---
TIP 8: explain analyze is your friend

Re: [GENERAL] Index use with left join

2005-04-08 Thread Julian Scarfe

From: "Tom Lane" <[EMAIL PROTECTED]>
The problem is that it's underestimating the number of rows pulled from
the n table (1 vs actual 150), which makes a simple nestloop join look
like the way to go.  That error comes from the fact that we don't really
have any statistical estimation for geometric conditions :-(.  Some of
the PostGIS hackers have been working on such, I believe, but I'm not
sure how far they've gotten.
Thanks Tom.
I can see the poor estimation of the geometric filter is a significant 
factor.

Reviewing it I wondered if the coarse/fine nature of the filters was also an 
issue.  Query 4, with the filter on the index, selects 396 of the ~20,000 
rows of the table (having estimated 22).  Query 5, with the filter requiring 
a Seq scan, selects 150 of the ~20,000 rows of the table (having estimated 
5), all of which were returned by Query 4.  Does the planner "realise" that 
the intersection, Query 6, will still return 150 rows, or does it assume 
independence of the filters in some way and estimate 
20,000*(150/20,000)*(396/20,000)?

I guess I can test that by trying it with a non-geometric column and a btree 
index:

7) Filtered requiring a sequntial scan
explain analyze
select  n.ref, n.code, a.ident, a.name
   from n left outer join a on (a.ident = n.code)
   where code ~~ 'EGT%';
   QUERY PLAN
---
Merge Left Join  (cost=10361.53..10441.90 rows=419 width=45) (actual 
time=731.175..732.615 rows=248 loops=1)
  Merge Cond: ("outer".code = "inner"."?column3?")
  ->  Sort  (cost=9078.36..9078.49 rows=53 width=20) (actual 
time=547.154..547.300 rows=248 loops=1)
Sort Key: n.code
->  Seq Scan on n  (cost=0.00..9076.84 rows=53 width=20) (actual 
time=260.558..543.587 rows=248 loops=1)
  Filter: (code ~~ 'EGT%'::text)
  ->  Sort  (cost=1283.17..1308.44 rows=10105 width=25) (actual 
time=180.359..181.149 rows=1292 loops=1)
Sort Key: (a.ident)::text
->  Seq Scan on a  (cost=0.00..611.05 rows=10105 width=25) (actual 
time=0.125..83.844 rows=10105 loops=1)
Total runtime: 735.613 ms

8) Filtered using the index, but returning a subset of the 419 rows from 
Query 7

explain analyze
select  n.ref, n.code, a.ident, a.name
   from n left outer join a on (a.ident = n.code)
   where code = 'EGTT';
QUERY PLAN

Merge Left Join  (cost=1283.17..1572.15 rows=411 width=45) (actual 
time=451.609..510.507 rows=226 loops=1)
  Merge Cond: ("outer".code = "inner"."?column3?")
  ->  Index Scan using n_pkey on n  (cost=0.00..208.82 rows=52 width=20) 
(actual time=17.301..73.840 rows=226 loops=1)
Index Cond: (code = 'EGTT'::text)
  ->  Sort  (cost=1283.17..1308.44 rows=10105 width=25) (actual 
time=430.231..431.032 rows=1279 loops=1)
Sort Key: (a.ident)::text
->  Seq Scan on a  (cost=0.00..611.05 rows=10105 width=25) (actual 
time=5.743..321.326 rows=10105 loops=1)
Total runtime: 514.103 ms

9) Filtered on both
explain analyze
select  n.ref, n.code, a.ident, a.name
   from n left outer join a on (a.ident = n.code)
   where code ~~ 'EGT%' and code = 'EGTT';
   QUERY PLAN
--
Nested Loop Left Join  (cost=0.00..971.58 rows=8 width=45) (actual 
time=53.634..1.285 rows=226 loops=1)
  Join Filter: (("inner".ident)::text = "outer".code)
  ->  Index Scan using n_pkey on n  (cost=0.00..208.95 rows=1 width=20) 
(actual time=0.288..21.137 rows=226 loops=1)
Index Cond: (code = 'EGTT'::text)
Filter: (code ~~ 'EGT%'::text)
  ->  Seq Scan on a  (cost=0.00..611.05 rows=10105 width=25) (actual 
time=0.008..18.741 rows=10105 loops=226)
Total runtime: 12223.328 ms

Similar problem.  Of course Query 9 is concocted and unrealistic, but it is 
representative of the coarse/fine filter problem, where I select a set of 
rows using an approximate filter (e.g. bounding box for the geometrical 
case) with an index and then use a second, exact but computationally 
expensive filter to keep only those rows that I really want.

Any general suggestions for workarounds?
Julian 


---(end of broadcast)---
TIP 9: the planner will ignore your desire to choose an index scan if your
 joining column's datatypes do not match

[GENERAL] Index use with left join

2005-04-08 Thread Julian Scarfe

757..728.847 rows=150 loops=1)
Sort Key: n.code
->  Seq Scan on n  (cost=0.00..9087.60 rows=5 width=20) (actual 
time=314.466..726.479 rows=150 loops=1)
  Filter: (box(q_node, q_node) @ 
'(0.017,0.908),(-0.032,0.873)'::box)
  ->  Sort  (cost=1283.17..1308.44 rows=10105 width=25) (actual 
time=180.078..180.927 rows=1391 loops=1)
Sort Key: (a.ident)::text
->  Seq Scan on a  (cost=0.00..611.05 rows=10105 width=25) (actual 
time=0.170..83.442 rows=10105 loops=1)
Total runtime: 917.066 ms

Again, so far, nothing obviously unusual.  Now I combine the filters in 4 & 
5 (as I did from 1 & 2 to get 3)

6) Now I combine the filters in 4 & 5 (as I did from 1 & 2 to get 3, which 
performed in a similar time to 1)

explain analyze
select  n.ref, n.code, a.ident, a.name
   from n left outer join a on (a.ident = n.code)
   where bbox && box (point (-0.032, 0.873), point (0.017, 0.908))
   and box (q_node, q_node)
   @ box (point (-0.032, 0.873), point (0.017, 0.908))
QUERY PLAN
-
Nested Loop Left Join  (cost=0.00..851.06 rows=8 width=45) (actual 
time=11.662..7919.946 rows=150 loops=1)
  Join Filter: (("inner".ident)::text = "outer".code)
  ->  Index Scan using n_bbox on n  (cost=0.00..88.44 rows=1 width=20) 
(actual time=0.107..10.256 rows=150 loops=1)
Index Cond: (bbox && '(0.017,0.908),(-0.032,0.873)'::box)
Filter: (box(q_node, q_node) @ '(0.017,0.908),(-0.032,0.873)'::box)
  ->  Seq Scan on a  (cost=0.00..611.05 rows=10105 width=25) (actual 
time=0.006..18.044 rows=10105 loops=150)
Total runtime: 7920.684 ms

Whoa!  Instead of a performance similar to query 4, it chooses a different 
strategy, and takes 40 times as long. (Both tables just analyzed.)

By brute force:
set enable_nestloop to off;
explain analyze
select  n.ref, n.code, a.ident, a.name
   from n left outer join a on (a.ident = n.code)
   where bbox && box (point (-0.032, 0.873), point (0.017, 0.908))
   and box (q_node, q_node)
   @ box (point (-0.032, 0.873), point (0.017, 0.908))
   QUERY PLAN
--
Merge Left Join  (cost=1371.62..1447.50 rows=8 width=45) (actual 
time=177.273..179.341 rows=150 loops=1)
  Merge Cond: ("outer".code = "inner"."?column3?")
  ->  Sort  (cost=88.45..88.45 rows=1 width=20) (actual time=8.452..8.538 
rows=150 loops=1)
Sort Key: n.code
->  Index Scan using n_bbox on n  (cost=0.00..88.44 rows=1 
width=20) (actual time=0.109..7.031 rows=150 loops=1)
  Index Cond: (bbox && '(0.017,0.908),(-0.032,0.873)'::box)
  Filter: (box(q_node, q_node) @ 
'(0.017,0.908),(-0.032,0.873)'::box)
  ->  Sort  (cost=1283.17..1308.44 rows=10105 width=25) (actual 
time=165.520..166.348 rows=1391 loops=1)
Sort Key: (a.ident)::text
->  Seq Scan on a  (cost=0.00..611.05 rows=10105 width=25) (actual 
time=0.042..69.560 rows=10105 loops=1)
Total runtime: 182.275 ms

What's happening here, please? How am I misleading the planner? Is it 
because the index is rtree?

Yes, I should consider PostGIS for spatial stuff, but I've got what I've got 
:-).

TIA
Julian Scarfe 


---(end of broadcast)---
TIP 7: don't forget to increase your free space map settings

Re: [GENERAL] truncate/create slowness

2005-03-31 Thread Julian Scarfe

It's possible you could get out of this by vacuum full and then reindex
each catalog, but it might be easier to dump and reload the database ...
I've got a similar issue, but caused by neglect rather than anything to to 
with pg_autovacuum.

Do you have any rules of thumb for deciding when a pg_dumpall/restore is 
likely to be faster than a vacuum full?  Or perhaps more straightforwardly, 
how would you expect the time required for a vacuum full to scale with pages 
used and rows in the table?

Thanks
Julian Scarfe 


---(end of broadcast)---
TIP 8: explain analyze is your friend

[GENERAL] Clustering system tables

2005-03-27 Thread Julian Scarfe

I've got a database (7.4)  whose system tables have been long neglected. 
Instead of the 100 or so pages I'd expect for 4000 rows after VACUUM, I've 
got 24,000 pages and a mere 1.4 million unused item pointers.

If it were an ordinary table, I'd CLUSTER it, as from experience it would be 
vastly quicker than a VACUUM FULL.  But CLUSTER responds with:

ERROR:  "pg_attribute" is a system catalog
Is VACUUM FULL my only option to compact the table?
Julian Scarfe 


---(end of broadcast)---
TIP 2: you can get off all lists at once with the unregister command
   (send "unregister YourEmailAddressHere" to [EMAIL PROTECTED])

Re: [GENERAL] postgresql 8.0 advantages

2005-02-27 Thread Julian Scarfe

From: "Tom Lane" <[EMAIL PROTECTED]>
There are some speed improvements in 8.0 too, though that wasn't the
main focus of the release cycle.  For instance, Lonni Friedman's nearby
thread reports on a case where 8.0 consistently finds a much better
query plan for a complex query than prior releases did, because we fixed
some issues in the GEQO planner module.
Could you give an example or two of the sorts of queries for which 
performance is improved under 8.0 compared with 7.4, please Tom?

Thanks
Julian Scarfe 


---(end of broadcast)---
TIP 3: if posting/reading through Usenet, please send an appropriate
 subscribe-nomail command to [EMAIL PROTECTED] so that your
 message can get through to the mailing list cleanly

Re: [GENERAL] pgpool 2.5b2 released

2005-02-06 Thread Julian Scarfe

> After takin a swig o' Arrakan spice grog, [EMAIL PROTECTED] ("Julian
Scarfe") belched out:
> > So all I'm looking for is a way for pgpool to shout if it detects a
> > failure.  That could initiate the investigation of the other
> > criteria required for failover.
>
> _There_ lies the one change that is needed.  Given that, some outside
> 'oracle' can be used to decide if it's appropriate to do a FAILOVER.
>
> It's quite important for this not to be deeply embedded in pgpool...

I think we have a consensus on that.  So what's the most sensible mechanism
for the "shout".  Since I posted my original question, I realized that
pgpool notes a failure of either master or slave in its log.  Would we want
something more proactive?

Julian



---(end of broadcast)---
TIP 9: the planner will ignore your desire to choose an index scan if your
  joining column's datatypes do not match

Re: [GENERAL] pgpool 2.5b2 released

2005-02-04 Thread Julian Scarfe

From: "Joshua D. Drake" <[EMAIL PROTECTED]>

> >Since pgpool has this capability, how about including a hook that allows
a
> >script to be run when pgpool detects a problem with the master?  That
would
> >allow action to be taken to investigate further and, if required,
switchover
> >or failover and promote the slave to master.
> >
> >
> Because it is not the place of a connection management software
> to failover. Failover should happen only if:
>
> 1. You have a specific set of criteria that is matched via a
network/system
> management software.
>
> 2. You do it yourself.

I'm not suggesting that it's the place of pgpool to *force* a failover.  I
am suggesting that one of the criteria that is likely to be useful is the
inability to connect to the master, and that's something that pgpool,
apparently, detects.  It seems unnecessary to use completely different
failure-detection mechanisms for the purpose of failover to those used for
the connection management.

So all I'm looking for is a way for pgpool to shout if it detects a failure.
That could initiate the investigation of the other criteria required for
failover.

> The last thing in the world you need is to fail over to a slave because
> somebody accidently tripped over a network cord.

In our application, that's *exactly* what we need.  We have a database that
receives data in a fairly continuous stream.  If the datastream cannot be
written to the database, the database becomes worse than useless quite
rapidly.  We need the ability to switchover or failover to another node as
master as soon as possible, to allow the datastream to be written to the
other node.  We'll rebuild the "failed" master later, if necessary.  But if
the failover doesn't happen promptly, we might as well rebuild the whole
cluster.

Julian Scarfe



---(end of broadcast)---
TIP 5: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq

Re: [GENERAL] pgpool 2.5b2 released

2005-02-03 Thread Julian Scarfe

> > Does it attempt any interaction with Slony when it detects a failure of
the
> > master?  It would seem a pity to have pgpool watching the pair to detect
> > failure but having to have a separate watcher process to tell Slony to
> > failover.
>
> If pgpool detects PostgreSQL failure, Slony should detect it as well, no?

Others are much better place to answer this, but my understanding is that
Slony does not do so:

http://gborg.postgresql.org/project/slony1/genpage.php?howto_overview
"What Slony-I is not:

Slony-I is not a network management system.  Slony-I does not have any
functionality within it to detect a node failure, or automatically promote a
node to a master or other data origin.  Slony-I is not multi-master; it's
not
a connection broker, and it doesn't make you coffee and toast in the
morning."

> pgpool 2.5 has the capabilty to perform periodical health checking to
> PostgreSQL.

Since pgpool has this capability, how about including a hook that allows a
script to be run when pgpool detects a problem with the master?  That would
allow action to be taken to investigate further and, if required, switchover
or failover and promote the slave to master.

Julian Scarfe



---(end of broadcast)---
TIP 6: Have you searched our list archives?

   http://archives.postgresql.org

Re: [GENERAL] pgpool 2.5b2 released

2005-02-02 Thread Julian Scarfe

From: "Tatsuo Ishii" <[EMAIL PROTECTED]>


> Pgpool 2.5b2 supports "master slave mode" which can cope with
> master/slave replication softwares such as Slony-I. In this mode
> pgpool sends non SELECT queries to master only. SELECTs are load
> balanced by pgpool.

Sounds good!

Does it attempt any interaction with Slony when it detects a failure of the
master?  It would seem a pity to have pgpool watching the pair to detect
failure but having to have a separate watcher process to tell Slony to
failover.

Julian Scarfe



---(end of broadcast)---
TIP 9: the planner will ignore your desire to choose an index scan if your
  joining column's datatypes do not match

[GENERAL] Select after insert to the unique column

2004-12-11 Thread Julian Legeny

Hello,

   I have a following table with unique column:

   CREATE TABLE UNIQUE_COLUMN_TEST (
 TEST_ID INTEGER,
 CONSTRAINT TEST_ID_UQ UNIQUE (TEST_ID)
   )


   Then I insert few records into the table, and then I try to insert
   duplicate record. There is correct error occured:

   org.postgresql.util.PSQLException:
   ERROR: duplicate key violates unique constraint "test_id_uq"


   Then I want to process command
  select count(*) from UNIQUE_COLUMN_TEST
   that I want to know how many records was already inserted before id
   faied.

   But when I try to process that SELECT COUNT(*), there is error
   occured again:
   
   org.postgresql.util.PSQLException:
   ERROR: current transaction is aborted, commands ignored until end of 
transaction block

   How can I solve this?

   Thank you in advance,
   with best regards,

   Julian Legeny

   
   All this work is processed within 1 transaction and here is the
   code:
  
   // insert value
   m_transaction.begin();
   
   try
   {
  Connection connection = null;

  try
  {
 // try to insert 5 correct records
 for (iCounter = 1; iCounter < 6; iCounter++)
 {
insertStatement = m_connection.prepareStatement(
"insert into UNIQUE_COLUMN_TEST (TEST_ID) values (?)");
insertStatement.setInt(1, 100 * iCounter);
   
insertStatement.executeUpdate();
  }
  // insert duplicite value into unique column
  try
  {
 insertStatement = m_connection.prepareStatement(
 "insert into UNIQUE_COLUMN_TEST (TEST_ID) values (?)");
 insertStatement.setInt(1, 100);
   
 insertStatement.executeUpdate();
   }
   catch (SQLException sqlExc)
   {
  try
  {
 // THIS EXCEPTION IS EXPECTED
 // now try to find out how many records were
 // already inserted befor it failed
 selectStatement = m_connection.prepareStatement(
 "select count(*) from UNIQUE_COLUMN_TEST");

 // !!! AT THE FOLLOWING LINE IT FAILED AGAIN !!!
 // cause: current transaction is aborted, commands
 //ignored until end of transaction block
 rsResults = selectStatement.executeQuery();
 
 if (rsResults.next())
 {
assertEquals("Incorrect number of selected items",
 5, rsResults.getInt(1));
 }
   }
   catch (SQLException sqlExc1)
   {
  throw new SQLException();
   }
   finally
   {
  rsResults.close();
   }
}
 }
 finally
 {
DatabaseUtils.closeStatement(insertStatement);
DatabaseUtils.closeStatement(selectStatement);
 }
 m_transaction.commit();
  }
  catch (Throwable throwable)
  {
 m_transaction.rollback();
 throw throwable;
  }

...


---(end of broadcast)---
TIP 4: Don't 'kill -9' the postmaster

Re: [GENERAL] Index bloat in 7.2

2004-12-06 Thread Julian Scarfe

On Mon, Dec 06, 2004 at 08:48:04AM -, Julian Scarfe wrote:
b) Only a dump-restore major version upgrade (which we'll do next 
time we
can take the system out for long enough) will avoid the issue.
On 6 Dec 2004, at 16:18, Alvaro Herrera wrote:
"Long enough" could be a minutes or seconds issue if you use Slony-I,
I've heard ...  (Of course you'd still need to fix your apps, which may
take somewhat longer than that.)
A good point Alvaro, but I don't think Slony-I is compatible with 7.2, 
which is the version I'm starting from.  For upgrades from 7.3 onwards, 
it's certainly a route worth considering.

Julian
---(end of broadcast)---
TIP 8: explain analyze is your friend

Re: [GENERAL] Index bloat in 7.2

2004-12-06 Thread Julian Scarfe

From: "Christopher Browne" <[EMAIL PROTECTED]>

> The "empty pages not reclaimed" problem is something that did indeed
> get fixed in the post-7.2 days.  I _think_ it was 7.4, but it might
> have been 7.3.

> In short, 7.4.x is indeed a good resolution to your issue.

From: "Tom Lane" <[EMAIL PROTECTED]>
>
> That's exactly what I'd expect ...

Thanks both.

So it sounds like:

a) the issue is controllable with a regular (and in our case, just
occasional) reindex without any long term negative consequences

b) Only a dump-restore major version upgrade (which we'll do next time we
can take the system out for long enough) will avoid the issue.

Julian



---(end of broadcast)---
TIP 6: Have you searched our list archives?

   http://archives.postgresql.org

[GENERAL] Index bloat in 7.2

2004-12-05 Thread Julian Scarfe

I've got a box running 7.2.1 (yes, I know :-() in which an index for a
rapidly turning over (and regularly vacuumed) table is growing steadily in
size.  The index in question is on a timestamp field that is just set to
now() on the entry of the row, to enable the query that clears out old data
to an archive to run efficiently.  Reindexing shrinks it back to a
reasonable size. Other indexes reach an equilibrium size and stay there. The
behaviour is fine on a system running 7.4.x: the index stays at a sensible
number of pages.

Is this likely to be related to a known issue with 7.2 that got fixed, or
have I got potentially more serious problems?

Thanks

Julian Scarfe



---(end of broadcast)---
TIP 5: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faqs/FAQ.html

Re: [GENERAL] How can I change a cast from explicit only to implicit?

2004-11-25 Thread Julian Scarfe

From: "Tom Lane" <[EMAIL PROTECTED]>

> > So how can I force a built-in cast to become implicit?
>
> If you're intent on doing that, you can change its entry in pg_cast.
> But I think you'd be far better off to fix your application.  Implicit
> casts across type categories have a habit of kicking in when you least
> expected it, causing the backend to adopt surprising and unpleasant
> interpretations of straightforward-looking queries.  If you check the
> pgsql-bugs archives you will find some of the examples that prompted us
> to change this cast to non-implicit...

It's a fair point, Tom.  Having developed the code with 7.2, we are looking
for the simplest way to recreate backward compatible behaviour in 7.4.  Sure
enough, we'll flag the code for revision next time it gets an overhaul, but
a single change to the database would be preferable in the short term.   I
think the current default of these casts being explicit only is sensible.

Thanks

Julian



---(end of broadcast)---
TIP 8: explain analyze is your friend

[GENERAL] How can I change a cast from explicit only to implicit?

2004-11-25 Thread Julian Scarfe

In 7.2.x

template1=# select point('1'::text, '2'::text);
 point
---
 (1,2)
(1 row)

but in 7.4.x

template1=# select point('1'::text, '2'::text);
ERROR:  function point(text, text) does not exist
HINT:  No function matches the given name and argument types. You may need
to add explicit type casts.

  List of casts
 Source type | Target type |  Function
|   Implicit?
-+-+
-+---
...
 text| double precision| float8
| no

OK, so to make the cast work without explicit casts in the SQL, I need the
text to float8 (or another suitable numeric type) cast to be implicit.  But:

template1=# create cast (text as float8) with function float8(text) as
implicit;
ERROR:  cast from type text to type float8 already existsoat8(text) as
implicit;

template1=# drop cast (text as float8);
ERROR:  cannot drop cast from text to double precision because it is
required by the database system

So how can I force a built-in cast to become implicit?

Thanks

Julian Scarfe



---(end of broadcast)---
TIP 3: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly

[GENERAL] VACUUM ANALYZE question - PostgreSQL performance tests

2004-11-25 Thread Julian Legeny

Hello,

   I have the question about VACUUM ANALYZE. I have try to do Postgres
performance tests for selecting large amount of records from DB.
   First I have insert 30.000 records into the 1 table. After this
insert I executed VACUUM ANALYZE query.
   I have a test that retrieves page by page (20 records per page)
all data from a table. It means I'm executing 1500 selects in the cycle
for retrieving each page and I'm retrieving also time duration of some
of this selects.

   PROBLEM IS, that when I start to retrieve records, the performance
is poor. But when I execute manually (from a DB client) query VACUUM
ANALYZE one more time (during retrieving of pages), the performance is
much better.

   Is there also neccessary to call VACUUM ANALYZE also for getting of
better performance for select query?


   Thank you for your answer,
   with best regards,

   Julian Legeny




   Here I attach log reports for 30.000 records:
   =

  Here can be possible to see that time duration of executing final query is 
aproximately same for
  each retrieved page. And performance is not very good.
  I have applied VACUUM ANALYZE during processing test and from the page 1000 
performance
  is better about 2/3.


   a.) First I insert 30.000 records into the DB.

   b.) Then I retrieve page by page all records from the DB.

 I'm executing 2 commands:
 
 1. Command retrieve number of all items that I want to retrieve
page by page:

select count(*) from BF_USER
   where BF_USER.DOMAIN_ID=19 and BF_USER.ID NOT IN(280)

- in the log file is possible to see time duration of this
  select - it is time for "Duration for executing count
  statement".

 2. Final query for retrieving particular records specified within
 the LIMIT clause.

select BF_USER.LOGIN_NAME,
   BF_USER.EMAIL,BF_USER.ID,
   BF_USER.MODIFICATION_DATE,
   BF_USER.SUPER_USER,
   BF_USER.GUEST_ACCESS_ENABLED
from BF_USER
where BF_USER.DOMAIN_ID=19 and
  BF_USER.ID NOT IN(280)
order by BF_USER.LOGIN_NAME asc
limit 20 offset 0


   First I execute "select COUNT(*) ..." query for retrieving number
   of all items that I will retrieve and then when I know this number,
   I can retrieve specified records (used LIMIT for this).


-

INFO: Total duration to create 3 data objects was 1:46.453 which is 281 
items/sec

INFO: Duration for executing count statement for page 1 (at position 1) = 171 ms
INFO: Duration for executing final query = 641 ms
INFO: Total duration = 828 ms

INFO: Duration for executing count statement for page 2 (at position 21) = 156 
ms
INFO: Duration for executing final query = 641 ms
INFO: Total duration = 797 ms

INFO: Duration for executing count statement for page 3 (at position 41) = 140 
ms
INFO: Duration for executing final query = 625 ms
INFO: Total duration = 765 ms

INFO: Duration for executing count statement for page 4 (at position 61) = 141 
ms
INFO: Duration for executing final query = 640 ms
INFO: Total duration = 797 ms

INFO: Duration for executing count statement for page 5 (at position 81) = 141 
ms
INFO: Duration for executing final query = 625 ms
INFO: Total duration = 766 ms

INFO: Duration for executing count statement for page 100 (at position 1981) = 
141 ms
INFO: Duration for executing final query = 625 ms
INFO: Total duration = 766 ms

INFO: Duration for executing count statement for page 101 (at position 2001) = 
141 ms
INFO: Duration for executing final query = 625 ms
INFO: Total duration = 766 ms

INFO: Duration for executing count statement for page 102 (at position 2021) = 
140 ms
INFO: Duration for executing final query = 594 ms
INFO: Total duration = 750 ms

INFO: Duration for executing count statement for page 103 (at position 2041) = 
140 ms
INFO: Duration for executing final query = 625 ms
INFO: Total duration = 765 ms

INFO: Duration for executing count statement for page 104 (at position 2061) = 
141 ms
INFO: Duration for executing final query = 609 ms
INFO: Total duration = 750 ms

INFO: Duration for executing count statement for page 105 (at position 2081) = 
141 ms
INFO: Duration for executing final query = 625 ms
INFO: Total duration = 766 ms

INFO: Duration for executing count statement for page 200 (at position 3981) = 
125 ms
INFO: Duration for executing final query = 641 ms
INFO: Total duration = 766 ms

INFO: Duration for executing count statement for page 201 (at position 4001) = 
140 ms
INFO: Duration for executing final query = 641 ms
INFO: Total duration = 781 ms

INFO: Duration for executing count statement for page 202 (at position 4021) = 
141 ms
INFO: Duration for executing final query = 609 ms
INFO: Total duration = 750 ms

INFO: Duration for executi

Re: [GENERAL] Best practices for migrating a development database

2004-09-13 Thread Julian North

Hi,

We manage a number of high-volume databases that require 24/7 uptime (pretty

much) and deal with this problem a lot.

The solution we employ is that once a database is in production the only 
way to alter the database is using a change script that deals with any data 
migration issues as well as altering the schema.

The practice is for a developer to make changes as they see fit in dev 
using a simple change script. Once the changes are complete (and the 
application coding work has been done) the schema change is passed to a
member of the dba team to code up properly. 

It is this change script (which usually includes a section that can reverse
the change) that is then run against the qa database. The application is
then 
tested at this stage before a production release is performed.

Depending on the nature of the change the applications and schema changes
are 
all then release together at a defined time.

This is generally a fairly easy process to manage. One of the useful things
we do is to include an insert at teh top of each schema change into a 
standard schema table. This way we can easily see which schema changes
have been run against each instance of each db.

All of schema change scripts are managed in source control.

J.

-Original Message-
From: [EMAIL PROTECTED]
[mailto:[EMAIL PROTECTED] Behalf Of Csaba Nagy
Sent: 13 September 2004 09:13
To: Thomas F.O'Connell
Cc: Collin Peters; Postgres general mailing list
Subject: Re: [GENERAL] Best practices for migrating a development
database

Beside version controlled schema files we have a guy who writes
migration scripts based on the old schema and the new (development)
schema (frozen e.g. by branching in CVS).
Usually there are 3 steps involved:
 - a pre-migration script, which prepares the data base for the new
schema, by adding the new structures needed for the data migration;
 - a data migration script, which moves around data between the old and
the new structures;
 - a finalization script, which removes the old structures not needed
anymore;

I think there's no way to make any of these steps automatically computed
as a diff between the old and new schemas...
We usually do it anyway so that after step 1 was executed, both the old
version of the application and the new version can work at the same
time, and the new version will only use the data migrated by step 2, but
I suppose our application is not very typical (we have lots of distinct
customers which live in the same data base but have distinct data).
This also means we try to do minimal changes to the data base and we try
to only have additions, no modifications, this makes migration easier.

HTH,
Csaba.

On Sat, 2004-09-11 at 09:29, Thomas F.O'Connell wrote:
> One thing I used to do (and I won't necessarily claim it as a best 
> practice) was to maintain my entire data model (tables, functions, 
> indexes, sequences) as SQL (plus postgres extensions) CREATE statements 
> in text files that were version controlled (via CVS). I had an entire 
> set of utilities that could modify the existing database as necessary 
> to treat the SQL files as authoritative. For anything new, the create 
> statements sufficed, but for modifications, some objects had to be 
> regenerated. When it was time to release, we would export the textual 
> SQL schema to the production server, make the necessary updates using 
> my utilities, and then restart services.
> 
> Since I'm deploying postgres in new environments now, and I left these 
> utilities behind at another job (where they're still in use), I've been 
> thinking more about the concept of schema version control. But I'm 
> similarly interested in any concepts of best practices in this area.
> 
> -tfo
> 
> On Sep 10, 2004, at 1:55 PM, Collin Peters wrote:
> 
> > I have searched the Internet... but haven't found much relating to 
> > this.
> >
> > I am wondering on what the best practices are for migrating a 
> > developmemnt database to a release database.  Here is the simplest 
> > example of my situation (real world would be more complex).
> >
> > Say you have two versions of your application.  A release version and 
> > a development version.  After a month of developing you are ready to 
> > release a new version.  There have been many changes to the 
> > development database that are not in the release database.  However, 
> > the release database contains all your real information (customers, 
> > etc...).  What is the best practice for migrating the development 
> > database to the release database?
> >
> > I have thought of the following situations:
> > -Simply track all the changes you made to the development database and 
> > make the same changes to the release database
> > -Back up the release database... overwrite it with the development 
> > database... then copy all your real data back into the release 
> > database (this last step is probably quite difficult)
> > -Perhaps some combination of the two
> >
> > Does anybody ha

Re: [GENERAL] tablespaces a priority for 7.5?

2004-01-22 Thread Julian North

speaking as someone currently migrating enterprise stuff to postgres

point-in-time is definitely the biggest issue.

this is the main thing we are having to look reproducing using some form of 
replication to an alternative server.

-Original Message-
From: Rick Gigger [mailto:[EMAIL PROTECTED]
Sent: 22 January 2004 19:09
To: Cott Lang; [EMAIL PROTECTED]
Subject: Re: [GENERAL] tablespaces a priority for 7.5?

> On Wed, 2004-01-21 at 09:05, Brian Maguire wrote:
>
> > In my opinion, it really is a critical feature to support and administer
> > enterprise databases.  All the major databases currently support this
> > and it is a compelling enough reason drive big users from away from
> > using postgres for their enterprise/large databases.  It really is a
> > database administrator's feature.
>
> It seems to me that the lack of point-in-time recovery is a much bigger
> roadblock against big users. :(

 This is certainly my feeling.

---(end of broadcast)---
TIP 9: the planner will ignore your desire to choose an index scan if your
  joining column's datatypes do not match

This e-mail has been scanned for all viruses by Star Internet. The
service is powered by MessageLabs. For more information on a proactive
anti-virus service working around the clock, around the globe, visit:
http://www.star.net.uk

This e-mail has been scanned for all viruses by Star Internet. The
service is powered by MessageLabs. For more information on a proactive
anti-virus service working around the clock, around the globe, visit:
http://www.star.net.uk

---(end of broadcast)---
TIP 7: don't forget to increase your free space map settings

Re: [GENERAL] Modelling Oracle Sequences

2003-11-21 Thread Julian North

Hi Pavel,

Thanks for that.

I hadn't realised there was support for proper sequences.

Usual asking before fully researching.

Thanks,

Julian.

-Original Message-
From: Pavel Stehule [mailto:[EMAIL PROTECTED]
Sent: 21 November 2003 09:42
To: Julian North
Cc: '[EMAIL PROTECTED]'
Subject: Re: [GENERAL] Modelling Oracle Sequences

Hello

why can't use normal sequence? There isn't difference betwen Oracle and 
PostgreSQL?

http://www.postgresql.org/docs/7.3/static/sql-createsequence.html

regards
Pavel

On Fri, 21 Nov 2003, Julian North wrote:

> Hi,
> 
> I am in the process of porting a number of large MSSQL databases to
> Postgres.
> 
> The current id generation model in SQL server is an oracle style sequence
> generator i wrote using a pretty simple stored proc and table structure.
> 
> I now need to do the same thnig in postgres.
> 
> In MSSQL I was able to open an UPDATE cursor, which places a lock on the
row
> in the sequence table. I then update, release and return.
> 
> Has anyone done something similar or have any pointers on how to do this
> safely, securely without an update cursor?
> 
> Any infor appreciated.
> 
> Cheers,
> 
> 
> 
> Julian.
> 
> 
> This e-mail has been scanned for all viruses by Star Internet. The
> service is powered by MessageLabs. For more information on a proactive
> anti-virus service working around the clock, around the globe, visit:
> http://www.star.net.uk
> 
> 
> ---(end of broadcast)---
> TIP 9: the planner will ignore your desire to choose an index scan if your
>   joining column's datatypes do not match
> 

This e-mail has been scanned for all viruses by Star Internet. The
service is powered by MessageLabs. For more information on a proactive
anti-virus service working around the clock, around the globe, visit:
http://www.star.net.uk

This e-mail has been scanned for all viruses by Star Internet. The
service is powered by MessageLabs. For more information on a proactive
anti-virus service working around the clock, around the globe, visit:
http://www.star.net.uk

---(end of broadcast)---
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]

[GENERAL] Modelling Oracle Sequences

2003-11-21 Thread Julian North

Hi,

I am in the process of porting a number of large MSSQL databases to
Postgres.

The current id generation model in SQL server is an oracle style sequence
generator i wrote using a pretty simple stored proc and table structure.

I now need to do the same thnig in postgres.

In MSSQL I was able to open an UPDATE cursor, which places a lock on the row
in the sequence table. I then update, release and return.

Has anyone done something similar or have any pointers on how to do this
safely, securely without an update cursor?

Any infor appreciated.

Cheers,



Julian.


This e-mail has been scanned for all viruses by Star Internet. The
service is powered by MessageLabs. For more information on a proactive
anti-virus service working around the clock, around the globe, visit:
http://www.star.net.uk


---(end of broadcast)---
TIP 9: the planner will ignore your desire to choose an index scan if your
  joining column's datatypes do not match

Re: [GENERAL] Simulating an outer join

2000-01-13 Thread Julian Scarfe

Bruce Momjian wrote:

> I am just writing the EXISTS section from my book.  I don't think it
> matters what fields are returned from an EXISTS subquery.  If I am
> wrong, someone please let me know.

Celko also writes (in his chapter on EXISTS in "SQL for Smarties"):

"In general the SELECT * option should perform better than the actual column. 
It lets the query optimizer decide which column to use.  If a column has an
index on it, then simply seeing a pointer to the index is enough to determine
that something exists."

Obviously you're in a much better position than me to judge whether that's the
case in pgsql!  But it might be worth a test.

Julian Scarfe

Re: [GENERAL] Simulating an outer join

2000-01-12 Thread Julian Scarfe


> Bruce Momjian wrote:
> >
> > I have been thinking about how to simulate an outer join.  It seems the
> > best way is to do:
> >
> > SELECT tab1.col1, tab2.col3
> > FROM tab1, tab2
> > WHERE tab1.col1 = tab2.col2
> > UNION ALL
> > SELECT tab1.col1, NULL
> > FROM tab1
> > WHERE tab1.col1 NOT IN (SELECT tab2.col2 FROM tab2)
> >
> > Comments?  I know someone was asking about this recently.

Mike Mascari wrote:
> 
> I wouldn't use IN ;-)
> 
> SELECT table1.key, table2.value
> FROM table1, table2
> WHERE table1.key = table2.key
> UNION ALL
> SELECT table1.key, NULL
> FROM table1 WHERE NOT EXISTS
> (SELECT table2.key FROM table2 WHERE table1.key = table2.key);

FWIW, that's exactly Joe Celko's SQL-89 workaround for OUTER JOINs in 'SQL for
Smarties'. Well in fact he uses (SELECT * FROM table2 WHERE table1.key =
table2.key) as the subquery, but I presume that's an insignificant difference.

Julian Scarfe

[GENERAL] Making points into paths

2000-01-12 Thread Julian Scarfe


I'd like to take a set of points and link them into a path.  But I can't see a
single operator/function that creates a path from points! ;-(

It seems like a fairly fundamental operation, unlike some of Postgres's
delightfully rich set of geometric datatypes, operators and functions. It
doesn't look too hard to write an external function that appends a point to a
path, but am missing something obvious?

Thanks

Julian Scarfe

67 matches

Mail list logo