Re: [GENERAL] Database design questions

2008-06-18 Thread David
Hi list.

I'm closing this thread, and will re-post as separate questions.

I agree with Jorge that smaller mails will be easier to read.

David.

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Database design questions

2008-06-18 Thread David
On Wed, Jun 18, 2008 at 12:25 PM, Jorge Godoy <[EMAIL PROTECTED]> wrote:
> On Wednesday 18 June 2008 05:43:25 David wrote:
>> * Should I split this into separate threads instead of 1 thread for
>> all my questions?
>
> I would submit all of the questions in separate messages.  It is tiresome to
> read everything, you'll loose a lot of context after one or two messages
> levels or reply and people won't read the email because of its size.
>

Thanks for your reply.

Should I post a 'please disregard this thread' message to this thread,
and start some new threads instead?

David.

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Database design questions

2008-06-18 Thread Jorge Godoy
On Wednesday 18 June 2008 05:43:25 David wrote:
> Hi list.
>
> There are some database design-related issues I've pondered about for some
> time.
>
> But first:
>
> * Is this the correct list to ask these questions on?
>
> * Should I split this into separate threads instead of 1 thread for
> all my questions?

I would submit all of the questions in separate messages.  It is tiresome to 
read everything, you'll loose a lot of context after one or two messages 
levels or reply and people won't read the email because of its size.

Regards,
-- 
Jorge Godoy  <[EMAIL PROTECTED]>




signature.asc
Description: This is a digitally signed message part.


[GENERAL] Database design questions

2008-06-18 Thread David
Hi list.

There are some database design-related issues I've pondered about for some time.

But first:

* Is this the correct list to ask these questions on?

* Should I split this into separate threads instead of 1 thread for
all my questions?

Assuming there isn't a problem, here are my questions:

==

Question 1: Storing app defaults.

If you have a table like this:

table1
 - id
 - field1
 - field2
 - field3

table2
 - id
 - table1_id
 - field1
 - field2
 - field3

table1 & table2 are setup as 1-to-many.

If I want to start providing user-customizable defaults to the
database (ie, we don't want apps to update database schema), is it ok
database design to add a table2 record, with a NULL table1_id field?

In other words, if table1 has no matching table2 record, then the app
will use the table2 record with a NULL table1_id field to get
defaults.

This looks messy however. Is there a better way to do it?

A few other ways I can think of:

1) Have an extra table1 record (with string fields containing
'DEFAULT'), against which the extra table2 record is linked.

2) Have a new table, just for defaults, like this:

table2_defaults
 - field1
 - field2
 - field3

Which is the cleanest way? Is there another method I should use instead?

==

Question 2: Backwards-compatible field addition

If you have an existing table, and apps which use it, then how do you
add new fields to the table (for new apps), but which might affect
existing apps negatively?

eg: I start with a table like this:

table1
 - id
 - field1
 - field2
 - field3

Later, I want to add a use case, where there is new behaviour, if a
new field is set in the table, like this:

table1
 - id
 - field1
 - field2
 - field3
 - field4 - NEW - if unset, do old behaviour. if set, do something else

The problem is, that existing apps (besides your new app) won't know
about field4, so they will keep using the old behaviour for new
records (where field4 is set), which you don't want.

The most obvious thing to do is to update all apps using table1, so
they also check the value of field4.

Is there another, more backwards-compatible way to add field4 for the
new behaviour, without having to update all the apps?

A few things I can think of:

1) table1 becomes a view of an updated table, with a 'WHERE field4 IS
NULL' clause.

Problem with this is that some RDBMS (Postgresql specifically) don't
let you run update statements on views.

2) Apps use stored procedures for all database access.

Maybe ok for new apps, not so much for existing apps which use regular SQL.

3) All apps use the same library for accessing database

Then you update the library and all apps automagically know about the
extra field. Again, maybe ok for new apps, not so much for existing
apps.

4) Make a new table (copy of the old one), with the extra field.

Then your app checks both tables, or just the new one if applicable.

This can work, but you may end up with a lot of app-specific tables,
where the main difference between the tables is extra columns, and
which apps use the tables.

5) Have a 'db version' column in the table. Older apps only operate on
records at or before the version the programmer knew about at the
time.

This can work, but it seems like a very non-standard, hackish way of
designing database tables. Also it's a pain for all apps to have to
hardcode a db version number.

6) Find a clever way to use table inheritance

I haven't thought it through, but here are some docs I've read on the subject:

http://www.postgresql.org/docs/8.1/static/ddl-inherit.html

Any other ideas?

==

Question 3: Temporal databases

http://en.wikipedia.org/wiki/Temporal_database

I haven't used them before, but I like the idea of never
deleting/updating records so you have a complete history (a bit like
source code version control).

How well do temporal databases work? Do RDBMS (ie Postgresql) need
add-ons to make it effective, or can you just add extra temporal
columns to all your tables and add them to your app queries? Does this
increase app complexity and increase server load a lot?

Are there Python libraries which simplify this? (eg: add-ons for
Elixir or SQLAlchemy).

Or should apps all implement their own 'temporal data access' module,
which transparently uses the current date & time until queried for
historical data?

==

Question 4: Data synchronization

2 cases I'm interested in:

1) Migrating data from one database to another

2) Distributing data over many databases, and later merging

In what ways can you design tables to easier facilitate the above cases?

I am aware of multi-master replication software, as described here:

http://en.wikipedia.org/wiki/Multi-master_replication

For this question, I'm more interested in schema design, so that a
home-brewed database synchronization can perform synchronization.

I have some experience with this. One of my previous projects was to
reliably migrate data from one database to another, where the 2
databases h

Re: [GENERAL] database design questions

2006-04-05 Thread hubert depesz lubaczewski
On 4/4/06, Ottavio Campana <[EMAIL PROTECTED]> wrote:
hubert depesz lubaczewski wrote:> 2) do you think it's possible in  a plpgsql procedure select the name of> a table into a variable and use that variable in the query?> possible, but not really good way. read about 'execute' in plpgsql.
why isn't it good?I mean, from my point of view is like a function accepting a pointer. Inmany languages it is used.when coding in plpgsql you have to use dynamic queries to use this kind of thing. this - by itself - is not bad. but it has certain performance penalty when compared to standard queries.
i think reading plpgsql's manual about "execute" is the best one can do about it :)best regardsdepesz


Re: [GENERAL] database design questions

2006-04-04 Thread Ottavio Campana
hubert depesz lubaczewski wrote:
> 2) do you think it's possible in  a plpgsql procedure select the name of
> a table into a variable and use that variable in the query?
> possible, but not really good way. read about 'execute' in plpgsql.

why isn't it good?

I mean, from my point of view is like a function accepting a pointer. In
many languages it is used.

-- 
Non c'è più forza nella normalità, c'è solo monotonia.



signature.asc
Description: OpenPGP digital signature


Re: [GENERAL] database design questions

2006-04-04 Thread hubert depesz lubaczewski
On 4/3/06, Ottavio Campana <[EMAIL PROTECTED]> wrote:
1) The database I'm going to develop is  a big list with a catalog ofitems and I  want to  store subsets  of this  list representing  theavailable items in several places.My idea  is to create the  big table with  all the elements and  then to
create another  table, where each  row holds a pair  (id_item, id_place)and  thanks to  this create  several views,  joining the  two tablesand selecting the rows with a give id_place.Do you think it's too heavy? Is there a simpler way to do it?
sorry but i dont understand your description. could you make a small example of this layout? 
2) do you think it's possible in  a plpgsql procedure select the name ofa table into a variable and use that variable in the query?possible, but not really good way. read about 'execute' in plpgsql.
 3) faq 4.11.1 sayshow can I do it with a INT8 instead of a INT4?
use bigserial instead of serial depesz


Re: [GENERAL] database design questions

2006-04-04 Thread Ottavio Campana
Alban Hertroys wrote:
> Ottavio Campana wrote:
> 
>>>   CREATE TABLE person (
>>>   id   SERIAL,
>>>   name TEXT
>>>   );
> 
> 
>> how can I do it with a INT8 instead of a INT4?
> 
> 
> Do you really expect that sequence to reach over 2 billion? Otherwise
> I'd stick with the SERIAL, nothing wrong with that unless you're selling
> electrons seperately or something like that (hmm... how much are they? I
> sure could use a few extra).

I agree wih you, but I think that in the feature the could be more than
2 billions. I don't want to alter in the future the database

-- 
Non c'è più forza nella normalità, c'è solo monotonia.



signature.asc
Description: OpenPGP digital signature


Re: [GENERAL] database design questions

2006-04-03 Thread Don Y

Alban Hertroys wrote:

Ottavio Campana wrote:

  CREATE TABLE person (
  id   SERIAL,
  name TEXT
  );



how can I do it with a INT8 instead of a INT4?


Do you really expect that sequence to reach over 2 billion? Otherwise 
I'd stick with the SERIAL, nothing wrong with that unless you're selling 


Depends on what the dynamics of his design are.  I.e. if he frequently
creates and deletes "people", then he can consume a lot of "id-space"
even though there aren't many real people in the database itself.

Personally, I doubt this is the case.  But, can see other applications
where this could be true.  Since you can't reuse old id's, it's easier
to just use a bigger size datum than having to worry how the database
will react when/if you run out of them!

electrons seperately or something like that (hmm... how much are they? I 
sure could use a few extra).


Three for a quark...  keep the charge!  [sic] :>

--don


---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

  http://www.postgresql.org/docs/faq


Re: [GENERAL] database design questions

2006-04-03 Thread Alex Turner
create table person (id serial8,name text);AlexOn 4/3/06, Alban Hertroys <[EMAIL PROTECTED]
> wrote:Ottavio Campana wrote:>>   CREATE TABLE person (>>   id   SERIAL,
>>   name TEXT>>   );> how can I do it with a INT8 instead of a INT4?Do you really expect that sequence to reach over 2 billion? OtherwiseI'd stick with the SERIAL, nothing wrong with that unless you're selling
electrons seperately or something like that (hmm... how much are they? Isure could use a few extra).--Alban Hertroys[EMAIL PROTECTED]magproductions 
b.v.T: ++31(0)534346874F: ++31(0)534346876M:I: www.magproductions.nlA: Postbus 4167500 AK Enschede// Integrate Your World //---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate   subscribe-nomail command to [EMAIL PROTECTED] so that your   message can get through to the mailing list cleanly



Re: [GENERAL] database design questions

2006-04-03 Thread Alban Hertroys

Ottavio Campana wrote:

  CREATE TABLE person (
  id   SERIAL,
  name TEXT
  );



how can I do it with a INT8 instead of a INT4?


Do you really expect that sequence to reach over 2 billion? Otherwise 
I'd stick with the SERIAL, nothing wrong with that unless you're selling 
electrons seperately or something like that (hmm... how much are they? I 
sure could use a few extra).


--
Alban Hertroys
[EMAIL PROTECTED]

magproductions b.v.

T: ++31(0)534346874
F: ++31(0)534346876
M:
I: www.magproductions.nl
A: Postbus 416
   7500 AK Enschede

// Integrate Your World //

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly


Re: [GENERAL] database design questions

2006-04-03 Thread Keary Suska
on 4/3/06 7:38 AM, [EMAIL PROTECTED] purportedly said:

> 1) The database I'm going to develop is  a big list with a catalog of
> items and I  want to  store subsets  of this  list representing  the
> available items in several places.
> 
> My idea  is to create the  big table with  all the elements and  then to
> create another  table, where each  row holds a pair  (id_item, id_place)
> and  thanks to  this create  several views,  joining the  two tables
> and selecting the rows with a give id_place.
> 
> Do you think it's too heavy? Is there a simpler way to do it?

On the surface, perhaps. Depending on your implementation details, you may
be adding unnecessary overhead. No one can really say since we don't know
what you are trying to accomplish.

> 2) do you think it's possible in  a plpgsql procedure select the name of
> a table into a variable and use that variable in the query?
> 
> I mean, can I do something like
> 
> SELECT INTO table_name get_table_name();
> SELECT * FROM table_name;

Yes, kind of. I.e., you can probably do what you want but not with the
syntax you are showing. See SELECT INTO and EXECUTE in chapter 36 of the
online docs. 

> 3) faq 4.11.1 says
> 
>>CREATE TABLE person (
>>id   SERIAL,
>>name TEXT
>>);
>> 
>> is automatically translated into this:
>> 
>>CREATE SEQUENCE person_id_seq;
>>CREATE TABLE person (
>>id   INT4 NOT NULL DEFAULT nextval('person_id_seq'),
>>name TEXT
>>);
> 
> how can I do it with a INT8 instead of a INT4?


Use BIGSERIAL instead.

Best,

Keary Suska
Esoteritech, Inc.
"Demystifying technology for your home or business"


---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match


Re: [GENERAL] database design questions

2006-04-03 Thread Richard Broersma Jr


--- Tomi NA <[EMAIL PROTECTED]> wrote:

> On 4/3/06, Ottavio Campana <[EMAIL PROTECTED]> wrote:
> 
> 
> > 3) faq 4.11.1 says
> >
> > >CREATE TABLE person (
> > >id   SERIAL,
> > >name TEXT
> > >);
> > >
> > >is automatically translated into this:
> > >
> > >CREATE SEQUENCE person_id_seq;
> > >CREATE TABLE person (
> > >id   INT4 NOT NULL DEFAULT nextval('person_id_seq'),
> > >name TEXT
> > >);
> >
> > how can I do it with a INT8 instead of a INT4?
> >
> > Thank you
> >
> 
> Is there a reason not to write explicitly?
> 
> CREATE SEQUENCE person_id_seq;
> CREATE TABLE person (
> id   INT8 NOT NULL DEFAULT nextval('person_id_seq'),
> name TEXT
> );

you could also do:

CREATE TABLE person (
id   BIGSERIAL,
name TEXT
);


Regards,

Richard


---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq


Re: [GENERAL] database design questions

2006-04-03 Thread Tomi NA
On 4/3/06, Ottavio Campana <[EMAIL PROTECTED]> wrote: 
3) faq 4.11.1 says>CREATE TABLE person (>id   SERIAL,>name TEXT>);>>is automatically translated into this:>>CREATE SEQUENCE person_id_seq;
>CREATE TABLE person (>id   INT4 NOT NULL DEFAULT nextval('person_id_seq'),>name TEXT>);how can I do it with a INT8 instead of a INT4?Thank you
Is there a reason not to write explicitly?CREATE SEQUENCE person_id_seq;CREATE TABLE person (id   INT8 NOT NULL DEFAULT nextval('person_id_seq'),name TEXT);Tomislav


[GENERAL] database design questions

2006-04-03 Thread Ottavio Campana
Hello,

I'm designing a  database and I'm having  some problems, so I  ask you a
suggestion.

1) The database I'm going to develop is  a big list with a catalog of
items and I  want to  store subsets  of this  list representing  the
available items in several places.

My idea  is to create the  big table with  all the elements and  then to
create another  table, where each  row holds a pair  (id_item, id_place)
and  thanks to  this create  several views,  joining the  two tables
and selecting the rows with a give id_place.

Do you think it's too heavy? Is there a simpler way to do it?

2) do you think it's possible in  a plpgsql procedure select the name of
a table into a variable and use that variable in the query?

I mean, can I do something like

SELECT INTO table_name get_table_name();
SELECT * FROM table_name;

?

3) faq 4.11.1 says

>CREATE TABLE person (
>id   SERIAL,
>name TEXT
>);
>
>is automatically translated into this:
>
>CREATE SEQUENCE person_id_seq;
>CREATE TABLE person (
>id   INT4 NOT NULL DEFAULT nextval('person_id_seq'),
>name TEXT
>);

how can I do it with a INT8 instead of a INT4?

Thank you

-- 
Non c'è più forza nella normalità, c'è solo monotonia.



signature.asc
Description: OpenPGP digital signature