Re: [GENERAL] suse 9.2

2005-06-14 Thread Ioannis Theoharis


SUSE's dvd includes PostgreSQL 7.4

and i think it's also avalable via ftp from various mirrors.
Search at Google, or at Novell site.

But if you search a newer version, i can't help you.


On Mon, 13 Jun 2005, martin wrote:

 hi all
 i try to find a new postgre rpm package for suse 9.2 but i didnt find some.
 plz help a newbee
 thx
 martin

 ---(end of broadcast)---
 TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]


---(end of broadcast)---
TIP 7: don't forget to increase your free space map settings


Re: [GENERAL] About Access paths

2005-03-07 Thread Ioannis Theoharis



 It's a fairly corner case feature, only for the case where you're
 looking for the existance of an index key but don't want any other
 data. I don't personally have any queries that could use such a
 construct, though maybe I'm missing the point.

I understand the difficulty of implementation of such a new feature and
the advantages and disadvantages coming with it.

But let me give you one of the most important usage of index only scan:

if one has two relations A(a1, a2, ..., an) and B(b1, b2, ..., bn) and let
b1 be a foreign key of one of the ai of A, (e.g of aj).

And let q be a query like:

select  A.ai
fromA, B
where   aj = b1

which is one of most common join queries,

then there if one has an index on b1 on relation B, there is no reason to
scann the whole table B, but only the b1 attribute. A good optimizer would
select index only access path, because it's the most selective acess path.

Eg. index nested loop algorithm for join, taking relation A as outer, and
B as inner, has to scan the whole relation B (Vs the whole index on B) for
each tuple(or block) of outer rel A.

Also, there are other less frequent examples in which index only acess
path is the optimal one.



---(end of broadcast)---
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]


[GENERAL] About Access paths

2005-03-06 Thread Ioannis Theoharis


Hi,

in case that one has a table R(att0 int4, att1 int4), an index on att1
and a query selecting only att1 whith a range condition over att1, then
the optimal access path for a rdbms would be INDEX ONLY SCAN, which means
scan only the index, and not the relation (all the needed information
exists in index, because the select clause has only the att1 attribute).
This access path is selected in this case by Oracle.

As far as i can see, postgresql select a simple INDEX SCAN using index
on table.

Is there any tuning that one has to do, in order to make postgresql's
optimizer to select INDEX ONLY SCAN ?
Or postgresql doesn't support this feature at all?

thanks.



---(end of broadcast)---
TIP 7: don't forget to increase your free space map settings


Re: [GENERAL] About Access paths

2005-03-06 Thread Ioannis Theoharis


 
  As far as i can see, postgresql select a simple INDEX SCAN using index
  on table.

 That's because the index does not hold all the information. So INDEX
 ONLY is impossible.


Thank you for your quick answer, but there is something a don't
understand:

the index holds all attributes' values (one index entry/one table entry).
Right?
given a query with only this attribute on SELECT clause, which is the
missing information, that do not exist in index?


---(end of broadcast)---
TIP 9: the planner will ignore your desire to choose an index scan if your
  joining column's datatypes do not match


Re: [GENERAL] About Access paths

2005-03-06 Thread Ioannis Theoharis



 Visibility information. An index contains references to all rows in all
 transactions. You need to go back to the table to work out if the row
 is visible in your transaction.

 Hope this helpsm

Yes at all.
You have a lot of work for future postgresql' versions :-))


---(end of broadcast)---
TIP 6: Have you searched our list archives?

   http://archives.postgresql.org


Re: [GENERAL] Index size

2005-03-03 Thread Ioannis Theoharis


All you said are wright. But it 's not so difficult for postgresql to hold
on a bit attribute attached to each table the information, whether there
is done an insertion/deletion/update to a clustered table or not.

And i guess, postgresql would already implement this simply alternative.


 Easy, if you CLUSTER a table, it's CLUSTERed then. But it doesn't stay
 that way. As soon as you insert a new row, or update an old one, it
 gets added to the end (the only place with space) and now it's not
 clustered anymore. It's almost clustered and from a caching point of
 view it's fine. But postgresql can't assume at any point a table will
 stay clustered, an insert could happen in the middle of your
 processing.

 Logically you can't magically add space in the middle of a file, you
 have to move everything else up. If you know an efficient way to keep a
 table clustered while handling arbitrary inserts and updates, I'd be
 curious to know...

---(end of broadcast)---
TIP 6: Have you searched our list archives?

   http://archives.postgresql.org


Re: [GENERAL] Index size

2005-03-02 Thread Ioannis Theoharis


On Tue, 1 Mar 2005, Tom Lane wrote:

 Tatsuo Ishii [EMAIL PROTECTED] writes:
  So it seems Ioannis' number was not taken immediately after a CREATE
  INDEX operation?

 I would guess not, but it's up to him to say.  If it is a number derived
 after some period of normal operation, then his result agrees with the
 theory that says 70% is the steady-state figure ...

yes, my number was taken after a large amount of inserts.

Your comments about the block usage in case of b-tree indexes are
absolutely interesting.
Where can i find a documentation with technical analysis for all (if
possible) of components of postgres?

All documentations that i have found are very general and refer to simple
users.



   regards, tom lane


---(end of broadcast)---
TIP 6: Have you searched our list archives?

   http://archives.postgresql.org


Re: [GENERAL] Index size

2005-03-02 Thread Ioannis Theoharis


On Wed, 2 Mar 2005, Tatsuo Ishii wrote:

  An other question:
 
  Is there any way to prevent duplicates on btree index attribute,
  PERMITTING them on table?

 I can't think of any usefull usage for such an index. Can you explain
 why you need it?

I have a relation like this: (att0 varchar(1000), att1 int4)

i create a b-tree index on att1 ()
i cluster my raltion according to index

now i have a query
select  *
formtc20
where   att1=9 and att1=0 ;

As far as i can see from explain analyze an index scan is used:
Index Scan using inst_id_idx on tc20
  Index Cond: ((att1 = 9) AND (att1 = 0))

If for each entry in table, an entry in index is beeing held, then the
index size is populated too fast.

I guess, that postgres uses index to find the first entry satisfying the
index conition, after find the last one and then do a sequential scan on
the appropriate fraction of the table (to take advantage of physical
clustering).

In my case, discrete values on att1 are orders of magnitude less than
number of table raws.

Thus, the big index size is useless for me. I want to avoid the overhead
of scanning such a big index, just permitting ONLY the discrete values to
entry in index. In such a way the whole scenario i presented before for
how i guess, that postgres evaluates my query, is still in use.

I think there must be a way to change the way of index_usage to alter it
to what i 'm looking for.


 --
 Tatsuo Ishii


---(end of broadcast)---
TIP 4: Don't 'kill -9' the postmaster


[GENERAL] Index size

2005-03-01 Thread Ioannis Theoharis


Hi,

I have created a btree index on a 'int4' attribute of a table.

After i have inserted 1,000,000 raws in my table, i can see that my index
size is 2745 Blocks (8KB each) from pg_class. That means about 21,960 KB
size.

I try to understand hows is this number generated, because thought that
for each new entry in table, there is a new entry in index and that each
entry of the index is:

4 Bytes for the int4 attribute
and
40 Bytes for oid

So 44 * 1,000,000 ~ 42,969 KB

Can anybody inform me where I do the mistake?


---(end of broadcast)---
TIP 3: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly


Re: [GENERAL] Index size

2005-03-01 Thread Ioannis Theoharis


Thanks a lot.

An other question:

Is there any way to prevent duplicates on btree index attribute,
PERMITTING them on table?




On Tue, 1 Mar 2005, Tatsuo Ishii wrote:

  I have created a btree index on a 'int4' attribute of a table.
 
  After i have inserted 1,000,000 raws in my table, i can see that my index
  size is 2745 Blocks (8KB each) from pg_class. That means about 21,960 KB
  size.
 
  I try to understand hows is this number generated, because thought that
  for each new entry in table, there is a new entry in index and that each
  entry of the index is:
 
  4 Bytes for the int4 attribute
  and
  40 Bytes for oid
 
  So 44 * 1,000,000 ~ 42,969 KB
 
  Can anybody inform me where I do the mistake?

 There's no oid in index tuples. There is an 8-byte long header for
 each index tuple. Since you are inserting 4-byte long user data, you
 index tuples are 12-byte each. Each index tuple needs a pointer in a
 block, which is called item pointer and that is 4-byte long. Each
 block can hold up to floor((8192-24(page header)-16(special
 data))/(12+4)) = 509 tuples. ceil(1,000,000/509) = 1965 is the blocks
 you need for your index. In addition to this, you need a meta page
 and a root page. So it becomes 1965+1+1 = 1967. Also you need
 internal pages, whose numer is hard to guess since it depends on the
 actual index tree structure(for example, tree height). From my limited
 experience, for 1,000,000 tuples, you will need at least 7 internal
 pages. Now the number becomes 1967+7 = 1974. Still it's different from
 2745. If you don't have deleted tuples, the difference probably comes
 from the fact that a btree index can never be 100% occupied. IMO
 1974/2745 = 0.71 seems not so bad.
 --
 Tatsuo Ishii

 ---(end of broadcast)---
 TIP 7: don't forget to increase your free space map settings


---(end of broadcast)---
TIP 4: Don't 'kill -9' the postmaster


[GENERAL] Clustering in the presence of hierarchies (fwd)

2004-12-11 Thread Ioannis Theoharis


As far as i can see my mail was not sent to list. I forward it:


-- Forwarded message --
Date: Fri, 10 Dec 2004 05:15:42 +0200 (EET)
From: Ioannis Theoharis [EMAIL PROTECTED]
To: [EMAIL PROTECTED]
Subject: Clustering in the presence of hierarchies




I'd like to ask you about the clustering strategy that postgres
implements.

Exactly: I have created a hierachy of tables using 'inherits'
relationship. I have populated tables with a huge amount of data and then
I cluster each table according to an attribute that exists on Root table
(hence, this attribute is inherited by all tables).

In general, clustering for each of those table means to reorder on disc
tuples, in order to be sequential stored.

The question is:
Does postgres uses the knowledge of the hierarchy structure
to reorder tuples of each table to be stored almost after
its direct paent-table tuples?



---(end of broadcast)---
TIP 4: Don't 'kill -9' the postmaster


[GENERAL] Clustering in the presence of hierarchies

2004-12-10 Thread Ioannis Theoharis



I'd like to ask you about the clustering strategy that postgres
implements.

Exactly: I have created a hierachy of tables using 'inherits'
relationship. I have populated tables with a huge amount of data and then
I cluster each table according to an attribute that exists on Root table
(hence, this attribute is inherited by all tables).

In general, clustering for each of those table means to reorder on disc
tuples, in order to be sequential stored.

The question is:
Does postgres uses the knowledge of the hierarchy structure
to reorder tuples of each table to be stored almost after
its direct paent-table tuples?



---(end of broadcast)---
TIP 6: Have you searched our list archives?

   http://archives.postgresql.org


Re: [GENERAL] About inheritance

2004-08-28 Thread Ioannis Theoharis


Thanks.

Time is little but visible affected for big chierarhies.


Let me do an other question.

I have again a Root table and a hierarchie of tables, all created with the
inherits relationship like:

create table father(att0 int4);
create table child1() inherits(father);
create table child2() inherits(father);
create table child11() inherits(child1);
create table child12() inherits(child1);
create table child21() inherits(child2);
create table child22() inherits(child2);



First i insert 1000 tuples into father table, and then i delete them and i
insert them into child22

I expekt explain analyze to give the same response time at both cases. But
i found that time increases as where as the level, where data are located,
increases.

Can anybody explain me the reason?





On Sun, 22 Aug 2004, Tom Lane wrote:

 Ioannis Theoharis [EMAIL PROTECTED] writes:
  I expekt to find the same plans because in both cases there is a union to
  be done, but i see that in second case there is an additional call to a
  routine. I meen the 'Subquery Scan *SELECT* X'

 The subquery scan step is in there because in a UNION construct, there
 may be a need to do transformations on the data before it can be
 unioned.  For instance you are allowed to UNION an int4 and an int8
 column, in which case the int4 values have to be promoted to int8 after
 they come out of the subplan.

 In the particular case you are showing, the subquery scan steps aren't
 really doing anything, but AFAIR the planner does not bother to optimize
 them out.  I'd be pretty surprised if they chew up any meaningful amount
 of runtime.

   regards, tom lane

 ---(end of broadcast)---
 TIP 5: Have you checked our extensive FAQ?

http://www.postgresql.org/docs/faqs/FAQ.html



---(end of broadcast)---
TIP 5: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faqs/FAQ.html