Re: [GENERAL] suse 9.2
SUSE's dvd includes PostgreSQL 7.4 and i think it's also avalable via ftp from various mirrors. Search at Google, or at Novell site. But if you search a newer version, i can't help you. On Mon, 13 Jun 2005, martin wrote: hi all i try to find a new postgre rpm package for suse 9.2 but i didnt find some. plz help a newbee thx martin ---(end of broadcast)--- TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED] ---(end of broadcast)--- TIP 7: don't forget to increase your free space map settings
Re: [GENERAL] About Access paths
It's a fairly corner case feature, only for the case where you're looking for the existance of an index key but don't want any other data. I don't personally have any queries that could use such a construct, though maybe I'm missing the point. I understand the difficulty of implementation of such a new feature and the advantages and disadvantages coming with it. But let me give you one of the most important usage of index only scan: if one has two relations A(a1, a2, ..., an) and B(b1, b2, ..., bn) and let b1 be a foreign key of one of the ai of A, (e.g of aj). And let q be a query like: select A.ai fromA, B where aj = b1 which is one of most common join queries, then there if one has an index on b1 on relation B, there is no reason to scann the whole table B, but only the b1 attribute. A good optimizer would select index only access path, because it's the most selective acess path. Eg. index nested loop algorithm for join, taking relation A as outer, and B as inner, has to scan the whole relation B (Vs the whole index on B) for each tuple(or block) of outer rel A. Also, there are other less frequent examples in which index only acess path is the optimal one. ---(end of broadcast)--- TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]
[GENERAL] About Access paths
Hi, in case that one has a table R(att0 int4, att1 int4), an index on att1 and a query selecting only att1 whith a range condition over att1, then the optimal access path for a rdbms would be INDEX ONLY SCAN, which means scan only the index, and not the relation (all the needed information exists in index, because the select clause has only the att1 attribute). This access path is selected in this case by Oracle. As far as i can see, postgresql select a simple INDEX SCAN using index on table. Is there any tuning that one has to do, in order to make postgresql's optimizer to select INDEX ONLY SCAN ? Or postgresql doesn't support this feature at all? thanks. ---(end of broadcast)--- TIP 7: don't forget to increase your free space map settings
Re: [GENERAL] About Access paths
As far as i can see, postgresql select a simple INDEX SCAN using index on table. That's because the index does not hold all the information. So INDEX ONLY is impossible. Thank you for your quick answer, but there is something a don't understand: the index holds all attributes' values (one index entry/one table entry). Right? given a query with only this attribute on SELECT clause, which is the missing information, that do not exist in index? ---(end of broadcast)--- TIP 9: the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [GENERAL] About Access paths
Visibility information. An index contains references to all rows in all transactions. You need to go back to the table to work out if the row is visible in your transaction. Hope this helpsm Yes at all. You have a lot of work for future postgresql' versions :-)) ---(end of broadcast)--- TIP 6: Have you searched our list archives? http://archives.postgresql.org
Re: [GENERAL] Index size
All you said are wright. But it 's not so difficult for postgresql to hold on a bit attribute attached to each table the information, whether there is done an insertion/deletion/update to a clustered table or not. And i guess, postgresql would already implement this simply alternative. Easy, if you CLUSTER a table, it's CLUSTERed then. But it doesn't stay that way. As soon as you insert a new row, or update an old one, it gets added to the end (the only place with space) and now it's not clustered anymore. It's almost clustered and from a caching point of view it's fine. But postgresql can't assume at any point a table will stay clustered, an insert could happen in the middle of your processing. Logically you can't magically add space in the middle of a file, you have to move everything else up. If you know an efficient way to keep a table clustered while handling arbitrary inserts and updates, I'd be curious to know... ---(end of broadcast)--- TIP 6: Have you searched our list archives? http://archives.postgresql.org
Re: [GENERAL] Index size
On Tue, 1 Mar 2005, Tom Lane wrote: Tatsuo Ishii [EMAIL PROTECTED] writes: So it seems Ioannis' number was not taken immediately after a CREATE INDEX operation? I would guess not, but it's up to him to say. If it is a number derived after some period of normal operation, then his result agrees with the theory that says 70% is the steady-state figure ... yes, my number was taken after a large amount of inserts. Your comments about the block usage in case of b-tree indexes are absolutely interesting. Where can i find a documentation with technical analysis for all (if possible) of components of postgres? All documentations that i have found are very general and refer to simple users. regards, tom lane ---(end of broadcast)--- TIP 6: Have you searched our list archives? http://archives.postgresql.org
Re: [GENERAL] Index size
On Wed, 2 Mar 2005, Tatsuo Ishii wrote: An other question: Is there any way to prevent duplicates on btree index attribute, PERMITTING them on table? I can't think of any usefull usage for such an index. Can you explain why you need it? I have a relation like this: (att0 varchar(1000), att1 int4) i create a b-tree index on att1 () i cluster my raltion according to index now i have a query select * formtc20 where att1=9 and att1=0 ; As far as i can see from explain analyze an index scan is used: Index Scan using inst_id_idx on tc20 Index Cond: ((att1 = 9) AND (att1 = 0)) If for each entry in table, an entry in index is beeing held, then the index size is populated too fast. I guess, that postgres uses index to find the first entry satisfying the index conition, after find the last one and then do a sequential scan on the appropriate fraction of the table (to take advantage of physical clustering). In my case, discrete values on att1 are orders of magnitude less than number of table raws. Thus, the big index size is useless for me. I want to avoid the overhead of scanning such a big index, just permitting ONLY the discrete values to entry in index. In such a way the whole scenario i presented before for how i guess, that postgres evaluates my query, is still in use. I think there must be a way to change the way of index_usage to alter it to what i 'm looking for. -- Tatsuo Ishii ---(end of broadcast)--- TIP 4: Don't 'kill -9' the postmaster
[GENERAL] Index size
Hi, I have created a btree index on a 'int4' attribute of a table. After i have inserted 1,000,000 raws in my table, i can see that my index size is 2745 Blocks (8KB each) from pg_class. That means about 21,960 KB size. I try to understand hows is this number generated, because thought that for each new entry in table, there is a new entry in index and that each entry of the index is: 4 Bytes for the int4 attribute and 40 Bytes for oid So 44 * 1,000,000 ~ 42,969 KB Can anybody inform me where I do the mistake? ---(end of broadcast)--- TIP 3: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [GENERAL] Index size
Thanks a lot. An other question: Is there any way to prevent duplicates on btree index attribute, PERMITTING them on table? On Tue, 1 Mar 2005, Tatsuo Ishii wrote: I have created a btree index on a 'int4' attribute of a table. After i have inserted 1,000,000 raws in my table, i can see that my index size is 2745 Blocks (8KB each) from pg_class. That means about 21,960 KB size. I try to understand hows is this number generated, because thought that for each new entry in table, there is a new entry in index and that each entry of the index is: 4 Bytes for the int4 attribute and 40 Bytes for oid So 44 * 1,000,000 ~ 42,969 KB Can anybody inform me where I do the mistake? There's no oid in index tuples. There is an 8-byte long header for each index tuple. Since you are inserting 4-byte long user data, you index tuples are 12-byte each. Each index tuple needs a pointer in a block, which is called item pointer and that is 4-byte long. Each block can hold up to floor((8192-24(page header)-16(special data))/(12+4)) = 509 tuples. ceil(1,000,000/509) = 1965 is the blocks you need for your index. In addition to this, you need a meta page and a root page. So it becomes 1965+1+1 = 1967. Also you need internal pages, whose numer is hard to guess since it depends on the actual index tree structure(for example, tree height). From my limited experience, for 1,000,000 tuples, you will need at least 7 internal pages. Now the number becomes 1967+7 = 1974. Still it's different from 2745. If you don't have deleted tuples, the difference probably comes from the fact that a btree index can never be 100% occupied. IMO 1974/2745 = 0.71 seems not so bad. -- Tatsuo Ishii ---(end of broadcast)--- TIP 7: don't forget to increase your free space map settings ---(end of broadcast)--- TIP 4: Don't 'kill -9' the postmaster
[GENERAL] Clustering in the presence of hierarchies (fwd)
As far as i can see my mail was not sent to list. I forward it: -- Forwarded message -- Date: Fri, 10 Dec 2004 05:15:42 +0200 (EET) From: Ioannis Theoharis [EMAIL PROTECTED] To: [EMAIL PROTECTED] Subject: Clustering in the presence of hierarchies I'd like to ask you about the clustering strategy that postgres implements. Exactly: I have created a hierachy of tables using 'inherits' relationship. I have populated tables with a huge amount of data and then I cluster each table according to an attribute that exists on Root table (hence, this attribute is inherited by all tables). In general, clustering for each of those table means to reorder on disc tuples, in order to be sequential stored. The question is: Does postgres uses the knowledge of the hierarchy structure to reorder tuples of each table to be stored almost after its direct paent-table tuples? ---(end of broadcast)--- TIP 4: Don't 'kill -9' the postmaster
[GENERAL] Clustering in the presence of hierarchies
I'd like to ask you about the clustering strategy that postgres implements. Exactly: I have created a hierachy of tables using 'inherits' relationship. I have populated tables with a huge amount of data and then I cluster each table according to an attribute that exists on Root table (hence, this attribute is inherited by all tables). In general, clustering for each of those table means to reorder on disc tuples, in order to be sequential stored. The question is: Does postgres uses the knowledge of the hierarchy structure to reorder tuples of each table to be stored almost after its direct paent-table tuples? ---(end of broadcast)--- TIP 6: Have you searched our list archives? http://archives.postgresql.org
Re: [GENERAL] About inheritance
Thanks. Time is little but visible affected for big chierarhies. Let me do an other question. I have again a Root table and a hierarchie of tables, all created with the inherits relationship like: create table father(att0 int4); create table child1() inherits(father); create table child2() inherits(father); create table child11() inherits(child1); create table child12() inherits(child1); create table child21() inherits(child2); create table child22() inherits(child2); First i insert 1000 tuples into father table, and then i delete them and i insert them into child22 I expekt explain analyze to give the same response time at both cases. But i found that time increases as where as the level, where data are located, increases. Can anybody explain me the reason? On Sun, 22 Aug 2004, Tom Lane wrote: Ioannis Theoharis [EMAIL PROTECTED] writes: I expekt to find the same plans because in both cases there is a union to be done, but i see that in second case there is an additional call to a routine. I meen the 'Subquery Scan *SELECT* X' The subquery scan step is in there because in a UNION construct, there may be a need to do transformations on the data before it can be unioned. For instance you are allowed to UNION an int4 and an int8 column, in which case the int4 values have to be promoted to int8 after they come out of the subplan. In the particular case you are showing, the subquery scan steps aren't really doing anything, but AFAIR the planner does not bother to optimize them out. I'd be pretty surprised if they chew up any meaningful amount of runtime. regards, tom lane ---(end of broadcast)--- TIP 5: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faqs/FAQ.html ---(end of broadcast)--- TIP 5: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faqs/FAQ.html