Re: [SQL] Multiple DB join

2006-08-15 Thread Andrew Sullivan
On Mon, Aug 14, 2006 at 05:26:10PM -0400, Sumeet Ambre wrote:
> Hi All,
> 
> I have a database which consists of 20 million records and I've split up 
> the db into 6-7 dbs. 

You can do this (as someone suggested with dblink), but I'm wondering
why the split?  20 million records isn't very big.

A

-- 
Andrew Sullivan  | [EMAIL PROTECTED]
Unfortunately reformatting the Internet is a little more painful 
than reformatting your hard drive when it gets out of whack.
--Scott Morris

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [SQL] Multiple DB join

2006-08-15 Thread Michael Fuhr
On Tue, Aug 15, 2006 at 06:36:24AM -0400, Andrew Sullivan wrote:
> On Mon, Aug 14, 2006 at 05:26:10PM -0400, Sumeet Ambre wrote:
> > I have a database which consists of 20 million records and I've split up 
> > the db into 6-7 dbs. 
> 
> You can do this (as someone suggested with dblink), but I'm wondering
> why the split?  20 million records isn't very big.

And why multiple databases instead of multiple schemas within the
same database?  Or even all data in the same schema?  Is there a
reason for the segregation?

-- 
Michael Fuhr

---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org


Re: [SQL] Multiple DB join

2006-08-15 Thread Jorge Godoy
Michael Fuhr <[EMAIL PROTECTED]> writes:

> And why multiple databases instead of multiple schemas within the
> same database?  Or even all data in the same schema?  Is there a
> reason for the segregation?

I can think that spreading processing requirements should be one.  And
distributing load another one.  Disk space can be solved with new disks and
tablespaces, but we can't yet distribute the load through several servers
without partitioning.

-- 
Jorge Godoy  <[EMAIL PROTECTED]>

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [SQL] Multiple DB join

2006-08-15 Thread Andrew Sullivan
On Tue, Aug 15, 2006 at 09:01:03AM -0300, Jorge Godoy wrote:
> I can think that spreading processing requirements should be one.  And
> distributing load another one.  Disk space can be solved with new disks and
> tablespaces, but we can't yet distribute the load through several servers
> without partitioning.

The cost of inter-node communication isn't nothing, though.  It
strikes me as at least possible that the overhead of dblink is going
to be larger than whatever gains one makes from adding a new server. 
For only 20M rows, I find it pretty hard to believe the gain is going
to be demonstrable.

A

-- 
Andrew Sullivan  | [EMAIL PROTECTED]
Everything that happens in the world happens at some place.
--Jane Jacobs 

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [SQL] Multiple DB join

2006-08-15 Thread Jorge Godoy
Andrew Sullivan <[EMAIL PROTECTED]> writes:

> On Tue, Aug 15, 2006 at 09:01:03AM -0300, Jorge Godoy wrote:
>> I can think that spreading processing requirements should be one.  And
>> distributing load another one.  Disk space can be solved with new disks and
>> tablespaces, but we can't yet distribute the load through several servers
>> without partitioning.
>
> The cost of inter-node communication isn't nothing, though.  It
> strikes me as at least possible that the overhead of dblink is going
> to be larger than whatever gains one makes from adding a new server. 
> For only 20M rows, I find it pretty hard to believe the gain is going
> to be demonstrable.

I totally agree with you.  But when your base grows, it might start getting
interesting. 

I'd like to see some sort of data partitioning in PostgreSQL.

-- 
Jorge Godoy  <[EMAIL PROTECTED]>


---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [SQL] Query response time

2006-08-15 Thread Scott Marlowe
On Fri, 2006-08-11 at 08:58, Jonathan Sinclair wrote:
> Hi all. Thanks for your help so far. However after configuring my system
> I am still getting major lag times with a couple of queries. The first,
> which suffers from the '538/539'(see old email included below) bug, is
> caused by running the following statement:
> 
> SELECT t1.col1, SUM(test) test_col, SUM(col2)
> FROM table1 tl, table2 t2
> WHERE t1.date BETWEEN '01/10/2004' AND '01/10/2004'
> AND t1.col3 = t2.col1
> AND t1.col5 = t2.col2
> AND t2.col3 BETWEEN 50.00 AND 200.00
> GROUP BY t1.col1
> HAVING SUM(test) BETWEEN 95.00 AND 100.00
> ORDER BY 2 DESC, t1.col1;
> 
> I would like to know if anyone has any ideas why this problem arises.
> (It's not due to the date format being ambiguous; I have set it to
> European standard)

Have you looked at the plan for this query?

explain select...

If I remember correctly, the planner has a bug where if you do a between
with the same date, it does a seq scan instead of using an index.  But
my memory on that's a bit rough.

How does it work if you change the where clause to be "t1.date =
'01/10/2004'???

P.s. this is more of a performance than a SQL question.  Not that that's
a big deal or anything, the SQL list  is kinda slow and can likely use
the traffic. :)

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [SQL] Multiple DB join

2006-08-15 Thread Sumeet Ambre

Andrew Sullivan wrote:

On Mon, Aug 14, 2006 at 05:26:10PM -0400, Sumeet Ambre wrote:
  

Hi All,

I have a database which consists of 20 million records and I've split up 
the db into 6-7 dbs. 



You can do this (as someone suggested with dblink), but I'm wondering
why the split?  20 million records isn't very big.

A
  
The Design of the database is because our organization wants to split up 
different datasets into different entities, and there might be a 
possibility that they'll run different instances of postgres for each 
dataset. The problem is that one of the databases consists of 14 million 
records and when i query the base database which consists of 20million 
records, the query runs damn slow...below is the sample schema for 
my base table and the query i try to run on itit takes me more than 
2-3 minutes to run a queryIs there any way i could speed this up..


sample schema for my base table
-

doc_id  |  title |  authors  |  pub_year  |   abs   |   db_name |

In the above scheme the field db_name is the name of the other databases 
which contain the whole record.
I'm trying to run query which searches on title, authors name, pub_year 
or abstract simultaneously...


e.g.  SELECT * FROM base WHERE ((title ILIKE '%something%') AND (authors 
~*   '.*something.*')  AND (db_name='something'));


Any suggestions to speed up this query.

Thanks,
Sumeet.

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
  choose an index scan if your joining column's datatypes do not
  match


Re: [SQL] Breaking up a query

2006-08-15 Thread Scott Marlowe
On Thu, 2006-08-10 at 17:53, Saad Anis wrote:
> Hi Guys,
> 
> A fellow at work has written the SQL below to retrieve some data from
> multiple tables. Obviously it is inefficient and unnecessarily complex, and
> I am trying to break it into 2 or more queries so as to enhance performance.

Nope, that's not true in PostgreSQL.  It is for some databases with
relatively simplistic query planners, but not postgresql.

I'd check that you have indexes where you need them (generally when you
see a seq scan on a small set) including, especially, the foreign key
columns (i.e. the ones pointing to another table's primary keys).

On to your explain analyze, I noticed a lot of lines like this:

Index Scan using positions_pkey on positions p  (cost=0.00..32.00
rows=1000 width=28) (actual time=0.019..90.920 rows=13958 loop

seeing as how the statistical default for a new, unanalyzed table is
1000, and you've got 1000 peppered all through your explain analyze, I'd
guess you've not analyzed your database.  Which means you've likely not
read the admin docs.  which means you've likely not vacuumed the
database.  

Read the admin docs (they're not that thick, and there's lots of good
info in there) and apply things like vacuum and analyze, and get back to
us on how things are doing then.

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match


Re: [SQL] Multiple DB join

2006-08-15 Thread Andrew Sullivan
On Tue, Aug 15, 2006 at 10:11:41AM -0400, Sumeet Ambre wrote:
> >  
> The Design of the database is because our organization wants to split up 
> different datasets into different entities, and there might be a 
> possibility that they'll run different instances of postgres for each 
> dataset. 

It's this "different instances" thing that I'm having a tough time
with.  Is this because they want to be able to query local things
when disconnected or something?  I can think of applications for
this, for sure, I'm just suggesting that you make sure you're not
building an optimisation that is (1) premature and (2) possibly a
pessimal operation.

> records, the query runs damn slow...below is the sample schema for 
> my base table and the query i try to run on itit takes me more than 
> 2-3 minutes to run a queryIs there any way i could speed this up..

The problem is not your design, nor even the size of the data
exactly, but the query: 

> e.g.  SELECT * FROM base WHERE ((title ILIKE '%something%') AND (authors 
> ~*   '.*something.*')  AND (db_name='something'));

You have two initially-unbound search terms there: ILIKE '%' and ~*
'.*' are automatically seqscans, because you have nowhere in the
index to start.  If you really want to do this kind of unbound-string
query, you need to look into full text search.  The above approach is
never going to be fast.

-- 
Andrew Sullivan  | [EMAIL PROTECTED]
This work was visionary and imaginative, and goes to show that visionary
and imaginative work need not end up well. 
--Dennis Ritchie

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [SQL] Multiple DB join

2006-08-15 Thread Andrew Sullivan
On Tue, Aug 15, 2006 at 10:25:00AM -0300, Jorge Godoy wrote:
> I'd like to see some sort of data partitioning in PostgreSQL.

Sure, I think everybody would.  I think it's among the more
interesting problems we have.

A

-- 
Andrew Sullivan  | [EMAIL PROTECTED]
The plural of anecdote is not data.
--Roger Brinner

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match


Re: [SQL] Multiple DB join

2006-08-15 Thread Sumeet
On 8/15/06, Andrew Sullivan <[EMAIL PROTECTED]> wrote:
On Tue, Aug 15, 2006 at 10:11:41AM -0400, Sumeet Ambre wrote:> >> The Design of the database is because our organization wants to split up> different datasets into different entities, and there might be a
> possibility that they'll run different instances of postgres for each> dataset.It's this "different instances" thing that I'm having a tough timewith.  Is this because they want to be able to query local things
when disconnected or something?  I can think of applications forthis, for sure, I'm just suggesting that you make sure you're notbuilding an optimisation that is (1) premature and (2) possibly apessimal operation.

The reason for splitting up the dbs into differnet instances is that in case one of the postgres instances on the server 
goes down for some reason, it doesnt effect the other instances which are running on the same server. Even I dont know 
the reason for this kind of approach. But i've to deal with it.
 > records, the query runs damn slow...below is the sample schema for> my base table and the query i try to run on itit takes me more than
> 2-3 minutes to run a queryIs there any way i could speed this up..The problem is not your design, nor even the size of the dataexactly, but the query:> e.g.  SELECT * FROM base WHERE ((title ILIKE '%something%') AND (authors
> ~*   '.*something.*')  AND (db_name='something'));You have two initially-unbound search terms there: ILIKE '%' and ~*'.*' are automatically seqscans, because you have nowhere in theindex to start.  If you really want to do this kind of unbound-string
query, you need to look into full text search.  The above approach isnever going to be fast.
I previously thought of doing the full text search indexing thing...but i had a intution that the full text search thing is for fields which have very large strings...but in my case the  strings are not above 200 chars in length.so would the full text search thing give me the performance which we need...also i tried doing combined indexes on title, authors in my base table..would indexes of any kind help me in this case..
Thanks for ur prompt replies,Sumeet.-- Thanks,Sumeet.


Re: [SQL] Multiple DB join

2006-08-15 Thread Andrew Sullivan
On Tue, Aug 15, 2006 at 10:35:36AM -0400, Sumeet wrote:
> 
> The reason for splitting up the dbs into differnet instances is that in case
> one of the postgres instances on the server
> goes down for some reason, it doesnt effect the other instances which are
> running on the same server. Even I dont know
> the reason for this kind of approach. But i've to deal with it.

I certainly sympathise with the situation where managers decide to
solve problems that never happen.  Given that these are on the same
server, it is _way_ more likely that you'll introduce a problem due
to running several instances of the back end than that one instance
of Postgres will magically die (especially in such a way that other
instances will continue to work).  But if Some Important Person
decreed it, you have my sympathy.

> I previously thought of doing the full text search indexing thing...but i
> had a intution that the full text search thing is for fields which have very
> large strings...but in my case the  strings are not above 200 chars in

No, it's for data where you're going to be searching in random pieces
of the text.  I think you should look at tsearch2, probably.  

If you had any way to enforce bounded searches, it'd be a different
matter: strings with initial matches but an unbound end are fast. 
(You can do it the other way, too, by some tricks with reversing the
strings.)

A

-- 
Andrew Sullivan  | [EMAIL PROTECTED]
The fact that technology doesn't work is no bar to success in the marketplace.
--Philip Greenspun

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [SQL] Undo an update

2006-08-15 Thread Kis János Tamás
2006. August 11. Friday 19.26 dátummal Judith ezt írta:
>
> Is there a way to undo an update???
>

At the 1st time: I speak little english, so I sorry.
If I understand, what's your problem, maybe I can help you... Maybe...
So, if I'd like to save the inserted, deleted and/or updated rows, 
then I create an extra table and some rows, triggers...

-- Function: generate_log_table()
CREATE OR REPLACE FUNCTION generate_log_table()
  RETURNS "trigger" AS
$BODY$
DECLARE query text;
BEGIN
 IF (TG_OP = 'INSERT') THENquery := 'INSERT INTO data_table 
VALUES('|||| NEW.a ||''');'; 
 ELSIF (TG_OP = 'UPDATE') THEN query := 'UPDATE data_table SET a = ' 
||  || NEW.a || ''' WHERE id_table = '|| NEW.id_table ||';';
 ELSIF (TG_OP = 'DELETE') THEN query := 'DELETE FROM data_table WHERE 
id_table = '|| OLD.id_table ||';';
 END IF;
 INSERT INTO log_table (fecha, instruction) VALUES (now(), query); 
 RETURN NEW;
END
$BODY$
  LANGUAGE 'plpgsql' VOLATILE;

-- Table: data_table
CREATE TABLE data_table
(
  a text,
  id_table serial NOT NULL,
  CONSTRAINT table_pkey PRIMARY KEY (id_table)
) 
WITH OIDS;

-- Trigger: generate_log_table on data_table
CREATE TRIGGER generate_log_table
  AFTER INSERT OR UPDATE OR DELETE
  ON data_table
  FOR EACH ROW
  EXECUTE PROCEDURE generate_log_table();

-- Table: log_table
CREATE TABLE log_table
(
  fecha timestamp,
  instruction text,
  id_table_log serial NOT NULL,
  CONSTRAINT log_table_pkey PRIMARY KEY (id_table_log)
) 
WITH OIDS;


So, if you send every insert, update, delete command to a 
logger-table, then you can to undo anything.
I think...

Bye,
kjt


McAfee SCM 4.1 által ellenrizve!

---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org


Re: [SQL] The Right Way to manage schemas in SCM systems

2006-08-15 Thread Aaron Bono
On 11 Aug 2006 08:05:34 -0700, Andrew Hammond <[EMAIL PROTECTED]> wrote:
I've been trying to figure out a good way to manage schema changecontrol for a while now. Since I have a development background, Ireally want to find some way to check the schema into a SCM system likeCVS (for example). Just using a pg_dump doesn't work very well becase
there's no guarantee of consistent ordering. One of the things I findmost useful with SCM systems is the ability to quickly spot changes. Are-ordering within a single file makes for a lot of noise in thedeltas.
So far, the best idea I've come up with is the "file tree dump"approach: get a list of every object in the database and then dump itinto a suitably named file. Finally, create a master file which
consists of only include lines. There are a couple of problems I cansee with this.1) How to calculate the dependancy graph (or worse, dealing with RIloops) to determine the right order to load things in isn't stupidly
obvious.2) If I just script this in bash, without making any changes topg_dump, it doesn't happen in a single transaction.3) No clever solution for relative vs absolute filenames (unless allthis goes into a tar file, but that format already exists and doesn't
solve my problem).So my question is, does anyone have a better way of managing schemas inSCM systems? And if not, is there a precieved value in the communityfor a pg_dump --format scm_friendly option? (In which case I'll take
this thread over to hackers) The approach you are taking here is difficult to implement but makes life easy on developers who like to make changes to the database whenever needed.  It can be great for rapid development but I have chosen to take a different approach - one that requires a little more discipline up front but is better for controlling your testing and production environments.
For every project, I first create DDL scripts for my database.  I have a code builder that does this for me but you could also just create the database and use the pg_dump to get your initial DDL.Next I save this DDL in CVS.  I break each schema into a separate script.  If you are worried about the order, do as pg_dump does - create the tables first, then add the constraints like foreign keys after the structure is there.
This is great for the initial setup.  To deal with change, I have a manual process in place and use an auditing process to check that everything is done correctly.  Here is how it works:Development:1. Copy the production database into the development database (we do this at least once a week).
2. Make the change in development - the nice thing here is I use EMS Manager and whenever I make a change it provides the SQL required to make the change.3. Copy the SQL for the change (from EMS Manager) and put it into a migration script (this will be executed in production when we deploy the application) - order is very important here.
4. Make the change to the initial DDL and check that change into CVS.  Our code builder helps out a lot with this when the changes are large.5. Update my application code and test with new changes in development.  Again out code builder does a lot of this for us.
6. Drop the development database, refresh it from production, run the migration script and test the new code - if all goes well it is ready for production.7. Deploy to production.  We never allow developers to directly make changes to production.  Only our administrators have that capability.
Audit:1. Get a copy of production and put it into development.2. Run the DDL from CVS and put it into an audit database.3. Run a database diff - we use DB Comparer (same company as EMS Manager).4. Reconcile differences and put into DDL.
We also will, from time to time, test the application against the DDL rather than from a copy of production.Currently we are working on ways to automate this mostly manual process.  The nice thing is, the tools we use allow us to do the work fairly quickly.  I like a little manual work in the whole process though as it keeps the developers better atuned to the database structure and the changes that are being made to it.
If you were looking for a way to just backup the structure in CVS every day, you may consider writing a script (Perl would be an excellent choice for this) that reads the pg_dump and splits it out into separate files for each schema/table.
==   Aaron Bono   Aranya Software Technologies, Inc.   http://www.aranya.com   
http://codeelixir.com==


Re: [SQL] Using bitmap index scans-more efficient

2006-08-15 Thread Kyle Bateman

Tom Lane wrote:


Kyle Bateman <[EMAIL PROTECTED]> writes:
 

But I'm assuming that using an interval-encoded project tree, I would 
have to do something like the following to get a progency group:
   



 

select * from ledger l, proj p where p.proj_id = l.proj and p.left > 
1234 and p.right < 2345;
   



btree has no idea about the constraint (that I imagine exists) that left
<= right.  If you're just doing a simple index on (left, right) then the
above query requires scanning all index entries with left > 1234.  It
would probably help to say

select * from ledger l, proj p where p.proj_id = l.proj and
 p.left > 1234 and p.left < 2345 and p.right < 2345;

so that you can constrain the range of "left" values scanned.
 


Thanks for the replies, Tom and Florian.

My problem is not that it is difficult (or costly) to determine the 
progeny of a given project.  I can determine this in about 90 msec 
regardless of whether I use an adjacency model, interval-encoding, or 
materialized path (current implementation).  The problem is, when I try 
to extract the ledger entries belonging to that progeny from a set of a 
million records, it seems to need to process all million records rather 
than being able to index right into the ones I want.


I'm not very good at reading explain output, but I tried to set up the 
query Tom suggests by creating an interval-encoded project table 
(proj_int) and then joining it to my ledger like so:


select l.* from ledger l, proj_int i where l.proj = i.proj_id and i.lft 
>= 5283 and i.lft < 5300 and i.rgt <= 5300;


On my mini-test-ledger of 100,000 entries, this takes the longest time 
(5 seconds) with the following explain output:


Hash Join  (cost=19018.46..23411.52 rows=14 width=85)
  Hash Cond: ("outer".proj = "inner".proj_id)
  ->  Nested Loop Left Join  (cost=18994.38..23378.41 rows=1700 width=85)
->  Hash Join  (cost=18990.84..23340.87 rows=1700 width=81)
  Hash Cond: ("outer".vendid = "inner".org_id)
  ->  Merge Join  (cost=18935.35..23255.64 rows=1700 width=63)
Merge Cond: (("outer".vendid = "inner".vendid) AND 
(("outer".invnum)::text = "inner"."?column10?"))
->  Index Scan using apinv_hdr_pkey on apinv_hdr h  
(cost=0.00..3148.16 rows=51016 width=21)
->  Sort  (cost=18935.35..19235.45 rows=120041 
width=55)

  Sort Key: i.vendid, (i.invnum)::text
  ->  Seq Scan on apinv_items i  
(cost=0.00..4152.99 rows=120041 width=55)
Filter: ((status = 'en'::bpchar) OR 
(status = 'cl'::bpchar) OR (status = 'pd'::bpchar))

  ->  Hash  (cost=50.99..50.99 rows=1799 width=26)
->  Seq Scan on vend_org v  (cost=0.00..50.99 
rows=1799 width=26)

->  Materialize  (cost=3.54..3.55 rows=1 width=4)
  ->  Seq Scan on acct a  (cost=0.00..3.54 rows=1 width=4)
Filter: ((code)::text = 'ap'::text)
  ->  Hash  (cost=24.06..24.06 rows=10 width=4)
->  Bitmap Heap Scan on proj_int i  (cost=2.26..24.06 rows=10 
width=4)
  Recheck Cond: ((lft >= 5283) AND (lft < 5300) AND (rgt <= 
5300))
  ->  Bitmap Index Scan on i_proj_int_lft_rgt  
(cost=0.00..2.26 rows=10 width=0)
Index Cond: ((lft >= 5283) AND (lft < 5300) AND 
(rgt <= 5300))


That is roughly equivalent to my materialized path method:

select l.* from ledger l where l.projin (select proj_id from proj_v 
where 4737 = any(ppath));



And is quite slow compared to 150 msec when enumerating the progeny 
projects like so:


select l.* from ledger l where l.proj in 
(4737,4789,4892,4893,4894,4895,4933,4934,4935);


Nested Loop Left Join  (cost=19.73..4164.10 rows=7 width=85)
  ->  Nested Loop  (cost=19.73..4139.08 rows=7 width=81)
->  Nested Loop  (cost=19.73..4100.07 rows=7 width=63)
  ->  Bitmap Heap Scan on apinv_items i  
(cost=19.73..1185.71 rows=487 width=55)
Recheck Cond: ((proj = 4737) OR (proj = 4789) OR 
(proj = 4892) OR (proj = 4893) OR (proj = 4894) OR (proj = 4895) OR 
(proj = 4933) OR (proj = 4934) OR (proj = 4935))
Filter: ((status = 'en'::bpchar) OR (status = 
'cl'::bpchar) OR (status = 'pd'::bpchar))

->  BitmapOr  (cost=19.73..19.73 rows=495 width=0)
  ->  Bitmap Index Scan on i_apinv_items_proj  
(cost=0.00..2.19 rows=55 width=0)

Index Cond: (proj = 4737)
  ->  Bitmap Index Scan on i_apinv_items_proj  
(cost=0.00..2.19 rows=55 width=0)

Index Cond: (proj = 4789)
  ->  Bitmap Index Scan on i_apinv_items_proj  
(cost=0.00..2.19 rows=55 width=0)

Index Cond: (proj = 4892)
  ->  Bitmap Index Scan on i_apinv_items_proj  
(cost=0.00..2.19 rows=55 width=0)

   

Re: [SQL] Using bitmap index scans-more efficient

2006-08-15 Thread Kyle Bateman

Tom Lane wrote:


Kyle Bateman <[EMAIL PROTECTED]> writes:
 

But I'm assuming that using an interval-encoded project tree, I would 
have to do something like the following to get a progency group:
   

select * from ledger l, proj p where p.proj_id = l.proj and p.left > 
1234 and p.right < 2345;
   



Here's an interesting result:

I created a function proj_left(int4) that returns the left interval 
number for a given project.  Then I created an index on the underlying 
table for the ledger  view(which took forever to build) like so:


create index i_test on apinv_items (proj_left(proj));

Now my query:

select * from ledger where proj_left(dr_proj) >= 5283 and 
proj_left(dr_proj) < 5300;


is very speedy.  Problem is, I had to mark the function proj_left() as 
immutable, which it can not be since the left and right values for a 
given project will change any time a project is added, removed, or moved 
around the hierarchy :(


So is there any good way to tell the planner to do several individual 
index scans for the projects involved in the desired progeny, or the 
results together and return the result?  This is what it seems to be 
choosing in the case of the query:


 select * from ledger where proj in 
(4737,4789,4892,4893,4894,4895,4933,4934,4935);




---(end of broadcast)---
TIP 4: Have you searched our list archives?

  http://archives.postgresql.org