If people are going to start listing features they want here's some things I think would be nice. I have no idea though if they would be useful to anyone else:

1) hierarchical / recursive queries. I realize it's just been discussed at length but since there was some question as to whether or not there's demand for it so I am just weighing in that I think there is. I have to deal with hierarchy tables all the time and I simply have several standard methods of dealing with them depending on the data set / format. But they all suck. I've just gotten use to using the workarounds since there is nothing else. If you are not hearing the screams it's just because I think it's just become a fact of life for most people (unless you're using oracle) that you've just got to work around it. And everyone already has some code to do this and they've already done it everywhere it needs to be done. And as long as you're a little bit clever you can always work around it without taking a big performance hit. But it would sure be nice to have next time I have to deal with a tree table.

2) PITR on a per database basis. I think this would be nice but I'm guessing that the work involved is big and that few people really care or need it, so it will probably never happen.

3) A further refinement of PITR where some sort of deamon ships small log segments as they are created so that the hot standby doesn't have to be updated in 16MB increments or have to wait for some timeout to occur. It could always be up to the minute data.

4) All the Greenplum Bizgress MPP goodness. In reality (and I don't know if bizgress mpp can actually do this) I'd like to have a cluster of cheap boxes. I'd like to install postgres on all of them and configure them in such a way that it automatically partitions and mirrors each table so that each piece of data is always on two boxes and large tables and indexes get divided up intelligently. Sort of like a raid10 on the database level. This way any one box could die and I would be fine. Enormous queries could be handled efficiently and I could scale up by just dropping in new hardware.

Maybe greeenplum has done this. Maybe we will get their changes soon enough, maybe not. Maybe this sort of functionality will never happen. My guess is that all the little bit's a pieces of this will trickle in over the next several years and this sort of setup will be slowly converged on over time as lot's of little things come together. Table spaces and constraint exclusion come to mind here as things that could eventually evolve to contribute to a larger solution.

5) Somehow make it so I NEVER HAVE TO THINK ABOUT OR DEAL WITH VACUUM AGAIN. Once I get everything set up right everything works great but I'm sure if there's one thing I think everyone would love it would be getting postgres to the point where you don't even need to ship vacuumdb because there's no way the user could outsmart postgres's attempts to do garbage collection on it's own.

6) genuine updatable views. such that you just add an updatable keyword when you create the view and it's automagically updatable. I'm guessing that we'll get something like that, but its real magic will be throwing an error to tell you when you try to make a view updatable and it can't figure out how to make the rules properly.

7) allow some way to extract the data files from a single database and insert them into another database cluster. In many cases it would be a lot faster to copy the datafiles across the network than it is to dump, copy dump file, reload.

8) some sort of standard "hooks" to be used for replication. I guess when the replication people all get their heads together and tell the core developers what they all need something like this could evolve.

Like I said, postgres more than satisfies my "needs". I am especially happy when you factor in the cost of the software (free), and the quality of the community support (excellent).

And you can definitely say that the "missing" list is shrinking. But I think of it like this. There are tiers of database functionality that different people need: A) Correct me if I'm wrong but as great as postgres is there are still people out there that MUST HAVE Oracle or DB2 to get done what they need to get done. They just do things that the others can't. They may be expensive. They may suck to use and administer but the simple fact is that they have features that people need that are not offered in less expensive databases. B) Very, very powerful databases but lack the biggest, most complicated "enterprise" features. C) Light weight db for taking care of the basic need to store data and query it with sql. (some would call these "toy" databases) D) databases which are experimental, unreliable or have other limits that make them not practical compared with the other options

I would say that with version 7.0 postgres moved from D to C (please don't get offended if this is way off base, I never used 6.x but I heard it was prone to crashes, data corruption and of course there was that pesky row size limit). It then proceeded to move up within tier C to become the best of it's class and pushing up into level B. With 8.0 it was firmly in level B. It was fast, efficient, powerful and began adding lots of really, really big features like PITR, savepoints, tablespaces, etc. Add ons like slony also allowed it to be used in places where it otherwise wouldn't have measured up.

Now there are only a few features left in the B range and so there are tons of situations that can be taken care of by postgres now that were out of it's reach just a few years ago. Once those features are all gone there will still be some very big, very difficult features on the table that once completed will begin to remove any advantage that the really big guys have. I'm thinking especially of #4 above here. But they will definitely take a while.

I may have tons of details wrong here but my point is that I think that postgres isn't just taking stuff off a big to do list, but rather is pushing itself upwards and is now in a position to start working on some very hard problems that once completed will put it into a very elite class of database systems. The "missing" list for tier B type problems is shrinking down to almost nothing and items from the tier A missing list are starting to come into view.

Maybe I'm way off base here but that's how I see it. Postgres has come a long, long way, but the problems ahead are bigger and meaner than the ones behind.


On Aug 4, 2006, at 12:02 AM, David Fetter wrote:

On Fri, Aug 04, 2006 at 12:37:10AM -0400, Tom Lane wrote:
Bruce Momjian <[EMAIL PROTECTED]> writes:
To me new things are like PITR, Win32, savepoints, two-phase
commit, partitioned tables, tablespaces.  These are from 8.0 and
8.1.  What is there in 8.2 like that?

[ shrug... ]  Five out of your six items have no basis in the SQL
spec.  So it's not clear to me what your definition of "major
feature" is, unless maybe it's "anything except what we did for
8.2".  Can you enumerate ten things you would consider comparable to
the above features that aren't done yet?

First, I'd like to say people are doing a fantastic job here.  Kudos!

One huge thing missing from the "done" list is that crucial bit of
infrastructure and process that has shortened feedback loops--hence
the beta period--by weeks if not months: the build farm.  It's now
smoothly integrated into the development process, and as a
consequence, we can realistically have a release each year. :)

As far as big missing features go, here's a short list:

* Splitting queries among CPUs--possibly even among machines--for OLAP
  loads

* In-place upgrades (pg_upgrade)

* Several varieties of replication, which I believe we as a project
  will eventually endorse and ship

* CALL

* WITH RECURSIVE

* MERGE

* Windowing functions

* On-the-fly in-line calls out to PL/your_choice without needing to
  issue DDL

* Wild-eyed feral bits of the SQL standard like SQL/MED and SQL/XML

But all that leaves out the oldest, most honored Postgres tradition:

    Breaking New Ground.

We're definitely not done yet. :)

Cheers,
D
--
David Fetter <[EMAIL PROTECTED]> http://fetter.org/
phone: +1 415 235 3778        AIM: dfetter666
                              Skype: davidfetter

Remember to vote!

---------------------------(end of broadcast)---------------------------
TIP 9: In versions below 8.0, the planner will ignore your desire to
       choose an index scan if your joining column's datatypes do not
       match



---------------------------(end of broadcast)---------------------------
TIP 6: explain analyze is your friend

Reply via email to