Simon,

I think you should read the thread and the patch before making any false statements like you did in your email.

1. The patch does not use any trigger for routing.
2. This is just an option for COPY that is useful for loading operations in the datawarehouse world. It is not meant to implement full partitioning as explained many times already in this thread. 3. This patch elaborates on existing mechanisms and cannot rely on a meta-data representation of partitions which does not exist yet and will probably not exist in 8.5

You should justify your statements when you say 'potentially buggy in its approach to developing a cache and using trigger-like stuff'. I understand that you don't like it because this is not what you want but this is not my fault. This is not an implementation of partitioning like COPY does not do update/delete/alter/... And yes the use case is 'narrow' like any option in COPY. It is like complaining that the CSV option is not useful because you want to load binary dumps.

If Itagaki gets the support of the community to get his implementation accepted, I will gladly use it. Contributing? If Aster is willing to contribute a code monkey to implement your specs, why not but you will have to convince them.

You should really think twice about the style of your emails that cast a detestable tone to discussions on pg-hackers.

Emmanuel


On Wed, 2009-11-11 at 19:53 -0500, Emmanuel Cecchet wrote:
Hi,
I have extracted the partitioning option for COPY (removed the error logging part) from the previous patch.
We can use an INSERT trigger to route tuples into partitions even now.
Why do you need an additional router for COPY?
Tom has already explained on the list why using a trigger was a bad idea (and I know we can use a trigger since I am the one who wrote it). If you look at the code you will see that you can do optimizations in the COPY code that you cannot do in the trigger.

 Also, it would be nicer
that the router can works not only in COPY but also in INSERT.
As 8.5 will at best provide a syntactic hack on top of the existing constraint implementation, I think that it will not hurt to have routing in COPY since we will not have it anywhere otherwise.
BTW, I'm working on meta data of partitioning now. Your "partitioning"
option in COPY could be replaced with the catalog.
This implementation is only for the current 8.5 and it will not be needed anymore once we get a fully functional partitioning in Postgres which seems to be for a future version.

Yes, the trigger way of doing this is a bad way.
I regret to say that the way proposed here isn't much better, AFAICS.
Let me explain why I think that, but -1 to anyone applying this patch.

This patch proposes keeping a cache of last visited partitions to reduce
the overhead of data routing.

What I've requested is that partitioning work by using a data structure
held in relcache for inheritance parents. This differs in 3 ways from
this patch
a) it has a clearly defined location for the cached metadata, with
clearly identified and well worked out mechanisms for cache invalidation
b) the cache can be built once when it is first needed, not slowly grown
as parts of the metadata are used
c) it would be available for all parts of the server, not just COPY.

The easiest way to build that metadata is when structured partitioning
info is available. i.e. the best next action is to complete and commit
Itagaki's partitioning syntax patch. Then we can easily build the
metadata for partitioning, which can then be used in COPY for data
routing.

Anyway, I want data routing, as is the intention of this patch. I just
don't think this patch is a useful way to do it. It is too narrow in its
scope and potentially buggy in its approach to developing a cache and
using trigger-like stuff.
ISTM that with the right metadata in the right place, a cleaner and
easier solution is still possible for 8.5. The code within COPY should
really just reduce to a small piece of code to derive the correct
relation for the desired row and then use that during heap_insert().

I have just discussed partitioning with Itagaki-san at JPUG, so I know
his plans. Itagaki-san and Manu, please can you work together to make
this work for 8.5?
---

A more detailed explanation of Partitioning Metadata:

Partitioning Metadata is information held on the relcache for a table
that has child partitions. Currently, a table does not cache info about
its children, which prevents various optimisations.

We would have an extra pointer on the Relation struct that points to a
PartitioningMetadata struct. We can fill in this information when we
construct the relcache for a relation, or we can populate it on demand
the first time we attempt to use that information (if it exists).

We want to hold an array of partition boundary values. This will then
allow us to use bsearch to find the partition that a specific value
applies to. Thus it can be used for routing data from INSERTs or COPY,
can be used for identifying which partitions need to be
included/excluded from an APPEND node. Using this will be O(logN) rather
than O(N), so allowing us to have much larger number of partitions when
required. Note that it can also be used within the executor to perform
dynamic partition elimination, thus allowing us to easily implement
partition aware joins etc.

To construct the array we must sort the partition boundary values and
prove that the partition definitions do not overlap. That is much easier
to do when the partitions are explicitly defined. (Plus, there is no
requirement to have, or mechanism to specify, unique partitions
currently, although most users assume this in their usage).

I imagine we would have an API called something like
RelationIdentifyPartition() where we provide value(s) for the
PartitioningKey column(s) and we then return the Oid of the partition
that holds that value. That function would build the metadata, if not
already cached, then bsearch it to provide the Oid.



--
Emmanuel Cecchet
Aster Data
Web: http://www.asterdata.com


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Reply via email to