Hi Emmanuel, Thanks for your time. This is a WIP patch and we will integrate your suggestions/comments as appropriate.
Regards, -- Kedar. On Fri, Mar 27, 2009 at 3:38 AM, Emmanuel Cecchet <m...@asterdata.com>wrote: > Hi Kedar, > > First of all, congratulations for the excellent work. > I have some comments and questions. > > In get_relevent_partition (btw, relevant is spelled with an a) you are > maintaining 2 lists. > Oops! 'a' typographical error. I guess this is only useful for multi-column partitions, right? > If you have a single column partition (without subpartitions), I think you > could directly return on the first match (without maintaining any list) > since you guarantee that there is no overlap between partitions. > A simple but effective optimization for inserts consists of caching the > last partition used (consecutive inserts often go to the same partition) and > try it first before going through the whole loop. > Yep. > > > The update trigger should first check if the tuple needs to be moved. If > the updated tuple still matches the constraints of the partitions it will > not have to be moved and will save a lot of overhead. > Yes. We agree on that. > > > The COPY operation should probably be optimized to use the same code as the > one in the insert trigger for partitioned tables. I guess some code could be > factorized in COPY to make the inserts more efficient. > > The current trigger approach should prevent other triggers to be added to > the table, or you should make sure that the partition trigger is always the > one to execute last. > As triggers are executed in order of their names, we've prefixed the trigger names with "zz". This should work fine as long as no-one uses trigger-name which starts with "zz". > > As we don't have automatic partition creation, it would be interesting to > have an optional mechanism to deal with tuples that don't match any > partition (very useful when you do bulk insert and some new data require a > new partition). Having a simple overflow partition or an error logging > mechanism would definitely help to identify these tuples and prevent things > like large COPY operations to fail. > Will get back on this. > > > > Looking forward to your responses, > Emmanuel > > >> We are implementing table partitioning feature to support Range and Hash >> partitions. Please find attached, the WIP patch and test-cases. >> >> The syntax used conforms to most of the suggestions mentioned in >> http://archives.postgresql.org/pgsql-hackers/2008-01/msg00413.php, >> barring the following: >> -- Specification of partition names is optional. System will be able to >> generate partition names in such cases. >> -- Sub partitioning >> >> We are maintaining a system catalog(pg_partition) for partition meta-data. >> System will look-up this table to find appropriate partition to operate on. >> System internally uses low-level 'C' triggers to row-movement. >> >> Regards, >> -- >> Kedar. >> >> >> >> ------------------------------------------------------------------------ >> >> >> >> > > > -- > Emmanuel Cecchet > Aster Data Systems > Web: http://www.asterdata.com > >