On 17 April 2015 at 14:54, Petr Jelinek <p...@2ndquadrant.com> wrote:


> I agree that DDL patch is not that important to get in (and I made it last
> patch in the series now), which does not mean somebody can't write the
> extension with new tablesample method.
>
>
> In any case attached another version.
>
> Changes:
> - I addressed the comments from Michael
>
> - I moved the interface between nodeSampleScan and the actual sampling
> method to it's own .c file and added TableSampleDesc struct for it. This
> makes the interface cleaner and will make it more straightforward to extend
> for subqueries in the future (nothing really changes just some functions
> were renamed and moved). Amit suggested this at some point and I thought
> it's not needed at that time but with the possible future extension to
> subquery support I changed my mind.
>
> - renamed heap_beginscan_ss to heap_beginscan_sampling to avoid confusion
> with sync scan
>
> - reworded some things and more typo fixes
>
> - Added two sample contrib modules demonstrating row limited and time
> limited sampling. I am using linear probing for both of those as the
> builtin block sampling is not well suited for row limited or time limited
> sampling. For row limited I originally thought of using the Vitter's
> reservoir sampling but that does not fit well with the executor as it needs
> to keep the reservoir of all the output tuples in memory which would have
> horrible memory requirements if the limit was high. The linear probing
> seems to work quite well for the use case of "give me 500 random rows from
> table".
>

For me, the DDL changes are something we can leave out for now, as a way to
minimize the change surface.

I'm now moving to final review of patches 1-5. Michael requested patch 1 to
be split out. If I commit, I will keep that split, but I am considering all
of this as a single patchset. I've already spent a few days reviewing, so I
don't expect this will take much longer.

-- 
Simon Riggs                http://www.2ndQuadrant.com/
<http://www.2ndquadrant.com/>
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Reply via email to