Hi, Attached is an updated version of the patch series, fixing the issues reported by Mark Dilger:
1) Fix fabs() issue in histogram.c. 2) Do not rely on extra_data being StdAnalyzeData, and instead lookup the LT operator explicitly. This also adds a simple regression tests to make sure ANALYZE on arrays works fine, but perhaps we should invent some simple queries too. 3) I've removed / clarified some of the comments mentioned by Mark. 4) I haven't changed how the statistics kinds are defined in relation.h, but I agree there should be a comment explaining how STATS_EXT_INFO_* relate to StatisticExtInfo.kinds. 5) The most significant change happened histograms. There used to be two structures for histograms: - MVHistogram - expanded (no deduplication etc.), result of histogram build and never used for estimation - MVSerializedHistogram - deduplicated to save space, produced from MVHistogram before storing in pg_statistic_ext and never used for estimation So there wasn't really any reason to expose the "non-serialized" version outside histogram.c. It was just confusing and unnecessary, so I've moved MVHistogram to histogram.c (and renamed it to MVHistogramBuild), and renamed MVSerializedHistogram. And same for the MVBucket stuff. So now we only deal with MVHistogram everywhere, except in histogram.c. 6) I've also made MVHistogram to include a varlena header directly (and be packed as a bytea), which allows us to store it without having to call any serialization functions). I guess if we should do (5) and (6) for the MCV lists too, it seems more convenient than the current approach. And perhaps even for the statistics added to 9.6 (it does not change the storage format). regards -- Tomas Vondra http://www.2ndQuadrant.com PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
0001-multivariate-MCV-lists.patch.gz
Description: application/gzip
0002-multivariate-histograms.patch.gz
Description: application/gzip