Hi,
Reaching out to the development community to learn a little history. For a single-partitioned table T, with primary key columns A, B, C, if I do an UPDATE STATISTICS ON EVERY KEY (or ON EVERY COLUMN), I will get multi-column histograms for prefixes of the key. So, I’ll get a multi-column histogram for (A,B) and for (A,B,C). However, if I now salt that table, then an additional column, “_SALT_”, is prepended to the primary key. And the behavior of UPDATE STATISTICS is different. Instead of getting multi-column histograms for (“_SALT_”,A), (“_SALT_”,A,B), and (“_SALT_”,A,B,C), I get only (“_SALT_”,A,B,C). Indeed, JIRA TRAFODION-1467 complains about this very behavior. See https://issues.apache.org/jira/browse/TRAFODION-1467. Now, I was looking into fixing this JIRA, and I found the following code in sql/ustat/hs_parser.cpp: // For salted table, generate only the longest MC for the key (subject // to max cols determined above) unless a cqd is set to gen all MCs of // allowable sizes. if (CmpCommon::getDefault(USTAT_ADD_SALTED_KEY_PREFIXES_FOR_MC) == DF_OFF && hs_globals->objDef->getColNum("_SALT_", FALSE) >= 0) minMCGroupSz = numKeys; And indeed this CQD has the default value of ‘OFF’. So it appears this behavior is intentional. My question is, why? On the surface it seems a bit arbitrary to me. Depending on the answer, I’ll address the JIRA in one of the following ways: 1. Say, it’s working as intended. And if you want all the multi-column histograms, set CQD USTAT_ADD_SALTED_KEY_PREFIXES_FOR_MC ‘ON’ prior to issuing the UPDATE STATISTICS command. 2. Change the default to ‘ON’. And you’ll have to set CQD USTAT_ADD_SALTED_KEY_PREFIXES_FOR_MC ‘OFF’ to reduce the number of multi-column histograms on key prefixes in salted tables. 3. Remove the CQD altogether if there was no good reason for it. Which would give us the ‘ON’ behavior all the time. Dave
