yjhjstz opened a new issue, #1293:
URL: https://github.com/apache/cloudberry/issues/1293
### Apache Cloudberry version
2.1.0-devel
### What happened
When attempting to collect extended statistics (dependencies) on a large
table with 10 million rows, ANALYZE fails with the following error:
ERROR: too many sample rows received from gp_acquire_sample_rows
(analyze.c:2841)
This appears to be a failure in the sampling process used by extended
statistics.
### What you think should happen instead
ANALYZE should successfully collect dependency statistics for the specified
columns.
### How to reproduce
```sql
-- Step 1: Create test table
CREATE TABLE tbl (
col1 int,
col2 int
);
-- Step 2: Insert 10 million rows with grouped values
INSERT INTO tbl
SELECT i / 10000, i / 100000
FROM generate_series(1, 10000000) s(i);
-- Step 3: Run initial ANALYZE
ANALYZE tbl;
-- Step 4: Create extended statistics on col1, col2
CREATE STATISTICS s1 (dependencies) ON col1, col2 FROM tbl;
-- Step 5: Trigger extended stats collection
ANALYZE tbl;
```
### Operating System
centos 9
### Anything else
_No response_
### Are you willing to submit PR?
- [x] Yes, I am willing to submit a PR!
### Code of Conduct
- [x] I agree to follow this project's [Code of
Conduct](https://github.com/apache/cloudberry/blob/main/CODE_OF_CONDUCT.md).
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]