This is an automated email from the ASF dual-hosted git repository.
thisisnic pushed a commit to branch main
in repository https://gitbox.apache.org/repos/asf/arrow.git
The following commit(s) were added to refs/heads/main by this push:
new 3c9cb0c498 GH-47100: [Docs] Correct the Statistics schema
specification (#50092)
3c9cb0c498 is described below
commit 3c9cb0c498a11bcadae7392634713bbcfba49c9a
Author: Nic Crane <[email protected]>
AuthorDate: Thu Jun 4 10:13:12 2026 +0100
GH-47100: [Docs] Correct the Statistics schema specification (#50092)
### Rationale for this change
Docs were incorrect apparently
### What changes are included in this PR?
Update them
### Are these changes tested?
Nope
### Are there any user-facing changes?
No
* GitHub Issue: #47100
Authored-by: Nic Crane <[email protected]>
Signed-off-by: Nic Crane <[email protected]>
---
docs/source/format/StatisticsSchema.rst | 10 +++++-----
1 file changed, 5 insertions(+), 5 deletions(-)
diff --git a/docs/source/format/StatisticsSchema.rst
b/docs/source/format/StatisticsSchema.rst
index 0e0752ff56..0b7ea4b04b 100644
--- a/docs/source/format/StatisticsSchema.rst
+++ b/docs/source/format/StatisticsSchema.rst
@@ -104,14 +104,14 @@ Here is the details of top-level ``struct``:
- ``int32``
- ``true``
- The zero-based column index, or null if the statistics
- describe the whole table or record batch.
+ describe the whole record batch or array.
The column index is computed as the same rule used by
:ref:`ipc-recordbatch-message`.
* - ``statistics``
- ``map``
- ``false``
- - Statistics for the target column, table or record batch. See
+ - Statistics for the target column, record batch or array. See
the separate table below for details.
Here is the details of the ``map`` of the ``statistics``:
@@ -151,7 +151,7 @@ Standard statistics
-------------------
Each statistic kind has a name that appears as a key in the statistics
-map for each column or entire table. ``dictionary<values: utf8,
+map for each column or entire record batch or array. ``dictionary<values: utf8,
indices: int32>`` is used to encode the name for space-efficiency.
We assign different names for variations of the same statistic instead
@@ -217,11 +217,11 @@ Here are pre-defined statistics names:
- The number of nulls in the target column. (approximate)
* - ``ARROW:row_count:exact``
- ``int64``
- - The number of rows in the target table, record batch or
+ - The number of rows in the target record batch or
array. (exact)
* - ``ARROW:row_count:approximate``
- ``float64``
- - The number of rows in the target table, record batch or
+ - The number of rows in the target record batch or
array. (approximate)
If you find a statistic that might be useful to multiple systems,