Hello,
Let me share our trial to support the min/max statistics per record batch.
https://github.com/heterodb/pg-strom/wiki/806:-Apache-Arrow-Min-Max-Statistics-Hint
The latest pg2arrow supports --stat option that can specify the
columns to include min/max statistics
for each record batch.
The st
Unfortunately FieldNode is a `struct` instead of a `table`, so fields may
not be added or deprecated.
On Thu, Feb 18, 2021, 04:38 Antoine Pitrou wrote:
>
> Le 18/02/2021 à 04:37, Micah Kornfield a écrit :
> > There is key-value metadata available on Message which might be able to
> > work in the
Le 18/02/2021 à 04:37, Micah Kornfield a écrit :
> There is key-value metadata available on Message which might be able to
> work in the short term (some sort of encoded message). I think
> standardizing how we store statistics per batch does make sense.
>
> We unfortunately can't add anything
>
> What is the parallel-list means?
Something like:
table RecordBatch {
nodes: [FieldNode];
// Statistics related to the data represented by each FieldNode
// This field is either length=0 or has the same length as nodes.
statistics: [Statistic];
}
On Wed, Feb 17, 2021 at 8:34 PM Kohei KaiGai
Thanks for the clarification.
> There is key-value metadata available on Message which might be able to
> work in the short term (some sort of encoded message). I think
> standardizing how we store statistics per batch does make sense.
>
For example, JSON array of min/max values as a key-value me
There is key-value metadata available on Message which might be able to
work in the short term (some sort of encoded message). I think
standardizing how we store statistics per batch does make sense.
We unfortunately can't add anything to field-node without breaking
compatibility. But another o
Hello,
Does Apache Arrow have any standard way to embed min/max values of the fields
per record-batch basis?
It looks FieldNode supports neither dedicated min/max attribute nor
custom-metadata.
https://github.com/apache/arrow/blob/master/format/Message.fbs#L28
If we embed an array of min/max valu