Re: [I] [C++][Statistics][Docs] Clarify whether arrow::ArrayStatistics is discarded during View, Slice, and Copy operations in the documentation for arrow::Array and arrow::ArrayData [arrow]

2025-05-25 Thread via GitHub


andishgar commented on issue #46485:
URL: https://github.com/apache/arrow/issues/46485#issuecomment-2907728585

   > Can we use `Fn` or something instead? (`*data.statistics` doesn't work 
when `data.statistics.get() == nullptr`.)
   > 
   > diff --git a/cpp/src/arrow/array/data.cc b/cpp/src/arrow/array/data.cc
   > index 2e55668fb9..6da4bb5317 100644
   > --- a/cpp/src/arrow/array/data.cc
   > +++ b/cpp/src/arrow/array/data.cc
   > @@ -165,7 +165,11 @@ Result> CopyToImpl(const 
ArrayData& data,
   >  ARROW_ASSIGN_OR_RAISE(output->dictionary, 
CopyToImpl(*data.dictionary, to, copy_fn));
   >}
   >  
   > -  output->statistics = data.statistics;
   > +  if (std::is_same_v) {
   > +output->statistics = data.statistics;
   > +  } else {
   > +output->statistics = 
std::make_shared(*data.statistics);
   > +  }
   >  
   >return output;
   >  }
   
   Because of this, the following line is not completely correct. However, 
let's discuss it in another issue that I will create soon
   
   
https://github.com/apache/arrow/blob/153da3053b62f8f9bcafb87546e12a979e605734/cpp/src/arrow/array/data.cc#L179-L182


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@arrow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [I] [C++][Statistics][Docs] Clarify whether arrow::ArrayStatistics is discarded during View, Slice, and Copy operations in the documentation for arrow::Array and arrow::ArrayData [arrow]

2025-05-25 Thread via GitHub


kou commented on issue #46485:
URL: https://github.com/apache/arrow/issues/46485#issuecomment-2907722275

   Can we use `Fn` or something instead?
   (`*data.statistics` doesn't work when `data.statistics.get() == nullptr`.)
   
   ```diff
   diff --git a/cpp/src/arrow/array/data.cc b/cpp/src/arrow/array/data.cc
   index 2e55668fb9..6da4bb5317 100644
   --- a/cpp/src/arrow/array/data.cc
   +++ b/cpp/src/arrow/array/data.cc
   @@ -165,7 +165,11 @@ Result> CopyToImpl(const 
ArrayData& data,
ARROW_ASSIGN_OR_RAISE(output->dictionary, CopyToImpl(*data.dictionary, 
to, copy_fn));
  }

   -  output->statistics = data.statistics;
   +  if (std::is_same_v) {
   +output->statistics = data.statistics;
   +  } else {
   +output->statistics = 
std::make_shared(*data.statistics);
   +  }

  return output;
}
   ```


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@arrow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [I] [C++][Statistics][Docs] Clarify whether arrow::ArrayStatistics is discarded during View, Slice, and Copy operations in the documentation for arrow::Array and arrow::ArrayData [arrow]

2025-05-24 Thread via GitHub


andishgar commented on issue #46485:
URL: https://github.com/apache/arrow/issues/46485#issuecomment-2907631297

   > Given this logic, it seems the current implementation should be modified 
to copy the arrow::ArrayStatistics instead of creating a new one during the 
copy operation.
   
   1-Sorry for the confusion in my above message. What I meant to say is that 
the current implementation should be updated to copy the 
`arrow::ArrayStatistics `rather than share it during the copy operation.
   Note: `arrow::ArrayData::statistics` is a `std::shared_ptr `, so the 
following line shares the object instead of copying it.
   
https://github.com/apache/arrow/blob/8afaf95b38459f97ab12b22ebb7e1913b31025bf/cpp/src/arrow/array/data.cc#L167-L169
   
   2- Both `arrow::ArrayData::ViewOrCopy` and `arrow::ArrayData::CopyTo `use 
[CopyToImpl](https://github.com/apache/arrow/blob/153da3053b62f8f9bcafb87546e12a979e605734/cpp/src/arrow/array/data.cc#L146-L171).
 Therefore, it should be specified whether the array is being copied in the 
line below. If the array is not copied, then `arrow::ArrayData::statistics 
`should be shared (which is the current behavior). Otherwise, it should be 
copied.
   
https://github.com/apache/arrow/blob/8afaf95b38459f97ab12b22ebb7e1913b31025bf/cpp/src/arrow/array/data.cc#L167-L169
   
   My suggestion is to use the following code instead of the one above
   
   ```c++
 if (output->buffers[0]->address() == data.buffers[0]->address()) {
   // The output is a view
   output->statistics = data.statistics;
 }else {
   // The output is a copy
   output->statistics = std::make_shared(*data.statistics);
 }
 
   ``` 
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@arrow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [I] [C++][Statistics][Docs] Clarify whether arrow::ArrayStatistics is discarded during View, Slice, and Copy operations in the documentation for arrow::Array and arrow::ArrayData [arrow]

2025-05-24 Thread via GitHub


kou commented on issue #46485:
URL: https://github.com/apache/arrow/issues/46485#issuecomment-2907563853

   1. What is the difference of "copy" and "creating a new one" in " copy the` 
arrow::ArrayStatistics` instead of creating a new one during the copy 
operation"? I think "copy" create a new one with the same content.
   2. Where do we need it?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@arrow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [I] [C++][Statistics][Docs] Clarify whether arrow::ArrayStatistics is discarded during View, Slice, and Copy operations in the documentation for arrow::Array and arrow::ArrayData [arrow]

2025-05-23 Thread via GitHub


andishgar commented on issue #46485:
URL: https://github.com/apache/arrow/issues/46485#issuecomment-2903984104

   @Kou, I have several questions regarding the` CopyTo` and` ViewOrCopy` 
operations.
   
   I understand that `arrow::Array` is designed to be immutable, and based on 
that design, it's acceptable to share buffer ownership among multiple arrays 
(as I've seen in the casting compute function). However, when performing a deep 
copy on an `arrow::Array` , the intent is to produce a new array with its own 
unique buffer.
   
   Given this logic, it seems the current implementation should be modified to 
copy the` arrow::ArrayStatistics` instead of creating a new one during the copy 
operation.
   
   
https://github.com/apache/arrow/blob/8afaf95b38459f97ab12b22ebb7e1913b31025bf/cpp/src/arrow/array/data.cc#L167-L169
   1-  Is my observation correct
   2- Can the following code be used to distinguish between View and Copy 
operations on an `arrow::Buffer`?
   ```c++
   buffer_in->data() == buffer_out->data()
   ``` 
   By the way, if the answer to question one is yes, I'd like to address that 
first


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@arrow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [I] [C++][Statistics][Docs] Clarify whether arrow::ArrayStatistics is discarded during View, Slice, and Copy operations in the documentation for arrow::Array and arrow::ArrayData [arrow]

2025-05-21 Thread via GitHub


andishgar commented on issue #46485:
URL: https://github.com/apache/arrow/issues/46485#issuecomment-2897972267

   take


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@arrow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [I] [C++][Statistics][Docs] Clarify whether arrow::ArrayStatistics is discarded during View, Slice, and Copy operations in the documentation for arrow::Array and arrow::ArrayData [arrow]

2025-05-20 Thread via GitHub


pitrou commented on issue #46485:
URL: https://github.com/apache/arrow/issues/46485#issuecomment-2893878388

   Yes, please open an issue, thank you!


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@arrow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [I] [C++][Statistics][Docs] Clarify whether arrow::ArrayStatistics is discarded during View, Slice, and Copy operations in the documentation for arrow::Array and arrow::ArrayData [arrow]

2025-05-20 Thread via GitHub


andishgar commented on issue #46485:
URL: https://github.com/apache/arrow/issues/46485#issuecomment-2893796275

   > > This happens because `fixed_size_list` uses a single bitmap layout. 
Consequently, the`int32`bitmap layout is ignored in the below code, and only 
one data layout remains — which coincidentally matches the layout of int32.
   > 
   > Ah, you're right. That code is certainly too naive...
   
   @pitrou
   Should I open a separate issue for this? I may not be able to address it in 
the near future due to both my current knowledge limitations and the number of 
other issues I'm already involved with


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@arrow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [I] [C++][Statistics][Docs] Clarify whether arrow::ArrayStatistics is discarded during View, Slice, and Copy operations in the documentation for arrow::Array and arrow::ArrayData [arrow]

2025-05-20 Thread via GitHub


andishgar commented on issue #46485:
URL: https://github.com/apache/arrow/issues/46485#issuecomment-2893786141

   > > Ah, wait. Is statistics still valid with different data type...?
   > 
   > Certainly not. If you view a `int32` array as `uint32`, then the min and 
max may be different if there are negative integers.
   
   Our focus is on preserving the `statistics attributes `as much as possible, 
as I mentioned 
[here](https://github.com/apache/arrow/issues/46485#issuecomment-2888790138)—for
 example, `ARROW:average_byte_width:exact`. However, with the current behavior 
of `arrow::Array::View`, it is not possible to preserve them.
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@arrow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [I] [C++][Statistics][Docs] Clarify whether arrow::ArrayStatistics is discarded during View, Slice, and Copy operations in the documentation for arrow::Array and arrow::ArrayData [arrow]

2025-05-20 Thread via GitHub


pitrou commented on issue #46485:
URL: https://github.com/apache/arrow/issues/46485#issuecomment-2893747015

   > This happens because `fixed_size_list` uses a single bitmap layout. 
Consequently, the`int32`bitmap layout is ignored in the below code, and only 
one data layout remains — which coincidentally matches the layout of int32.
   
   Ah, you're right. That code is certainly too naive...


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@arrow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [I] [C++][Statistics][Docs] Clarify whether arrow::ArrayStatistics is discarded during View, Slice, and Copy operations in the documentation for arrow::Array and arrow::ArrayData [arrow]

2025-05-20 Thread via GitHub


andishgar commented on issue #46485:
URL: https://github.com/apache/arrow/issues/46485#issuecomment-2893733972

   > > Hmm. The important part is that `fixed_size_list(int32(), 2)` -view-> 
`list_view(int32())` -view-> `fixed_size_list(int32(), 2)` doesn't work, right? 
[@pitrou](https://github.com/pitrou) What do you think about this behavior?
   > 
   > I don't know how it would be possible to view a fixed-size list as a 
variable-size list, even though there is no offsets or views buffer available. 
It should probably return a TypeError.
   
   @pitrou As I mentioned 
[here](https://github.com/apache/arrow/issues/46485#issuecomment-2893193246), 
no type error is emitted during the conversion from` 
list_view(fixed_size_list(int32, 2))` to `list_view(int32),` even when the 
array is built with only half the expected number of values (0..8). This 
happens because `fixed_size_list` uses a single bitmap layout. Consequently, 
the` int32 `bitmap layout is ignored in the code, and only one data layout 
remains — which coincidentally matches the layout of int32.
   
   
https://github.com/apache/arrow/blob/a1707dba7c9ec4265c048586c485c526048cf162/cpp/src/arrow/array/data.cc#L873-L880


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@arrow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [I] [C++][Statistics][Docs] Clarify whether arrow::ArrayStatistics is discarded during View, Slice, and Copy operations in the documentation for arrow::Array and arrow::ArrayData [arrow]

2025-05-20 Thread via GitHub


andishgar commented on issue #46485:
URL: https://github.com/apache/arrow/issues/46485#issuecomment-2893719092

   > > Hmm. The important part is that `fixed_size_list(int32(), 2)` -view-> 
`list_view(int32())` -view-> `fixed_size_list(int32(), 2)` doesn't work, right? 
[@pitrou](https://github.com/pitrou) What do you think about this behavior?
   > 
   > I don't know how it would be possible to view a fixed-size list as a 
variable-size list, even though there is no offsets or views buffer available. 
It should probably return a TypeError.
   
   @pitrou As I mentioned 
[here](https://github.com/apache/arrow/issues/46485#issuecomment-2893193246), 
no type error is emitted, even when the array is built with only half the 
expected number of values (0..8). This is because fixed_size_list uses a single 
bitmap layout. As a result, the int32() bitmap layout is ignored in the code 
below, and only one data layout remains — which happens to match the int32 
layout.
   
   
https://github.com/apache/arrow/blob/a1707dba7c9ec4265c048586c485c526048cf162/cpp/src/arrow/array/data.cc#L873-L880
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@arrow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [I] [C++][Statistics][Docs] Clarify whether arrow::ArrayStatistics is discarded during View, Slice, and Copy operations in the documentation for arrow::Array and arrow::ArrayData [arrow]

2025-05-20 Thread via GitHub


pitrou commented on issue #46485:
URL: https://github.com/apache/arrow/issues/46485#issuecomment-2893640268

   > Hmm. The important part is that `fixed_size_list(int32(), 2)` -view-> 
`list_view(int32())` -view-> `fixed_size_list(int32(), 2)` doesn't work, right? 
[@pitrou](https://github.com/pitrou) What do you think about this behavior?
   
   I don't know how it would be possible to view a fixed-size list as a 
variable-size list, even though there is no offsets or views buffer available. 
It should probably return a TypeError.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@arrow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [I] [C++][Statistics][Docs] Clarify whether arrow::ArrayStatistics is discarded during View, Slice, and Copy operations in the documentation for arrow::Array and arrow::ArrayData [arrow]

2025-05-20 Thread via GitHub


pitrou commented on issue #46485:
URL: https://github.com/apache/arrow/issues/46485#issuecomment-2893637216

   > Ah, wait. Is statistics still valid with different data type...?
   
   Certainly not. If you view a `int32` array as `uint32`, then the min and max 
may be different if there are negative integers.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@arrow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [I] [C++][Statistics][Docs] Clarify whether arrow::ArrayStatistics is discarded during View, Slice, and Copy operations in the documentation for arrow::Array and arrow::ArrayData [arrow]

2025-05-20 Thread via GitHub


andishgar commented on issue #46485:
URL: https://github.com/apache/arrow/issues/46485#issuecomment-2893193246

   > Hmm. The important part is that `fixed_size_list(int32(), 2)` -view-> 
`list_view(int32())` -view-> `fixed_size_list(int32(), 2)` doesn't work, right? 
[@pitrou](https://github.com/pitrou) What do you think about this behavior?
   
   There are two issues:
   
   1-During the conversion from `fixed_size_list(int32(), 2)` to 
`list_view(int32())`, half of the data is lost — only the first 9 elements (`0 
to 8`) are visible in` list_view(int32())`, even though the original range was 
`0 to 17`.
   
   2-As you mentioned, converting from `list_view(int32()) `back to 
`fixed_size_list(int32(), 2)` results in an invalid array.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@arrow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [I] [C++][Statistics][Docs] Clarify whether arrow::ArrayStatistics is discarded during View, Slice, and Copy operations in the documentation for arrow::Array and arrow::ArrayData [arrow]

2025-05-19 Thread via GitHub


kou commented on issue #46485:
URL: https://github.com/apache/arrow/issues/46485#issuecomment-2892605420

   > Doesn't the following code risk causing a stack overflow? Shouldn't a 
depth limit be considered to prevent that?
   
   Because we use recursive function calls here, right?
   
   In general, a data type for view is provided by a user. It can be controlled 
by a user. If a user gets the stack overflow error, a user will report it to 
us. So I think that we don't need to care about it for now.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@arrow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [I] [C++][Statistics][Docs] Clarify whether arrow::ArrayStatistics is discarded during View, Slice, and Copy operations in the documentation for arrow::Array and arrow::ArrayData [arrow]

2025-05-19 Thread via GitHub


kou commented on issue #46485:
URL: https://github.com/apache/arrow/issues/46485#issuecomment-2892596991

   > 2-Should I think of this method as a form of casting based on 
`arrow::DataTypeLayout`?
   
   Yes.
   
   > 3-Is the following output for `view` and `view_2` acceptable?(If yes, I 
think it might make sense to drop `arrow::ArrayStatistics`.)
   
   Hmm. The important part is that `fixed_size_list(int32(), 2)` -view-> 
`list_view(int32())` -view-> `fixed_size_list(int32(), 2)` doesn't work, right? 
@pitrou What do you think about this behavior?
   
   Note that `Array::View` was implemented by #4482.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@arrow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [I] [C++][Statistics][Docs] Clarify whether arrow::ArrayStatistics is discarded during View, Slice, and Copy operations in the documentation for arrow::Array and arrow::ArrayData [arrow]

2025-05-19 Thread via GitHub


andishgar commented on issue #46485:
URL: https://github.com/apache/arrow/issues/46485#issuecomment-2890156783

   I have another question that's not directly related to this issue, but it's 
about arrow::Array::View.
   
   Doesn't the following code risk causing a stack overflow? Shouldn't a depth 
limit be considered to prevent that?
   
   
https://github.com/apache/arrow/blob/7f645d404a16e8c7c939dec70ad61f4ae4de7730/cpp/src/arrow/array/data.cc#L739-L753


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@arrow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [I] [C++][Statistics][Docs] Clarify whether arrow::ArrayStatistics is discarded during View, Slice, and Copy operations in the documentation for arrow::Array and arrow::ArrayData [arrow]

2025-05-19 Thread via GitHub


andishgar commented on issue #46485:
URL: https://github.com/apache/arrow/issues/46485#issuecomment-2890083612

   @kou
   I have two questions regarding `arrow::Array::View`:
   2-Should I think of this method as a form of casting based on 
`arrow::DataTypeLayout`?
   3-Is the following output for `view` and `view_2` acceptable?(If yes, I 
think it might make sense to drop `arrow::ArrayStatistics`.)
   
   ```c++
   Result> MakeListView() {
 auto list_view_type = list_view(fixed_size_list(int32(), 2));
 auto builder = MakeBuilder(list_view_type).ValueOrDie();
 auto list_view_builder =
 internal::checked_pointer_cast(std::move(builder));
 auto fixed_size_builder =
 
internal::checked_cast(list_view_builder->value_builder());
 auto int32_builder =
 
internal::checked_cast(fixed_size_builder->value_builder());
 for (int32_t i = 0; i < 9; ++i) {
   if (i % 3 == 0) {
 ARROW_RETURN_NOT_OK(list_view_builder->Append(true, 3));
   }
   ARROW_RETURN_NOT_OK(fixed_size_builder->Append());
 }
 ARROW_RETURN_NOT_OK(list_view_builder->AppendNull());
 for (int i = 0; i < 18; i++) {
   ARROW_RETURN_NOT_OK(int32_builder->Append(i));
 }
 return list_view_builder->Finish();
   }
   TEST(View, Test) {
 ASSERT_OK_AND_ASSIGN(auto array, MakeListView());
 auto view = array->View(list_view(int32())).ValueOrDie();
 ARROW_LOGGER_INFO("", array->ToString());
 ARROW_LOGGER_INFO("", view->ToString());
 auto view_2 = view->View(list_view(fixed_size_list(int32(), 
2))).ValueOrDie();
 ARROW_LOGGER_INFO("", view_2->ToString());
   }
   ``` 
   
   output
   
   
   
   ```
   //  ARROW_LOGGER_INFO("", array->ToString())
   
 [
   [
 0,
 1
   ],
   [
 2,
 3
   ],
   [
 4,
 5
   ]
 ],
 [
   [
 6,
 7
   ],
   [
 8,
 9
   ],
   [
 10,
 11
   ]
 ],
 [
   [
 12,
 13
   ],
   [
 14,
 15
   ],
   [
 16,
 17
   ]
 ],
 null
   ]
   
   //  ARROW_LOGGER_INFO("", view->ToString());
   
[
 [
   0,
   1,
   2
 ],
 [
   3,
   4,
   5
 ],
 [
   6,
   7,
   8
 ],
 null
   ]
   
   //  ARROW_LOGGER_INFO("", view_2->ToString());
   
   
   
   ``` 
   
   
   



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@arrow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [I] [C++][Statistics][Docs] Clarify whether arrow::ArrayStatistics is discarded during View, Slice, and Copy operations in the documentation for arrow::Array and arrow::ArrayData [arrow]

2025-05-17 Thread via GitHub


andishgar commented on issue #46485:
URL: https://github.com/apache/arrow/issues/46485#issuecomment-2888790138

   Statistical attributes might be grouped into two types: target-independent 
attributes, like` ARROW:average_byte_width:exact,` and target-dependent 
attributes, such as `min` and `max`. It seems possible to preserve 
target-independent attributes. I’ll look into it more and share my final 
thoughts


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@arrow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [I] [C++][Statistics][Docs] Clarify whether arrow::ArrayStatistics is discarded during View, Slice, and Copy operations in the documentation for arrow::Array and arrow::ArrayData [arrow]

2025-05-17 Thread via GitHub


kou commented on issue #46485:
URL: https://github.com/apache/arrow/issues/46485#issuecomment-2888741367

   Ah, wait. Is statistics still valid with different data type...?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@arrow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [I] [C++][Statistics][Docs] Clarify whether arrow::ArrayStatistics is discarded during View, Slice, and Copy operations in the documentation for arrow::Array and arrow::ArrayData [arrow]

2025-05-17 Thread via GitHub


kou commented on issue #46485:
URL: https://github.com/apache/arrow/issues/46485#issuecomment-2888740860

   Good catch. We don't need to discard `arrow::ArrayStatistics` by 
`arrow::Array::View()`.
   
   Could you open a new issue for it?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@arrow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [I] [C++][Statistics][Docs] Clarify whether arrow::ArrayStatistics is discarded during View, Slice, and Copy operations in the documentation for arrow::Array and arrow::ArrayData [arrow]

2025-05-17 Thread via GitHub


andishgar commented on issue #46485:
URL: https://github.com/apache/arrow/issues/46485#issuecomment-2888472845

   @kou, arrow::ArrayStatistics is discarded in arrow::Array::View. Is this the 
expected behavior?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@arrow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org