sjperkins commented on a change in pull request #8510:
URL: https://github.com/apache/arrow/pull/8510#discussion_r722173704



##########
File path: cpp/src/arrow/extension_type_test.cc
##########
@@ -333,4 +334,144 @@ TEST_F(TestExtensionType, ValidateExtensionArray) {
   ASSERT_OK(ext_arr4->ValidateFull());
 }
 
+class TensorArray : public ExtensionArray {
+ public:
+  using ExtensionArray::ExtensionArray;
+};
+
+class TensorArrayType : public ExtensionType {
+ public:
+  explicit TensorArrayType(const std::shared_ptr<DataType>& type,
+                           const std::vector<int64_t>& shape,
+                           const std::vector<int64_t>& strides)
+      : ExtensionType(type), type_(type), shape_(shape), strides_(strides) {}
+
+  std::shared_ptr<DataType> type() const { return type_; }
+  std::vector<int64_t> shape() const { return shape_; }
+  std::vector<int64_t> strides() const { return strides_; }
+
+  std::string extension_name() const override {
+    std::stringstream s;
+    s << "ext-array-tensor-type<type=" << *storage_type() << ", shape=(";
+    for (uint64_t i = 0; i < shape_.size(); i++) {
+      s << shape_[i];
+      if (i < shape_.size() - 1) {
+        s << ", ";
+      }
+    }
+    s << "), strides=(";
+    for (uint64_t i = 0; i < strides_.size(); i++) {
+      s << strides_[i];
+      if (i < strides_.size() - 1) {
+        s << ", ";
+      }
+    }
+    s << ")>";
+    return s.str();
+  }
+
+  bool ExtensionEquals(const ExtensionType& other) const override {
+    return this->shape() == static_cast<const TensorArrayType&>(other).shape();

Review comment:
       > In that case do we even want to keep ndim for equality comparison?
   
   This is a good question. I might lean towards saying not, but I'm not a 
maintainer. I guess it depends on how the type is used throughout the rest of 
the Arrow ecosystem -- you mentioned the Compute Engine for example.
   
   For instance, Numba parameterise their 
[Array](https://github.com/numba/numba/blob/e617b39a0b4b23d7b69d16f482fd66b4ac6cc307/numba/core/types/npytypes.py#L423)
 type on `dtype`, `ndim`, `layout`, `readonly` and `alignment`, I would guess 
because these properties are useful for generating efficient LLVM code.
   
   I'm not suggesting that Arrow should copy Numba's Type parameterisation, but 
there are enough similarities to provide food for thought.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


Reply via email to