[arrow-nanoarrow] branch main updated: ci: Add codespell to pre-commit (#274)

paleolimbot Tue, 15 Aug 2023 07:53:55 -0700

This is an automated email from the ASF dual-hosted git repository.

paleolimbot pushed a commit to branch main
in repository https://gitbox.apache.org/repos/asf/arrow-nanoarrow.git



The following commit(s) were added to refs/heads/main by this push:
     new 9ce719a  ci: Add codespell to pre-commit (#274)
9ce719a is described below

commit 9ce719a7c73b7e6b45422dc3d1373213640f2fae
Author: William Ayd <[email protected]>
AuthorDate: Tue Aug 15 10:53:45 2023 -0400

    ci: Add codespell to pre-commit (#274)
---
 .pre-commit-config.yaml                                       | 11 +++++++++++
 dev/release/README.md                                         |  6 +++---
 docs/source/getting-started.md                                |  8 ++++----
 extensions/nanoarrow_device/README.md                         |  2 +-
 extensions/nanoarrow_ipc/src/nanoarrow/nanoarrow_ipc.h        |  2 +-
 .../nanoarrow_ipc/src/nanoarrow/nanoarrow_ipc_decoder.c       |  6 +++---
 extensions/nanoarrow_ipc/src/nanoarrow/nanoarrow_ipc_reader.c |  2 +-
 python/README.md                                              |  2 +-
 r/src/array.h                                                 |  2 +-
 r/src/array_view.h                                            |  2 +-
 r/src/convert.h                                               |  2 +-
 src/nanoarrow/array_test.cc                                   |  4 ++--
 src/nanoarrow/schema_test.cc                                  |  2 +-
 13 files changed, 31 insertions(+), 20 deletions(-)

diff --git a/.pre-commit-config.yaml b/.pre-commit-config.yaml
index 0f731a8..803299d 100644
--- a/.pre-commit-config.yaml
+++ b/.pre-commit-config.yaml
@@ -57,5 +57,16 @@ repos:
     - id: isort
       types_or: [python]
       exclude: "__init__.py$"
+  - repo: https://github.com/codespell-project/codespell
+    rev: v2.2.5
+    hooks:
+    -   id: codespell
+        types_or: [rst, markdown, c, c++]
+        additional_dependencies: [tomli]
+        exclude: |
+            (?x)
+            ^extensions/nanoarrow_ipc/thirdparty
+            |nanoarrow_ipc_flatcc_generated.h
+            |nanoarrow_device_metal.cc
 
 exclude: "^dist"
diff --git a/dev/release/README.md b/dev/release/README.md
index c9c74e1..474e3fe 100644
--- a/dev/release/README.md
+++ b/dev/release/README.md
@@ -102,7 +102,7 @@ You can install R using the instructions provided on the
 ### Conda (Linux and MacOS)
 
 Using `conda`, one can install all requirements needed for verification on 
Linux
-or MacOS. Users are reccomended to install `gnupg` using
+or MacOS. Users are recommended to install `gnupg` using
 a system installer because of interactions with other installations that
 may cause a crash.
 
@@ -281,7 +281,7 @@ is release-candidate worthy, `git push` the tag to the 
`upstream` repository
 This will kick off a
 [packaging 
workflow](https://github.com/apache/arrow-nanoarrow/blob/main/.github/workflows/packaging.yaml)
 that will create a GitHub release and upload assets that are required for
-later steps. This step can be done by any Arrow comitter.
+later steps. This step can be done by any Arrow committer.
 
 Next, all assets need to be signed by somebody whose GPG key is listed in the
 [Arrow developers KEYS file](https://dist.apache.org/repos/dist/dev/arrow/KEYS)
@@ -299,7 +299,7 @@ dev/release/02-sign.sh 0.2.0 0
 
 Finally, run
 
[03-source.sh](https://github.com/apache/arrow-nanoarrow/blob/main/dev/release/03-source.sh).
-This step can be done by any Arrow comitter. The caller of this script does 
not need to
+This step can be done by any Arrow committer. The caller of this script does 
not need to
 be on any particular branch but *does* need the
 
[dev/release/.env](https://github.com/apache/arrow-nanoarrow/blob/main/dev/release/.env.example)
 file to exist setting the appropriate `APACHE_USERNAME` environment variable.
diff --git a/docs/source/getting-started.md b/docs/source/getting-started.md
index 41b3515..b4c61c1 100644
--- a/docs/source/getting-started.md
+++ b/docs/source/getting-started.md
@@ -181,7 +181,7 @@ interface and a few conventions used in the nanoarrow 
implementation.
 
 First, let's discuss the `ArrowSchema` and the `ArrowArray`. You can think of 
an
 `ArrowSchema` as an expression of a data type, whereas an `ArrowArray` is the
-data itself. These structures accomodate nested types: columns are encoded in
+data itself. These structures accommodate nested types: columns are encoded in
 the `children` member of each. You always need to know the data type of an
 `ArrowArray` before accessing its contents. In our case we only operate on 
arrays
 of one type ("string") and document that in our interface; for functions that
@@ -307,7 +307,7 @@ target_link_libraries(linesplitter PRIVATE nanoarrow)
 ```
 
 After saving `CMakeLists.txt`, you may have to close and re-open the 
`linesplitter`
-directory in VSCode to activate the CMake integration. From the command pallete
+directory in VSCode to activate the CMake integration. From the command palette
 (i.e., Control/Command-Shift-P), choose **CMake: Build**. If all went well, 
you should
 see a few lines of output indicating progress towards building and linking 
`linesplitter`.
 
@@ -472,8 +472,8 @@ gtest_discover_tests(linesplitter_test)
 
 After you're done, build the project again using the **CMake: Build** command 
from
 the command palette. If all goes well, choose **CMake: Refresh Tests** and then
-**Test: Run All Tests** from the command pallete to run them! You should see 
some
-output indiciating that tests ran successfully, or you can use VSCode's 
"Testing"
+**Test: Run All Tests** from the command palette to run them! You should see 
some
+output indicating that tests ran successfully, or you can use VSCode's 
"Testing"
 panel to visually inspect which tests passed.
 
 ```{=rst}
diff --git a/extensions/nanoarrow_device/README.md 
b/extensions/nanoarrow_device/README.md
index e4fd6f6..1b6d9a9 100644
--- a/extensions/nanoarrow_device/README.md
+++ b/extensions/nanoarrow_device/README.md
@@ -24,7 +24,7 @@ extended to the
 [Arrow C 
Device](https://arrow.apache.org/docs/dev/format/CDeviceDataInterface.html)
 interfaces in the Arrow specification.
 
-Currently, this extension provides an implementation fof CUDA devices
+Currently, this extension provides an implementation of CUDA devices
 and an implementation for the default Apple Metal device on MacOS/M1.
 These implementation are preliminary/experimental and are under active
 development.
diff --git a/extensions/nanoarrow_ipc/src/nanoarrow/nanoarrow_ipc.h 
b/extensions/nanoarrow_ipc/src/nanoarrow/nanoarrow_ipc.h
index 4078708..84d0885 100644
--- a/extensions/nanoarrow_ipc/src/nanoarrow/nanoarrow_ipc.h
+++ b/extensions/nanoarrow_ipc/src/nanoarrow/nanoarrow_ipc.h
@@ -222,7 +222,7 @@ ArrowErrorCode ArrowIpcDecoderVerifyHeader(struct 
ArrowIpcDecoder* decoder,
 /// decoder.codec is set and a successful call can be followed by a call to
 /// ArrowIpcDecoderDecodeArray().
 ///
-/// In almost all cases this should be preceeded by a call to
+/// In almost all cases this should be preceded by a call to
 /// ArrowIpcDecoderVerifyHeader() to ensure decoding does not access data 
outside of the
 /// specified buffer.
 ///
diff --git a/extensions/nanoarrow_ipc/src/nanoarrow/nanoarrow_ipc_decoder.c 
b/extensions/nanoarrow_ipc/src/nanoarrow/nanoarrow_ipc_decoder.c
index 9583565..1f37aaa 100644
--- a/extensions/nanoarrow_ipc/src/nanoarrow/nanoarrow_ipc_decoder.c
+++ b/extensions/nanoarrow_ipc/src/nanoarrow/nanoarrow_ipc_decoder.c
@@ -56,7 +56,7 @@ struct ArrowIpcField {
   // array is scratch space for any intermediary allocations (i.e., it is 
never moved
   // to the user).
   struct ArrowArray* array;
-  // The cumulative number of buffers preceeding this node.
+  // The cumulative number of buffers preceding this node.
   int64_t buffer_offset;
 };
 
@@ -1068,7 +1068,7 @@ ArrowErrorCode ArrowIpcDecoderDecodeHeader(struct 
ArrowIpcDecoder* decoder,
                     ns(MessageHeader_type_name(decoder->message_type)));
       return ENOTSUP;
     default:
-      ArrowErrorSet(error, "Unnown message type: %d", 
(int)(decoder->message_type));
+      ArrowErrorSet(error, "Unknown message type: %d", 
(int)(decoder->message_type));
       return EINVAL;
   }
 
@@ -1570,7 +1570,7 @@ static ArrowErrorCode ArrowIpcDecoderDecodeArrayInternal(
 
   // If validation is going to happen it has already occurred; however, the 
part of
   // ArrowArrayFinishBuilding() that allocates a data buffer if the data 
buffer is
-  // NULL (required for compatability with Arrow <= 9.0.0) assumes CPU data 
access
+  // NULL (required for compatibility with Arrow <= 9.0.0) assumes CPU data 
access
   // and thus needs a validation level >= default.
   if (validation_level >= NANOARROW_VALIDATION_LEVEL_DEFAULT) {
     NANOARROW_RETURN_NOT_OK(
diff --git a/extensions/nanoarrow_ipc/src/nanoarrow/nanoarrow_ipc_reader.c 
b/extensions/nanoarrow_ipc/src/nanoarrow/nanoarrow_ipc_reader.c
index 15b7c4b..2d18900 100644
--- a/extensions/nanoarrow_ipc/src/nanoarrow/nanoarrow_ipc_reader.c
+++ b/extensions/nanoarrow_ipc/src/nanoarrow/nanoarrow_ipc_reader.c
@@ -212,7 +212,7 @@ static int ArrowIpcArrayStreamReaderNextHeader(
   if (bytes_read == 0) {
     // The caller might not use this error message (e.g., if the end of the 
stream
     // is one of the valid outcomes) but we set the error anyway in case it 
gets
-    // propagated higher (e.g., if the stream is emtpy and there's no schema 
message)
+    // propagated higher (e.g., if the stream is empty and there's no schema 
message)
     ArrowErrorSet(&private_data->error, "No data available on stream");
     return ENODATA;
   } else if (bytes_read != 8) {
diff --git a/python/README.md b/python/README.md
index 3a80ee9..2051180 100644
--- a/python/README.md
+++ b/python/README.md
@@ -157,7 +157,7 @@ reader = pa.RecordBatchReader.from_batches(pa_array.schema, 
[pa_array])
 array_stream = na.array_stream(reader)
 ```
 
-You can pull the next array from the stream using `.get_next()` or use it like 
an interator. The `.get_next()` method will return `None` when there are no 
more arrays in the stream.
+You can pull the next array from the stream using `.get_next()` or use it like 
an iterator. The `.get_next()` method will return `None` when there are no more 
arrays in the stream.
 
 
 ```python
diff --git a/r/src/array.h b/r/src/array.h
index 5d0b9ac..f45d634 100644
--- a/r/src/array.h
+++ b/r/src/array.h
@@ -140,7 +140,7 @@ static inline void array_export(SEXP array_xptr, struct 
ArrowArray* array_copy)
     UNPROTECT(1);
   }
 
-  // Swap out any children for independently releasable chilren and export them
+  // Swap out any children for independently releasable children and export 
them
   // into array_copy->children
   result = ArrowArrayAllocateChildren(array_copy, array->n_children);
   if (result != NANOARROW_OK) {
diff --git a/r/src/array_view.h b/r/src/array_view.h
index a2ab79c..8eeeccd 100644
--- a/r/src/array_view.h
+++ b/r/src/array_view.h
@@ -25,7 +25,7 @@
 
 // Creates an external pointer to a struct ArrowArrayView, erroring
 // if the validation inherent in its creation fails (i.e., calling
-// this will aslo validate the array). This requires that array_xptr
+// this will also validate the array). This requires that array_xptr
 // has a schema attached. The ArrowArrayView is an augmented structure
 // provided by the nanoarrow C library that makes it easier to access
 // elements and buffers. This is not currently exposed at the R
diff --git a/r/src/convert.h b/r/src/convert.h
index 0397447..3641295 100644
--- a/r/src/convert.h
+++ b/r/src/convert.h
@@ -45,7 +45,7 @@ int nanoarrow_converter_set_array(SEXP converter_xptr, SEXP 
array_xptr);
 int nanoarrow_converter_reserve(SEXP converter_xptr, R_xlen_t additional_size);
 
 // Materialize the next n elements into the output. Returns the number of 
elements
-// that were actualy materialized which may be less than n.
+// that were actually materialized which may be less than n.
 R_xlen_t nanoarrow_converter_materialize_n(SEXP converter_xptr, R_xlen_t n);
 
 // Materialize the entire array into the output. Returns an errno code.
diff --git a/src/nanoarrow/array_test.cc b/src/nanoarrow/array_test.cc
index 0454299..d6702f2 100644
--- a/src/nanoarrow/array_test.cc
+++ b/src/nanoarrow/array_test.cc
@@ -1744,7 +1744,7 @@ TEST(ArrayTest, ArrayViewTestStruct) {
   EXPECT_EQ(array_view.layout.buffer_type[0], NANOARROW_BUFFER_TYPE_VALIDITY);
   EXPECT_EQ(array_view.layout.element_size_bits[0], 1);
 
-  // Exepct error for out-of-memory
+  // Expect error for out-of-memory
   EXPECT_EQ(ArrowArrayViewAllocateChildren(
                 &array_view, std::numeric_limits<int64_t>::max() / 
sizeof(void*)),
             ENOMEM);
@@ -1760,7 +1760,7 @@ TEST(ArrayTest, ArrayViewTestStruct) {
   EXPECT_EQ(array_view.buffer_views[0].size_bytes, 1);
   EXPECT_EQ(array_view.children[0]->buffer_views[1].size_bytes, 5 * 
sizeof(int32_t));
 
-  // Exepct error for attempting to allocate a children array that already 
exists
+  // Except error for attempting to allocate a children array that already 
exists
   EXPECT_EQ(ArrowArrayViewAllocateChildren(&array_view, 1), EINVAL);
 
   ArrowArrayViewReset(&array_view);
diff --git a/src/nanoarrow/schema_test.cc b/src/nanoarrow/schema_test.cc
index c7419e9..f2d291f 100644
--- a/src/nanoarrow/schema_test.cc
+++ b/src/nanoarrow/schema_test.cc
@@ -340,7 +340,7 @@ TEST(SchemaTest, SchemaInitUnion) {
             NANOARROW_OK);
   EXPECT_STREQ(schema.format, "+us:");
   EXPECT_EQ(schema.n_children, 0);
-  // The zero-case union isn't supported by Arrow C++'s C data inferface 
implementation
+  // The zero-case union isn't supported by Arrow C++'s C data interface 
implementation
   schema.release(&schema);
 
   ArrowSchemaInit(&schema);

[arrow-nanoarrow] branch main updated: ci: Add codespell to pre-commit (#274)

Reply via email to