[jira] [Created] (ARROW-17300) [Java][Docs] Compare/contrast the Netty and Unsafe memory backends
David Li created ARROW-17300: Summary: [Java][Docs] Compare/contrast the Netty and Unsafe memory backends Key: ARROW-17300 URL: https://issues.apache.org/jira/browse/ARROW-17300 Project: Apache Arrow Issue Type: Improvement Components: Documentation, Java Reporter: David Li We should compare why you might want to use each. Are there benchmarks in the Java benchmark suite that might also be useful? -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (ARROW-17299) Expose the Scanner kDefaultBatchReadahead and kDefaultFragmentReadahead parameters
Ziheng Wang created ARROW-17299: --- Summary: Expose the Scanner kDefaultBatchReadahead and kDefaultFragmentReadahead parameters Key: ARROW-17299 URL: https://issues.apache.org/jira/browse/ARROW-17299 Project: Apache Arrow Issue Type: New Feature Components: C++, Python Reporter: Ziheng Wang Assignee: Ziheng Wang In the Scanner there are parameters kDefaultFragmentReadahead and kDefaultBatchReadahead that are currently set to fixed numbers that cannot be changed. This is not great because tuning these numbers is the key to tradeoff RAM usage and network IO utilization during reading. For example on an i3.2xlarge instance on AWS you can get peak throughput only by quadrupling kDefaultFragmentReadahead from the default. The current settings are very conservative and assume a < 1Gbps network. Exposing them allow people to tune the Scanner behavior to their own hardware. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (ARROW-17298) [C++][Docs] Add Acero project example in Getting Started Section
Will Jones created ARROW-17298: -- Summary: [C++][Docs] Add Acero project example in Getting Started Section Key: ARROW-17298 URL: https://issues.apache.org/jira/browse/ARROW-17298 Project: Apache Arrow Issue Type: Improvement Components: Documentation Reporter: Will Jones Fix For: 10.0.0 >From [~westonpace]: {quote} A request I've seen a few times (and just received now) has been... Can you point me at a sample C++ starter project that links against Acero? For example, I tend to use a CMakeLists.txt that looks something like... cmake_minimum_required(VERSION 3.10) {code} set(CMAKE_MODULE_PATH "${CMAKE_SOURCE_DIR}/cmake;${CMAKE_MODULE_PATH}") set(CMAKE_CXX_FLAGS "-Wall -Wextra") # set(CMAKE_CXX_FLAGS_DEBUG "-g") set(CMAKE_CXX_FLAGS_RELEASE "-O3") # set the project name project(Experiments VERSION 1.0) # specify the C++ standard set(CMAKE_CXX_STANDARD 17) set(CMAKE_CXX_STANDARD_REQUIRED True) set(CMAKE_EXPORT_COMPILE_COMMANDS ON) if(NOT DEFINED CONDA_HOME) message(FATAL_ERROR "CONDA_HOME is a required variable") endif() include_directories(SYSTEM ${CONDA_HOME}/include) link_directories(${CONDA_HOME}/lib64) link_directories(${CONDA_HOME}/lib) function(experiment TARGET) add_executable( ${TARGET} ${TARGET}.cc ) target_link_libraries( ${TARGET} arrow arrow_dataset parquet aws-cpp-sdk-core aws-cpp-sdk-s3 glog pthread re2 utf8proc lz4 snappy z zstd aws-cpp-sdk-identity-management thrift ) if (MSVC) target_compile_options(${TARGET} PRIVATE /W4 /WX) else () target_compile_options(${TARGET} PRIVATE -Wall -Wextra -Wpedantic -Werror) endif () endfunction() experiment(arrow_16642) {code} {quote} -- This message was sent by Atlassian Jira (v8.20.10#820010)
[GitHub] [arrow-nanoarrow] lidavidm commented on pull request #10: Implement bitmap setters, getters, and element-wise builder
lidavidm commented on PR #10: URL: https://github.com/apache/arrow-nanoarrow/pull/10#issuecomment-1204391711 > I'm not sure which buffer you mean! The `struct ArrowBuffer`? Sorry, I mean the buffer of values inside the bitmap builder -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@arrow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [arrow-nanoarrow] paleolimbot commented on pull request #10: Implement bitmap setters, getters, and element-wise builder
paleolimbot commented on PR #10: URL: https://github.com/apache/arrow-nanoarrow/pull/10#issuecomment-1204391097 I'm not sure which buffer you mean! The `struct ArrowBuffer`? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@arrow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[jira] [Created] (ARROW-17297) [Java] [Doc] C++ to Java via C Data Interface (ImportArray, ImportRecordBatch)
David Dali Susanibar Arce created ARROW-17297: - Summary: [Java] [Doc] C++ to Java via C Data Interface (ImportArray, ImportRecordBatch) Key: ARROW-17297 URL: https://issues.apache.org/jira/browse/ARROW-17297 Project: Apache Arrow Issue Type: Task Components: Documentation, Java Reporter: David Dali Susanibar Arce Add detail documentation about how-to interact between C++ to Java via C Data Interface for: * ImportArray * ImportRecordBatch -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (ARROW-17296) [Python] Doctest failure in pyarrow.parquet.read_metadata after 10.0.0 dev version update
Wes McKinney created ARROW-17296: Summary: [Python] Doctest failure in pyarrow.parquet.read_metadata after 10.0.0 dev version update Key: ARROW-17296 URL: https://issues.apache.org/jira/browse/ARROW-17296 Project: Apache Arrow Issue Type: Bug Components: Python Reporter: Wes McKinney Fix For: 10.0.0 The version update caused the doctest in this function to fail -- This message was sent by Atlassian Jira (v8.20.10#820010)
[GitHub] [arrow-nanoarrow] lidavidm commented on a diff in pull request #10: Implement bitmap setters, getters, and element-wise builder
lidavidm commented on code in PR #10: URL: https://github.com/apache/arrow-nanoarrow/pull/10#discussion_r937046260 ## src/nanoarrow/typedefs_inline.h: ## @@ -0,0 +1,172 @@ +// Licensed to the Apache Software Foundation (ASF) under one +// or more contributor license agreements. See the NOTICE file +// distributed with this work for additional information +// regarding copyright ownership. The ASF licenses this file +// to you under the Apache License, Version 2.0 (the +// "License"); you may not use this file except in compliance +// with the License. You may obtain a copy of the License at +// +// http://www.apache.org/licenses/LICENSE-2.0 +// +// Unless required by applicable law or agreed to in writing, +// software distributed under the License is distributed on an +// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY +// KIND, either express or implied. See the License for the +// specific language governing permissions and limitations +// under the License. + +#ifndef NANOARROW_TYPEDEFS_INLINE_H_INCLUDED +#define NANOARROW_TYPEDEFS_INLINE_H_INCLUDED + +#include + +/// \defgroup nanoarrow-inline-typedef Type definitions used in inlined implementations Review Comment: This still needs the `extern "C"` ## src/nanoarrow/bitmap_inline.h: ## @@ -0,0 +1,131 @@ +// Licensed to the Apache Software Foundation (ASF) under one +// or more contributor license agreements. See the NOTICE file +// distributed with this work for additional information +// regarding copyright ownership. The ASF licenses this file +// to you under the Apache License, Version 2.0 (the +// "License"); you may not use this file except in compliance +// with the License. You may obtain a copy of the License at +// +// http://www.apache.org/licenses/LICENSE-2.0 +// +// Unless required by applicable law or agreed to in writing, +// software distributed under the License is distributed on an +// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY +// KIND, either express or implied. See the License for the +// specific language governing permissions and limitations +// under the License. + +#ifndef NANOARROW_BITMAP_INLINE_H_INCLUDED +#define NANOARROW_BITMAP_INLINE_H_INCLUDED + +#include +#include + +#include "buffer_inline.h" +#include "typedefs_inline.h" + +static inline int8_t ArrowBitmapElement(const void* bitmap, int64_t i) { Review Comment: It may be worth stealing Arrow's implementations of these utilities: https://github.com/apache/arrow/blob/3b987d92d14ce7b5f5ccd2afb7366273e20348d4/cpp/src/arrow/util/bit_util.h#L296 ## src/nanoarrow/bitmap_inline.h: ## @@ -0,0 +1,131 @@ +// Licensed to the Apache Software Foundation (ASF) under one +// or more contributor license agreements. See the NOTICE file +// distributed with this work for additional information +// regarding copyright ownership. The ASF licenses this file +// to you under the Apache License, Version 2.0 (the +// "License"); you may not use this file except in compliance +// with the License. You may obtain a copy of the License at +// +// http://www.apache.org/licenses/LICENSE-2.0 +// +// Unless required by applicable law or agreed to in writing, +// software distributed under the License is distributed on an +// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY +// KIND, either express or implied. See the License for the +// specific language governing permissions and limitations +// under the License. + +#ifndef NANOARROW_BITMAP_INLINE_H_INCLUDED +#define NANOARROW_BITMAP_INLINE_H_INCLUDED + +#include +#include + +#include "buffer_inline.h" +#include "typedefs_inline.h" + +static inline int8_t ArrowBitmapElement(const void* bitmap, int64_t i) { + const int8_t* bitmap_char = (const int8_t*)bitmap; + return 0 != (bitmap_char[i / 8] & ((int8_t)0x01) << (i % 8)); +} + +static inline void ArrowBitmapSetElement(void* bitmap, int64_t i, int8_t value) { + int8_t* bitmap_char = (int8_t*)bitmap; + int8_t mask = 0x01 << (i % 8); + if (value) { +bitmap_char[i / 8] |= mask; + } else { +bitmap_char[i / 8] &= ~mask; + } +} + +static inline int64_t ArrowBitmapCountTrue(const void* bitmap, int64_t i_from, + int64_t i_to) { + int64_t count = 0; + for (int64_t i = i_from; i < i_to; i++) { +count += ArrowBitmapElement(bitmap, i); Review Comment: I wonder if this optimized to popcnt or not…but Arrow also has implementations to steal! ## src/nanoarrow/nanoarrow.c: ## @@ -17,6 +17,7 @@ #include "allocator.c" #include "buffer.c" +#include "bitmap.c" Review Comment: Does this file exist? ## src/nanoarrow/nanoarrow.h: ## @@ -483,82 +372,117 @@ ArrowErrorCode ArrowSchemaViewInit(struct ArrowSchemaView* schema_view, /// }@ -/// \defgroup nanoarrow-buffer-builder Growable buffer builders - -/// \brief An owning mutable view of a buffer -struct ArrowBuffer { - /// \brief A pointer t
[GitHub] [arrow-nanoarrow] paleolimbot commented on pull request #10: Implement bitmap setters, getters, and element-wise builder
paleolimbot commented on PR #10: URL: https://github.com/apache/arrow-nanoarrow/pull/10#issuecomment-1204376962 Ok, made the definitions inline, although perhaps there is a prettier way to do it. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@arrow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [arrow-nanoarrow] paleolimbot opened a new pull request, #12: Add metadata builder functions
paleolimbot opened a new pull request, #12: URL: https://github.com/apache/arrow-nanoarrow/pull/12 Fixes #6. This PR adds functions `ArrowMetadataBuilderAppend()` (for blindly but efficiently adding a key/value pair to the end of some metadata) and `ArrowMetadataBuilderSet()` (to less efficiently replace or remove a value for key). The use case I had in mind is building extension type metadata from some input (i.e., make an output schema like the input except with new extension type or with new serialized extension type metadata). It's rather difficult to replicate the "replace" or "remove" behaviour otherwise. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@arrow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[jira] [Created] (ARROW-17295) [C++] Build separate bundled_depenencies.so
Will Jones created ARROW-17295: -- Summary: [C++] Build separate bundled_depenencies.so Key: ARROW-17295 URL: https://issues.apache.org/jira/browse/ARROW-17295 Project: Apache Arrow Issue Type: Improvement Components: C++ Affects Versions: 8.0.1, 8.0.0 Reporter: Will Jones When building arrow _static_ libraries with bundled dependencies, we produce {{{}arrow_bundled_dependencies.a{}}}. But when building dynamic libraries, the bundled dependencies are statically linked directly into the arrow libraries (libarrow, libarrow_flight, etc.). This means that users can access the symbols of bundled dependencies in the static case, but not in the dynamic library case. One use case of this is being able to pass in gRPC configuration to a Flight server, which requires access to gRPC symbols. Could we change the dynamic library building to build an {{arrow_bundled_dependencies.so}} so that the symbols are accessible? -- This message was sent by Atlassian Jira (v8.20.10#820010)
[GitHub] [arrow-nanoarrow] paleolimbot commented on pull request #10: Implement bitmap setters, getters, and element-wise builder
paleolimbot commented on PR #10: URL: https://github.com/apache/arrow-nanoarrow/pull/10#issuecomment-1204110572 You convinced me! I didn't know you could split up an `inline` declaration and definition and that solves the conundrum I was having. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@arrow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [arrow-nanoarrow] lidavidm commented on pull request #10: Implement bitmap setters, getters, and element-wise builder
lidavidm commented on PR #10: URL: https://github.com/apache/arrow-nanoarrow/pull/10#issuecomment-1204107639 But if it's easier to do things in stages then we can just review the current PR? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@arrow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [arrow-nanoarrow] lidavidm commented on pull request #10: Implement bitmap setters, getters, and element-wise builder
lidavidm commented on PR #10: URL: https://github.com/apache/arrow-nanoarrow/pull/10#issuecomment-1204106882 I don't think we need to split up header files or rearrange things, though, and I don't think we need to inline the buffer functions - it could just be moving the definitions from `bitmap.c` to `nanoarrow.h` and adding `inline` -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@arrow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [arrow-adbc] lidavidm opened a new issue, #53: [Python] Driver distribution
lidavidm opened a new issue, #53: URL: https://github.com/apache/arrow-adbc/issues/53 We don't want to/can't futz with shared libraries when distributing packages. Instead, since drivers have an entrypoint, just expose the entrypoint in Python? e.g. `adbc_driver_sqlite` will bundle the driver and expose `adbc_driver_sqlite._entrypoint(driver_address: int) -> int` ~= `AdbcStatusCode _entrypoint(struct AdbcDriver* driver)`. Then the driver manager can just import the package and call the entrypoint. (There'll need to be some support for this on the C++ side too) That way we can more easily distribute pip-installable packages -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@arrow.apache.org.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [arrow-adbc] lidavidm opened a new issue, #52: [Python] Implement DBAPI
lidavidm opened a new issue, #52: URL: https://github.com/apache/arrow-adbc/issues/52 Will make implementing Ibis backends easier since we can reuse more of the code for SQLAlchemy-based backends -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@arrow.apache.org.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [arrow-nanoarrow] pitrou commented on issue #11: Inline performance-sensitive functions and their dependencies
pitrou commented on issue #11: URL: https://github.com/apache/arrow-nanoarrow/issues/11#issuecomment-1204071180 Note that inline functions (or macros) require you to be very careful if you want to maintain some ABI compatibility. I don't know if that's a goal of nanoarrow, but worth remembering. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@arrow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [arrow-nanoarrow] paleolimbot commented on pull request #10: Implement bitmap setters, getters, and element-wise builder
paleolimbot commented on PR #10: URL: https://github.com/apache/arrow-nanoarrow/pull/10#issuecomment-1204069688 I made #11 to set up a PR for the inlining process...that's definitely important but requires a rather different set of changes (splitting up header files, rearranging stuff, defines, maybe some benchmarking) that transcend the scope of the bitmap (since these implementations use the buffer functions). If it works for all of you, this and perhaps the next few PRs are more about scope, syntax, and correctness (i.e., tests). -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@arrow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [arrow-nanoarrow] lidavidm commented on issue #11: Inline performance-sensitive functions and their dependencies
lidavidm commented on issue #11: URL: https://github.com/apache/arrow-nanoarrow/issues/11#issuecomment-1204065088 I don't think you need to split up the headers, just ```c /// \brief Docstring... inline ArrowErrorCode ArrowBufferAppend() { // Definition... } ``` directly in the header and continue as usual. If having definitions inline is messy, you could still split the declaration and definition (and then shunt all the definitions to the bottom) -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@arrow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [arrow-nanoarrow] paleolimbot opened a new issue, #11: Inline performance-sensitive functions and their dependencies
paleolimbot opened a new issue, #11: URL: https://github.com/apache/arrow-nanoarrow/issues/11 A number of functions should be defined as `inline` with their definition visible to the compiler when a client types `#include "nanoarrow.h"`. Buffer appenders, element-wise builder, and bitmap functions are in this category. My expertise is pretty limited with respect to strategies of how to do this while maintaining nice documentation...I imagine the headers would get split up (error.h, buffer.h, bitmap.h, builder.h) and nanoarrow.h would get reduced to a set of includes. I rather liked reading through adbc.h to get a sense of what ADBC is and what it does, so perhaps there's another strategy that can be used to maintain nanoarrow.h in something like its current form. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@arrow.apache.org.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [arrow-adbc] lidavidm commented on pull request #51: [Python] Set up package with Poetry
lidavidm commented on PR #51: URL: https://github.com/apache/arrow-adbc/pull/51#issuecomment-1204017482 I also grabbed https://pypi.org/project/adbc-driver-manager/ -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@arrow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [arrow-adbc] lidavidm commented on a diff in pull request #51: [Python] Set up package with Poetry
lidavidm commented on code in PR #51: URL: https://github.com/apache/arrow-adbc/pull/51#discussion_r936686336 ## python/adbc_driver_manager/requirements-dev.txt: ## @@ -1,19 +1,124 @@ -# Licensed to the Apache Software Foundation (ASF) under one -# or more contributor license agreements. See the NOTICE file -# distributed with this work for additional information -# regarding copyright ownership. The ASF licenses this file -# to you under the Apache License, Version 2.0 (the -# "License"); you may not use this file except in compliance -# with the License. You may obtain a copy of the License at -# -# http://www.apache.org/licenses/LICENSE-2.0 -# -# Unless required by applicable law or agreed to in writing, -# software distributed under the License is distributed on an -# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY -# KIND, either express or implied. See the License for the -# specific language governing permissions and limitations -# under the License. - -Cython -pytest +atomicwrites==1.4.1; python_version >= "3.7" and python_full_version < "3.0.0" and sys_platform == "win32" or sys_platform == "win32" and python_version >= "3.7" and python_full_version >= "3.4.0" \ Review Comment: Yup, it's via `poetry export`; I'll update CONTRIBUTING.md and figure out the CI changes here -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@arrow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [arrow-adbc] pitrou commented on a diff in pull request #51: [Python] Set up package with Poetry
pitrou commented on code in PR #51: URL: https://github.com/apache/arrow-adbc/pull/51#discussion_r936684861 ## python/adbc_driver_manager/requirements-dev.txt: ## @@ -1,19 +1,124 @@ -# Licensed to the Apache Software Foundation (ASF) under one -# or more contributor license agreements. See the NOTICE file -# distributed with this work for additional information -# regarding copyright ownership. The ASF licenses this file -# to you under the Apache License, Version 2.0 (the -# "License"); you may not use this file except in compliance -# with the License. You may obtain a copy of the License at -# -# http://www.apache.org/licenses/LICENSE-2.0 -# -# Unless required by applicable law or agreed to in writing, -# software distributed under the License is distributed on an -# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY -# KIND, either express or implied. See the License for the -# specific language governing permissions and limitations -# under the License. - -Cython -pytest +atomicwrites==1.4.1; python_version >= "3.7" and python_full_version < "3.0.0" and sys_platform == "win32" or sys_platform == "win32" and python_version >= "3.7" and python_full_version >= "3.4.0" \ Review Comment: Is this auto-generated somehow? Would be nice to add some comment explaining how. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@arrow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[jira] [Created] (ARROW-17294) [Release] Update remove old artifacts release script
Krisztian Szucs created ARROW-17294: --- Summary: [Release] Update remove old artifacts release script Key: ARROW-17294 URL: https://issues.apache.org/jira/browse/ARROW-17294 Project: Apache Arrow Issue Type: Bug Components: Developer Tools Reporter: Krisztian Szucs Fix For: 10.0.0 I just executed the remove old artifacts release script which also removed the previously created three patch releases for 6.0.2, 7.0.1, 8.0.1. That's not desirable since those have just been released so I had to revert to an earlier revision. cc [~kou] [~assignUser] [~raulcd] -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (ARROW-17293) [Java][CI] Prune java nightly builds
Jacob Wujciak-Jens created ARROW-17293: -- Summary: [Java][CI] Prune java nightly builds Key: ARROW-17293 URL: https://issues.apache.org/jira/browse/ARROW-17293 Project: Apache Arrow Issue Type: Improvement Components: Continuous Integration, Java Reporter: Jacob Wujciak-Jens Currently we are accumulating a huge number of nightly java jars. We should prune them to keep max. 14 versions around. (see r_nightly.yml) It might also be nice to always rename/copy the most recent jars to something fixed so there is no need to update your local config to always have the newest version? (but up to the java users if this is necessary/worth it). cc [~dsusanibara] [~ljw1001] -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (ARROW-17292) [C++] Segmentation fault on arrow-compute-hash-join-node-test on macos nightlies
Raúl Cumplido created ARROW-17292: - Summary: [C++] Segmentation fault on arrow-compute-hash-join-node-test on macos nightlies Key: ARROW-17292 URL: https://issues.apache.org/jira/browse/ARROW-17292 Project: Apache Arrow Issue Type: Bug Components: C++ Reporter: Raúl Cumplido Fix For: 10.0.0 Some of our nightly builds are failing due to a segmentation fault on hash-join tests: {code:java} 33/90 Test #35: arrow-compute-hash-join-node-test .***Failed 1.21 sec Running arrow-compute-hash-join-node-test, redirecting output into /var/folders/24/8k48jl6d249_n_qfxwsl6xvmgn/T/arrow-HEAD.X.W72iCJcj/cpp-build/build/test-logs/arrow-compute-hash-join-node-test.txt (attempt 1/1) /Users/runner/work/crossbow/crossbow/arrow/cpp/build-support/run-test.sh: line 88: 78018 Segmentation fault: 11 $TEST_EXECUTABLE "$@" > $LOGFILE.raw 2>&1 Running main() from /var/folders/24/8k48jl6d249_n_qfxwsl6xvmgn/T/arrow-HEAD.X.W72iCJcj/cpp-build/googletest_ep-prefix/src/googletest_ep/googletest/src/gtest_main.cc [==] Running 29 tests from 4 test suites. [--] Global test environment set-up. [--] 10 tests from HashJoin [ RUN ] HashJoin.Suffix [ OK ] HashJoin.Suffix (4 ms) [ RUN ] HashJoin.Random /private/var/folders/24/8k48jl6d249_n_qfxwsl6xvmgn/T/arrow-HEAD.X.W72iCJcj/cpp-build/src/arrow/compute/exec {code} The failures can be seen. It seems to be only related to macos from the failed jobs: [verify-rc-source-cpp-macos-conda-amd64|https://github.com/ursacomputing/crossbow/runs/7631965199?check_suite_focus=true] [verify-rc-source-integration-macos-conda-amd64|https://github.com/ursacomputing/crossbow/runs/7631969879?check_suite_focus=true] [verify-rc-source-python-macos-amd64|https://github.com/ursacomputing/crossbow/runs/7631926429?check_suite_focus=true] -- This message was sent by Atlassian Jira (v8.20.10#820010)
[GitHub] [arrow-nanoarrow] pitrou commented on issue #8: Implement `struct ArrowArrayBuilder`
pitrou commented on issue #8: URL: https://github.com/apache/arrow-nanoarrow/issues/8#issuecomment-1203706719 > One way to do this is a bunch of callables that are something like the reverse of SQLite's result interface So this is for building arrays whose type is not strictly known at compile-time? Going through function pointers will make this slower than if you expose type-specific inline functions/macros, at least when appending one items at a time. We have to think about the use cases for nanoarrow. Mostly, it will be used for bridging/converting with non-Arrow systems, and in this scheme people will probably have runtime switch/case statements to dispatch based on datatype anyway. So I'm not sure an untyped convention is really useful. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@arrow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org