[GitHub] [arrow] jorisvandenbossche commented on pull request #12231: ARROW-14783: [C++][Python] Fix the write ORC in BytesIO issue

2022-01-25 Thread GitBox


jorisvandenbossche commented on pull request #12231:
URL: https://github.com/apache/arrow/pull/12231#issuecomment-1021950791


   @iajoiner can you check 
https://github.com/apache/arrow/pull/12231/files#r792361651 ?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@arrow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [arrow] ursabot edited a comment on pull request #12264: ARROW-15457: [Packaging][deb] Specify CUDAToolkit_ROOT explicitly

2022-01-25 Thread GitBox


ursabot edited a comment on pull request #12264:
URL: https://github.com/apache/arrow/pull/12264#issuecomment-1021931648


   Benchmark runs are scheduled for baseline = 
c5f400461f6d2be836d30df4626fed4d59107015 and contender = 
0b95b625cc5f2423498bdafdcc5acad968909933. 
0b95b625cc5f2423498bdafdcc5acad968909933 is a master commit associated with 
this PR. Results will be available as each benchmark for each run completes.
   Conbench compare runs links:
   [Finished :arrow_down:0.0% :arrow_up:0.0%] 
[ec2-t3-xlarge-us-east-2](https://conbench.ursa.dev/compare/runs/b6266e792e88462badd7a4bcea8bd43f...535fc5a1a80d439ca0ce502e093f9494/)
   [Scheduled] 
[ursa-i9-9960x](https://conbench.ursa.dev/compare/runs/a4ffc026754a469f821ccdfec52dddf8...1722adf096634604866c4afcce34b00a/)
   [Scheduled] 
[ursa-thinkcentre-m75q](https://conbench.ursa.dev/compare/runs/93e5a4880f23490595da95eb0c96e661...e71636c2a20148f5b3fdb1f0a849b23c/)
   Supported benchmarks:
   ec2-t3-xlarge-us-east-2: Supported benchmark langs: Python. Runs only 
benchmarks with cloud = True
   ursa-i9-9960x: Supported benchmark langs: Python, R, JavaScript
   ursa-thinkcentre-m75q: Supported benchmark langs: C++, Java
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@arrow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [arrow] ursabot edited a comment on pull request #12255: ARROW-15447: [C++] Avoid conflict between ORC options API and glibc-defined macro

2022-01-25 Thread GitBox


ursabot edited a comment on pull request #12255:
URL: https://github.com/apache/arrow/pull/12255#issuecomment-1021376927


   Benchmark runs are scheduled for baseline = 
0fa4b9ca1e7b13ac230f85075d86d783eeea6a74 and contender = 
85f67d71381c4dbfbf55377e646b785a643daa0b. 
85f67d71381c4dbfbf55377e646b785a643daa0b is a master commit associated with 
this PR. Results will be available as each benchmark for each run completes.
   Conbench compare runs links:
   [Finished :arrow_down:0.0% :arrow_up:0.0%] 
[ec2-t3-xlarge-us-east-2](https://conbench.ursa.dev/compare/runs/ec29b5b27ede4f84b0b57463365ac53f...68a10e5a92d241fabd0750358a6d4e18/)
   [Failed :arrow_down:0.0% :arrow_up:0.0%] 
[ursa-i9-9960x](https://conbench.ursa.dev/compare/runs/558272c1011141a184f2376011f85fb2...139a17a167904fa1a6f5551851f051eb/)
   [Finished :arrow_down:0.3% :arrow_up:0.0%] 
[ursa-thinkcentre-m75q](https://conbench.ursa.dev/compare/runs/28075e1916b54bc1bf42fbad27900d56...06bdde9d71844bbaa5344520f1585978/)
   Supported benchmarks:
   ec2-t3-xlarge-us-east-2: Supported benchmark langs: Python. Runs only 
benchmarks with cloud = True
   ursa-i9-9960x: Supported benchmark langs: Python, R, JavaScript
   ursa-thinkcentre-m75q: Supported benchmark langs: C++, Java
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@arrow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [arrow] iajoiner commented on pull request #12231: ARROW-14783: [C++][Python] Fix the write ORC in BytesIO issue

2022-01-25 Thread GitBox


iajoiner commented on pull request #12231:
URL: https://github.com/apache/arrow/pull/12231#issuecomment-1021936293


   @pitrou This has been fixed. Please review again. :)


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@arrow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [arrow] iajoiner commented on a change in pull request #12231: ARROW-14783: [C++][Python] Fix the write ORC in BytesIO issue

2022-01-25 Thread GitBox


iajoiner commented on a change in pull request #12231:
URL: https://github.com/apache/arrow/pull/12231#discussion_r792367435



##
File path: python/pyarrow/tests/test_orc.py
##
@@ -171,7 +171,26 @@ def test_orcfile_empty(datadir):
 assert table.schema == expected_schema
 
 
-def test_readwrite(tmpdir):
+def test_filesystem_uri(tmpdir):
+from pyarrow import orc
+table = pa.table({"a": [1, 2, 3]})
+
+directory = tmpdir / "data_dir"
+directory.mkdir()
+path = directory / "data.orc"
+orc.write_table(table, str(path))
+
+# filesystem object
+result = orc.read_table(path, filesystem=fs.LocalFileSystem())
+assert result.equals(table)
+
+# filesystem URI
+result = orc.read_table(
+"data_dir/data.orc", filesystem=util._filesystem_uri(tmpdir))

Review comment:
   Yes it does.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@arrow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [arrow] ursabot commented on pull request #12264: ARROW-15457: [Packaging][deb] Specify CUDAToolkit_ROOT explicitly

2022-01-25 Thread GitBox


ursabot commented on pull request #12264:
URL: https://github.com/apache/arrow/pull/12264#issuecomment-1021931648


   Benchmark runs are scheduled for baseline = 
c5f400461f6d2be836d30df4626fed4d59107015 and contender = 
0b95b625cc5f2423498bdafdcc5acad968909933. 
0b95b625cc5f2423498bdafdcc5acad968909933 is a master commit associated with 
this PR. Results will be available as each benchmark for each run completes.
   Conbench compare runs links:
   [Scheduled] 
[ec2-t3-xlarge-us-east-2](https://conbench.ursa.dev/compare/runs/b6266e792e88462badd7a4bcea8bd43f...535fc5a1a80d439ca0ce502e093f9494/)
   [Scheduled] 
[ursa-i9-9960x](https://conbench.ursa.dev/compare/runs/a4ffc026754a469f821ccdfec52dddf8...1722adf096634604866c4afcce34b00a/)
   [Scheduled] 
[ursa-thinkcentre-m75q](https://conbench.ursa.dev/compare/runs/93e5a4880f23490595da95eb0c96e661...e71636c2a20148f5b3fdb1f0a849b23c/)
   Supported benchmarks:
   ec2-t3-xlarge-us-east-2: Supported benchmark langs: Python. Runs only 
benchmarks with cloud = True
   ursa-i9-9960x: Supported benchmark langs: Python, R, JavaScript
   ursa-thinkcentre-m75q: Supported benchmark langs: C++, Java
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@arrow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [arrow] jorisvandenbossche commented on a change in pull request #12231: ARROW-14783: [C++][Python] Fix the write ORC in BytesIO issue

2022-01-25 Thread GitBox


jorisvandenbossche commented on a change in pull request #12231:
URL: https://github.com/apache/arrow/pull/12231#discussion_r792361651



##
File path: python/pyarrow/orc.py
##
@@ -330,7 +330,9 @@ def read_table(source, columns=None, filesystem=None):
 """
 
 
-def write_table(table, where, *, file_version='0.12',
+def write_table(table, where, *,
+close_file=False,

Review comment:
   The argument is no longer passed down, though. So I don't think it can 
do something. I also don't see it used in the tests anymore.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@arrow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [arrow] kou closed pull request #12264: ARROW-15457: [Packaging][deb] Specify CUDAToolkit_ROOT explicitly

2022-01-25 Thread GitBox


kou closed pull request #12264:
URL: https://github.com/apache/arrow/pull/12264


   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@arrow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [arrow] jorisvandenbossche commented on a change in pull request #12231: ARROW-14783: [C++][Python] Fix the write ORC in BytesIO issue

2022-01-25 Thread GitBox


jorisvandenbossche commented on a change in pull request #12231:
URL: https://github.com/apache/arrow/pull/12231#discussion_r792361651



##
File path: python/pyarrow/orc.py
##
@@ -330,7 +330,9 @@ def read_table(source, columns=None, filesystem=None):
 """
 
 
-def write_table(table, where, *, file_version='0.12',
+def write_table(table, where, *,
+close_file=False,

Review comment:
   The argument is no longer passed down, though. So I don't think it can 
do something




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@arrow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [arrow] kou commented on pull request #12264: ARROW-15457: [Packaging][deb] Specify CUDAToolkit_ROOT explicitly

2022-01-25 Thread GitBox


kou commented on pull request #12264:
URL: https://github.com/apache/arrow/pull/12264#issuecomment-1021926531


   +1


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@arrow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [arrow] liyafan82 commented on a change in pull request #8949: ARROW-10880: [Java] Support compressing RecordBatch IPC buffers by LZ4

2022-01-25 Thread GitBox


liyafan82 commented on a change in pull request #8949:
URL: https://github.com/apache/arrow/pull/8949#discussion_r792333746



##
File path: 
java/compression/src/main/java/org/apache/arrow/compression/Lz4CompressionCodec.java
##
@@ -0,0 +1,159 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.arrow.compression;
+
+import java.io.ByteArrayInputStream;
+import java.io.ByteArrayOutputStream;
+import java.io.IOException;
+import java.io.InputStream;
+import java.io.OutputStream;
+
+import org.apache.arrow.flatbuf.CompressionType;
+import org.apache.arrow.memory.ArrowBuf;
+import org.apache.arrow.memory.BufferAllocator;
+import org.apache.arrow.memory.util.MemoryUtil;
+import org.apache.arrow.util.Preconditions;
+import org.apache.arrow.vector.compression.CompressionCodec;
+import org.apache.arrow.vector.compression.CompressionUtil;
+import 
org.apache.commons.compress.compressors.lz4.FramedLZ4CompressorInputStream;
+import 
org.apache.commons.compress.compressors.lz4.FramedLZ4CompressorOutputStream;
+import org.apache.commons.compress.utils.IOUtils;
+
+import io.netty.util.internal.PlatformDependent;
+
+/**
+ * Compression codec for the LZ4 algorithm.
+ */
+public class Lz4CompressionCodec implements CompressionCodec {
+
+  @Override
+  public ArrowBuf compress(BufferAllocator allocator, ArrowBuf 
uncompressedBuffer) {
+Preconditions.checkArgument(uncompressedBuffer.writerIndex() <= 
Integer.MAX_VALUE,
+"The uncompressed buffer size exceeds the integer limit");
+
+if (uncompressedBuffer.writerIndex() == 0L) {
+  // shortcut for empty buffer
+  ArrowBuf compressedBuffer = 
allocator.buffer(CompressionUtil.SIZE_OF_UNCOMPRESSED_LENGTH);
+  compressedBuffer.setLong(0, 0);
+  
compressedBuffer.writerIndex(CompressionUtil.SIZE_OF_UNCOMPRESSED_LENGTH);
+  uncompressedBuffer.close();
+  return compressedBuffer;
+}
+
+try {
+  ArrowBuf compressedBuffer = doCompress(allocator, uncompressedBuffer);
+  long compressedLength = compressedBuffer.writerIndex() - 
CompressionUtil.SIZE_OF_UNCOMPRESSED_LENGTH;
+  if (compressedLength > uncompressedBuffer.writerIndex()) {
+// compressed buffer is larger, send the raw buffer
+compressedBuffer.close();
+compressedBuffer = CompressionUtil.compressRawBuffer(allocator, 
uncompressedBuffer);
+  }
+
+  uncompressedBuffer.close();
+  return compressedBuffer;
+} catch (IOException e) {
+  throw new RuntimeException(e);
+}
+  }
+
+  private ArrowBuf doCompress(BufferAllocator allocator, ArrowBuf 
uncompressedBuffer) throws IOException {
+byte[] inBytes = new byte[(int) uncompressedBuffer.writerIndex()];
+PlatformDependent.copyMemory(uncompressedBuffer.memoryAddress(), inBytes, 
0, uncompressedBuffer.writerIndex());
+ByteArrayOutputStream baos = new ByteArrayOutputStream();
+try (InputStream in = new ByteArrayInputStream(inBytes);
+ OutputStream out = new FramedLZ4CompressorOutputStream(baos)) {
+  IOUtils.copy(in, out);
+}
+
+byte[] outBytes = baos.toByteArray();
+
+ArrowBuf compressedBuffer = 
allocator.buffer(CompressionUtil.SIZE_OF_UNCOMPRESSED_LENGTH + outBytes.length);
+
+long uncompressedLength = uncompressedBuffer.writerIndex();
+if (!MemoryUtil.LITTLE_ENDIAN) {
+  uncompressedLength = Long.reverseBytes(uncompressedLength);
+}
+// first 8 bytes reserved for uncompressed length, to be consistent with 
the
+// C++ implementation.
+compressedBuffer.setLong(0, uncompressedLength);
+
+PlatformDependent.copyMemory(
+outBytes, 0, compressedBuffer.memoryAddress() + 
CompressionUtil.SIZE_OF_UNCOMPRESSED_LENGTH, outBytes.length);
+compressedBuffer.writerIndex(CompressionUtil.SIZE_OF_UNCOMPRESSED_LENGTH + 
outBytes.length);
+return compressedBuffer;
+  }
+
+  @Override
+  public ArrowBuf decompress(BufferAllocator allocator, ArrowBuf 
compressedBuffer) {
+Preconditions.checkArgument(compressedBuffer.writerIndex() <= 
Integer.MAX_VALUE,
+"The compressed buffer size exceeds the integer limit");
+
+Preconditions.checkArgument(compressedBuffer.writerIndex() >= 

[GitHub] [arrow] ursabot edited a comment on pull request #12149: ARROW-15331: [Go][Parquet] Add pqarrow package for direct Parquet <--> Arrow conversion

2022-01-25 Thread GitBox


ursabot edited a comment on pull request #12149:
URL: https://github.com/apache/arrow/pull/12149#issuecomment-1021770422


   Benchmark runs are scheduled for baseline = 
345609b61674cedc1daa3f1bf5d6b64266b19b26 and contender = 
c5f400461f6d2be836d30df4626fed4d59107015. 
c5f400461f6d2be836d30df4626fed4d59107015 is a master commit associated with 
this PR. Results will be available as each benchmark for each run completes.
   Conbench compare runs links:
   [Finished :arrow_down:0.0% :arrow_up:0.0%] 
[ec2-t3-xlarge-us-east-2](https://conbench.ursa.dev/compare/runs/77919f1fe9384f03b857f977a428e0dd...b6266e792e88462badd7a4bcea8bd43f/)
   [Scheduled] 
[ursa-i9-9960x](https://conbench.ursa.dev/compare/runs/c604c4784e1147dc91533b54a002d5b4...a4ffc026754a469f821ccdfec52dddf8/)
   [Finished :arrow_down:0.52% :arrow_up:0.04%] 
[ursa-thinkcentre-m75q](https://conbench.ursa.dev/compare/runs/cb3cee8911dc4de9b38960d52b0738ee...93e5a4880f23490595da95eb0c96e661/)
   Supported benchmarks:
   ec2-t3-xlarge-us-east-2: Supported benchmark langs: Python. Runs only 
benchmarks with cloud = True
   ursa-i9-9960x: Supported benchmark langs: Python, R, JavaScript
   ursa-thinkcentre-m75q: Supported benchmark langs: C++, Java
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@arrow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [arrow] github-actions[bot] commented on pull request #12264: ARROW-15457: [Packaging][deb] Specify CUDAToolkit_ROOT explicitly

2022-01-25 Thread GitBox


github-actions[bot] commented on pull request #12264:
URL: https://github.com/apache/arrow/pull/12264#issuecomment-1021875556


   Revision: ddb038a51f9ec7370f11e4af5fc23c435776ebbc
   
   Submitted crossbow builds: [ursacomputing/crossbow @ 
actions-1505](https://github.com/ursacomputing/crossbow/branches/all?query=actions-1505)
   
   |Task|Status|
   ||--|
   |almalinux-8-amd64|[![Github 
Actions](https://github.com/ursacomputing/crossbow/workflows/Crossbow/badge.svg?branch=actions-1505-github-almalinux-8-amd64)](https://github.com/ursacomputing/crossbow/actions?query=branch:actions-1505-github-almalinux-8-amd64)|
   
|almalinux-8-arm64|[![TravisCI](https://img.shields.io/travis/ursacomputing/crossbow/actions-1505-travis-almalinux-8-arm64.svg)](https://app.travis-ci.com/github/ursacomputing/crossbow/branches)|
   |amazon-linux-2-amd64|[![Github 
Actions](https://github.com/ursacomputing/crossbow/workflows/Crossbow/badge.svg?branch=actions-1505-github-amazon-linux-2-amd64)](https://github.com/ursacomputing/crossbow/actions?query=branch:actions-1505-github-amazon-linux-2-amd64)|
   |centos-7-amd64|[![Github 
Actions](https://github.com/ursacomputing/crossbow/workflows/Crossbow/badge.svg?branch=actions-1505-github-centos-7-amd64)](https://github.com/ursacomputing/crossbow/actions?query=branch:actions-1505-github-centos-7-amd64)|
   |debian-bookworm-amd64|[![Github 
Actions](https://github.com/ursacomputing/crossbow/workflows/Crossbow/badge.svg?branch=actions-1505-github-debian-bookworm-amd64)](https://github.com/ursacomputing/crossbow/actions?query=branch:actions-1505-github-debian-bookworm-amd64)|
   
|debian-bookworm-arm64|[![TravisCI](https://img.shields.io/travis/ursacomputing/crossbow/actions-1505-travis-debian-bookworm-arm64.svg)](https://app.travis-ci.com/github/ursacomputing/crossbow/branches)|
   |debian-bullseye-amd64|[![Github 
Actions](https://github.com/ursacomputing/crossbow/workflows/Crossbow/badge.svg?branch=actions-1505-github-debian-bullseye-amd64)](https://github.com/ursacomputing/crossbow/actions?query=branch:actions-1505-github-debian-bullseye-amd64)|
   
|debian-bullseye-arm64|[![TravisCI](https://img.shields.io/travis/ursacomputing/crossbow/actions-1505-travis-debian-bullseye-arm64.svg)](https://app.travis-ci.com/github/ursacomputing/crossbow/branches)|
   |debian-buster-amd64|[![Github 
Actions](https://github.com/ursacomputing/crossbow/workflows/Crossbow/badge.svg?branch=actions-1505-github-debian-buster-amd64)](https://github.com/ursacomputing/crossbow/actions?query=branch:actions-1505-github-debian-buster-amd64)|
   
|debian-buster-arm64|[![TravisCI](https://img.shields.io/travis/ursacomputing/crossbow/actions-1505-travis-debian-buster-arm64.svg)](https://app.travis-ci.com/github/ursacomputing/crossbow/branches)|
   |ubuntu-bionic-amd64|[![Github 
Actions](https://github.com/ursacomputing/crossbow/workflows/Crossbow/badge.svg?branch=actions-1505-github-ubuntu-bionic-amd64)](https://github.com/ursacomputing/crossbow/actions?query=branch:actions-1505-github-ubuntu-bionic-amd64)|
   
|ubuntu-bionic-arm64|[![TravisCI](https://img.shields.io/travis/ursacomputing/crossbow/actions-1505-travis-ubuntu-bionic-arm64.svg)](https://app.travis-ci.com/github/ursacomputing/crossbow/branches)|
   |ubuntu-focal-amd64|[![Github 
Actions](https://github.com/ursacomputing/crossbow/workflows/Crossbow/badge.svg?branch=actions-1505-github-ubuntu-focal-amd64)](https://github.com/ursacomputing/crossbow/actions?query=branch:actions-1505-github-ubuntu-focal-amd64)|
   
|ubuntu-focal-arm64|[![TravisCI](https://img.shields.io/travis/ursacomputing/crossbow/actions-1505-travis-ubuntu-focal-arm64.svg)](https://app.travis-ci.com/github/ursacomputing/crossbow/branches)|
   |ubuntu-hirsute-amd64|[![Github 
Actions](https://github.com/ursacomputing/crossbow/workflows/Crossbow/badge.svg?branch=actions-1505-github-ubuntu-hirsute-amd64)](https://github.com/ursacomputing/crossbow/actions?query=branch:actions-1505-github-ubuntu-hirsute-amd64)|
   
|ubuntu-hirsute-arm64|[![TravisCI](https://img.shields.io/travis/ursacomputing/crossbow/actions-1505-travis-ubuntu-hirsute-arm64.svg)](https://app.travis-ci.com/github/ursacomputing/crossbow/branches)|
   |ubuntu-impish-amd64|[![Github 
Actions](https://github.com/ursacomputing/crossbow/workflows/Crossbow/badge.svg?branch=actions-1505-github-ubuntu-impish-amd64)](https://github.com/ursacomputing/crossbow/actions?query=branch:actions-1505-github-ubuntu-impish-amd64)|
   
|ubuntu-impish-arm64|[![TravisCI](https://img.shields.io/travis/ursacomputing/crossbow/actions-1505-travis-ubuntu-impish-arm64.svg)](https://app.travis-ci.com/github/ursacomputing/crossbow/branches)|


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@arrow.apache.org

For queries about this service, please contact Infrastructure at:

[GitHub] [arrow] kou commented on pull request #12264: ARROW-15457: [Packaging][deb] Specify CUDAToolkit_ROOT explicitly

2022-01-25 Thread GitBox


kou commented on pull request #12264:
URL: https://github.com/apache/arrow/pull/12264#issuecomment-1021875073


   @github-actions crossbow submit -g linux


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@arrow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [arrow-datafusion] houqp commented on issue #1675: Improvements to Ballista extensibility

2022-01-25 Thread GitBox


houqp commented on issue #1675:
URL: 
https://github.com/apache/arrow-datafusion/issues/1675#issuecomment-1021874217


   FWIW, Andy wrote a substrait rust implementation: 
https://github.com/andygrove/substrait-rs.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@arrow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [arrow] ursabot edited a comment on pull request #12245: ARROW-15437: [Python][FlightRPC] Fix flaky test test_interrupt

2022-01-25 Thread GitBox


ursabot edited a comment on pull request #12245:
URL: https://github.com/apache/arrow/pull/12245#issuecomment-1021211208


   Benchmark runs are scheduled for baseline = 
231d0a6b30e7017f0e07162b6716f3da49a674d0 and contender = 
0fa4b9ca1e7b13ac230f85075d86d783eeea6a74. 
0fa4b9ca1e7b13ac230f85075d86d783eeea6a74 is a master commit associated with 
this PR. Results will be available as each benchmark for each run completes.
   Conbench compare runs links:
   [Finished :arrow_down:0.0% :arrow_up:0.0%] 
[ec2-t3-xlarge-us-east-2](https://conbench.ursa.dev/compare/runs/708a4484bf054ea39e277489f07f327f...ec29b5b27ede4f84b0b57463365ac53f/)
   [Failed :arrow_down:0.0% :arrow_up:0.0%] 
[ursa-i9-9960x](https://conbench.ursa.dev/compare/runs/01cfa1661c444ba0a1ed9f1ff691cec8...558272c1011141a184f2376011f85fb2/)
   [Finished :arrow_down:0.35% :arrow_up:0.0%] 
[ursa-thinkcentre-m75q](https://conbench.ursa.dev/compare/runs/0e434b6c358342a3be2253a61c549839...28075e1916b54bc1bf42fbad27900d56/)
   Supported benchmarks:
   ec2-t3-xlarge-us-east-2: Supported benchmark langs: Python. Runs only 
benchmarks with cloud = True
   ursa-i9-9960x: Supported benchmark langs: Python, R, JavaScript
   ursa-thinkcentre-m75q: Supported benchmark langs: C++, Java
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@arrow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [arrow] github-actions[bot] commented on pull request #12262: [Release] Verify 7.0.0 RC8 [WIP]

2022-01-25 Thread GitBox


github-actions[bot] commented on pull request #12262:
URL: https://github.com/apache/arrow/pull/12262#issuecomment-1021842924


   ```
   Invalid group(s) {'verify-rc-wheels-windows'}. Must be one of {'wheel', 
'nightly', 'packaging', 'homebrew', 'verify-rc-jars', 'linux-amd64', 'test', 
'verify-rc-source', 'linux', 'r', 'c-glib', 'example', 'integration', 'conda', 
'vcpkg', 'verify-rc-source-macos', 'verify-rc-binaries', 'python', 
'verify-rc-source-linux', 'linux-arm64', 'verify-rc-wheels', 'verify-rc', 
'fuzz', 'cpp', 'ruby', 'example-cpp'}
   The Archery job run can be found at: 
https://github.com/apache/arrow/actions/runs/1749038993```


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@arrow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [arrow] kszucs commented on pull request #12262: [Release] Verify 7.0.0 RC8 [WIP]

2022-01-25 Thread GitBox


kszucs commented on pull request #12262:
URL: https://github.com/apache/arrow/pull/12262#issuecomment-1021842277


   @github-actions crossbow submit -g verify-rc-wheels-windows --param 
release=7.0.0 --param rc=8


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@arrow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [arrow-datafusion] houqp commented on a change in pull request #1665: Skip some path in list_file_with_suffix.

2022-01-25 Thread GitBox


houqp commented on a change in pull request #1665:
URL: https://github.com/apache/arrow-datafusion/pull/1665#discussion_r792300171



##
File path: datafusion/src/datasource/object_store/mod.rs
##
@@ -159,6 +159,16 @@ pub trait ObjectStore: Sync + Send + Debug {
 /// Get object reader for one file
 fn file_reader(, file: SizedFile) -> Result>;
 }
+/// Checks if we should filter out this path name
+/// 
https://github.com/apache/spark/blob/5b91381b9bcea27b7be6d9cef852efbd6c23d98a/core/src/main/scala/org/apache/spark/util/HadoopFSUtils.scala#L358
+fn do_path_filter_with_suffix(path: , suffix: ) -> bool {
+let exclude = path.starts_with('_') && !path.contains('=')
+|| path.starts_with('.')
+|| path.ends_with("._COPYING_");

Review comment:
   > Sometimes read parquet file not pass the suffix option
   
   Isn't this a bug we should also fix?




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@arrow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [arrow-datafusion] houqp commented on a change in pull request #1665: Skip some path in list_file_with_suffix.

2022-01-25 Thread GitBox


houqp commented on a change in pull request #1665:
URL: https://github.com/apache/arrow-datafusion/pull/1665#discussion_r79238



##
File path: datafusion/src/datasource/object_store/mod.rs
##
@@ -159,6 +159,16 @@ pub trait ObjectStore: Sync + Send + Debug {
 /// Get object reader for one file
 fn file_reader(, file: SizedFile) -> Result>;
 }
+/// Checks if we should filter out this path name
+/// 
https://github.com/apache/spark/blob/5b91381b9bcea27b7be6d9cef852efbd6c23d98a/core/src/main/scala/org/apache/spark/util/HadoopFSUtils.scala#L358
+fn do_path_filter_with_suffix(path: , suffix: ) -> bool {
+let exclude = path.starts_with('_') && !path.contains('=')
+|| path.starts_with('.')
+|| path.ends_with("._COPYING_");
+let include = path.starts_with("_common_metadata") || 
path.starts_with("_metadata");
+let is_suffix = path.ends_with(suffix);
+(!exclude || include) && is_suffix

Review comment:
   i recommend changing up the evaluation order so we can perform early 
exit and avoid unnecessary string operations. For example:
   
   ```rust
   if !path.ends_with(suffix) {
   return false;
   }
   
   let exclude = ;
   if exlude {
   let include = path.starts_with("_common_metadata") || 
path.starts_with("_metadata");
   if !include {
   return false;
   }
   } 
   
   return true;
   ```




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@arrow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [arrow-datafusion] houqp commented on a change in pull request #1665: Skip some path in list_file_with_suffix.

2022-01-25 Thread GitBox


houqp commented on a change in pull request #1665:
URL: https://github.com/apache/arrow-datafusion/pull/1665#discussion_r792298940



##
File path: datafusion/src/physical_plan/file_format/parquet.rs
##
@@ -217,6 +217,7 @@ impl ExecutionPlan for ParquetExec {
 
 let file_schema_ref = self.base_config().file_schema.clone();
 let join_handle = task::spawn_blocking(move || {
+let f = partition.clone();

Review comment:
   perhaps we can pass partition as a reference to `read_partition` to 
avoid the clone here? I noticed read_partition is already perfoming clone 
inside.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@arrow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [arrow-datafusion] houqp commented on a change in pull request #1665: Skip some path in list_file_with_suffix.

2022-01-25 Thread GitBox


houqp commented on a change in pull request #1665:
URL: https://github.com/apache/arrow-datafusion/pull/1665#discussion_r792297397



##
File path: datafusion/src/datasource/object_store/mod.rs
##
@@ -141,7 +141,7 @@ pub trait ObjectStore: Sync + Send + Debug {
 let suffix = suffix.to_owned();
 Ok(Box::pin(file_stream.filter(move |fr| {
 let has_suffix = match fr {
-Ok(f) => f.path().ends_with(),
+Ok(f) => do_path_filter_with_suffix(f.path(), ),

Review comment:
   Forcing the filter here will restrict the usefullness of the object 
store abstraction. for example, I won't be able to use `list_file_with_suffix` 
in delta-rs to manage delta lake metadata. Object store should be a very 
generic module that is agnostic of this type of application specific file 
naming conventions. I think a better place to host this particular logic is in 
the ListingTable struct.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@arrow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [arrow] github-actions[bot] commented on pull request #12262: [Release] Verify 7.0.0 RC8 [WIP]

2022-01-25 Thread GitBox


github-actions[bot] commented on pull request #12262:
URL: https://github.com/apache/arrow/pull/12262#issuecomment-1021834940


   Revision: 80c8fc8739c4425fba10e8f8a9871a41f9f7f1db
   
   Submitted crossbow builds: [ursacomputing/crossbow @ 
actions-1504](https://github.com/ursacomputing/crossbow/branches/all?query=actions-1504)
   
   |Task|Status|
   ||--|
   |verify-rc-wheels-linux-amd64|[![Github 
Actions](https://github.com/ursacomputing/crossbow/workflows/Crossbow/badge.svg?branch=actions-1504-github-verify-rc-wheels-linux-amd64)](https://github.com/ursacomputing/crossbow/actions?query=branch:actions-1504-github-verify-rc-wheels-linux-amd64)|
   |verify-rc-wheels-macos-10.15-amd64|[![Github 
Actions](https://github.com/ursacomputing/crossbow/workflows/Crossbow/badge.svg?branch=actions-1504-github-verify-rc-wheels-macos-10.15-amd64)](https://github.com/ursacomputing/crossbow/actions?query=branch:actions-1504-github-verify-rc-wheels-macos-10.15-amd64)|
   |verify-rc-wheels-macos-11-amd64|[![Github 
Actions](https://github.com/ursacomputing/crossbow/workflows/Crossbow/badge.svg?branch=actions-1504-github-verify-rc-wheels-macos-11-amd64)](https://github.com/ursacomputing/crossbow/actions?query=branch:actions-1504-github-verify-rc-wheels-macos-11-amd64)|
   |verify-rc-wheels-macos-11-arm64|[![Github 
Actions](https://github.com/ursacomputing/crossbow/workflows/Crossbow/badge.svg?branch=actions-1504-github-verify-rc-wheels-macos-11-arm64)](https://github.com/ursacomputing/crossbow/actions?query=branch:actions-1504-github-verify-rc-wheels-macos-11-arm64)|
   |verify-rc-wheels-windows|[![Github 
Actions](https://github.com/ursacomputing/crossbow/workflows/Crossbow/badge.svg?branch=actions-1504-github-verify-rc-wheels-windows)](https://github.com/ursacomputing/crossbow/actions?query=branch:actions-1504-github-verify-rc-wheels-windows)|


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@arrow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [arrow] kszucs removed a comment on pull request #12262: [Release] Verify 7.0.0 RC8 [WIP]

2022-01-25 Thread GitBox


kszucs removed a comment on pull request #12262:
URL: https://github.com/apache/arrow/pull/12262#issuecomment-1021834473


   @github-actions crossbow submit verify-rc-wheels


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@arrow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [arrow] kszucs commented on pull request #12262: [Release] Verify 7.0.0 RC8 [WIP]

2022-01-25 Thread GitBox


kszucs commented on pull request #12262:
URL: https://github.com/apache/arrow/pull/12262#issuecomment-1021834473


   @github-actions crossbow submit verify-rc-wheels


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@arrow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [arrow] kszucs commented on pull request #12262: [Release] Verify 7.0.0 RC8 [WIP]

2022-01-25 Thread GitBox


kszucs commented on pull request #12262:
URL: https://github.com/apache/arrow/pull/12262#issuecomment-1021834555


   @github-actions crossbow submit -g verify-rc-wheels --param release=7.0.0 
--param rc=8


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@arrow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [arrow] github-actions[bot] commented on pull request #12262: [Release] Verify 7.0.0 RC8 [WIP]

2022-01-25 Thread GitBox


github-actions[bot] commented on pull request #12262:
URL: https://github.com/apache/arrow/pull/12262#issuecomment-1021834451


   Revision: 80c8fc8739c4425fba10e8f8a9871a41f9f7f1db
   
   Submitted crossbow builds: [ursacomputing/crossbow @ 
actions-1503](https://github.com/ursacomputing/crossbow/branches/all?query=actions-1503)
   
   |Task|Status|
   ||--|
   |verify-rc-jars-amd64|[![Github 
Actions](https://github.com/ursacomputing/crossbow/workflows/Crossbow/badge.svg?branch=actions-1503-github-verify-rc-jars-amd64)](https://github.com/ursacomputing/crossbow/actions?query=branch:actions-1503-github-verify-rc-jars-amd64)|


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@arrow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [arrow] kszucs commented on pull request #12262: [Release] Verify 7.0.0 RC8 [WIP]

2022-01-25 Thread GitBox


kszucs commented on pull request #12262:
URL: https://github.com/apache/arrow/pull/12262#issuecomment-1021834041


   @github-actions crossbow submit -g verify-rc-jars --param release=7.0.0 
--param rc=8


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@arrow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [arrow] iajoiner commented on a change in pull request #12231: ARROW-14783: [C++][Python] Fix the write ORC in BytesIO issue

2022-01-25 Thread GitBox


iajoiner commented on a change in pull request #12231:
URL: https://github.com/apache/arrow/pull/12231#discussion_r792294552



##
File path: python/pyarrow/tests/test_orc.py
##
@@ -171,7 +171,26 @@ def test_orcfile_empty(datadir):
 assert table.schema == expected_schema
 
 
-def test_readwrite(tmpdir):
+def test_filesystem_uri(tmpdir):
+from pyarrow import orc
+table = pa.table({"a": [1, 2, 3]})
+
+directory = tmpdir / "data_dir"
+directory.mkdir()
+path = directory / "data.orc"
+orc.write_table(table, str(path))
+
+# filesystem object
+result = orc.read_table(path, filesystem=fs.LocalFileSystem())
+assert result.equals(table)
+
+# filesystem URI
+result = orc.read_table(
+"data_dir/data.orc", filesystem=util._filesystem_uri(tmpdir))

Review comment:
   I think so. That’s a test @dongjoon-hyun added which I moved. I will add 
that one and see whether it still works.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@arrow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [arrow] zhixingheyi-tian commented on a change in pull request #11763: ARROW-14153: [C++][Dataset] Add support for batch_size in the ORC Scanner

2022-01-25 Thread GitBox


zhixingheyi-tian commented on a change in pull request #11763:
URL: https://github.com/apache/arrow/pull/11763#discussion_r792293256



##
File path: cpp/src/arrow/adapters/orc/adapter.h
##
@@ -231,6 +231,19 @@ class ARROW_EXPORT ORCFileReader {
   Status NextStripeReader(int64_t batch_size, const std::vector& 
include_indices,
   std::shared_ptr* out);
 
+  /// \brief Get a stripe level record batch iterator with specified row count
+  /// in each record batch. NextStripeReader serves as a fine grain
+  /// alternative to ReadStripe which may cause OOM issue by loading
+  /// the whole stripes into memory.
+  ///
+  /// \param[in] batch_size Get a stripe level record batch iterator with 
specified row
+  /// count in each record batch.
+  ///
+  /// \param[in] include_names the selected field names to read
+  /// \param[out] out the returned stripe reader
+  Status NextBatchReader(int64_t batch_size, const std::vector& 
include_names,

Review comment:
   @pitrou  Why return Result. This follows other 
interfaces  convention, for example " Status NextStripeReader()" .




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@arrow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [arrow-datafusion] realno edited a comment on issue #1675: Improvements to Ballista extensibility

2022-01-25 Thread GitBox


realno edited a comment on issue #1675:
URL: 
https://github.com/apache/arrow-datafusion/issues/1675#issuecomment-1021830009


   > Maybe it's better to introduce the substrait integration into the roadmap.
   
   +1
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@arrow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [arrow-datafusion] realno commented on issue #1675: Improvements to Ballista extensibility

2022-01-25 Thread GitBox


realno commented on issue #1675:
URL: 
https://github.com/apache/arrow-datafusion/issues/1675#issuecomment-1021830009


   > Maybe it's better to introduce the substrait integration into the roadmap.
   +1
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@arrow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [arrow] ursabot edited a comment on pull request #12243: ARROW-15436: [Release][Python] Disable flaky csv::test_cancellation test on apple M1

2022-01-25 Thread GitBox


ursabot edited a comment on pull request #12243:
URL: https://github.com/apache/arrow/pull/12243#issuecomment-1021558572


   Benchmark runs are scheduled for baseline = 
f70acb2bcc17b11dcdb372b6e8047ca5805ec2ed and contender = 
345609b61674cedc1daa3f1bf5d6b64266b19b26. 
345609b61674cedc1daa3f1bf5d6b64266b19b26 is a master commit associated with 
this PR. Results will be available as each benchmark for each run completes.
   Conbench compare runs links:
   [Finished :arrow_down:0.0% :arrow_up:0.0%] 
[ec2-t3-xlarge-us-east-2](https://conbench.ursa.dev/compare/runs/b7312fa64122474f90926a9ee7c5bfc0...77919f1fe9384f03b857f977a428e0dd/)
   [Scheduled] 
[ursa-i9-9960x](https://conbench.ursa.dev/compare/runs/bdbd5f6bebf2494d8ab961a82071245a...c604c4784e1147dc91533b54a002d5b4/)
   [Finished :arrow_down:0.6% :arrow_up:0.13%] 
[ursa-thinkcentre-m75q](https://conbench.ursa.dev/compare/runs/bfabf432111f470295d371af2aa6b989...cb3cee8911dc4de9b38960d52b0738ee/)
   Supported benchmarks:
   ec2-t3-xlarge-us-east-2: Supported benchmark langs: Python. Runs only 
benchmarks with cloud = True
   ursa-i9-9960x: Supported benchmark langs: Python, R, JavaScript
   ursa-thinkcentre-m75q: Supported benchmark langs: C++, Java
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@arrow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [arrow] kszucs opened a new pull request #12265: MINOR: [Python][Packaging] Update crossbow cache key for vcpkg in the macos wheel builds

2022-01-25 Thread GitBox


kszucs opened a new pull request #12265:
URL: https://github.com/apache/arrow/pull/12265


   This need to correspond to 
https://github.com/ursacomputing/crossbow/blob/master/.github/workflows/cache_vcpkg.yml#L68


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@arrow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [arrow] github-actions[bot] commented on pull request #12262: [Release] Verify 7.0.0 RC8 [WIP]

2022-01-25 Thread GitBox


github-actions[bot] commented on pull request #12262:
URL: https://github.com/apache/arrow/pull/12262#issuecomment-1021825272


   Revision: 80c8fc8739c4425fba10e8f8a9871a41f9f7f1db
   
   Submitted crossbow builds: [ursacomputing/crossbow @ 
actions-1502](https://github.com/ursacomputing/crossbow/branches/all?query=actions-1502)
   
   |Task|Status|
   ||--|
   |verify-rc-binaries-apt-amd64|[![Github 
Actions](https://github.com/ursacomputing/crossbow/workflows/Crossbow/badge.svg?branch=actions-1502-github-verify-rc-binaries-apt-amd64)](https://github.com/ursacomputing/crossbow/actions?query=branch:actions-1502-github-verify-rc-binaries-apt-amd64)|
   |verify-rc-binaries-binary-amd64|[![Github 
Actions](https://github.com/ursacomputing/crossbow/workflows/Crossbow/badge.svg?branch=actions-1502-github-verify-rc-binaries-binary-amd64)](https://github.com/ursacomputing/crossbow/actions?query=branch:actions-1502-github-verify-rc-binaries-binary-amd64)|
   |verify-rc-binaries-yum-amd64|[![Github 
Actions](https://github.com/ursacomputing/crossbow/workflows/Crossbow/badge.svg?branch=actions-1502-github-verify-rc-binaries-yum-amd64)](https://github.com/ursacomputing/crossbow/actions?query=branch:actions-1502-github-verify-rc-binaries-yum-amd64)|


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@arrow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [arrow-datafusion] hntd187 commented on issue #1505: Renaming Tests Discussion

2022-01-25 Thread GitBox


hntd187 commented on issue #1505:
URL: 
https://github.com/apache/arrow-datafusion/issues/1505#issuecomment-1021824887


   No @alamb this is renaming which we didn’t do in #1491 we agreed I’d do it 
after some discussion in this issue. I planned to get to this this weekend. 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@arrow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [arrow] kszucs commented on pull request #12262: [Release] Verify 7.0.0 RC8 [WIP]

2022-01-25 Thread GitBox


kszucs commented on pull request #12262:
URL: https://github.com/apache/arrow/pull/12262#issuecomment-1021824741


   @github-actions crossbow submit -g verify-rc-binaries --param release=7.0.0 
--param rc=8


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@arrow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [arrow-datafusion] xudong963 commented on a change in pull request #1674: feat: add join type for logical plan display

2022-01-25 Thread GitBox


xudong963 commented on a change in pull request #1674:
URL: https://github.com/apache/arrow-datafusion/pull/1674#discussion_r792283577



##
File path: datafusion/src/logical_plan/plan.rs
##
@@ -934,16 +934,30 @@ impl LogicalPlan {
 LogicalPlan::Join(Join {
 on: ref keys,
 join_constraint,
+join_type,
 ..
 }) => {
 let join_expr: Vec =
 keys.iter().map(|(l, r)| format!("{} = {}", l, 
r)).collect();
+let join_type = match join_type {

Review comment:
   I think so.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@arrow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [arrow-datafusion] Ted-Jiang commented on a change in pull request #1665: Skip some path in list_file_with_suffix.

2022-01-25 Thread GitBox


Ted-Jiang commented on a change in pull request #1665:
URL: https://github.com/apache/arrow-datafusion/pull/1665#discussion_r792273983



##
File path: datafusion/src/datasource/object_store/mod.rs
##
@@ -159,6 +159,16 @@ pub trait ObjectStore: Sync + Send + Debug {
 /// Get object reader for one file
 fn file_reader(, file: SizedFile) -> Result>;
 }
+/// Checks if we should filter out this path name
+/// 
https://github.com/apache/spark/blob/5b91381b9bcea27b7be6d9cef852efbd6c23d98a/core/src/main/scala/org/apache/spark/util/HadoopFSUtils.scala#L358
+fn do_path_filter_with_suffix(path: , suffix: ) -> bool {
+let exclude = path.starts_with('_') && !path.contains('=')
+|| path.starts_with('.')
+|| path.ends_with("._COPYING_");

Review comment:
   Sometimes read parquet file not pass the suffix option




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@arrow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [arrow-rs] HaoYang670 opened a new issue #1240: Get `Unknown configuration option `rust-version` when running the `rustfmt`

2022-01-25 Thread GitBox


HaoYang670 opened a new issue #1240:
URL: https://github.com/apache/arrow-rs/issues/1240


   **Describe the bug**
   On my desktop, when I run the command
   ```
   cargo +stable fmt --all -- --check
   ```
   I always get the warning
   ```
   Warning: Unknown configuration option `rust-version`
   ```
   When I run the command
   ```
   rustfmt --help=config
   ```
   I could not find a configuration option called `rust-version`
   I am not sure whether it is a bug or my problem.
   
   
   
   **To Reproduce**
   Run the command 
   ```
   cargo +stable fmt --all -- --check
   ```
   
   **Expected behavior**
   Remove the warning.
   
   **Additional context**
   [Developer's guide to Arrow 
Rust.](https://github.com/apache/arrow-rs/blob/794929835dd32f2c2e6897d7f606606e3e5aab29/CONTRIBUTING.md#:~:text=cargo%20%2Bstable%20fmt%20%2D%2Dall%20%2D%2D%20%2D%2Dcheck)
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@arrow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [arrow-datafusion] yahoNanJing edited a comment on issue #1675: Improvements to Ballista extensibility

2022-01-25 Thread GitBox


yahoNanJing edited a comment on issue #1675:
URL: 
https://github.com/apache/arrow-datafusion/issues/1675#issuecomment-1021799128


   Thanks @thinkharderdev for proposing these potentials.
   
   > Scans using custom object stores
   
   For this, actually our team has implemented for the HDFS. To avoid new 
object store registration, our workaround is to make the path self description 
with its scheme, like hdfs:://localhost:15050//file.parquet. Then with the 
scheme, we will know which kind of remote object store we needs. 
   
   > User Defined logical plan extensions, physical plan extensions, scalar and 
aggregation functions
   
   Maybe it's better to introduce the substrait integration into the roadmap.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@arrow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [arrow-datafusion] yahoNanJing commented on issue #1675: Improvements to Ballista extensibility

2022-01-25 Thread GitBox


yahoNanJing commented on issue #1675:
URL: 
https://github.com/apache/arrow-datafusion/issues/1675#issuecomment-1021799128


   Thanks @thinkharderdev for proposing these potentials.
   > Scans using custom object stores
   For this, actually our team has implemented for the HDFS. To avoid new 
object store registration, our workaround is to make the path self description 
with its scheme, like hdfs:://localhost:15050//file.parquet. Then with the 
scheme, we will know which kind of remote object store we needs. 
   
   > User Defined logical plan extensions, physical plan extensions, scalar and 
aggregation functions
   Maybe it's better to introduce the substrait integration into the roadmap.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@arrow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [arrow-cookbook] jaimesalvador edited a comment on issue #129: Read encrypted parquet file from R

2022-01-25 Thread GitBox


jaimesalvador edited a comment on issue #129:
URL: https://github.com/apache/arrow-cookbook/issues/129#issuecomment-1021794420


   Hi @thisisnic, 
   I have some programs written in C++ that use ARROW/PARQUET in encrypted 
format, but int order to check the data stored in parquet files, I need a quick 
way to check it, so I thought R could be useful for me, but I can't find the 
way to pass the "key" to decrypt the data (in R).
   
   I think it is an unsupported feature (not supported yet)!
   
   I will open a ticket on JIRA
   
   Thanks for your replies.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@arrow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [arrow-datafusion] yjshen commented on issue #1676: Regression: Merge assertion: `'assertion failed: i < self.len()'` in array_primitive.rs while merging non overlapping streams

2022-01-25 Thread GitBox


yjshen commented on issue #1676:
URL: 
https://github.com/apache/arrow-datafusion/issues/1676#issuecomment-1021794543


   It's caused by inserting an empty record batch, I think. The current sort 
will not be affected since empty batches are eliminated while inserting. 
Proposed fix in https://github.com/alamb/arrow-datafusion/pull/4


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@arrow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [arrow-cookbook] jaimesalvador edited a comment on issue #129: Read encrypted parquet file from R

2022-01-25 Thread GitBox


jaimesalvador edited a comment on issue #129:
URL: https://github.com/apache/arrow-cookbook/issues/129#issuecomment-1021794420


   Hi @thisisnic, 
   I have some programs written in C++ that use ARROW/PARQUET in encrypted 
format, but int order to check the data stored in parquet files, I need a quick 
way to check it, so I thought R could be useful for me, but I can't find the 
way to pass the "key" to decrypt the data (in R).
   
   I think there is a feature not supported yet!
   
   I will open a ticket on JIRA
   
   Thanks for your replies.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@arrow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [arrow-cookbook] jaimesalvador commented on issue #129: Read encrypted parquet file from R

2022-01-25 Thread GitBox


jaimesalvador commented on issue #129:
URL: https://github.com/apache/arrow-cookbook/issues/129#issuecomment-1021794420


   Hi @thisisnic, 
   I have some programs written in C++ that use ARROW/PARQUET in encrypted 
format, but int order to check the data stored in parquet files, I need a quit 
way to check it, so I thought R could be useful for me, but I can't find the 
way to pass the "key" to decrypt the data (in R).
   
   I think there is a feature not supported yet!
   
   I will open a ticket on JIRA
   
   Thanks for your replies.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@arrow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [arrow-datafusion] liukun4515 commented on issue #1356: The framework about expression type coercion

2022-01-25 Thread GitBox


liukun4515 commented on issue #1356:
URL: 
https://github.com/apache/arrow-datafusion/issues/1356#issuecomment-1021780614


   > I think this issue is now closed and we are on our way to uniform type 
coercion logic ❤️
   
   thank you for your reminder. 
   I will focus on the coercion logic when we meet some issues.
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@arrow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [arrow] Ignalina commented on pull request #12149: ARROW-15331: [Go][Parquet] Add pqarrow package for direct Parquet <--> Arrow conversion

2022-01-25 Thread GitBox


Ignalina commented on pull request #12149:
URL: https://github.com/apache/arrow/pull/12149#issuecomment-1021778236


   Lovly work !  The pqarrow module is Very helpful for us beginners !


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@arrow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [arrow-datafusion] liukun4515 commented on issue #1661: thread 'tokio-runtime-worker' panicked at 'not implemented: Take not supported for data type Decimal(18, 4)

2022-01-25 Thread GitBox


liukun4515 commented on issue #1661:
URL: 
https://github.com/apache/arrow-datafusion/issues/1661#issuecomment-1021777881


   thanks @alamb 
   Maybe we can close this ticket


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@arrow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [arrow] ursabot edited a comment on pull request #12257: ARROW-15451: [C++] Fix build with C++17 and ARROW_GCS=ON

2022-01-25 Thread GitBox


ursabot edited a comment on pull request #12257:
URL: https://github.com/apache/arrow/pull/12257#issuecomment-1021552302


   Benchmark runs are scheduled for baseline = 
38d4d77aed15a82a723c06768ebd659a90138fc2 and contender = 
f70acb2bcc17b11dcdb372b6e8047ca5805ec2ed. 
f70acb2bcc17b11dcdb372b6e8047ca5805ec2ed is a master commit associated with 
this PR. Results will be available as each benchmark for each run completes.
   Conbench compare runs links:
   [Finished :arrow_down:0.0% :arrow_up:0.0%] 
[ec2-t3-xlarge-us-east-2](https://conbench.ursa.dev/compare/runs/c2de1930597a49c5b31a14ebc33d0ead...b7312fa64122474f90926a9ee7c5bfc0/)
   [Scheduled] 
[ursa-i9-9960x](https://conbench.ursa.dev/compare/runs/890b2685fe8c41d2ac4fb566d0918b81...bdbd5f6bebf2494d8ab961a82071245a/)
   [Finished :arrow_down:0.26% :arrow_up:0.09%] 
[ursa-thinkcentre-m75q](https://conbench.ursa.dev/compare/runs/2b07e9d81ee64680bf264263e38c430f...bfabf432111f470295d371af2aa6b989/)
   Supported benchmarks:
   ec2-t3-xlarge-us-east-2: Supported benchmark langs: Python. Runs only 
benchmarks with cloud = True
   ursa-i9-9960x: Supported benchmark langs: Python, R, JavaScript
   ursa-thinkcentre-m75q: Supported benchmark langs: C++, Java
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@arrow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [arrow] ursabot edited a comment on pull request #12149: ARROW-15331: [Go][Parquet] Add pqarrow package for direct Parquet <--> Arrow conversion

2022-01-25 Thread GitBox


ursabot edited a comment on pull request #12149:
URL: https://github.com/apache/arrow/pull/12149#issuecomment-1021770422


   Benchmark runs are scheduled for baseline = 
345609b61674cedc1daa3f1bf5d6b64266b19b26 and contender = 
c5f400461f6d2be836d30df4626fed4d59107015. 
c5f400461f6d2be836d30df4626fed4d59107015 is a master commit associated with 
this PR. Results will be available as each benchmark for each run completes.
   Conbench compare runs links:
   [Finished :arrow_down:0.0% :arrow_up:0.0%] 
[ec2-t3-xlarge-us-east-2](https://conbench.ursa.dev/compare/runs/77919f1fe9384f03b857f977a428e0dd...b6266e792e88462badd7a4bcea8bd43f/)
   [Scheduled] 
[ursa-i9-9960x](https://conbench.ursa.dev/compare/runs/c604c4784e1147dc91533b54a002d5b4...a4ffc026754a469f821ccdfec52dddf8/)
   [Scheduled] 
[ursa-thinkcentre-m75q](https://conbench.ursa.dev/compare/runs/cb3cee8911dc4de9b38960d52b0738ee...93e5a4880f23490595da95eb0c96e661/)
   Supported benchmarks:
   ec2-t3-xlarge-us-east-2: Supported benchmark langs: Python. Runs only 
benchmarks with cloud = True
   ursa-i9-9960x: Supported benchmark langs: Python, R, JavaScript
   ursa-thinkcentre-m75q: Supported benchmark langs: C++, Java
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@arrow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [arrow-rs] HaoYang670 commented on a change in pull request #1238: dyn compare for binary array

2022-01-25 Thread GitBox


HaoYang670 commented on a change in pull request #1238:
URL: https://github.com/apache/arrow-rs/pull/1238#discussion_r792254539



##
File path: arrow/src/compute/kernels/comparison.rs
##
@@ -3843,6 +3960,114 @@ mod tests {
 assert_eq!(neq_dyn_scalar(, 8).unwrap(), expected);
 }
 
+#[test]

Review comment:
   Sure, I will add them




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@arrow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [arrow] ursabot commented on pull request #12149: ARROW-15331: [Go][Parquet] Add pqarrow package for direct Parquet <--> Arrow conversion

2022-01-25 Thread GitBox


ursabot commented on pull request #12149:
URL: https://github.com/apache/arrow/pull/12149#issuecomment-1021770422


   Benchmark runs are scheduled for baseline = 
345609b61674cedc1daa3f1bf5d6b64266b19b26 and contender = 
c5f400461f6d2be836d30df4626fed4d59107015. 
c5f400461f6d2be836d30df4626fed4d59107015 is a master commit associated with 
this PR. Results will be available as each benchmark for each run completes.
   Conbench compare runs links:
   [Scheduled] 
[ec2-t3-xlarge-us-east-2](https://conbench.ursa.dev/compare/runs/77919f1fe9384f03b857f977a428e0dd...b6266e792e88462badd7a4bcea8bd43f/)
   [Scheduled] 
[ursa-i9-9960x](https://conbench.ursa.dev/compare/runs/c604c4784e1147dc91533b54a002d5b4...a4ffc026754a469f821ccdfec52dddf8/)
   [Scheduled] 
[ursa-thinkcentre-m75q](https://conbench.ursa.dev/compare/runs/cb3cee8911dc4de9b38960d52b0738ee...93e5a4880f23490595da95eb0c96e661/)
   Supported benchmarks:
   ec2-t3-xlarge-us-east-2: Supported benchmark langs: Python. Runs only 
benchmarks with cloud = True
   ursa-i9-9960x: Supported benchmark langs: Python, R, JavaScript
   ursa-thinkcentre-m75q: Supported benchmark langs: C++, Java
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@arrow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [arrow] zeroshade closed pull request #12149: ARROW-15331: [Go][Parquet] Add pqarrow package for direct Parquet <--> Arrow conversion

2022-01-25 Thread GitBox


zeroshade closed pull request #12149:
URL: https://github.com/apache/arrow/pull/12149


   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@arrow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [arrow] vibhatha commented on pull request #12263: ARROW-15438: [Python] Flaky test test_write_dataset_max_open_files

2022-01-25 Thread GitBox


vibhatha commented on pull request #12263:
URL: https://github.com/apache/arrow/pull/12263#issuecomment-1021760779


   Thanks for looking into this @westonpace 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@arrow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [arrow] westonpace commented on pull request #12263: ARROW-15438: [Python] Flaky test test_write_dataset_max_open_files

2022-01-25 Thread GitBox


westonpace commented on pull request #12263:
URL: https://github.com/apache/arrow/pull/12263#issuecomment-1021758382


   @kszucs Feel free to merge this if you want.  This should not block RC6 as 
it is mostly a flaky test (the threading thing has some practical implications 
but they are minor)


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@arrow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [arrow] github-actions[bot] commented on pull request #12262: [Release] Verify 7.0.0 RC8 [WIP]

2022-01-25 Thread GitBox


github-actions[bot] commented on pull request #12262:
URL: https://github.com/apache/arrow/pull/12262#issuecomment-1021756211


   Revision: 80c8fc8739c4425fba10e8f8a9871a41f9f7f1db
   
   Submitted crossbow builds: [ursacomputing/crossbow @ 
actions-1501](https://github.com/ursacomputing/crossbow/branches/all?query=actions-1501)
   
   |Task|Status|
   ||--|
   |verify-rc-source-windows|[![Github 
Actions](https://github.com/ursacomputing/crossbow/workflows/Crossbow/badge.svg?branch=actions-1501-github-verify-rc-source-windows)](https://github.com/ursacomputing/crossbow/actions?query=branch:actions-1501-github-verify-rc-source-windows)|


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@arrow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [arrow] iajoiner commented on a change in pull request #12231: ARROW-14783: [C++][Python] Fix the write ORC in BytesIO issue

2022-01-25 Thread GitBox


iajoiner commented on a change in pull request #12231:
URL: https://github.com/apache/arrow/pull/12231#discussion_r792242670



##
File path: python/pyarrow/orc.py
##
@@ -330,7 +330,9 @@ def read_table(source, columns=None, filesystem=None):
 """
 
 
-def write_table(table, where, *, file_version='0.12',
+def write_table(table, where, *,
+close_file=False,

Review comment:
   This argument does serve a real ORC purpose (please see the tests in 
`test_orc.py).` It’s not the `filesystem` argument I removed.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@arrow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [arrow] kszucs commented on pull request #12262: [Release] Verify 7.0.0 RC8 [WIP]

2022-01-25 Thread GitBox


kszucs commented on pull request #12262:
URL: https://github.com/apache/arrow/pull/12262#issuecomment-1021755472


   @github-actions crossbow submit verify-rc-source-windows --param 
release=7.0.0 --param rc=8


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@arrow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [arrow] ursabot edited a comment on pull request #12242: ARROW-15433: [Doc] Fix warnings when building

2022-01-25 Thread GitBox


ursabot edited a comment on pull request #12242:
URL: https://github.com/apache/arrow/pull/12242#issuecomment-1020983453


   Benchmark runs are scheduled for baseline = 
9fb6defaecd3df3bfd40519118ae248c85df16dd and contender = 
231d0a6b30e7017f0e07162b6716f3da49a674d0. 
231d0a6b30e7017f0e07162b6716f3da49a674d0 is a master commit associated with 
this PR. Results will be available as each benchmark for each run completes.
   Conbench compare runs links:
   [Finished :arrow_down:0.0% :arrow_up:0.0%] 
[ec2-t3-xlarge-us-east-2](https://conbench.ursa.dev/compare/runs/2093f71d480c44ea868e64e3591db557...708a4484bf054ea39e277489f07f327f/)
   [Failed :arrow_down:0.45% :arrow_up:0.0%] 
[ursa-i9-9960x](https://conbench.ursa.dev/compare/runs/7b7a6c2ff5c64c7aab7e4f38910b7d6e...01cfa1661c444ba0a1ed9f1ff691cec8/)
   [Finished :arrow_down:0.22% :arrow_up:0.04%] 
[ursa-thinkcentre-m75q](https://conbench.ursa.dev/compare/runs/a8090f61e5464e149f9200bcd7161ee4...0e434b6c358342a3be2253a61c549839/)
   Supported benchmarks:
   ec2-t3-xlarge-us-east-2: Supported benchmark langs: Python. Runs only 
benchmarks with cloud = True
   ursa-i9-9960x: Supported benchmark langs: Python, R, JavaScript
   ursa-thinkcentre-m75q: Supported benchmark langs: C++, Java
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@arrow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [arrow] westonpace commented on pull request #12241: ARROW-15390: [Dev][C++][Doc] Document the GDB extension

2022-01-25 Thread GitBox


westonpace commented on pull request #12241:
URL: https://github.com/apache/arrow/pull/12241#issuecomment-1021746084


   Ok.  I'm fine with leaving versioning out until we introduce some kind of 
backwards incompatible change.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@arrow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [arrow] westonpace commented on a change in pull request #12241: ARROW-15390: [Dev][C++][Doc] Document the GDB extension

2022-01-25 Thread GitBox


westonpace commented on a change in pull request #12241:
URL: https://github.com/apache/arrow/pull/12241#discussion_r792236288



##
File path: docs/source/cpp/gdb.rst
##
@@ -0,0 +1,167 @@
+.. Licensed to the Apache Software Foundation (ASF) under one
+.. or more contributor license agreements.  See the NOTICE file
+.. distributed with this work for additional information
+.. regarding copyright ownership.  The ASF licenses this file
+.. to you under the Apache License, Version 2.0 (the
+.. "License"); you may not use this file except in compliance
+.. with the License.  You may obtain a copy of the License at
+
+..   http://www.apache.org/licenses/LICENSE-2.0
+
+.. Unless required by applicable law or agreed to in writing,
+.. software distributed under the License is distributed on an
+.. "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+.. KIND, either express or implied.  See the License for the
+.. specific language governing permissions and limitations
+.. under the License.
+
+.. default-domain:: cpp
+.. highlight:: console
+
+==
+Debugging code using Arrow
+==
+
+GDB extension for Arrow C++
+===
+
+By default, when asked to print the value of a C++ object,
+`GDB `_ displays the contents of its
+member variables.  However, for C++ objects this does not often yield
+a very useful output, as C++ classes tend to hide their implementation details
+behind methods and accessors.
+
+For example, here is how a :class:`arrow::Status` instance may be displayed
+by GDB::
+
+   $3 = {
+ > = {},
+ > = {},
+ members of arrow::Status:
+ state_ = 0x0
+   }
+
+and here is a :class:`arrow::Decimal128Scalar`::
+
+   $4 = (arrow::Decimal128Scalar) {
+ > = {
+= {
+  = {
+   > = {},
+   members of arrow::Scalar:
+   _vptr.Scalar = 0x76870e78 ,
+   type = std::shared_ptr (use count 1, weak count 0) 
= {
+ get() = 0x55ce58a0
+   },
+   is_valid = true
+ }, },
+   members of arrow::DecimalScalar:
+   value = {
+  = {
+   > = {
+ static kHighWordIndex = ,
+ static kBitWidth = 128,
+ static kByteWidth = 16,
+ static LittleEndianArray = ,
+ array_ = {
+   _M_elems = {[0] = 1234567, [1] = 0}
+ }
+   },
+   members of arrow::BasicDecimal128:
+   static kMaxPrecision = 38,
+   static kMaxScale = 38
+ }, }
+ }, }
+
+Fortunately, GDB also allows custom extensions to override the default printing
+for specific types.  We provide a
+`GDB extension `_
+written in Python that enables pretty-printing for common Arrow C++ classes,
+so as to enable a more productive debugging experience.  For example,
+here is how the aforementioned :class:`arrow::Status` instance will be
+displayed::
+
+   $5 = arrow::Status::OK()
+
+and here is the same :class:`arrow::Decimal128Scalar` instance as above::
+
+   $6 = arrow::Decimal128Scalar of value 123.4567 [precision=10, scale=4]
+
+
+Manual loading
+--
+
+To enable the GDB extension for Arrow, you can simply
+`download it `_
+somewhere on your computer and ``source`` it from the GDB prompt::
+
+   (gdb) source path/to/gdb_arrow.py
+
+You will have to ``source`` it on each new GDB session.  You might want to
+make this implicit by adding the ``source`` invocation in a
+`gdbinit `_ file.
+
+
+Automatic loading
+-
+
+GDB provides a facility to automatically load scripts or extensions for each
+object file or library that is involved in a debugging session.  You will need
+to:
+
+1. Find out what the *auto-load* location(s) is/are for your GDB install.
+   This can be determined using ``show`` subcommands on the GDB prompt;
+   the answer will depend on the operating system.
+
+   Here is an example on Ubuntu::
+
+  (gdb) show auto-load scripts-directory
+  List of directories from which to load auto-loaded scripts is 
$debugdir:$datadir/auto-load.
+  (gdb) show data-directory
+  GDB's data directory is "/usr/share/gdb".
+  (gdb) show debug-file-directory
+  The directory where separate debug symbols are searched for is 
"/usr/lib/debug".
+
+   This tells you that the directories used for auto-loading are
+   ``$debugdir`` and ``$datadir/auto-load``, which expand to
+   ``/usr/lib/debug/`` and ``/usr/share/gdb/auto-load`` respectively.
+
+2. Find out the full path to the Arrow C++ DLL, *with all symlinks resolved*.
+   For example, you might have installed Arrow 7.0 in ``/usr/local`` and the
+   path to the Arrow C++ DLL could then be 
``/usr/local/lib/libarrow.so.700.0.0``.
+
+3. Determine the actual auto-load script path.  It is computed by *a)* 

[GitHub] [arrow] github-actions[bot] commented on pull request #12262: [Release] Verify 7.0.0 RC8 [WIP]

2022-01-25 Thread GitBox


github-actions[bot] commented on pull request #12262:
URL: https://github.com/apache/arrow/pull/12262#issuecomment-1021740775


   Revision: 400b5d989dd3a654bc1061d19a5ae3e95972e5eb
   
   Submitted crossbow builds: [ursacomputing/crossbow @ 
actions-1500](https://github.com/ursacomputing/crossbow/branches/all?query=actions-1500)
   
   |Task|Status|
   ||--|
   |verify-rc-jars-amd64|[![Github 
Actions](https://github.com/ursacomputing/crossbow/workflows/Crossbow/badge.svg?branch=actions-1500-github-verify-rc-jars-amd64)](https://github.com/ursacomputing/crossbow/actions?query=branch:actions-1500-github-verify-rc-jars-amd64)|


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@arrow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [arrow] kou commented on pull request #12262: [Release] Verify 7.0.0 RC8 [WIP]

2022-01-25 Thread GitBox


kou commented on pull request #12262:
URL: https://github.com/apache/arrow/pull/12262#issuecomment-1021740218


   @github-actions crossbow submit verify-rc-jars-amd64 --param release=7.0.0 
--param rc=8


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@arrow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [arrow] github-actions[bot] commented on pull request #12264: ARROW-15457: [Packaging][deb] Specify CUDAToolkit_ROOT explicitly

2022-01-25 Thread GitBox


github-actions[bot] commented on pull request #12264:
URL: https://github.com/apache/arrow/pull/12264#issuecomment-1021737324


   Revision: ddb038a51f9ec7370f11e4af5fc23c435776ebbc
   
   Submitted crossbow builds: [ursacomputing/crossbow @ 
actions-1499](https://github.com/ursacomputing/crossbow/branches/all?query=actions-1499)
   
   |Task|Status|
   ||--|
   |ubuntu-impish-amd64|[![Github 
Actions](https://github.com/ursacomputing/crossbow/workflows/Crossbow/badge.svg?branch=actions-1499-github-ubuntu-impish-amd64)](https://github.com/ursacomputing/crossbow/actions?query=branch:actions-1499-github-ubuntu-impish-amd64)|


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@arrow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [arrow] kou commented on pull request #12264: ARROW-15457: [Packaging][deb] Specify CUDAToolkit_ROOT explicitly

2022-01-25 Thread GitBox


kou commented on pull request #12264:
URL: https://github.com/apache/arrow/pull/12264#issuecomment-1021736759


   @github-actions crossbow submit ubuntu-impish-amd64


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@arrow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [arrow] github-actions[bot] commented on pull request #12264: ARROW-15457: [Packaging][deb] Specify CUDAToolkit_ROOT explicitly

2022-01-25 Thread GitBox


github-actions[bot] commented on pull request #12264:
URL: https://github.com/apache/arrow/pull/12264#issuecomment-1021736658


   https://issues.apache.org/jira/browse/ARROW-15457


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@arrow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [arrow] ursabot edited a comment on pull request #12253: ARROW-15442: [C++][Python] Skip GDB tests on a non-debug build

2022-01-25 Thread GitBox


ursabot edited a comment on pull request #12253:
URL: https://github.com/apache/arrow/pull/12253#issuecomment-1021462488


   Benchmark runs are scheduled for baseline = 
f6f494eae0719dd00da08aae02b2c39245f16ce3 and contender = 
38d4d77aed15a82a723c06768ebd659a90138fc2. 
38d4d77aed15a82a723c06768ebd659a90138fc2 is a master commit associated with 
this PR. Results will be available as each benchmark for each run completes.
   Conbench compare runs links:
   [Finished :arrow_down:0.0% :arrow_up:0.0%] 
[ec2-t3-xlarge-us-east-2](https://conbench.ursa.dev/compare/runs/db60455d7e1a4e03b81d28f218791c7c...c2de1930597a49c5b31a14ebc33d0ead/)
   [Scheduled] 
[ursa-i9-9960x](https://conbench.ursa.dev/compare/runs/481a53e8235d4f118240eb991b43808b...890b2685fe8c41d2ac4fb566d0918b81/)
   [Finished :arrow_down:0.52% :arrow_up:0.13%] 
[ursa-thinkcentre-m75q](https://conbench.ursa.dev/compare/runs/0edcfbffe80949e9874136dfc51a9dab...2b07e9d81ee64680bf264263e38c430f/)
   Supported benchmarks:
   ec2-t3-xlarge-us-east-2: Supported benchmark langs: Python. Runs only 
benchmarks with cloud = True
   ursa-i9-9960x: Supported benchmark langs: Python, R, JavaScript
   ursa-thinkcentre-m75q: Supported benchmark langs: C++, Java
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@arrow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [arrow-rs] yordan-pavlov commented on pull request #1225: Improve MutableArrayData Null Handling (#1224) (#1230)

2022-01-25 Thread GitBox


yordan-pavlov commented on pull request #1225:
URL: https://github.com/apache/arrow-rs/pull/1225#issuecomment-1021703225


   @tustvold are you still seeing a 2x performance improvement in filter 
benchmarks after the latest changes?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@arrow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [arrow-rs] yordan-pavlov commented on a change in pull request #1225: Improve MutableArrayData Null Handling (#1224) (#1230)

2022-01-25 Thread GitBox


yordan-pavlov commented on a change in pull request #1225:
URL: https://github.com/apache/arrow-rs/pull/1225#discussion_r792199565



##
File path: arrow/src/array/transform/mod.rs
##
@@ -377,9 +375,12 @@ impl<'a> MutableArrayData<'a> {
 /// returns a new [MutableArrayData] with capacity to `capacity` slots and 
specialized to create an
 /// [ArrayData] from multiple `arrays`.
 ///
-/// `use_nulls` is a flag used to optimize insertions. It should be 
`false` if the only source of nulls
-/// are the arrays themselves and `true` if the user plans to call 
[MutableArrayData::extend_nulls].
-/// In other words, if `use_nulls` is `false`, calling 
[MutableArrayData::extend_nulls] should not be used.
+/// `use_nulls` is a flag used to optimize insertions, if `use_nulls` is 
`true` a null bitmap
+/// will be created regardless of the contents of `arrays`, otherwise a 
null bitmap will
+/// be computed only if `arrays` contains nulls.
+///
+/// Code that plans to call [MutableArrayData::extend_nulls] MUST set 
`use_nulls` to `true`,
+/// in order to ensure that a null bitmap is computed.
 pub fn new(arrays: Vec<&'a ArrayData>, use_nulls: bool, capacity: usize) 
-> Self {

Review comment:
   having thought some more about this, wouldn't something like 
`compute_nulls` or `create_null_bitmap`  (instead of `use_nulls`) be a better 
name, because:
   (1) if it's `true`, then a null bitmap is always created, no matter if any 
the input arrays have a null bitmap
   (2) the documentation comment, I think, reads better as e.g. 
   ```
if `compute_nulls` is `true` a null bitmap will be created regardless of 
the contents of `arrays`
   ```




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@arrow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [arrow-datafusion] realno commented on issue #1675: Improvements to Ballista extensibility

2022-01-25 Thread GitBox


realno commented on issue #1675:
URL: 
https://github.com/apache/arrow-datafusion/issues/1675#issuecomment-1021690436


   This would be a great  improvement  I will follow the design and PRs


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@arrow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [arrow] kszucs closed pull request #12261: [Release] Verify 7.0.0 RC7 [WIP]

2022-01-25 Thread GitBox


kszucs closed pull request #12261:
URL: https://github.com/apache/arrow/pull/12261


   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@arrow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [arrow] kszucs commented on pull request #12261: [Release] Verify 7.0.0 RC7 [WIP]

2022-01-25 Thread GitBox


kszucs commented on pull request #12261:
URL: https://github.com/apache/arrow/pull/12261#issuecomment-1021687899


   Closing in favor of https://github.com/apache/arrow/pull/12262


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@arrow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [arrow] kszucs commented on pull request #12263: ARROW-15438: [Python] Flaky test test_write_dataset_max_open_files

2022-01-25 Thread GitBox


kszucs commented on pull request #12263:
URL: https://github.com/apache/arrow/pull/12263#issuecomment-1021686025


   Thanks Weston! 
   
   @lidavidm could you please verify this locally?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@arrow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [arrow] github-actions[bot] commented on pull request #12263: ARROW-15438: [Python] Flaky test test_write_dataset_max_open_files

2022-01-25 Thread GitBox


github-actions[bot] commented on pull request #12263:
URL: https://github.com/apache/arrow/pull/12263#issuecomment-1021684894


   https://issues.apache.org/jira/browse/ARROW-15438


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@arrow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [arrow] westonpace opened a new pull request #12263: ARROW-15438: [Python] Flaky test test_write_dataset_max_open_files

2022-01-25 Thread GitBox


westonpace opened a new pull request #12263:
URL: https://github.com/apache/arrow/pull/12263


   The test could fail when writing due to a race condition.  If the batches 
were delivered `ABC...` then by the time we need to close a file to 
make space we can close an already completed file (and so we won't have to open 
up a new one later) and we end up with 5 files for 5 partitions.
   
   Adding `use_threads=False` to the `write_dataset` call was not sufficient.  
The `arrow::dataset::FileSystemDataset::Write` method was always using the CPU 
executor for the exec plan.  In other scanner methods we base the CPU executor 
on the scan options (`nullptr` if `scan_options->use_threads` is `false`).  
Making both of these changes together seems to make the test reliably pass.
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@arrow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [arrow] github-actions[bot] commented on pull request #12262: [Release] Verify 7.0.0 RC8 [WIP]

2022-01-25 Thread GitBox


github-actions[bot] commented on pull request #12262:
URL: https://github.com/apache/arrow/pull/12262#issuecomment-1021676881


   Revision: 400b5d989dd3a654bc1061d19a5ae3e95972e5eb
   
   Submitted crossbow builds: [ursacomputing/crossbow @ 
actions-1498](https://github.com/ursacomputing/crossbow/branches/all?query=actions-1498)
   
   |Task|Status|
   ||--|
   |verify-rc-source-cpp-linux-almalinux-8-amd64|[![Github 
Actions](https://github.com/ursacomputing/crossbow/workflows/Crossbow/badge.svg?branch=actions-1498-github-verify-rc-source-cpp-linux-almalinux-8-amd64)](https://github.com/ursacomputing/crossbow/actions?query=branch:actions-1498-github-verify-rc-source-cpp-linux-almalinux-8-amd64)|
   |verify-rc-source-cpp-linux-ubuntu-18.04-amd64|[![Github 
Actions](https://github.com/ursacomputing/crossbow/workflows/Crossbow/badge.svg?branch=actions-1498-github-verify-rc-source-cpp-linux-ubuntu-18.04-amd64)](https://github.com/ursacomputing/crossbow/actions?query=branch:actions-1498-github-verify-rc-source-cpp-linux-ubuntu-18.04-amd64)|
   |verify-rc-source-cpp-linux-ubuntu-20.04-amd64|[![Github 
Actions](https://github.com/ursacomputing/crossbow/workflows/Crossbow/badge.svg?branch=actions-1498-github-verify-rc-source-cpp-linux-ubuntu-20.04-amd64)](https://github.com/ursacomputing/crossbow/actions?query=branch:actions-1498-github-verify-rc-source-cpp-linux-ubuntu-20.04-amd64)|
   |verify-rc-source-cpp-macos-amd64|[![Github 
Actions](https://github.com/ursacomputing/crossbow/workflows/Crossbow/badge.svg?branch=actions-1498-github-verify-rc-source-cpp-macos-amd64)](https://github.com/ursacomputing/crossbow/actions?query=branch:actions-1498-github-verify-rc-source-cpp-macos-amd64)|
   |verify-rc-source-cpp-macos-arm64|[![Github 
Actions](https://github.com/ursacomputing/crossbow/workflows/Crossbow/badge.svg?branch=actions-1498-github-verify-rc-source-cpp-macos-arm64)](https://github.com/ursacomputing/crossbow/actions?query=branch:actions-1498-github-verify-rc-source-cpp-macos-arm64)|
   |verify-rc-source-csharp-linux-almalinux-8-amd64|[![Github 
Actions](https://github.com/ursacomputing/crossbow/workflows/Crossbow/badge.svg?branch=actions-1498-github-verify-rc-source-csharp-linux-almalinux-8-amd64)](https://github.com/ursacomputing/crossbow/actions?query=branch:actions-1498-github-verify-rc-source-csharp-linux-almalinux-8-amd64)|
   |verify-rc-source-csharp-linux-ubuntu-18.04-amd64|[![Github 
Actions](https://github.com/ursacomputing/crossbow/workflows/Crossbow/badge.svg?branch=actions-1498-github-verify-rc-source-csharp-linux-ubuntu-18.04-amd64)](https://github.com/ursacomputing/crossbow/actions?query=branch:actions-1498-github-verify-rc-source-csharp-linux-ubuntu-18.04-amd64)|
   |verify-rc-source-csharp-linux-ubuntu-20.04-amd64|[![Github 
Actions](https://github.com/ursacomputing/crossbow/workflows/Crossbow/badge.svg?branch=actions-1498-github-verify-rc-source-csharp-linux-ubuntu-20.04-amd64)](https://github.com/ursacomputing/crossbow/actions?query=branch:actions-1498-github-verify-rc-source-csharp-linux-ubuntu-20.04-amd64)|
   |verify-rc-source-csharp-macos-amd64|[![Github 
Actions](https://github.com/ursacomputing/crossbow/workflows/Crossbow/badge.svg?branch=actions-1498-github-verify-rc-source-csharp-macos-amd64)](https://github.com/ursacomputing/crossbow/actions?query=branch:actions-1498-github-verify-rc-source-csharp-macos-amd64)|
   |verify-rc-source-csharp-macos-arm64|[![Github 
Actions](https://github.com/ursacomputing/crossbow/workflows/Crossbow/badge.svg?branch=actions-1498-github-verify-rc-source-csharp-macos-arm64)](https://github.com/ursacomputing/crossbow/actions?query=branch:actions-1498-github-verify-rc-source-csharp-macos-arm64)|
   |verify-rc-source-go-linux-almalinux-8-amd64|[![Github 
Actions](https://github.com/ursacomputing/crossbow/workflows/Crossbow/badge.svg?branch=actions-1498-github-verify-rc-source-go-linux-almalinux-8-amd64)](https://github.com/ursacomputing/crossbow/actions?query=branch:actions-1498-github-verify-rc-source-go-linux-almalinux-8-amd64)|
   |verify-rc-source-go-linux-ubuntu-18.04-amd64|[![Github 
Actions](https://github.com/ursacomputing/crossbow/workflows/Crossbow/badge.svg?branch=actions-1498-github-verify-rc-source-go-linux-ubuntu-18.04-amd64)](https://github.com/ursacomputing/crossbow/actions?query=branch:actions-1498-github-verify-rc-source-go-linux-ubuntu-18.04-amd64)|
   |verify-rc-source-go-linux-ubuntu-20.04-amd64|[![Github 
Actions](https://github.com/ursacomputing/crossbow/workflows/Crossbow/badge.svg?branch=actions-1498-github-verify-rc-source-go-linux-ubuntu-20.04-amd64)](https://github.com/ursacomputing/crossbow/actions?query=branch:actions-1498-github-verify-rc-source-go-linux-ubuntu-20.04-amd64)|
   |verify-rc-source-go-macos-amd64|[![Github 

[GitHub] [arrow] kszucs commented on pull request #12262: [Release] Verify 7.0.0 RC8 [WIP]

2022-01-25 Thread GitBox


kszucs commented on pull request #12262:
URL: https://github.com/apache/arrow/pull/12262#issuecomment-1021676214


   @github-actions crossbow submit --group verify-rc-source --param 
release=7.0.0 --param rc=8


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@arrow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [arrow-datafusion] thinkharderdev commented on issue #1273: Question: Is the Ballista project providing value to the overall DataFusion project?

2022-01-25 Thread GitBox


thinkharderdev commented on issue #1273:
URL: 
https://github.com/apache/arrow-datafusion/issues/1273#issuecomment-1021673135


   Late to the party here but my team is very excited about the potential of 
Ballista and are interested in helping push the project forward.  


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@arrow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [arrow] ursabot edited a comment on pull request #12247: ARROW-15439: [Release] Update .deb/.rpm changelogs after release

2022-01-25 Thread GitBox


ursabot edited a comment on pull request #12247:
URL: https://github.com/apache/arrow/pull/12247#issuecomment-1021376965


   Benchmark runs are scheduled for baseline = 
3fc90532d4353146c64b2575a36a00069c747968 and contender = 
f6f494eae0719dd00da08aae02b2c39245f16ce3. 
f6f494eae0719dd00da08aae02b2c39245f16ce3 is a master commit associated with 
this PR. Results will be available as each benchmark for each run completes.
   Conbench compare runs links:
   [Finished :arrow_down:0.0% :arrow_up:0.0%] 
[ec2-t3-xlarge-us-east-2](https://conbench.ursa.dev/compare/runs/9f84c6a9ee3e48c6b5fb59bd37ff7b3f...db60455d7e1a4e03b81d28f218791c7c/)
   [Scheduled] 
[ursa-i9-9960x](https://conbench.ursa.dev/compare/runs/2c78dc87183c44049683782b5e636aa1...481a53e8235d4f118240eb991b43808b/)
   [Finished :arrow_down:0.13% :arrow_up:0.0%] 
[ursa-thinkcentre-m75q](https://conbench.ursa.dev/compare/runs/07da683c36c04f838623834e5312ca6b...0edcfbffe80949e9874136dfc51a9dab/)
   Supported benchmarks:
   ec2-t3-xlarge-us-east-2: Supported benchmark langs: Python. Runs only 
benchmarks with cloud = True
   ursa-i9-9960x: Supported benchmark langs: Python, R, JavaScript
   ursa-thinkcentre-m75q: Supported benchmark langs: C++, Java
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@arrow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [arrow-datafusion] andygrove commented on issue #1675: Improvements to Ballista extensibility

2022-01-25 Thread GitBox


andygrove commented on issue #1675:
URL: 
https://github.com/apache/arrow-datafusion/issues/1675#issuecomment-1021646194


   It may be useful to see how substrait is handling extensions as well - 
https://substrait.io/extensions/


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@arrow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [arrow-datafusion] alamb commented on issue #1356: The framework about expression type coercion

2022-01-25 Thread GitBox


alamb commented on issue #1356:
URL: 
https://github.com/apache/arrow-datafusion/issues/1356#issuecomment-1021642866


   I think this issue is now closed and we are on our way to uniform type 
coercion logic ❤️ 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@arrow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [arrow-datafusion] alamb closed issue #1356: The framework about expression type coercion

2022-01-25 Thread GitBox


alamb closed issue #1356:
URL: https://github.com/apache/arrow-datafusion/issues/1356


   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@arrow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [arrow-datafusion] alamb commented on issue #1505: Renaming Tests Discussion

2022-01-25 Thread GitBox


alamb commented on issue #1505:
URL: 
https://github.com/apache/arrow-datafusion/issues/1505#issuecomment-1021642464


   I think this is completed in  
https://github.com/apache/arrow-datafusion/pull/1491


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@arrow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [arrow-datafusion] alamb closed issue #1505: Renaming Tests Discussion

2022-01-25 Thread GitBox


alamb closed issue #1505:
URL: https://github.com/apache/arrow-datafusion/issues/1505


   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@arrow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [arrow-datafusion] alamb commented on a change in pull request #1680: Use NamedTempFile rather than `String` in DiskManager

2022-01-25 Thread GitBox


alamb commented on a change in pull request #1680:
URL: https://github.com/apache/arrow-datafusion/pull/1680#discussion_r792144774



##
File path: datafusion/src/physical_plan/sorts/sort.rs
##
@@ -301,17 +306,16 @@ async fn spill_partial_sorted_stream(
 }
 
 fn read_spill_as_stream(
-path: String,
+path: NamedTempFile,

Review comment:
   Since ownership of `NamedTempFile` is passed into the actual task doing 
the reading, so when it is done, the temp file is cleaned up 粒 

##
File path: datafusion/src/physical_plan/sorts/sort.rs
##
@@ -301,17 +306,16 @@ async fn spill_partial_sorted_stream(
 }
 
 fn read_spill_as_stream(
-path: String,
+path: NamedTempFile,

Review comment:
   Since ownership of `NamedTempFile` is passed into the actual task doing 
the reading, when it is done, the temp file is cleaned up 粒 




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@arrow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [arrow-datafusion] alamb commented on a change in pull request #1680: Use NamedTempFile rather than `String` in DiskManager

2022-01-25 Thread GitBox


alamb commented on a change in pull request #1680:
URL: https://github.com/apache/arrow-datafusion/pull/1680#discussion_r792144273



##
File path: datafusion/src/execution/disk_manager.rs
##
@@ -120,34 +116,15 @@ fn create_local_dirs(local_dirs: Vec) -> 
Result> {
 .collect()
 }
 
-fn get_file(file_name: , local_dirs: &[TempDir]) -> String {
-let mut hasher = DefaultHasher::new();
-file_name.hash( hasher);
-let hash = hasher.finish();
-let dir = _dirs[hash.rem_euclid(local_dirs.len() as u64) as usize];
-let mut path = PathBuf::new();
-path.push(dir);
-path.push(file_name);
-path.to_str().unwrap().to_string()
-}
+fn create_tmp_file(local_dirs: &[TempDir]) -> Result {

Review comment:
   tempfiles are now generated using `tempfile` rather than string 
manipulation




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@arrow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [arrow] github-actions[bot] commented on pull request #12261: [Release] Verify 7.0.0 RC7 [WIP]

2022-01-25 Thread GitBox


github-actions[bot] commented on pull request #12261:
URL: https://github.com/apache/arrow/pull/12261#issuecomment-1021636953


   Revision: cb0820c60a63fcb2150ece8acbd47b9ccc2c0979
   
   Submitted crossbow builds: [ursacomputing/crossbow @ 
actions-1497](https://github.com/ursacomputing/crossbow/branches/all?query=actions-1497)
   
   |Task|Status|
   ||--|
   |verify-rc-source-cpp-linux-almalinux-8-amd64|[![Github 
Actions](https://github.com/ursacomputing/crossbow/workflows/Crossbow/badge.svg?branch=actions-1497-github-verify-rc-source-cpp-linux-almalinux-8-amd64)](https://github.com/ursacomputing/crossbow/actions?query=branch:actions-1497-github-verify-rc-source-cpp-linux-almalinux-8-amd64)|
   |verify-rc-source-cpp-linux-ubuntu-18.04-amd64|[![Github 
Actions](https://github.com/ursacomputing/crossbow/workflows/Crossbow/badge.svg?branch=actions-1497-github-verify-rc-source-cpp-linux-ubuntu-18.04-amd64)](https://github.com/ursacomputing/crossbow/actions?query=branch:actions-1497-github-verify-rc-source-cpp-linux-ubuntu-18.04-amd64)|
   |verify-rc-source-cpp-linux-ubuntu-20.04-amd64|[![Github 
Actions](https://github.com/ursacomputing/crossbow/workflows/Crossbow/badge.svg?branch=actions-1497-github-verify-rc-source-cpp-linux-ubuntu-20.04-amd64)](https://github.com/ursacomputing/crossbow/actions?query=branch:actions-1497-github-verify-rc-source-cpp-linux-ubuntu-20.04-amd64)|
   |verify-rc-source-cpp-macos-amd64|[![Github 
Actions](https://github.com/ursacomputing/crossbow/workflows/Crossbow/badge.svg?branch=actions-1497-github-verify-rc-source-cpp-macos-amd64)](https://github.com/ursacomputing/crossbow/actions?query=branch:actions-1497-github-verify-rc-source-cpp-macos-amd64)|
   |verify-rc-source-cpp-macos-arm64|[![Github 
Actions](https://github.com/ursacomputing/crossbow/workflows/Crossbow/badge.svg?branch=actions-1497-github-verify-rc-source-cpp-macos-arm64)](https://github.com/ursacomputing/crossbow/actions?query=branch:actions-1497-github-verify-rc-source-cpp-macos-arm64)|
   |verify-rc-source-csharp-linux-almalinux-8-amd64|[![Github 
Actions](https://github.com/ursacomputing/crossbow/workflows/Crossbow/badge.svg?branch=actions-1497-github-verify-rc-source-csharp-linux-almalinux-8-amd64)](https://github.com/ursacomputing/crossbow/actions?query=branch:actions-1497-github-verify-rc-source-csharp-linux-almalinux-8-amd64)|
   |verify-rc-source-csharp-linux-ubuntu-18.04-amd64|[![Github 
Actions](https://github.com/ursacomputing/crossbow/workflows/Crossbow/badge.svg?branch=actions-1497-github-verify-rc-source-csharp-linux-ubuntu-18.04-amd64)](https://github.com/ursacomputing/crossbow/actions?query=branch:actions-1497-github-verify-rc-source-csharp-linux-ubuntu-18.04-amd64)|
   |verify-rc-source-csharp-linux-ubuntu-20.04-amd64|[![Github 
Actions](https://github.com/ursacomputing/crossbow/workflows/Crossbow/badge.svg?branch=actions-1497-github-verify-rc-source-csharp-linux-ubuntu-20.04-amd64)](https://github.com/ursacomputing/crossbow/actions?query=branch:actions-1497-github-verify-rc-source-csharp-linux-ubuntu-20.04-amd64)|
   |verify-rc-source-csharp-macos-amd64|[![Github 
Actions](https://github.com/ursacomputing/crossbow/workflows/Crossbow/badge.svg?branch=actions-1497-github-verify-rc-source-csharp-macos-amd64)](https://github.com/ursacomputing/crossbow/actions?query=branch:actions-1497-github-verify-rc-source-csharp-macos-amd64)|
   |verify-rc-source-csharp-macos-arm64|[![Github 
Actions](https://github.com/ursacomputing/crossbow/workflows/Crossbow/badge.svg?branch=actions-1497-github-verify-rc-source-csharp-macos-arm64)](https://github.com/ursacomputing/crossbow/actions?query=branch:actions-1497-github-verify-rc-source-csharp-macos-arm64)|
   |verify-rc-source-go-linux-almalinux-8-amd64|[![Github 
Actions](https://github.com/ursacomputing/crossbow/workflows/Crossbow/badge.svg?branch=actions-1497-github-verify-rc-source-go-linux-almalinux-8-amd64)](https://github.com/ursacomputing/crossbow/actions?query=branch:actions-1497-github-verify-rc-source-go-linux-almalinux-8-amd64)|
   |verify-rc-source-go-linux-ubuntu-18.04-amd64|[![Github 
Actions](https://github.com/ursacomputing/crossbow/workflows/Crossbow/badge.svg?branch=actions-1497-github-verify-rc-source-go-linux-ubuntu-18.04-amd64)](https://github.com/ursacomputing/crossbow/actions?query=branch:actions-1497-github-verify-rc-source-go-linux-ubuntu-18.04-amd64)|
   |verify-rc-source-go-linux-ubuntu-20.04-amd64|[![Github 
Actions](https://github.com/ursacomputing/crossbow/workflows/Crossbow/badge.svg?branch=actions-1497-github-verify-rc-source-go-linux-ubuntu-20.04-amd64)](https://github.com/ursacomputing/crossbow/actions?query=branch:actions-1497-github-verify-rc-source-go-linux-ubuntu-20.04-amd64)|
   |verify-rc-source-go-macos-amd64|[![Github 

[GitHub] [arrow-datafusion] alamb commented on pull request #1680: Use NamedTempFile rather than `String` in DiskManager

2022-01-25 Thread GitBox


alamb commented on pull request #1680:
URL: 
https://github.com/apache/arrow-datafusion/pull/1680#issuecomment-1021636850


   cc @yjshen 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@arrow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [arrow-datafusion] alamb opened a new pull request #1680: Use NamedTempFile rather than `String` in DiskManager

2022-01-25 Thread GitBox


alamb opened a new pull request #1680:
URL: https://github.com/apache/arrow-datafusion/pull/1680


   # Which issue does this PR close?
   
   Closes https://github.com/apache/arrow-datafusion/issues/1679
   
# Rationale for this change
   1. Using `String` for temporary files leaves the files around longer than 
necessary, and would cause trouble with a long lived DiskManager across plans.
   2. Using the existing `tempfile` crate in rust is likely to work better 
across operating systems than DataFusion specific tempfile creation logic
   
   # What changes are included in this PR?
   1. `DiskManager` passes out `NamedTempFile`s rather than `String` (when 
these are dropped they also clean up the temp file immediately)
   2. Update users to use NamedTempFiles
   
   # Are there any user-facing changes?
   No
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@arrow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [arrow] kszucs closed pull request #12235: [Release] Verify 7.0.0 RC6 [WIP]

2022-01-25 Thread GitBox


kszucs closed pull request #12235:
URL: https://github.com/apache/arrow/pull/12235


   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@arrow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [arrow] kszucs commented on pull request #12235: [Release] Verify 7.0.0 RC6 [WIP]

2022-01-25 Thread GitBox


kszucs commented on pull request #12235:
URL: https://github.com/apache/arrow/pull/12235#issuecomment-1021636139


   Closing in favor of https://github.com/apache/arrow/pull/12261


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@arrow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [arrow] github-actions[bot] commented on pull request #12261: [Release] Verify 7.0.0 RC7 [WIP]

2022-01-25 Thread GitBox


github-actions[bot] commented on pull request #12261:
URL: https://github.com/apache/arrow/pull/12261#issuecomment-1021635843


   
   
   Thanks for opening a pull request!
   
   If this is not a [minor 
PR](https://github.com/apache/arrow/blob/master/CONTRIBUTING.md#Minor-Fixes). 
Could you open an issue for this pull request on JIRA? 
https://issues.apache.org/jira/browse/ARROW
   
   Opening JIRAs ahead of time contributes to the 
[Openness](http://theapacheway.com/open/#:~:text=Openness%20allows%20new%20users%20the,must%20happen%20in%20the%20open.)
 of the Apache Arrow project.
   
   Then could you also rename pull request title in the following format?
   
   ARROW-${JIRA_ID}: [${COMPONENT}] ${SUMMARY}
   
   or
   
   MINOR: [${COMPONENT}] ${SUMMARY}
   
   See also:
   
 * [Other pull requests](https://github.com/apache/arrow/pulls/)
 * [Contribution Guidelines - How to contribute 
patches](https://arrow.apache.org/docs/developers/contributing.html#how-to-contribute-patches)
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@arrow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [arrow] kszucs commented on pull request #12261: [Release] Verify 7.0.0 RC7 [WIP]

2022-01-25 Thread GitBox


kszucs commented on pull request #12261:
URL: https://github.com/apache/arrow/pull/12261#issuecomment-1021636007


   @github-actions crossbow submit --group verify-rc-source --param 
release=7.0.0 --param rc=7


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@arrow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [arrow-datafusion] alamb merged pull request #1668: Improve configuration and resource use of `MemoryManager` and `DiskManager`

2022-01-25 Thread GitBox


alamb merged pull request #1668:
URL: https://github.com/apache/arrow-datafusion/pull/1668


   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@arrow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [arrow-datafusion] alamb closed issue #1636: Provide RuntimeEnv to ExecutionContext

2022-01-25 Thread GitBox


alamb closed issue #1636:
URL: https://github.com/apache/arrow-datafusion/issues/1636


   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@arrow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [arrow-datafusion] alamb opened a new issue #1679: DiskManager keeps temporary files around until the manager itself is dropped

2022-01-25 Thread GitBox


alamb opened a new issue #1679:
URL: https://github.com/apache/arrow-datafusion/issues/1679


   **Describe the bug**
   
   The `DiskManager` passes out `String`s rather than `TempFile`s. The 
tempfiles are eventually cleaned up, but only when the `DiskManager` is 
`drop`ed , rather than the actual temp file use us is complete.
   
   This both leaves temporary files around longer than necessary, and would 
cause trouble with a long lived  `DiskManager` across plans.
   
   **To Reproduce**
   Make a shared `DiskManager` and run queries that spill to disk. The files 
will not be cleaned up until the `DiskManager` is dropped 
   
   **Expected behavior**
   When the temp file is no longer in use, it should be dropped 
   
   **Additional context**
   Noticed while working on 
https://github.com/apache/arrow-datafusion/pull/1668 I noticed that the 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@arrow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [arrow] djnavarro commented on pull request #12244: ARROW-14807: [R] Implement bindings for lubridate am and pm

2022-01-25 Thread GitBox


djnavarro commented on pull request #12244:
URL: https://github.com/apache/arrow/pull/12244#issuecomment-1021601402


   Yep, no worries! I'll look into this tomorrow (it's a public holiday here 
today)


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@arrow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [arrow] xhochy closed pull request #12259: WIP: Test patch for ARROW-15444

2022-01-25 Thread GitBox


xhochy closed pull request #12259:
URL: https://github.com/apache/arrow/pull/12259


   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@arrow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [arrow] github-actions[bot] commented on pull request #12260: ARROW-15454: [Python] Try to make CSV cancellation test more robust

2022-01-25 Thread GitBox


github-actions[bot] commented on pull request #12260:
URL: https://github.com/apache/arrow/pull/12260#issuecomment-1021598926


   Revision: 9525671caa3d0bb5b2913733253783c1fe1525d2
   
   Submitted crossbow builds: [ursacomputing/crossbow @ 
actions-1496](https://github.com/ursacomputing/crossbow/branches/all?query=actions-1496)
   
   |Task|Status|
   ||--|
   |wheel-macos-big-sur-cp38-arm64|[![Github 
Actions](https://github.com/ursacomputing/crossbow/workflows/Crossbow/badge.svg?branch=actions-1496-github-wheel-macos-big-sur-cp38-arm64)](https://github.com/ursacomputing/crossbow/actions?query=branch:actions-1496-github-wheel-macos-big-sur-cp38-arm64)|


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@arrow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




  1   2   3   4   5   6   >