[jira] [Assigned] (ARROW-17850) [Java] Upgrade netty-codec-http dependencies

2022-09-27 Thread David Dali Susanibar Arce (Jira)


 [ 
https://issues.apache.org/jira/browse/ARROW-17850?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

David Dali Susanibar Arce reassigned ARROW-17850:
-

Assignee: David Dali Susanibar Arce

> [Java] Upgrade netty-codec-http dependencies
> 
>
> Key: ARROW-17850
> URL: https://issues.apache.org/jira/browse/ARROW-17850
> Project: Apache Arrow
>  Issue Type: Bug
>  Components: Java
>Affects Versions: 9.0.0
>Reporter: Hui Yu
>Assignee: David Dali Susanibar Arce
>Priority: Major
>
> [CVE-2022-24823]([https://github.com/advisories/GHSA-269q-hmxg-m83q]) reports 
> a security vulnerability for *netty-codec-http*
> Now the version of *netty-codec-http* in the master branch is *4.1.72.Final,* 
> that is unsafe.
> The ticket https://issues.apache.org/jira/browse/ARROW-16996 bumps 
> *netty-codec* to {*}4.1.78.Final{*}, it didn't bump *netty-codec-http.*
> Can you upgrade the version of *netty-codec-http* ? 
>  
> Here is my output of mvn:dependency now:
> ```bash
> [INFO] +- org.apache.arrow:flight-core:jar:9.0.0:compile
> [INFO] |  +- io.grpc:grpc-netty:jar:1.47.0:compile
> [INFO] |  |  +- io.netty:netty-codec-http2:jar:4.1.72.Final:compile
> [INFO] |  |  |  - io.netty:{*}netty-codec-http{*}:jar:4.1.72.Final:compile
> [INFO] |  |  +- io.netty:netty-handler-proxy:jar:4.1.72.Final:runtime
> [INFO] |  |  |  - io.netty:netty-codec-socks:jar:4.1.72.Final:runtime
> [INFO] |  |  +- 
> com.google.errorprone:error_prone_annotations:jar:2.10.0:compile
> [INFO] |  |  +- io.perfmark:perfmark-api:jar:0.25.0:runtime
> [INFO] |  |  - 
> io.netty:netty-transport-native-unix-common:jar:4.1.72.Final:compile
> [INFO] |  +- io.grpc:grpc-core:jar:1.47.0:compile
> [INFO] |  |  +- com.google.android:annotations:jar:4.1.1.4:runtime
> [INFO] |  |  - org.codehaus.mojo:animal-sniffer-annotations:jar:1.19:runtime
> [INFO] |  +- io.grpc:grpc-context:jar:1.47.0:compile
> [INFO] |  +- io.grpc:grpc-protobuf:jar:1.47.0:compile
> [INFO] |  |  +- 
> com.google.api.grpc:proto-google-common-protos:jar:2.0.1:compile
> [INFO] |  |  - io.grpc:grpc-protobuf-lite:jar:1.47.0:compile
> [INFO] |  +- io.netty:netty-tcnative-boringssl-static:jar:2.0.53.Final:compile
> [INFO] |  |  +- io.netty:netty-tcnative-classes:jar:2.0.53.Final:compile
> [INFO] |  |  +- 
> io.netty:netty-tcnative-boringssl-static:jar:linux-x86_64:2.0.53.Final:compile
> [INFO] |  |  +- 
> io.netty:netty-tcnative-boringssl-static:jar:linux-aarch_64:2.0.53.Final:compile
> [INFO] |  |  +- 
> io.netty:netty-tcnative-boringssl-static:jar:osx-x86_64:2.0.53.Final:compile
> [INFO] |  |  +- 
> io.netty:netty-tcnative-boringssl-static:jar:osx-aarch_64:2.0.53.Final:compile
> [INFO] |  |  - 
> io.netty:netty-tcnative-boringssl-static:jar:windows-x86_64:2.0.53.Final:compile
> [INFO] |  +- io.netty:netty-handler:jar:4.1.78.Final:compile
> [INFO] |  |  +- io.netty:netty-resolver:jar:4.1.78.Final:compile
> [INFO] |  |  - io.netty:netty-codec:jar:4.1.78.Final:compile
> [INFO] |  +- io.netty:netty-transport:jar:4.1.78.Final:compile
> [INFO] |  +- com.google.guava:guava:jar:30.1.1-jre:compile
> [INFO] |  |  +- com.google.guava:failureaccess:jar:1.0.1:compile
> [INFO] |  |  +- 
> com.google.guava:listenablefuture:jar:.0-empty-to-avoid-conflict-with-guava:compile
> [INFO] |  |  +- org.checkerframework:checker-qual:jar:3.8.0:compile
> [INFO] |  |  - com.google.j2objc:j2objc-annotations:jar:1.3:compile
> [INFO] |  +- io.grpc:grpc-stub:jar:1.47.0:compile
> [INFO] |  +- com.google.protobuf:protobuf-java:jar:3.21.2:compile
> [INFO] |  +- io.grpc:grpc-api:jar:1.47.0:compile
> [INFO] |  - javax.annotation:javax.annotation-api:jar:1.3.2:compile
> ```



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (ARROW-17865) [Java] Deprecate Plasma JNI bindings

2022-09-27 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/ARROW-17865?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated ARROW-17865:
---
Labels: pull-request-available  (was: )

> [Java] Deprecate Plasma JNI bindings
> 
>
> Key: ARROW-17865
> URL: https://issues.apache.org/jira/browse/ARROW-17865
> Project: Apache Arrow
>  Issue Type: Sub-task
>  Components: Java
>Reporter: Antoine Pitrou
>Assignee: David Dali Susanibar Arce
>Priority: Blocker
>  Labels: pull-request-available
> Fix For: 10.0.0
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Assigned] (ARROW-17865) [Java] Deprecate Plasma JNI bindings

2022-09-27 Thread David Dali Susanibar Arce (Jira)


 [ 
https://issues.apache.org/jira/browse/ARROW-17865?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

David Dali Susanibar Arce reassigned ARROW-17865:
-

Assignee: David Dali Susanibar Arce

> [Java] Deprecate Plasma JNI bindings
> 
>
> Key: ARROW-17865
> URL: https://issues.apache.org/jira/browse/ARROW-17865
> Project: Apache Arrow
>  Issue Type: Sub-task
>  Components: Java
>Reporter: Antoine Pitrou
>Assignee: David Dali Susanibar Arce
>Priority: Blocker
> Fix For: 10.0.0
>
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (ARROW-17687) [C++] ScanningStress test is flaky in CI

2022-09-27 Thread Jira


[ 
https://issues.apache.org/jira/browse/ARROW-17687?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17610349#comment-17610349
 ] 

Percy Camilo Triveño Aucahuasi commented on ARROW-17687:


I got this [^backtrace.log.cpp].

It seems we are moving the unique_locker and trying to lock some invalid mutex.

Also, I was able to get another issue, this time a deadlock using these values:
  constexpr int kNumIters = 1;
  constexpr int kNumFragments = 10;
  constexpr int kBatchesPerFragment = 10;
  constexpr int kNumConcurrentTasks = 2;
I'll try to explore more about where we are getting these errors, so far I was 
able to reduce and reproduce the test issue using these values:
  constexpr int kNumIters = 1;
  constexpr int kNumFragments = 2;
  constexpr int kBatchesPerFragment = 1;
  constexpr int kNumConcurrentTasks = 1;
Given that we can use C++ 17 now, I'll try to use the new std::scoped_lock 
instead of the the other lockers (in the places where it make sense to do so)

> [C++] ScanningStress test is flaky in CI
> 
>
> Key: ARROW-17687
> URL: https://issues.apache.org/jira/browse/ARROW-17687
> Project: Apache Arrow
>  Issue Type: Bug
>  Components: C++
>Reporter: Weston Pace
>Assignee: Percy Camilo Triveño Aucahuasi
>Priority: Major
> Attachments: backtrace.log.cpp
>
>
> There is at least one nightly failure: 
> https://github.com/ursacomputing/crossbow/actions/runs/3033965241/jobs/4882574634



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Comment Edited] (ARROW-17687) [C++] ScanningStress test is flaky in CI

2022-09-27 Thread Jira


[ 
https://issues.apache.org/jira/browse/ARROW-17687?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17610349#comment-17610349
 ] 

Percy Camilo Triveño Aucahuasi edited comment on ARROW-17687 at 9/28/22 4:56 AM:
-

I got this [^backtrace.log.cpp].

It seems we are moving the unique_locker and trying to lock some invalid mutex.

Also, I was able to get another issue, this time a deadlock using these values:
{code:java}
constexpr int kNumIters = 1;
constexpr int kNumFragments = 10;
constexpr int kBatchesPerFragment = 10;
constexpr int kNumConcurrentTasks = 2;{code}
I'll try to explore more about where we are getting these errors, so far I was 
able to reduce and reproduce the test issue using these values:
{code:java}
constexpr int kNumIters = 1;
constexpr int kNumFragments = 2;
constexpr int kBatchesPerFragment = 1;
constexpr int kNumConcurrentTasks = 1;{code}
Given that we can use C++ 17 now, I'll try to use the new std::scoped_lock 
instead of the the other lockers (in the places where it make sense to do so)


was (Author: aucahuasi):
I got this [^backtrace.log.cpp].

It seems we are moving the unique_locker and trying to lock some invalid mutex.

Also, I was able to get another issue, this time a deadlock using these values:
  constexpr int kNumIters = 1;
  constexpr int kNumFragments = 10;
  constexpr int kBatchesPerFragment = 10;
  constexpr int kNumConcurrentTasks = 2;
I'll try to explore more about where we are getting these errors, so far I was 
able to reduce and reproduce the test issue using these values:
  constexpr int kNumIters = 1;
  constexpr int kNumFragments = 2;
  constexpr int kBatchesPerFragment = 1;
  constexpr int kNumConcurrentTasks = 1;
Given that we can use C++ 17 now, I'll try to use the new std::scoped_lock 
instead of the the other lockers (in the places where it make sense to do so)

> [C++] ScanningStress test is flaky in CI
> 
>
> Key: ARROW-17687
> URL: https://issues.apache.org/jira/browse/ARROW-17687
> Project: Apache Arrow
>  Issue Type: Bug
>  Components: C++
>Reporter: Weston Pace
>Assignee: Percy Camilo Triveño Aucahuasi
>Priority: Major
> Attachments: backtrace.log.cpp
>
>
> There is at least one nightly failure: 
> https://github.com/ursacomputing/crossbow/actions/runs/3033965241/jobs/4882574634



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (ARROW-17687) [C++] ScanningStress test is flaky in CI

2022-09-27 Thread Jira


 [ 
https://issues.apache.org/jira/browse/ARROW-17687?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Percy Camilo Triveño Aucahuasi updated ARROW-17687:
---
Attachment: backtrace.log.cpp

> [C++] ScanningStress test is flaky in CI
> 
>
> Key: ARROW-17687
> URL: https://issues.apache.org/jira/browse/ARROW-17687
> Project: Apache Arrow
>  Issue Type: Bug
>  Components: C++
>Reporter: Weston Pace
>Assignee: Percy Camilo Triveño Aucahuasi
>Priority: Major
> Attachments: backtrace.log.cpp
>
>
> There is at least one nightly failure: 
> https://github.com/ursacomputing/crossbow/actions/runs/3033965241/jobs/4882574634



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Closed] (ARROW-17154) [C++] Change cmake project name from arrow_python to pyarrow_cpp

2022-09-27 Thread Alenka Frim (Jira)


 [ 
https://issues.apache.org/jira/browse/ARROW-17154?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alenka Frim closed ARROW-17154.
---
Resolution: Not A Problem

> [C++] Change cmake project name from arrow_python to pyarrow_cpp
> 
>
> Key: ARROW-17154
> URL: https://issues.apache.org/jira/browse/ARROW-17154
> Project: Apache Arrow
>  Issue Type: Sub-task
>  Components: C++
>Reporter: Alenka Frim
>Assignee: Alenka Frim
>Priority: Major
> Fix For: 10.0.0
>
>
> See discussion 
> https://github.com/apache/arrow/pull/13311#discussion_r926198302



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Resolved] (ARROW-17427) [Java] Add windows build script that produces DLLs

2022-09-27 Thread Kouhei Sutou (Jira)


 [ 
https://issues.apache.org/jira/browse/ARROW-17427?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kouhei Sutou resolved ARROW-17427.
--
Fix Version/s: 10.0.0
   Resolution: Fixed

Issue resolved by pull request 14203
[https://github.com/apache/arrow/pull/14203]

> [Java] Add windows build script that produces DLLs
> --
>
> Key: ARROW-17427
> URL: https://issues.apache.org/jira/browse/ARROW-17427
> Project: Apache Arrow
>  Issue Type: Sub-task
>  Components: Documentation, Java
>Reporter: David Dali Susanibar Arce
>Assignee: Kouhei Sutou
>Priority: Major
>  Labels: pull-request-available
> Fix For: 10.0.0
>
>  Time Spent: 7h
>  Remaining Estimate: 0h
>
> Improve support for Java developers who wish to use a release version Arrow 
> on Microsoft Windows



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Assigned] (ARROW-17869) [Java][Gandiva] ProjectorTest.testStringOutput is failed

2022-09-27 Thread Jin Shang (Jira)


 [ 
https://issues.apache.org/jira/browse/ARROW-17869?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jin Shang reassigned ARROW-17869:
-

Assignee: Jin Shang

> [Java][Gandiva] ProjectorTest.testStringOutput is failed
> 
>
> Key: ARROW-17869
> URL: https://issues.apache.org/jira/browse/ARROW-17869
> Project: Apache Arrow
>  Issue Type: Bug
>Reporter: Kouhei Sutou
>Assignee: Jin Shang
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> https://github.com/ursacomputing/crossbow/actions/runs/3134225521/jobs/5089370615#step:7:69073
> {noformat}
> Error:  Tests run: 43, Failures: 0, Errors: 1, Skipped: 1, Time elapsed: 
> 30.152 s <<< FAILURE! - in org.apache.arrow.gandiva.evaluator.ProjectorTest
> Error:  org.apache.arrow.gandiva.evaluator.ProjectorTest.testStringOutput  
> Time elapsed: 0.124 s  <<< ERROR!
> reserve not implemented
>   at 
> org.apache.arrow.gandiva.evaluator.JniWrapper.evaluateProjector(Native Method)
>   at 
> org.apache.arrow.gandiva.evaluator.Projector.evaluate(Projector.java:345)
>   at 
> org.apache.arrow.gandiva.evaluator.Projector.evaluate(Projector.java:213)
>   at 
> org.apache.arrow.gandiva.evaluator.ProjectorTest.testStringOutput(ProjectorTest.java:605)
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:498)
>   at 
> org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:50)
>   at 
> org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
>   at 
> org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:47)
>   at 
> org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
>   at 
> org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26)
>   at 
> org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27)
>   at 
> org.junit.rules.ExpectedException$ExpectedExceptionStatement.evaluate(ExpectedException.java:239)
>   at org.junit.rules.RunRules.evaluate(RunRules.java:20)
>   at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:325)
>   at 
> org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:78)
>   at 
> org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:57)
>   at org.junit.runners.ParentRunner$3.run(ParentRunner.java:290)
>   at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:71)
>   at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:288)
>   at org.junit.runners.ParentRunner.access$000(ParentRunner.java:58)
>   at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:268)
>   at org.junit.runners.ParentRunner.run(ParentRunner.java:363)
>   at org.junit.runner.JUnitCore.run(JUnitCore.java:137)
>   at org.junit.runner.JUnitCore.run(JUnitCore.java:115)
>   at 
> org.junit.vintage.engine.execution.RunnerExecutor.execute(RunnerExecutor.java:42)
>   at 
> org.junit.vintage.engine.VintageTestEngine.executeAllChildren(VintageTestEngine.java:80)
>   at 
> org.junit.vintage.engine.VintageTestEngine.execute(VintageTestEngine.java:72)
>   at 
> org.junit.platform.launcher.core.EngineExecutionOrchestrator.execute(EngineExecutionOrchestrator.java:147)
>   at 
> org.junit.platform.launcher.core.EngineExecutionOrchestrator.execute(EngineExecutionOrchestrator.java:127)
>   at 
> org.junit.platform.launcher.core.EngineExecutionOrchestrator.execute(EngineExecutionOrchestrator.java:90)
>   at 
> org.junit.platform.launcher.core.EngineExecutionOrchestrator.lambda$execute$0(EngineExecutionOrchestrator.java:55)
>   at 
> org.junit.platform.launcher.core.EngineExecutionOrchestrator.withInterceptedStreams(EngineExecutionOrchestrator.java:102)
>   at 
> org.junit.platform.launcher.core.EngineExecutionOrchestrator.execute(EngineExecutionOrchestrator.java:54)
>   at 
> org.junit.platform.launcher.core.DefaultLauncher.execute(DefaultLauncher.java:114)
>   at 
> org.junit.platform.launcher.core.DefaultLauncher.execute(DefaultLauncher.java:86)
>   at 
> org.junit.platform.launcher.core.DefaultLauncherSession$DelegatingLauncher.execute(DefaultLauncherSession.java:86)
>   at 
> org.apache.maven.surefire.junitplatform.LazyLauncher.execute(LazyLauncher.java:55)
>   at 
> org.apache.maven.surefire.junitplatform.JUnitPlatformProvider.lambda$execute$1(JUnitPlatformProvider.java:234)
>   at 

[jira] [Updated] (ARROW-17847) [C++] Support unquoted decimal in JSON parser

2022-09-27 Thread Jin Shang (Jira)


 [ 
https://issues.apache.org/jira/browse/ARROW-17847?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jin Shang updated ARROW-17847:
--
Description: 
-Add an option to parse decimal as unquoted numbers in JSON-

Support both quoted and unquoted decimal in JSON parser automatically.

  was:Add an option to parse decimal as unquoted numbers in JSON


> [C++] Support unquoted decimal in JSON parser
> -
>
> Key: ARROW-17847
> URL: https://issues.apache.org/jira/browse/ARROW-17847
> Project: Apache Arrow
>  Issue Type: Improvement
>  Components: C++
>Affects Versions: 9.0.0
>Reporter: Jin Shang
>Assignee: Jin Shang
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 2.5h
>  Remaining Estimate: 0h
>
> -Add an option to parse decimal as unquoted numbers in JSON-
> Support both quoted and unquoted decimal in JSON parser automatically.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (ARROW-17869) [Java][Gandiva] ProjectorTest.testStringOutput is failed

2022-09-27 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/ARROW-17869?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated ARROW-17869:
---
Labels: pull-request-available  (was: )

> [Java][Gandiva] ProjectorTest.testStringOutput is failed
> 
>
> Key: ARROW-17869
> URL: https://issues.apache.org/jira/browse/ARROW-17869
> Project: Apache Arrow
>  Issue Type: Bug
>Reporter: Kouhei Sutou
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> https://github.com/ursacomputing/crossbow/actions/runs/3134225521/jobs/5089370615#step:7:69073
> {noformat}
> Error:  Tests run: 43, Failures: 0, Errors: 1, Skipped: 1, Time elapsed: 
> 30.152 s <<< FAILURE! - in org.apache.arrow.gandiva.evaluator.ProjectorTest
> Error:  org.apache.arrow.gandiva.evaluator.ProjectorTest.testStringOutput  
> Time elapsed: 0.124 s  <<< ERROR!
> reserve not implemented
>   at 
> org.apache.arrow.gandiva.evaluator.JniWrapper.evaluateProjector(Native Method)
>   at 
> org.apache.arrow.gandiva.evaluator.Projector.evaluate(Projector.java:345)
>   at 
> org.apache.arrow.gandiva.evaluator.Projector.evaluate(Projector.java:213)
>   at 
> org.apache.arrow.gandiva.evaluator.ProjectorTest.testStringOutput(ProjectorTest.java:605)
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:498)
>   at 
> org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:50)
>   at 
> org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
>   at 
> org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:47)
>   at 
> org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
>   at 
> org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26)
>   at 
> org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27)
>   at 
> org.junit.rules.ExpectedException$ExpectedExceptionStatement.evaluate(ExpectedException.java:239)
>   at org.junit.rules.RunRules.evaluate(RunRules.java:20)
>   at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:325)
>   at 
> org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:78)
>   at 
> org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:57)
>   at org.junit.runners.ParentRunner$3.run(ParentRunner.java:290)
>   at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:71)
>   at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:288)
>   at org.junit.runners.ParentRunner.access$000(ParentRunner.java:58)
>   at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:268)
>   at org.junit.runners.ParentRunner.run(ParentRunner.java:363)
>   at org.junit.runner.JUnitCore.run(JUnitCore.java:137)
>   at org.junit.runner.JUnitCore.run(JUnitCore.java:115)
>   at 
> org.junit.vintage.engine.execution.RunnerExecutor.execute(RunnerExecutor.java:42)
>   at 
> org.junit.vintage.engine.VintageTestEngine.executeAllChildren(VintageTestEngine.java:80)
>   at 
> org.junit.vintage.engine.VintageTestEngine.execute(VintageTestEngine.java:72)
>   at 
> org.junit.platform.launcher.core.EngineExecutionOrchestrator.execute(EngineExecutionOrchestrator.java:147)
>   at 
> org.junit.platform.launcher.core.EngineExecutionOrchestrator.execute(EngineExecutionOrchestrator.java:127)
>   at 
> org.junit.platform.launcher.core.EngineExecutionOrchestrator.execute(EngineExecutionOrchestrator.java:90)
>   at 
> org.junit.platform.launcher.core.EngineExecutionOrchestrator.lambda$execute$0(EngineExecutionOrchestrator.java:55)
>   at 
> org.junit.platform.launcher.core.EngineExecutionOrchestrator.withInterceptedStreams(EngineExecutionOrchestrator.java:102)
>   at 
> org.junit.platform.launcher.core.EngineExecutionOrchestrator.execute(EngineExecutionOrchestrator.java:54)
>   at 
> org.junit.platform.launcher.core.DefaultLauncher.execute(DefaultLauncher.java:114)
>   at 
> org.junit.platform.launcher.core.DefaultLauncher.execute(DefaultLauncher.java:86)
>   at 
> org.junit.platform.launcher.core.DefaultLauncherSession$DelegatingLauncher.execute(DefaultLauncherSession.java:86)
>   at 
> org.apache.maven.surefire.junitplatform.LazyLauncher.execute(LazyLauncher.java:55)
>   at 
> org.apache.maven.surefire.junitplatform.JUnitPlatformProvider.lambda$execute$1(JUnitPlatformProvider.java:234)
>   at 

[jira] [Commented] (ARROW-17869) [Java][Gandiva] ProjectorTest.testStringOutput is failed

2022-09-27 Thread Jin Shang (Jira)


[ 
https://issues.apache.org/jira/browse/ARROW-17869?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17610327#comment-17610327
 ] 

Jin Shang commented on ARROW-17869:
---

Yeah it's related to ARROW-17824. I'll see what I can do.

> [Java][Gandiva] ProjectorTest.testStringOutput is failed
> 
>
> Key: ARROW-17869
> URL: https://issues.apache.org/jira/browse/ARROW-17869
> Project: Apache Arrow
>  Issue Type: Bug
>Reporter: Kouhei Sutou
>Priority: Major
>
> https://github.com/ursacomputing/crossbow/actions/runs/3134225521/jobs/5089370615#step:7:69073
> {noformat}
> Error:  Tests run: 43, Failures: 0, Errors: 1, Skipped: 1, Time elapsed: 
> 30.152 s <<< FAILURE! - in org.apache.arrow.gandiva.evaluator.ProjectorTest
> Error:  org.apache.arrow.gandiva.evaluator.ProjectorTest.testStringOutput  
> Time elapsed: 0.124 s  <<< ERROR!
> reserve not implemented
>   at 
> org.apache.arrow.gandiva.evaluator.JniWrapper.evaluateProjector(Native Method)
>   at 
> org.apache.arrow.gandiva.evaluator.Projector.evaluate(Projector.java:345)
>   at 
> org.apache.arrow.gandiva.evaluator.Projector.evaluate(Projector.java:213)
>   at 
> org.apache.arrow.gandiva.evaluator.ProjectorTest.testStringOutput(ProjectorTest.java:605)
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:498)
>   at 
> org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:50)
>   at 
> org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
>   at 
> org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:47)
>   at 
> org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
>   at 
> org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26)
>   at 
> org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27)
>   at 
> org.junit.rules.ExpectedException$ExpectedExceptionStatement.evaluate(ExpectedException.java:239)
>   at org.junit.rules.RunRules.evaluate(RunRules.java:20)
>   at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:325)
>   at 
> org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:78)
>   at 
> org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:57)
>   at org.junit.runners.ParentRunner$3.run(ParentRunner.java:290)
>   at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:71)
>   at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:288)
>   at org.junit.runners.ParentRunner.access$000(ParentRunner.java:58)
>   at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:268)
>   at org.junit.runners.ParentRunner.run(ParentRunner.java:363)
>   at org.junit.runner.JUnitCore.run(JUnitCore.java:137)
>   at org.junit.runner.JUnitCore.run(JUnitCore.java:115)
>   at 
> org.junit.vintage.engine.execution.RunnerExecutor.execute(RunnerExecutor.java:42)
>   at 
> org.junit.vintage.engine.VintageTestEngine.executeAllChildren(VintageTestEngine.java:80)
>   at 
> org.junit.vintage.engine.VintageTestEngine.execute(VintageTestEngine.java:72)
>   at 
> org.junit.platform.launcher.core.EngineExecutionOrchestrator.execute(EngineExecutionOrchestrator.java:147)
>   at 
> org.junit.platform.launcher.core.EngineExecutionOrchestrator.execute(EngineExecutionOrchestrator.java:127)
>   at 
> org.junit.platform.launcher.core.EngineExecutionOrchestrator.execute(EngineExecutionOrchestrator.java:90)
>   at 
> org.junit.platform.launcher.core.EngineExecutionOrchestrator.lambda$execute$0(EngineExecutionOrchestrator.java:55)
>   at 
> org.junit.platform.launcher.core.EngineExecutionOrchestrator.withInterceptedStreams(EngineExecutionOrchestrator.java:102)
>   at 
> org.junit.platform.launcher.core.EngineExecutionOrchestrator.execute(EngineExecutionOrchestrator.java:54)
>   at 
> org.junit.platform.launcher.core.DefaultLauncher.execute(DefaultLauncher.java:114)
>   at 
> org.junit.platform.launcher.core.DefaultLauncher.execute(DefaultLauncher.java:86)
>   at 
> org.junit.platform.launcher.core.DefaultLauncherSession$DelegatingLauncher.execute(DefaultLauncherSession.java:86)
>   at 
> org.apache.maven.surefire.junitplatform.LazyLauncher.execute(LazyLauncher.java:55)
>   at 
> org.apache.maven.surefire.junitplatform.JUnitPlatformProvider.lambda$execute$1(JUnitPlatformProvider.java:234)
>   at java.util.Iterator.forEachRemaining(Iterator.java:116)
>   at 
> 

[jira] [Resolved] (ARROW-17773) [CI][C++] sccache error on Travis-CI Arm64 build

2022-09-27 Thread Yibo Cai (Jira)


 [ 
https://issues.apache.org/jira/browse/ARROW-17773?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yibo Cai resolved ARROW-17773.
--
Resolution: Fixed

Issue resolved by pull request 14201
[https://github.com/apache/arrow/pull/14201]

> [CI][C++] sccache error on Travis-CI Arm64 build
> 
>
> Key: ARROW-17773
> URL: https://issues.apache.org/jira/browse/ARROW-17773
> Project: Apache Arrow
>  Issue Type: Bug
>  Components: C++, Continuous Integration
>Reporter: Antoine Pitrou
>Assignee: Yibo Cai
>Priority: Critical
>  Labels: pull-request-available
> Fix For: 10.0.0
>
>  Time Spent: 2h 10m
>  Remaining Estimate: 0h
>
> See https://app.travis-ci.com/github/apache/arrow/jobs/583166213#L3382
> {code}
> + command -v sccache
> + echo '=== sccache stats after the build ==='
> === sccache stats after the build ===
> + sccache --show-stats
> /arrow/ci/scripts/cpp_build.sh: line 183: /usr/local/bin/sccache: cannot 
> execute binary file: Exec format error
> ERROR: 126
> {code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (ARROW-17753) [C++][Python][Docs] Provide instructions for "fixing" build environment issues

2022-09-27 Thread Kouhei Sutou (Jira)


 [ 
https://issues.apache.org/jira/browse/ARROW-17753?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kouhei Sutou updated ARROW-17753:
-
Summary: [C++][Python][Docs] Provide instructions for "fixing" build 
environment issues  (was: [C++][Python] Provide instructions for "fixing" build 
environment issues)

> [C++][Python][Docs] Provide instructions for "fixing" build environment issues
> --
>
> Key: ARROW-17753
> URL: https://issues.apache.org/jira/browse/ARROW-17753
> Project: Apache Arrow
>  Issue Type: Bug
>  Components: C++, Python
>Reporter: Alenka Frim
>Assignee: Anja Boskovic
>Priority: Major
>  Labels: pull-request-available
> Fix For: 10.0.0
>
>  Time Spent: 0.5h
>  Remaining Estimate: 0h
>
> Due to bigger changes in the build workflow for Arrow C++ coming up in the 
> 10.0.0 release, failures when building the libraries are quite common. The 
> errors we bump into are similar to:
> {code:java}
> CMake Error at 
> build/dist/lib/cmake/ArrowPython/ArrowPythonConfig.cmake:61 
> (arrow_keep_backward_compatibility):
>   Unknown CMake command "arrow_keep_backward_compatibility".
> Call Stack (most recent call first):
>   CMakeLists.txt:240 (find_package)
> {code}
> or
> {code:java}
> -- Found Python3Alt: /Users/alenkafrim/repos/pyarrow-dev-9/bin/python  
> CMake Error at 
> /opt/homebrew/Cellar/cmake/3.24.1/share/cmake/Modules/CMakeFindDependencyMacro.cmake:47
>  (find_package):
>   By not providing "FindArrow.cmake" in CMAKE_MODULE_PATH this project has
>   asked CMake to find a package configuration file provided by "Arrow", but
>   CMake did not find one.
>   Could not find a package configuration file provided by "Arrow" with any of
>   the following names:
> ArrowConfig.cmake
> arrow-config.cmake
>   Add the installation prefix of "Arrow" to CMAKE_PREFIX_PATH or set
>   "Arrow_DIR" to a directory containing one of the above files.  If "Arrow"
>   provides a separate development package or SDK, be sure it has been
>   installed.
> Call Stack (most recent call first):
>   build/dist/lib/cmake/ArrowPython/ArrowPythonConfig.cmake:54 
> (find_dependency)
>   CMakeLists.txt:240 (find_package)
> {code}
> Connected issues:
>  - https://issues.apache.org/jira/browse/ARROW-17577
>  - https://issues.apache.org/jira/browse/ARROW-17575



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (ARROW-17753) [C++][Python][Docs] Provide instructions for "fixing" build environment issues

2022-09-27 Thread Kouhei Sutou (Jira)


 [ 
https://issues.apache.org/jira/browse/ARROW-17753?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kouhei Sutou updated ARROW-17753:
-
Component/s: Documentation

> [C++][Python][Docs] Provide instructions for "fixing" build environment issues
> --
>
> Key: ARROW-17753
> URL: https://issues.apache.org/jira/browse/ARROW-17753
> Project: Apache Arrow
>  Issue Type: Bug
>  Components: C++, Documentation, Python
>Reporter: Alenka Frim
>Assignee: Anja Boskovic
>Priority: Major
>  Labels: pull-request-available
> Fix For: 10.0.0
>
>  Time Spent: 0.5h
>  Remaining Estimate: 0h
>
> Due to bigger changes in the build workflow for Arrow C++ coming up in the 
> 10.0.0 release, failures when building the libraries are quite common. The 
> errors we bump into are similar to:
> {code:java}
> CMake Error at 
> build/dist/lib/cmake/ArrowPython/ArrowPythonConfig.cmake:61 
> (arrow_keep_backward_compatibility):
>   Unknown CMake command "arrow_keep_backward_compatibility".
> Call Stack (most recent call first):
>   CMakeLists.txt:240 (find_package)
> {code}
> or
> {code:java}
> -- Found Python3Alt: /Users/alenkafrim/repos/pyarrow-dev-9/bin/python  
> CMake Error at 
> /opt/homebrew/Cellar/cmake/3.24.1/share/cmake/Modules/CMakeFindDependencyMacro.cmake:47
>  (find_package):
>   By not providing "FindArrow.cmake" in CMAKE_MODULE_PATH this project has
>   asked CMake to find a package configuration file provided by "Arrow", but
>   CMake did not find one.
>   Could not find a package configuration file provided by "Arrow" with any of
>   the following names:
> ArrowConfig.cmake
> arrow-config.cmake
>   Add the installation prefix of "Arrow" to CMAKE_PREFIX_PATH or set
>   "Arrow_DIR" to a directory containing one of the above files.  If "Arrow"
>   provides a separate development package or SDK, be sure it has been
>   installed.
> Call Stack (most recent call first):
>   build/dist/lib/cmake/ArrowPython/ArrowPythonConfig.cmake:54 
> (find_dependency)
>   CMakeLists.txt:240 (find_package)
> {code}
> Connected issues:
>  - https://issues.apache.org/jira/browse/ARROW-17577
>  - https://issues.apache.org/jira/browse/ARROW-17575



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (ARROW-17753) [C++][Python] Provide instructions for "fixing" build environment issues

2022-09-27 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/ARROW-17753?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated ARROW-17753:
---
Labels: pull-request-available  (was: )

> [C++][Python] Provide instructions for "fixing" build environment issues
> 
>
> Key: ARROW-17753
> URL: https://issues.apache.org/jira/browse/ARROW-17753
> Project: Apache Arrow
>  Issue Type: Bug
>  Components: C++, Python
>Reporter: Alenka Frim
>Assignee: Anja Boskovic
>Priority: Major
>  Labels: pull-request-available
> Fix For: 10.0.0
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> Due to bigger changes in the build workflow for Arrow C++ coming up in the 
> 10.0.0 release, failures when building the libraries are quite common. The 
> errors we bump into are similar to:
> {code:java}
> CMake Error at 
> build/dist/lib/cmake/ArrowPython/ArrowPythonConfig.cmake:61 
> (arrow_keep_backward_compatibility):
>   Unknown CMake command "arrow_keep_backward_compatibility".
> Call Stack (most recent call first):
>   CMakeLists.txt:240 (find_package)
> {code}
> or
> {code:java}
> -- Found Python3Alt: /Users/alenkafrim/repos/pyarrow-dev-9/bin/python  
> CMake Error at 
> /opt/homebrew/Cellar/cmake/3.24.1/share/cmake/Modules/CMakeFindDependencyMacro.cmake:47
>  (find_package):
>   By not providing "FindArrow.cmake" in CMAKE_MODULE_PATH this project has
>   asked CMake to find a package configuration file provided by "Arrow", but
>   CMake did not find one.
>   Could not find a package configuration file provided by "Arrow" with any of
>   the following names:
> ArrowConfig.cmake
> arrow-config.cmake
>   Add the installation prefix of "Arrow" to CMAKE_PREFIX_PATH or set
>   "Arrow_DIR" to a directory containing one of the above files.  If "Arrow"
>   provides a separate development package or SDK, be sure it has been
>   installed.
> Call Stack (most recent call first):
>   build/dist/lib/cmake/ArrowPython/ArrowPythonConfig.cmake:54 
> (find_dependency)
>   CMakeLists.txt:240 (find_package)
> {code}
> Connected issues:
>  - https://issues.apache.org/jira/browse/ARROW-17577
>  - https://issues.apache.org/jira/browse/ARROW-17575



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (ARROW-16947) [C++] Remove boost dependency with thrift

2022-09-27 Thread Kouhei Sutou (Jira)


[ 
https://issues.apache.org/jira/browse/ARROW-16947?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17610298#comment-17610298
 ] 

Kouhei Sutou commented on ARROW-16947:
--

FYI: https://github.com/apache/thrift/pull/2630 was merged. It means that we 
don't need Boost with system Thrift 0.17.0 or later.

We need more work to remove Boost dependency for bundled Thrift. Because we 
still need Boost for building Thrift. But Thrift developers want to remove 
build-time Boost dependency too: 
https://github.com/apache/thrift/pull/2630#issuecomment-1242712937

{quote}
Thanks to you! If you have more such changes, please keep them coming!
{quote}

So, our contributions to remove build-time Boost dependency will be welcome.

> [C++] Remove boost dependency with thrift
> -
>
> Key: ARROW-16947
> URL: https://issues.apache.org/jira/browse/ARROW-16947
> Project: Apache Arrow
>  Issue Type: Improvement
>  Components: C++
>Reporter: Neal Richardson
>Assignee: Kouhei Sutou
>Priority: Major
>  Labels: good-second-issue
>
> [~kou] (re-)added this dependency in ARROW-16721: 
> https://github.com/apache/arrow/pull/13292/files#r890849903. But looking at 
> thrift/transport/TBufferTransports.h, the header we include that uses boost, 
> the class we use from it doesn't seem to require boost itself. So maybe we 
> can pull the class definition out that we need and inline/vendor it, so that 
> we can drop the need for that header, and thus drop the need for boost with 
> thrift.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (ARROW-17862) [C/GLib] Deprecate Plasma C/GLib bindings

2022-09-27 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/ARROW-17862?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated ARROW-17862:
---
Labels: pull-request-available  (was: )

> [C/GLib] Deprecate Plasma C/GLib bindings
> -
>
> Key: ARROW-17862
> URL: https://issues.apache.org/jira/browse/ARROW-17862
> Project: Apache Arrow
>  Issue Type: Sub-task
>  Components: GLib
>Reporter: Antoine Pitrou
>Assignee: Kouhei Sutou
>Priority: Blocker
>  Labels: pull-request-available
> Fix For: 10.0.0
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (ARROW-17864) [Ruby] Deprecate Plasma Ruby bindings

2022-09-27 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/ARROW-17864?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated ARROW-17864:
---
Labels: pull-request-available  (was: )

> [Ruby] Deprecate Plasma Ruby bindings
> -
>
> Key: ARROW-17864
> URL: https://issues.apache.org/jira/browse/ARROW-17864
> Project: Apache Arrow
>  Issue Type: Sub-task
>  Components: Ruby
>Reporter: Antoine Pitrou
>Assignee: Kouhei Sutou
>Priority: Blocker
>  Labels: pull-request-available
> Fix For: 10.0.0
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (ARROW-17871) [Go] Implement Initial Scalar Binary Arithmetic Infrastructure

2022-09-27 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/ARROW-17871?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated ARROW-17871:
---
Labels: pull-request-available  (was: )

> [Go] Implement Initial Scalar Binary Arithmetic Infrastructure
> --
>
> Key: ARROW-17871
> URL: https://issues.apache.org/jira/browse/ARROW-17871
> Project: Apache Arrow
>  Issue Type: Sub-task
>  Components: Go
>Reporter: Matthew Topol
>Assignee: Matthew Topol
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> Uses add, add_checked, sub, and sub_checked as the initial implementation, 
> only for integral types and float32/float64.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Assigned] (ARROW-17871) [Go] Implement Initial Scalar Binary Arithmetic Infrastructure

2022-09-27 Thread Matthew Topol (Jira)


 [ 
https://issues.apache.org/jira/browse/ARROW-17871?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Matthew Topol reassigned ARROW-17871:
-

Assignee: Matthew Topol

> [Go] Implement Initial Scalar Binary Arithmetic Infrastructure
> --
>
> Key: ARROW-17871
> URL: https://issues.apache.org/jira/browse/ARROW-17871
> Project: Apache Arrow
>  Issue Type: Sub-task
>  Components: Go
>Reporter: Matthew Topol
>Assignee: Matthew Topol
>Priority: Major
>
> Uses add, add_checked, sub, and sub_checked as the initial implementation, 
> only for integral types and float32/float64.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (ARROW-17871) [Go] Implement Initial Scalar Binary Arithmetic Infrastructure

2022-09-27 Thread Matthew Topol (Jira)
Matthew Topol created ARROW-17871:
-

 Summary: [Go] Implement Initial Scalar Binary Arithmetic 
Infrastructure
 Key: ARROW-17871
 URL: https://issues.apache.org/jira/browse/ARROW-17871
 Project: Apache Arrow
  Issue Type: Sub-task
  Components: Go
Reporter: Matthew Topol


Uses add, add_checked, sub, and sub_checked as the initial implementation, only 
for integral types and float32/float64.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (ARROW-17870) [Go] Add Scalar Binary Arithmetic

2022-09-27 Thread Matthew Topol (Jira)
Matthew Topol created ARROW-17870:
-

 Summary: [Go] Add Scalar Binary Arithmetic
 Key: ARROW-17870
 URL: https://issues.apache.org/jira/browse/ARROW-17870
 Project: Apache Arrow
  Issue Type: New Feature
  Components: Go
Reporter: Matthew Topol
Assignee: Matthew Topol






--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (ARROW-17869) [Java][Gandiva] ProjectorTest.testStringOutput is failed

2022-09-27 Thread Kouhei Sutou (Jira)
Kouhei Sutou created ARROW-17869:


 Summary: [Java][Gandiva] ProjectorTest.testStringOutput is failed
 Key: ARROW-17869
 URL: https://issues.apache.org/jira/browse/ARROW-17869
 Project: Apache Arrow
  Issue Type: Bug
Reporter: Kouhei Sutou


https://github.com/ursacomputing/crossbow/actions/runs/3134225521/jobs/5089370615#step:7:69073

{noformat}
Error:  Tests run: 43, Failures: 0, Errors: 1, Skipped: 1, Time elapsed: 30.152 
s <<< FAILURE! - in org.apache.arrow.gandiva.evaluator.ProjectorTest
Error:  org.apache.arrow.gandiva.evaluator.ProjectorTest.testStringOutput  Time 
elapsed: 0.124 s  <<< ERROR!
reserve not implemented
at 
org.apache.arrow.gandiva.evaluator.JniWrapper.evaluateProjector(Native Method)
at 
org.apache.arrow.gandiva.evaluator.Projector.evaluate(Projector.java:345)
at 
org.apache.arrow.gandiva.evaluator.Projector.evaluate(Projector.java:213)
at 
org.apache.arrow.gandiva.evaluator.ProjectorTest.testStringOutput(ProjectorTest.java:605)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at 
org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:50)
at 
org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
at 
org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:47)
at 
org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
at 
org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26)
at 
org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27)
at 
org.junit.rules.ExpectedException$ExpectedExceptionStatement.evaluate(ExpectedException.java:239)
at org.junit.rules.RunRules.evaluate(RunRules.java:20)
at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:325)
at 
org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:78)
at 
org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:57)
at org.junit.runners.ParentRunner$3.run(ParentRunner.java:290)
at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:71)
at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:288)
at org.junit.runners.ParentRunner.access$000(ParentRunner.java:58)
at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:268)
at org.junit.runners.ParentRunner.run(ParentRunner.java:363)
at org.junit.runner.JUnitCore.run(JUnitCore.java:137)
at org.junit.runner.JUnitCore.run(JUnitCore.java:115)
at 
org.junit.vintage.engine.execution.RunnerExecutor.execute(RunnerExecutor.java:42)
at 
org.junit.vintage.engine.VintageTestEngine.executeAllChildren(VintageTestEngine.java:80)
at 
org.junit.vintage.engine.VintageTestEngine.execute(VintageTestEngine.java:72)
at 
org.junit.platform.launcher.core.EngineExecutionOrchestrator.execute(EngineExecutionOrchestrator.java:147)
at 
org.junit.platform.launcher.core.EngineExecutionOrchestrator.execute(EngineExecutionOrchestrator.java:127)
at 
org.junit.platform.launcher.core.EngineExecutionOrchestrator.execute(EngineExecutionOrchestrator.java:90)
at 
org.junit.platform.launcher.core.EngineExecutionOrchestrator.lambda$execute$0(EngineExecutionOrchestrator.java:55)
at 
org.junit.platform.launcher.core.EngineExecutionOrchestrator.withInterceptedStreams(EngineExecutionOrchestrator.java:102)
at 
org.junit.platform.launcher.core.EngineExecutionOrchestrator.execute(EngineExecutionOrchestrator.java:54)
at 
org.junit.platform.launcher.core.DefaultLauncher.execute(DefaultLauncher.java:114)
at 
org.junit.platform.launcher.core.DefaultLauncher.execute(DefaultLauncher.java:86)
at 
org.junit.platform.launcher.core.DefaultLauncherSession$DelegatingLauncher.execute(DefaultLauncherSession.java:86)
at 
org.apache.maven.surefire.junitplatform.LazyLauncher.execute(LazyLauncher.java:55)
at 
org.apache.maven.surefire.junitplatform.JUnitPlatformProvider.lambda$execute$1(JUnitPlatformProvider.java:234)
at java.util.Iterator.forEachRemaining(Iterator.java:116)
at 
org.apache.maven.surefire.junitplatform.JUnitPlatformProvider.execute(JUnitPlatformProvider.java:228)
at 
org.apache.maven.surefire.junitplatform.JUnitPlatformProvider.invokeAllTests(JUnitPlatformProvider.java:175)
at 
org.apache.maven.surefire.junitplatform.JUnitPlatformProvider.invoke(JUnitPlatformProvider.java:131)
at 

[jira] [Updated] (ARROW-17550) [C++][CI] MinGW builds shouldn't compile grpcio

2022-09-27 Thread Kouhei Sutou (Jira)


 [ 
https://issues.apache.org/jira/browse/ARROW-17550?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kouhei Sutou updated ARROW-17550:
-
Description: 
MinGW builds currently compile the GCS testbench and grpcio for MinGW.
When the compiled MinGW wheel is not in cache, compiling takes a very long time 
(\*). But Win32 and Win64 binary wheels are available on PyPI.

This is pointless: the GCS testbench could simply run with the system Python 
instead of the msys2 Python, and always use the binaries from PyPI.

(\*) see for example https://github.com/pitrou/arrow/runs/8071607360 where 
installing the GCS testbench took 18 minutes


  was:
MinGW builds currently compile the GCS testbench and grpcio for MinGW.
When the compiled MinGW wheel is not in cache, compiling takes a very long time 
(*). But Win32 and Win64 binary wheels are available on PyPI.

This is pointless: the GCS testbench could simply run with the system Python 
instead of the msys2 Python, and always use the binaries from PyPI.

(*) see for example https://github.com/pitrou/arrow/runs/8071607360 where 
installing the GCS testbench took 18 minutes



> [C++][CI] MinGW builds shouldn't compile grpcio
> ---
>
> Key: ARROW-17550
> URL: https://issues.apache.org/jira/browse/ARROW-17550
> Project: Apache Arrow
>  Issue Type: Bug
>  Components: C++, Continuous Integration
>Reporter: Antoine Pitrou
>Assignee: Kouhei Sutou
>Priority: Major
> Fix For: 10.0.0
>
>
> MinGW builds currently compile the GCS testbench and grpcio for MinGW.
> When the compiled MinGW wheel is not in cache, compiling takes a very long 
> time (\*). But Win32 and Win64 binary wheels are available on PyPI.
> This is pointless: the GCS testbench could simply run with the system Python 
> instead of the msys2 Python, and always use the binaries from PyPI.
> (\*) see for example https://github.com/pitrou/arrow/runs/8071607360 where 
> installing the GCS testbench took 18 minutes



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (ARROW-17550) [C++][CI] MinGW builds shouldn't compile grpcio

2022-09-27 Thread Kouhei Sutou (Jira)


[ 
https://issues.apache.org/jira/browse/ARROW-17550?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17610239#comment-17610239
 ] 

Kouhei Sutou commented on ARROW-17550:
--

OK. I'll do it.

> [C++][CI] MinGW builds shouldn't compile grpcio
> ---
>
> Key: ARROW-17550
> URL: https://issues.apache.org/jira/browse/ARROW-17550
> Project: Apache Arrow
>  Issue Type: Bug
>  Components: C++, Continuous Integration
>Reporter: Antoine Pitrou
>Assignee: Kouhei Sutou
>Priority: Major
> Fix For: 10.0.0
>
>
> MinGW builds currently compile the GCS testbench and grpcio for MinGW.
> When the compiled MinGW wheel is not in cache, compiling takes a very long 
> time (\*). But Win32 and Win64 binary wheels are available on PyPI.
> This is pointless: the GCS testbench could simply run with the system Python 
> instead of the msys2 Python, and always use the binaries from PyPI.
> (\*) see for example https://github.com/pitrou/arrow/runs/8071607360 where 
> installing the GCS testbench took 18 minutes



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (ARROW-17868) [C++][Python] Keep and deprecate ARROW_PYTHON CMake option for backward compatibility

2022-09-27 Thread Kouhei Sutou (Jira)
Kouhei Sutou created ARROW-17868:


 Summary: [C++][Python] Keep and deprecate ARROW_PYTHON CMake 
option for backward compatibility
 Key: ARROW-17868
 URL: https://issues.apache.org/jira/browse/ARROW-17868
 Project: Apache Arrow
  Issue Type: Improvement
  Components: C++, Python
Reporter: Kouhei Sutou
Assignee: Kouhei Sutou


ARROW-6858 removed {{ARROW_PYTHON}} CMake option because ARROW-16340 moved 
{{cpp/src/arrow/python/}} to {{python/pyarrow/src/}}. But it broke backward 
compatibility. Users who use {{-DARROW_PYTHON=ON}} needs to {{-DARROW_CSV=ON}}, 
{{-DARROW_DATASET=ON}} and so on manually.

See also: https://github.com/apache/arrow/pull/14224#discussion_r981399130

{quote}
FWIW this broke my local development because of no longer including those 
(although I should probably start using presets ..)

Now, it's probably fine to remove this now Python C++ has moved, but we do 
assume that some C++ modules are built on the pyarrow side (eg we assume that 
CSV is always built, while with the above change you need to ensure manually 
that this is done in your cmake call).
In any case we should update the documentation at 
https://arrow.apache.org/docs/dev/developers/python.html#build-and-test to 
indicate that there are a few components required to be able to build pyarrow.
{quote}

Eventually, we can remove {{ARROW_PYTHON}} CMake option but we should provide a 
deprecation period before we remove {{ARROW_PYTHON}}.

We should also mention that {{ARROW_PYTHON}} is deprecated in our documentation 
( https://arrow.apache.org/docs/dev/developers/python.html#build-and-test ).



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (ARROW-17867) [C++][FlightRPC] Expose bulk parameter binding in Flight SQL client

2022-09-27 Thread David Li (Jira)
David Li created ARROW-17867:


 Summary: [C++][FlightRPC] Expose bulk parameter binding in Flight 
SQL client
 Key: ARROW-17867
 URL: https://issues.apache.org/jira/browse/ARROW-17867
 Project: Apache Arrow
  Issue Type: Improvement
Reporter: David Li
Assignee: David Li


Also fix various issues noticed as part of ARROW-17661



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (ARROW-17154) [C++] Change cmake project name from arrow_python to pyarrow_cpp

2022-09-27 Thread Kouhei Sutou (Jira)


[ 
https://issues.apache.org/jira/browse/ARROW-17154?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17610231#comment-17610231
 ] 

Kouhei Sutou commented on ARROW-17154:
--

We can close this.
I'll unify {{python/CMakeLists.txt}} and {{python/pyarrow/src/CMakeLists.txt}} 
by ARROW-17838. It removes the CMake project name in 
{{python/pyarrow/src/CMakeLists.txt}}. So we don't need to rename it.

> [C++] Change cmake project name from arrow_python to pyarrow_cpp
> 
>
> Key: ARROW-17154
> URL: https://issues.apache.org/jira/browse/ARROW-17154
> Project: Apache Arrow
>  Issue Type: Sub-task
>  Components: C++
>Reporter: Alenka Frim
>Assignee: Alenka Frim
>Priority: Major
> Fix For: 10.0.0
>
>
> See discussion 
> https://github.com/apache/arrow/pull/13311#discussion_r926198302



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Resolved] (ARROW-17795) [C++][R] Using ARROW_ZSTD_USE_SHARED fails

2022-09-27 Thread Kouhei Sutou (Jira)


 [ 
https://issues.apache.org/jira/browse/ARROW-17795?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kouhei Sutou resolved ARROW-17795.
--
Resolution: Fixed

Issue resolved by pull request 14202
[https://github.com/apache/arrow/pull/14202]

> [C++][R] Using ARROW_ZSTD_USE_SHARED fails
> --
>
> Key: ARROW-17795
> URL: https://issues.apache.org/jira/browse/ARROW-17795
> Project: Apache Arrow
>  Issue Type: Bug
>  Components: C++, R
>Reporter: Jacob Wujciak-Jens
>Assignee: Kouhei Sutou
>Priority: Blocker
>  Labels: pull-request-available
> Fix For: 10.0.0
>
>  Time Spent: 4h 20m
>  Remaining Estimate: 0h
>
> See zulip discussion 
> [here|https://ursalabs.zulipchat.com/#narrow/stream/180245-dev/topic/zstd.20cmake.20changes]
> Changes to the find zstd module cause failure when  ARROW_ZSTD_USE_SHARED is 
> used



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (ARROW-17866) [Python] List child array invalid

2022-09-27 Thread Kouhei Sutou (Jira)


 [ 
https://issues.apache.org/jira/browse/ARROW-17866?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kouhei Sutou updated ARROW-17866:
-
Component/s: Python

> [Python] List child array invalid
> -
>
> Key: ARROW-17866
> URL: https://issues.apache.org/jira/browse/ARROW-17866
> Project: Apache Arrow
>  Issue Type: Bug
>  Components: Python
>Affects Versions: 9.0.0
>Reporter: Sean Conroy
>Priority: Major
>
> This issue happens for all the versions of pyarrow I checked (9.0.0, 7.0.0, 
> 6.0.0, 6.0.1).
> Running on Windows 11.
> {code:java}
> log.to_feather(log_fname)
> Traceback (most recent call last):
>   File "G:\My 
> Drive\ds-atcore-etl\venv\lib\site-packages\IPython\core\interactiveshell.py", 
> line 3444, in run_code
>     exec(code_obj, self.user_global_ns, self.user_ns)
>   File "", line 1, in 
>     log.to_feather(log_fname)
>   File "G:\My 
> Drive\ds-atcore-etl\venv\lib\site-packages\pandas\util\_decorators.py", line 
> 207, in wrapper
>     return func(*args, **kwargs)
>   File "G:\My 
> Drive\ds-atcore-etl\venv\lib\site-packages\pandas\core\frame.py", line 2519, 
> in to_feather
>     to_feather(self, path, **kwargs)
>   File "G:\My 
> Drive\ds-atcore-etl\venv\lib\site-packages\pandas\io\feather_format.py", line 
> 87, in to_feather
>     feather.write_feather(df, handles.handle, **kwargs)
>   File "G:\My Drive\ds-atcore-etl\venv\lib\site-packages\pyarrow\feather.py", 
> line 164, in write_feather
>     table = Table.from_pandas(df, preserve_index=preserve_index)
>   File "pyarrow\table.pxi", line 3495, in pyarrow.lib.Table.from_pandas
>   File "pyarrow\table.pxi", line 3597, in pyarrow.lib.Table.from_arrays
>   File "pyarrow\table.pxi", line 2793, in pyarrow.lib.Table.validate
>   File "pyarrow\error.pxi", line 100, in pyarrow.lib.check_status
> pyarrow.lib.ArrowInvalid: Column 13: In chunk 0: Invalid: List child array 
> invalid: Invalid: Struct child array #0 has length smaller than expected for 
> struct array (67186731 < 67186732) {code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (ARROW-17866) [Python] List child array invalid

2022-09-27 Thread Kouhei Sutou (Jira)


 [ 
https://issues.apache.org/jira/browse/ARROW-17866?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kouhei Sutou updated ARROW-17866:
-
Summary: [Python] List child array invalid  (was: List child array invalid)

> [Python] List child array invalid
> -
>
> Key: ARROW-17866
> URL: https://issues.apache.org/jira/browse/ARROW-17866
> Project: Apache Arrow
>  Issue Type: Bug
>Affects Versions: 9.0.0
>Reporter: Sean Conroy
>Priority: Major
>
> This issue happens for all the versions of pyarrow I checked (9.0.0, 7.0.0, 
> 6.0.0, 6.0.1).
> Running on Windows 11.
> {code:java}
> log.to_feather(log_fname)
> Traceback (most recent call last):
>   File "G:\My 
> Drive\ds-atcore-etl\venv\lib\site-packages\IPython\core\interactiveshell.py", 
> line 3444, in run_code
>     exec(code_obj, self.user_global_ns, self.user_ns)
>   File "", line 1, in 
>     log.to_feather(log_fname)
>   File "G:\My 
> Drive\ds-atcore-etl\venv\lib\site-packages\pandas\util\_decorators.py", line 
> 207, in wrapper
>     return func(*args, **kwargs)
>   File "G:\My 
> Drive\ds-atcore-etl\venv\lib\site-packages\pandas\core\frame.py", line 2519, 
> in to_feather
>     to_feather(self, path, **kwargs)
>   File "G:\My 
> Drive\ds-atcore-etl\venv\lib\site-packages\pandas\io\feather_format.py", line 
> 87, in to_feather
>     feather.write_feather(df, handles.handle, **kwargs)
>   File "G:\My Drive\ds-atcore-etl\venv\lib\site-packages\pyarrow\feather.py", 
> line 164, in write_feather
>     table = Table.from_pandas(df, preserve_index=preserve_index)
>   File "pyarrow\table.pxi", line 3495, in pyarrow.lib.Table.from_pandas
>   File "pyarrow\table.pxi", line 3597, in pyarrow.lib.Table.from_arrays
>   File "pyarrow\table.pxi", line 2793, in pyarrow.lib.Table.validate
>   File "pyarrow\error.pxi", line 100, in pyarrow.lib.check_status
> pyarrow.lib.ArrowInvalid: Column 13: In chunk 0: Invalid: List child array 
> invalid: Invalid: Struct child array #0 has length smaller than expected for 
> struct array (67186731 < 67186732) {code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Assigned] (ARROW-17862) [C/GLib] Deprecate Plasma C/GLib bindings

2022-09-27 Thread Kouhei Sutou (Jira)


 [ 
https://issues.apache.org/jira/browse/ARROW-17862?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kouhei Sutou reassigned ARROW-17862:


Assignee: Kouhei Sutou

> [C/GLib] Deprecate Plasma C/GLib bindings
> -
>
> Key: ARROW-17862
> URL: https://issues.apache.org/jira/browse/ARROW-17862
> Project: Apache Arrow
>  Issue Type: Sub-task
>  Components: GLib
>Reporter: Antoine Pitrou
>Assignee: Kouhei Sutou
>Priority: Blocker
> Fix For: 10.0.0
>
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Assigned] (ARROW-17864) [Ruby] Deprecate Plasma Ruby bindings

2022-09-27 Thread Kouhei Sutou (Jira)


 [ 
https://issues.apache.org/jira/browse/ARROW-17864?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kouhei Sutou reassigned ARROW-17864:


Assignee: Kouhei Sutou

> [Ruby] Deprecate Plasma Ruby bindings
> -
>
> Key: ARROW-17864
> URL: https://issues.apache.org/jira/browse/ARROW-17864
> Project: Apache Arrow
>  Issue Type: Sub-task
>  Components: Ruby
>Reporter: Antoine Pitrou
>Assignee: Kouhei Sutou
>Priority: Blocker
> Fix For: 10.0.0
>
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (ARROW-17855) [R] Simultaneous read-write operations causing file corruption.

2022-09-27 Thread Kouhei Sutou (Jira)


 [ 
https://issues.apache.org/jira/browse/ARROW-17855?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kouhei Sutou updated ARROW-17855:
-
Component/s: R

> [R] Simultaneous read-write operations causing file corruption.
> ---
>
> Key: ARROW-17855
> URL: https://issues.apache.org/jira/browse/ARROW-17855
> Project: Apache Arrow
>  Issue Type: Improvement
>  Components: R
>Reporter: N Gautam Animesh
>Priority: Major
>
> UseCase: I was trying to simultaneously read and write an arrow file which in 
> turn gave me an Error. It is leading to file corruption. I am currently using 
> read_feather and write_feather functions to save it as a .arrow file. Do let 
> me know if there's anything in this regard or any other way to avoid this. 
> [Error: Invalid: Not an Arrow file]



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (ARROW-17855) [R] Simultaneous read-write operations causing file corruption.

2022-09-27 Thread Kouhei Sutou (Jira)


 [ 
https://issues.apache.org/jira/browse/ARROW-17855?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kouhei Sutou updated ARROW-17855:
-
Summary: [R] Simultaneous read-write operations causing file corruption.  
(was: Simultaneous read-write operations causing file corruption.)

> [R] Simultaneous read-write operations causing file corruption.
> ---
>
> Key: ARROW-17855
> URL: https://issues.apache.org/jira/browse/ARROW-17855
> Project: Apache Arrow
>  Issue Type: Improvement
>Reporter: N Gautam Animesh
>Priority: Major
>
> UseCase: I was trying to simultaneously read and write an arrow file which in 
> turn gave me an Error. It is leading to file corruption. I am currently using 
> read_feather and write_feather functions to save it as a .arrow file. Do let 
> me know if there's anything in this regard or any other way to avoid this. 
> [Error: Invalid: Not an Arrow file]



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (ARROW-17854) [CI][Developer] Host preview docs on S3

2022-09-27 Thread Kouhei Sutou (Jira)


 [ 
https://issues.apache.org/jira/browse/ARROW-17854?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kouhei Sutou updated ARROW-17854:
-
Summary: [CI][Developer] Host preview docs on S3  (was: [CI][Developer] 
Hoste preview docs on S3)

> [CI][Developer] Host preview docs on S3
> ---
>
> Key: ARROW-17854
> URL: https://issues.apache.org/jira/browse/ARROW-17854
> Project: Apache Arrow
>  Issue Type: Improvement
>  Components: Continuous Integration, Developer Tools
>Reporter: Jacob Wujciak-Jens
>Assignee: Jacob Wujciak-Jens
>Priority: Critical
>  Labels: pull-request-available
> Fix For: 10.0.0
>
>  Time Spent: 1h 40m
>  Remaining Estimate: 0h
>
> Hosting on Github Pages as implemented in [ARROW-12958] is unsustainable due 
> to the size of the arrow docs (~ 200mb).



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (ARROW-17753) [C++][Python] Provide instructions for "fixing" build environment issues

2022-09-27 Thread Anja Boskovic (Jira)


[ 
https://issues.apache.org/jira/browse/ARROW-17753?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17610214#comment-17610214
 ] 

Anja Boskovic commented on ARROW-17753:
---

Omg, bless you Joris.

Thank you! =) I will open a PR.

> [C++][Python] Provide instructions for "fixing" build environment issues
> 
>
> Key: ARROW-17753
> URL: https://issues.apache.org/jira/browse/ARROW-17753
> Project: Apache Arrow
>  Issue Type: Bug
>  Components: C++, Python
>Reporter: Alenka Frim
>Assignee: Alenka Frim
>Priority: Major
> Fix For: 10.0.0
>
>
> Due to bigger changes in the build workflow for Arrow C++ coming up in the 
> 10.0.0 release, failures when building the libraries are quite common. The 
> errors we bump into are similar to:
> {code:java}
> CMake Error at 
> build/dist/lib/cmake/ArrowPython/ArrowPythonConfig.cmake:61 
> (arrow_keep_backward_compatibility):
>   Unknown CMake command "arrow_keep_backward_compatibility".
> Call Stack (most recent call first):
>   CMakeLists.txt:240 (find_package)
> {code}
> or
> {code:java}
> -- Found Python3Alt: /Users/alenkafrim/repos/pyarrow-dev-9/bin/python  
> CMake Error at 
> /opt/homebrew/Cellar/cmake/3.24.1/share/cmake/Modules/CMakeFindDependencyMacro.cmake:47
>  (find_package):
>   By not providing "FindArrow.cmake" in CMAKE_MODULE_PATH this project has
>   asked CMake to find a package configuration file provided by "Arrow", but
>   CMake did not find one.
>   Could not find a package configuration file provided by "Arrow" with any of
>   the following names:
> ArrowConfig.cmake
> arrow-config.cmake
>   Add the installation prefix of "Arrow" to CMAKE_PREFIX_PATH or set
>   "Arrow_DIR" to a directory containing one of the above files.  If "Arrow"
>   provides a separate development package or SDK, be sure it has been
>   installed.
> Call Stack (most recent call first):
>   build/dist/lib/cmake/ArrowPython/ArrowPythonConfig.cmake:54 
> (find_dependency)
>   CMakeLists.txt:240 (find_package)
> {code}
> Connected issues:
>  - https://issues.apache.org/jira/browse/ARROW-17577
>  - https://issues.apache.org/jira/browse/ARROW-17575



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Assigned] (ARROW-17753) [C++][Python] Provide instructions for "fixing" build environment issues

2022-09-27 Thread Anja Boskovic (Jira)


 [ 
https://issues.apache.org/jira/browse/ARROW-17753?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Anja Boskovic reassigned ARROW-17753:
-

Assignee: Anja Boskovic  (was: Alenka Frim)

> [C++][Python] Provide instructions for "fixing" build environment issues
> 
>
> Key: ARROW-17753
> URL: https://issues.apache.org/jira/browse/ARROW-17753
> Project: Apache Arrow
>  Issue Type: Bug
>  Components: C++, Python
>Reporter: Alenka Frim
>Assignee: Anja Boskovic
>Priority: Major
> Fix For: 10.0.0
>
>
> Due to bigger changes in the build workflow for Arrow C++ coming up in the 
> 10.0.0 release, failures when building the libraries are quite common. The 
> errors we bump into are similar to:
> {code:java}
> CMake Error at 
> build/dist/lib/cmake/ArrowPython/ArrowPythonConfig.cmake:61 
> (arrow_keep_backward_compatibility):
>   Unknown CMake command "arrow_keep_backward_compatibility".
> Call Stack (most recent call first):
>   CMakeLists.txt:240 (find_package)
> {code}
> or
> {code:java}
> -- Found Python3Alt: /Users/alenkafrim/repos/pyarrow-dev-9/bin/python  
> CMake Error at 
> /opt/homebrew/Cellar/cmake/3.24.1/share/cmake/Modules/CMakeFindDependencyMacro.cmake:47
>  (find_package):
>   By not providing "FindArrow.cmake" in CMAKE_MODULE_PATH this project has
>   asked CMake to find a package configuration file provided by "Arrow", but
>   CMake did not find one.
>   Could not find a package configuration file provided by "Arrow" with any of
>   the following names:
> ArrowConfig.cmake
> arrow-config.cmake
>   Add the installation prefix of "Arrow" to CMAKE_PREFIX_PATH or set
>   "Arrow_DIR" to a directory containing one of the above files.  If "Arrow"
>   provides a separate development package or SDK, be sure it has been
>   installed.
> Call Stack (most recent call first):
>   build/dist/lib/cmake/ArrowPython/ArrowPythonConfig.cmake:54 
> (find_dependency)
>   CMakeLists.txt:240 (find_package)
> {code}
> Connected issues:
>  - https://issues.apache.org/jira/browse/ARROW-17577
>  - https://issues.apache.org/jira/browse/ARROW-17575



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (ARROW-17857) [C++] Table::CombineChunksToBatch segfaults on empty tables

2022-09-27 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/ARROW-17857?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated ARROW-17857:
---
Labels: pull-request-available  (was: )

> [C++] Table::CombineChunksToBatch segfaults on empty tables
> ---
>
> Key: ARROW-17857
> URL: https://issues.apache.org/jira/browse/ARROW-17857
> Project: Apache Arrow
>  Issue Type: Improvement
>  Components: C++
>Reporter: David Li
>Assignee: David Li
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> There can be 0 chunks in a ChunkedArray



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (ARROW-17866) List child array invalid

2022-09-27 Thread Sean Conroy (Jira)
Sean Conroy created ARROW-17866:
---

 Summary: List child array invalid
 Key: ARROW-17866
 URL: https://issues.apache.org/jira/browse/ARROW-17866
 Project: Apache Arrow
  Issue Type: Bug
Affects Versions: 9.0.0
Reporter: Sean Conroy


This issue happens for all the versions of pyarrow I checked (9.0.0, 7.0.0, 
6.0.0, 6.0.1).

Running on Windows 11.
{code:java}
log.to_feather(log_fname)
Traceback (most recent call last):
  File "G:\My 
Drive\ds-atcore-etl\venv\lib\site-packages\IPython\core\interactiveshell.py", 
line 3444, in run_code
    exec(code_obj, self.user_global_ns, self.user_ns)
  File "", line 1, in 
    log.to_feather(log_fname)
  File "G:\My 
Drive\ds-atcore-etl\venv\lib\site-packages\pandas\util\_decorators.py", line 
207, in wrapper
    return func(*args, **kwargs)
  File "G:\My Drive\ds-atcore-etl\venv\lib\site-packages\pandas\core\frame.py", 
line 2519, in to_feather
    to_feather(self, path, **kwargs)
  File "G:\My 
Drive\ds-atcore-etl\venv\lib\site-packages\pandas\io\feather_format.py", line 
87, in to_feather
    feather.write_feather(df, handles.handle, **kwargs)
  File "G:\My Drive\ds-atcore-etl\venv\lib\site-packages\pyarrow\feather.py", 
line 164, in write_feather
    table = Table.from_pandas(df, preserve_index=preserve_index)
  File "pyarrow\table.pxi", line 3495, in pyarrow.lib.Table.from_pandas
  File "pyarrow\table.pxi", line 3597, in pyarrow.lib.Table.from_arrays
  File "pyarrow\table.pxi", line 2793, in pyarrow.lib.Table.validate
  File "pyarrow\error.pxi", line 100, in pyarrow.lib.check_status
pyarrow.lib.ArrowInvalid: Column 13: In chunk 0: Invalid: List child array 
invalid: Invalid: Struct child array #0 has length smaller than expected for 
struct array (67186731 < 67186732) {code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Assigned] (ARROW-17379) [C++][Docs] Create tutorial content for Acero

2022-09-27 Thread Kae Suarez (Jira)


 [ 
https://issues.apache.org/jira/browse/ARROW-17379?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kae Suarez reassigned ARROW-17379:
--

Assignee: Kae Suarez

> [C++][Docs] Create tutorial content for Acero
> -
>
> Key: ARROW-17379
> URL: https://issues.apache.org/jira/browse/ARROW-17379
> Project: Apache Arrow
>  Issue Type: Sub-task
>  Components: C++, Documentation
>Reporter: Kae Suarez
>Assignee: Kae Suarez
>Priority: Major
> Fix For: 10.0.0
>
>
> As per 
> [https://docs.google.com/document/d/1IFk6m97JWZZzFC3UIlLf3sxnXgFoL-l89nqSGl8bE28/edit?usp=sharing],
>  create tutorial content to introduce users to Acero. Use the ExecPlan from 
> the top of the Acero documentation to do so, and link them to the rest of the 
> Acero docs afterwards.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Closed] (ARROW-10528) Add ability to send arbitrary HTTP headers to C++/Python clients

2022-09-27 Thread David Li (Jira)


 [ 
https://issues.apache.org/jira/browse/ARROW-10528?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

David Li closed ARROW-10528.

Resolution: Done

> Add ability to send arbitrary HTTP headers to C++/Python clients
> 
>
> Key: ARROW-10528
> URL: https://issues.apache.org/jira/browse/ARROW-10528
> Project: Apache Arrow
>  Issue Type: Sub-task
>  Components: C++, FlightRPC
>Reporter: James Duong
>Priority: Major
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (ARROW-17692) [R] Arrow Package Installation: undefined symbol error

2022-09-27 Thread Wayne Tu (Jira)


[ 
https://issues.apache.org/jira/browse/ARROW-17692?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17610189#comment-17610189
 ] 

Wayne Tu commented on ARROW-17692:
--

Thank you Nicola and Kouhei so much! I will add the additional flags and try 
again to build Arrow R package with S3 enabled.

> [R] Arrow Package Installation: undefined symbol error 
> ---
>
> Key: ARROW-17692
> URL: https://issues.apache.org/jira/browse/ARROW-17692
> Project: Apache Arrow
>  Issue Type: Bug
>  Components: R
>Reporter: Wayne Tu
>Assignee: Nicola Crane
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 40m
>  Remaining Estimate: 0h
>
> Hi,
> I encountered "undefined symbol: _ZTIN3Aws4Auth22AWSCredentialsProviderE
> {noformat}
> Error: loading failed
> Execution halted
> ERROR: loading failed" errors when trying to install arrow under R 4.1.3 with 
> devtoolset-8 (gcc version 8.3.1).
> > Sys.getenv("LD_LIBRARY_PATH")
> [1] 
> "/usr/local/lib64:/usr/local/lib64/cmake:/lib64:/opt/rh/devtoolset-8/root/usr/lib64:/opt/rh/devtoolset-8/root/usr/lib/gcc/x86_64-redhat-linux/8:/opt/rh/devtoolset-8/root/usr/libexec/gcc/x86_64-redhat-linux/8:/opt/R/4.1.3/lib/R/lib:/usr/local/lib:/usr/lib/jvm/java-1.8.0-openjdk-1.8.0.322.b06-1.el7_9.x86_64/jre/lib/amd64/server"
> > Sys.getenv("PATH")
> [1] 
> "/apps/Python/3.9.12/bin:/usr/local/cmake-3.21.4-linux-x86_64/bin:/opt/rh/devtoolset-8/root/usr/bin:/apps/bin:/usr/local/bin:/bin:/usr/bin"
> > Sys.setenv("NOT_CRAN"=TRUE)
> > Sys.setenv("LIBARROW_BINARY" = FALSE)
> > Sys.setenv("ARROW_R_DEV" = TRUE)
> > Sys.setenv("ARROW_USE_PKG_CONFIG" = FALSE)
> > Sys.setenv(ARROW_S3 = "ON")
> > Sys.setenv(CMAKE = "/apps/cmake-3.21.4-linux-x86_64/bin/cmake")
> > sessionInfo()
> R version 4.1.3 (2022-03-10)
> Platform: x86_64-pc-linux-gnu (64-bit)
> Running under: Red Hat Enterprise Linux Server 7.9 (Maipo)
> Matrix products: default
> BLAS/LAPACK: /usr/lib64/libopenblasp-r0.3.3.so
> locale:
>  [1] LC_CTYPE=en_US.UTF-8       LC_NUMERIC=C
>  [3] LC_TIME=en_US.UTF-8        LC_COLLATE=en_US.UTF-8
>  [5] LC_MONETARY=en_US.UTF-8    LC_MESSAGES=en_US.UTF-8
>  [7] LC_PAPER=en_US.UTF-8       LC_NAME=C
>  [9] LC_ADDRESS=C               LC_TELEPHONE=C
> [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C
> attached base packages:
> [1] stats     graphics  grDevices utils     datasets  methods   base
> loaded via a namespace (and not attached):
> [1] compiler_4.1.3
> > arrow::arrow_available()
> Error in loadNamespace(x) : there is no package called ‘arrow’
> > system("gcc -v")
> Using built-in specs.
> COLLECT_GCC=gcc
> COLLECT_LTO_WRAPPER=/opt/rh/devtoolset-8/root/usr/libexec/gcc/x86_64-redhat-linux/8/lto-wrapper
> Target: x86_64-redhat-linux
> Configured with: ../configure --enable-bootstrap 
> --enable-languages=c,c++,fortran,lto --prefix=/opt/rh/devtoolset-8/root/usr 
> --mandir=/opt/rh/devtoolset-8/root/usr/share/man 
> --infodir=/opt/rh/devtoolset-8/root/usr/share/info 
> --with-bugurl=http://bugzilla.redhat.com/bugzilla --enable-shared 
> --enable-threads=posix --enable-checking=release --enable-multilib 
> --with-system-zlib --enable-__cxa_atexit --disable-libunwind-exceptions 
> --enable-gnu-unique-object --enable-linker-build-id 
> --with-gcc-major-version-only --with-linker-hash-style=gnu 
> --with-default-libstdcxx-abi=gcc4-compatible --enable-plugin 
> --enable-initfini-array 
> --with-isl=/builddir/build/BUILD/gcc-8.3.1-20190311/obj-x86_64-redhat-linux/isl-install
>  --disable-libmpx --enable-gnu-indirect-function --with-tune=generic 
> --with-arch_32=x86-64 --build=x86_64-redhat-linux
> Thread model: posix
> gcc version 8.3.1 20190311 (Red Hat 8.3.1-3) (GCC)
>  
> > install.packages(mpkg, repos=NULL, type="source")
> ..
> ..
> ** building package indices
> ** installing vignettes
> ** testing if installed package can be loaded from temporary location
> Error: package or namespace load failed for ‘arrow’ in dyn.load(file, DLLpath 
> = DLLpath, ...):
>  unable to load shared object 
> '/home/user1/R/x86_64-pc-linux-gnu/4.1.3/00LOCK-arrow/00new/arrow/libs/arrow.so':
>   
> /home/user1/R/x86_64-pc-linux-gnu/4.1.3/00LOCK-arrow/00new/arrow/libs/arrow.so:
>  undefined symbol: _ZTIN3Aws4Auth22AWSCredentialsProviderE
> Error: loading failed
> Execution halted
> ERROR: loading failed
> * removing ‘/home/user1/R/x86_64-pc-linux-gnu/4.1.3/arrow’
> Warning message:
> In install.packages(mpkg, repos = NULL, type = "source") :
>   installation of package 
> ‘/apps/tmp/RtmpEqJN3J/downloaded_packages/arrow_8.0.0.tar.gz’ had non-zero 
> exit status
> {noformat}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (ARROW-17550) [C++][CI] MinGW builds shouldn't compile grpcio

2022-09-27 Thread Antoine Pitrou (Jira)


[ 
https://issues.apache.org/jira/browse/ARROW-17550?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17610171#comment-17610171
 ] 

Antoine Pitrou commented on ARROW-17550:


[~kou] Would you like to prioritize this?

> [C++][CI] MinGW builds shouldn't compile grpcio
> ---
>
> Key: ARROW-17550
> URL: https://issues.apache.org/jira/browse/ARROW-17550
> Project: Apache Arrow
>  Issue Type: Bug
>  Components: C++, Continuous Integration
>Reporter: Antoine Pitrou
>Assignee: Kouhei Sutou
>Priority: Major
> Fix For: 10.0.0
>
>
> MinGW builds currently compile the GCS testbench and grpcio for MinGW.
> When the compiled MinGW wheel is not in cache, compiling takes a very long 
> time (*). But Win32 and Win64 binary wheels are available on PyPI.
> This is pointless: the GCS testbench could simply run with the system Python 
> instead of the msys2 Python, and always use the binaries from PyPI.
> (*) see for example https://github.com/pitrou/arrow/runs/8071607360 where 
> installing the GCS testbench took 18 minutes



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (ARROW-17550) [C++][CI] MinGW builds shouldn't compile grpcio

2022-09-27 Thread Antoine Pitrou (Jira)


 [ 
https://issues.apache.org/jira/browse/ARROW-17550?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Antoine Pitrou updated ARROW-17550:
---
Fix Version/s: 10.0.0

> [C++][CI] MinGW builds shouldn't compile grpcio
> ---
>
> Key: ARROW-17550
> URL: https://issues.apache.org/jira/browse/ARROW-17550
> Project: Apache Arrow
>  Issue Type: Bug
>  Components: C++, Continuous Integration
>Reporter: Antoine Pitrou
>Assignee: Kouhei Sutou
>Priority: Major
> Fix For: 10.0.0
>
>
> MinGW builds currently compile the GCS testbench and grpcio for MinGW.
> When the compiled MinGW wheel is not in cache, compiling takes a very long 
> time (*). But Win32 and Win64 binary wheels are available on PyPI.
> This is pointless: the GCS testbench could simply run with the system Python 
> instead of the msys2 Python, and always use the binaries from PyPI.
> (*) see for example https://github.com/pitrou/arrow/runs/8071607360 where 
> installing the GCS testbench took 18 minutes



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (ARROW-17855) Simultaneous read-write operations causing file corruption.

2022-09-27 Thread N Gautam Animesh (Jira)


[ 
https://issues.apache.org/jira/browse/ARROW-17855?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17610157#comment-17610157
 ] 

N Gautam Animesh commented on ARROW-17855:
--

{code:java}
I am doing both the operations(read_feather() and write_feather()) 
simultaneously which is causing the file to get corrupted.{code}
{code:java}
library(arrow)

df <- read_feather("test.arrow"){code}
{code:java}
library(arrow)
df <- data.frame(mtcars)
write_feather(df, "test.arrow")

{code}

> Simultaneous read-write operations causing file corruption.
> ---
>
> Key: ARROW-17855
> URL: https://issues.apache.org/jira/browse/ARROW-17855
> Project: Apache Arrow
>  Issue Type: Improvement
>Reporter: N Gautam Animesh
>Priority: Major
>
> UseCase: I was trying to simultaneously read and write an arrow file which in 
> turn gave me an Error. It is leading to file corruption. I am currently using 
> read_feather and write_feather functions to save it as a .arrow file. Do let 
> me know if there's anything in this regard or any other way to avoid this. 
> [Error: Invalid: Not an Arrow file]



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (ARROW-17864) [Ruby] Deprecate Plasma Ruby bindings

2022-09-27 Thread Antoine Pitrou (Jira)


[ 
https://issues.apache.org/jira/browse/ARROW-17864?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17610154#comment-17610154
 ] 

Antoine Pitrou commented on ARROW-17864:


[~kou]

> [Ruby] Deprecate Plasma Ruby bindings
> -
>
> Key: ARROW-17864
> URL: https://issues.apache.org/jira/browse/ARROW-17864
> Project: Apache Arrow
>  Issue Type: Sub-task
>  Components: Ruby
>Reporter: Antoine Pitrou
>Priority: Blocker
> Fix For: 10.0.0
>
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (ARROW-17865) [Java] Deprecate Plasma JNI bindings

2022-09-27 Thread Antoine Pitrou (Jira)


[ 
https://issues.apache.org/jira/browse/ARROW-17865?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17610155#comment-17610155
 ] 

Antoine Pitrou commented on ARROW-17865:


[~dsusanibara] [~ljw1001]

> [Java] Deprecate Plasma JNI bindings
> 
>
> Key: ARROW-17865
> URL: https://issues.apache.org/jira/browse/ARROW-17865
> Project: Apache Arrow
>  Issue Type: Sub-task
>  Components: Java
>Reporter: Antoine Pitrou
>Priority: Blocker
> Fix For: 10.0.0
>
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (ARROW-17863) [Python] Deprecate Plasma Python bindings

2022-09-27 Thread Antoine Pitrou (Jira)


[ 
https://issues.apache.org/jira/browse/ARROW-17863?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17610153#comment-17610153
 ] 

Antoine Pitrou commented on ARROW-17863:


[~jorisvandenbossche]


> [Python] Deprecate Plasma Python bindings
> -
>
> Key: ARROW-17863
> URL: https://issues.apache.org/jira/browse/ARROW-17863
> Project: Apache Arrow
>  Issue Type: Sub-task
>  Components: Python
>Reporter: Antoine Pitrou
>Priority: Blocker
> Fix For: 10.0.0
>
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Assigned] (ARROW-17861) [C++] Deprecate Plasma C++

2022-09-27 Thread Antoine Pitrou (Jira)


 [ 
https://issues.apache.org/jira/browse/ARROW-17861?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Antoine Pitrou reassigned ARROW-17861:
--

Assignee: Antoine Pitrou

> [C++] Deprecate Plasma C++
> --
>
> Key: ARROW-17861
> URL: https://issues.apache.org/jira/browse/ARROW-17861
> Project: Apache Arrow
>  Issue Type: Sub-task
>  Components: C++, C++ - Plasma
>Reporter: Antoine Pitrou
>Assignee: Antoine Pitrou
>Priority: Blocker
> Fix For: 10.0.0
>
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Assigned] (ARROW-13028) [C++] CSV add convert option to attempt 32bit number inferences

2022-09-27 Thread Todd Farmer (Jira)


 [ 
https://issues.apache.org/jira/browse/ARROW-13028?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Todd Farmer reassigned ARROW-13028:
---

Assignee: (was: Nate Clark)

> [C++] CSV add convert option to attempt 32bit number inferences
> ---
>
> Key: ARROW-13028
> URL: https://issues.apache.org/jira/browse/ARROW-13028
> Project: Apache Arrow
>  Issue Type: Improvement
>  Components: C++
>Reporter: Nate Clark
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 40m
>  Remaining Estimate: 0h
>
> When types are being inferred by CSV the numbers are always 64 bit. For large 
> data sets it could be better to use 32 bit types to save over all memory. To 
> do this it would be useful to add an option to ConvertOptions to try 32 bit 
> numbers before 64 bit. By default this option would be disabled.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (ARROW-17862) [C/GLib] Deprecate Plasma C/GLib bindings

2022-09-27 Thread Antoine Pitrou (Jira)


[ 
https://issues.apache.org/jira/browse/ARROW-17862?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17610152#comment-17610152
 ] 

Antoine Pitrou commented on ARROW-17862:


[~kou]


> [C/GLib] Deprecate Plasma C/GLib bindings
> -
>
> Key: ARROW-17862
> URL: https://issues.apache.org/jira/browse/ARROW-17862
> Project: Apache Arrow
>  Issue Type: Sub-task
>  Components: GLib
>Reporter: Antoine Pitrou
>Priority: Blocker
> Fix For: 10.0.0
>
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (ARROW-14289) [C++] Change Scanner::Head to return a RecordBatchReader

2022-09-27 Thread Todd Farmer (Jira)


[ 
https://issues.apache.org/jira/browse/ARROW-14289?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17610150#comment-17610150
 ] 

Todd Farmer commented on ARROW-14289:
-

This issue was last updated over 90 days ago, which may be an indication it is 
no longer being actively worked. To better reflect the current state, the issue 
is being unassigned per [project 
policy|https://arrow.apache.org/docs/dev/developers/bug_reports.html#issue-assignment].
 Please feel free to re-take assignment of the issue if it is being actively 
worked, or if you plan to start that work soon.

> [C++] Change Scanner::Head to return a RecordBatchReader
> 
>
> Key: ARROW-14289
> URL: https://issues.apache.org/jira/browse/ARROW-14289
> Project: Apache Arrow
>  Issue Type: Improvement
>  Components: C++, R
>Reporter: Neal Richardson
>Assignee: Weston Pace
>Priority: Major
>
> Following ARROW-9731 and ARROW-13893. This would make it more natural to work 
> with ExecPlans that return a RecordBatchReader when you Run them. 
> Alternatively, we could move the business to RecordBatchReader::Head.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (ARROW-17865) [Java] Deprecate Plasma JNI bindings

2022-09-27 Thread Antoine Pitrou (Jira)


 [ 
https://issues.apache.org/jira/browse/ARROW-17865?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Antoine Pitrou updated ARROW-17865:
---
Component/s: Java
 (was: Ruby)

> [Java] Deprecate Plasma JNI bindings
> 
>
> Key: ARROW-17865
> URL: https://issues.apache.org/jira/browse/ARROW-17865
> Project: Apache Arrow
>  Issue Type: Sub-task
>  Components: Java
>Reporter: Antoine Pitrou
>Priority: Blocker
> Fix For: 10.0.0
>
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Assigned] (ARROW-15459) [C++] Unable to build Arrow C++ on osx arm64 inside conda env because of Invalid configuration `arm64-apple-darwin20.0.0': machine `arm64-apple' not recognized and arro

2022-09-27 Thread Todd Farmer (Jira)


 [ 
https://issues.apache.org/jira/browse/ARROW-15459?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Todd Farmer reassigned ARROW-15459:
---

Assignee: (was: Elena Henderson)

> [C++] Unable to build Arrow C++ on osx arm64 inside conda env because of 
> Invalid configuration `arm64-apple-darwin20.0.0': machine `arm64-apple' not 
> recognized and arrow/cpp/arm64-apple-darwin20.0.0-ar: No such file or 
> directory
> 
>
> Key: ARROW-15459
> URL: https://issues.apache.org/jira/browse/ARROW-15459
> Project: Apache Arrow
>  Issue Type: Bug
>  Components: C++
>Reporter: Elena Henderson
>Priority: Major
>  Labels: osx-arm64
> Attachments: logs
>
>
> Steps to reproduce this issue on osx arm64:
> {code:bash}
> git clone https://github.com/apache/arrow.git
> cd arrow/cpp
> brew update && brew install node && brew bundle --file=Brewfile
> cd ..
> mamba create -y -n arrow-commit -c conda-forge \
>   --file ci/conda_env_unix.txt \
>   --file ci/conda_env_cpp.txt \
>   --file ci/conda_env_python.txt \
>   compilers \
>   python=3.8 \
>   pandas \
>   aws-sdk-cpp \
>   r
> mamba activate arrow-commit
> pip install -r python/requirements-build.txt -r python/requirements-test.txt
> export ARROW_BUILD_TESTS=OFF
> export ARROW_BUILD_TYPE=release
> export ARROW_DEPENDENCY_SOURCE=AUTO
> export ARROW_DATASET=ON
> export ARROW_DEFAULT_MEMORY_POOL=mimalloc
> export ARROW_ENABLE_UNSAFE_MEMORY_ACCESS=true
> export ARROW_ENABLE_NULL_CHECK_FOR_GET=false
> export ARROW_FLIGHT=OFF
> export ARROW_GANDIVA=OFF
> export ARROW_HDFS=ON
> export ARROW_HOME=$CONDA_PREFIX
> export ARROW_INSTALL_NAME_RPATH=OFF
> export ARROW_MIMALLOC=ON
> export ARROW_NO_DEPRECATED_API=ON
> export ARROW_ORC=ON
> export ARROW_PARQUET=ON
> export ARROW_PLASMA=ON
> export ARROW_PYTHON=ON
> export ARROW_S3=ON
> export ARROW_USE_ASAN=OFF
> export ARROW_USE_CCACHE=ON
> export ARROW_USE_UBSAN=OFF
> export ARROW_WITH_BROTLI=ON
> export ARROW_WITH_BZ2=ON
> export ARROW_WITH_LZ4=ON
> export ARROW_WITH_SNAPPY=ON
> export ARROW_WITH_ZLIB=ON
> export ARROW_WITH_ZSTD=ON
> export GTest_SOURCE=BUNDLED
> export ORC_SOURCE=BUNDLED
> export PARQUET_BUILD_EXAMPLES=ON
> export PARQUET_BUILD_EXECUTABLES=ON
> export PYTHON=python
> export LD_LIBRARY_PATH=$CONDA_PREFIX/lib:$LD_LIBRARY_PATH
> ci/scripts/cpp_build.sh $(pwd) $(pwd) 
> {code}
>  
> Error (full logs are attached):
> {code:java}
> ...
> checking size of void *... 8
> checking size of int... 4
> checking size of long... 8
> checking size of long long... 8
> checking size of intmax_t... 8
> checking build system type... 
> -- stderr output is:
> Invalid configuration `arm64-apple-darwin20.0.0': machine `arm64-apple' not 
> recognized
> configure: error: /bin/sh build-aux/config.sub arm64-apple-darwin20.0.0 failed
> CMake Error at 
> /Users/voltrondata/arrow/cpp/jemalloc_ep-prefix/src/jemalloc_ep-stamp/jemalloc_ep-configure-RELEASE.cmake:47
>  (message):
>   Stopping after outputting logs.
> [31/380] Performing configure step for 'orc_ep'
> ninja: build stopped: subcommand failed. {code}
>  
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (ARROW-15459) [C++] Unable to build Arrow C++ on osx arm64 inside conda env because of Invalid configuration `arm64-apple-darwin20.0.0': machine `arm64-apple' not recognized and arr

2022-09-27 Thread Todd Farmer (Jira)


[ 
https://issues.apache.org/jira/browse/ARROW-15459?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17610151#comment-17610151
 ] 

Todd Farmer commented on ARROW-15459:
-

This issue was last updated over 90 days ago, which may be an indication it is 
no longer being actively worked. To better reflect the current state, the issue 
is being unassigned per [project 
policy|https://arrow.apache.org/docs/dev/developers/bug_reports.html#issue-assignment].
 Please feel free to re-take assignment of the issue if it is being actively 
worked, or if you plan to start that work soon.

> [C++] Unable to build Arrow C++ on osx arm64 inside conda env because of 
> Invalid configuration `arm64-apple-darwin20.0.0': machine `arm64-apple' not 
> recognized and arrow/cpp/arm64-apple-darwin20.0.0-ar: No such file or 
> directory
> 
>
> Key: ARROW-15459
> URL: https://issues.apache.org/jira/browse/ARROW-15459
> Project: Apache Arrow
>  Issue Type: Bug
>  Components: C++
>Reporter: Elena Henderson
>Assignee: Elena Henderson
>Priority: Major
>  Labels: osx-arm64
> Attachments: logs
>
>
> Steps to reproduce this issue on osx arm64:
> {code:bash}
> git clone https://github.com/apache/arrow.git
> cd arrow/cpp
> brew update && brew install node && brew bundle --file=Brewfile
> cd ..
> mamba create -y -n arrow-commit -c conda-forge \
>   --file ci/conda_env_unix.txt \
>   --file ci/conda_env_cpp.txt \
>   --file ci/conda_env_python.txt \
>   compilers \
>   python=3.8 \
>   pandas \
>   aws-sdk-cpp \
>   r
> mamba activate arrow-commit
> pip install -r python/requirements-build.txt -r python/requirements-test.txt
> export ARROW_BUILD_TESTS=OFF
> export ARROW_BUILD_TYPE=release
> export ARROW_DEPENDENCY_SOURCE=AUTO
> export ARROW_DATASET=ON
> export ARROW_DEFAULT_MEMORY_POOL=mimalloc
> export ARROW_ENABLE_UNSAFE_MEMORY_ACCESS=true
> export ARROW_ENABLE_NULL_CHECK_FOR_GET=false
> export ARROW_FLIGHT=OFF
> export ARROW_GANDIVA=OFF
> export ARROW_HDFS=ON
> export ARROW_HOME=$CONDA_PREFIX
> export ARROW_INSTALL_NAME_RPATH=OFF
> export ARROW_MIMALLOC=ON
> export ARROW_NO_DEPRECATED_API=ON
> export ARROW_ORC=ON
> export ARROW_PARQUET=ON
> export ARROW_PLASMA=ON
> export ARROW_PYTHON=ON
> export ARROW_S3=ON
> export ARROW_USE_ASAN=OFF
> export ARROW_USE_CCACHE=ON
> export ARROW_USE_UBSAN=OFF
> export ARROW_WITH_BROTLI=ON
> export ARROW_WITH_BZ2=ON
> export ARROW_WITH_LZ4=ON
> export ARROW_WITH_SNAPPY=ON
> export ARROW_WITH_ZLIB=ON
> export ARROW_WITH_ZSTD=ON
> export GTest_SOURCE=BUNDLED
> export ORC_SOURCE=BUNDLED
> export PARQUET_BUILD_EXAMPLES=ON
> export PARQUET_BUILD_EXECUTABLES=ON
> export PYTHON=python
> export LD_LIBRARY_PATH=$CONDA_PREFIX/lib:$LD_LIBRARY_PATH
> ci/scripts/cpp_build.sh $(pwd) $(pwd) 
> {code}
>  
> Error (full logs are attached):
> {code:java}
> ...
> checking size of void *... 8
> checking size of int... 4
> checking size of long... 8
> checking size of long long... 8
> checking size of intmax_t... 8
> checking build system type... 
> -- stderr output is:
> Invalid configuration `arm64-apple-darwin20.0.0': machine `arm64-apple' not 
> recognized
> configure: error: /bin/sh build-aux/config.sub arm64-apple-darwin20.0.0 failed
> CMake Error at 
> /Users/voltrondata/arrow/cpp/jemalloc_ep-prefix/src/jemalloc_ep-stamp/jemalloc_ep-configure-RELEASE.cmake:47
>  (message):
>   Stopping after outputting logs.
> [31/380] Performing configure step for 'orc_ep'
> ninja: build stopped: subcommand failed. {code}
>  
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (ARROW-17865) [Java] Deprecate Plasma JNI bindings

2022-09-27 Thread Antoine Pitrou (Jira)
Antoine Pitrou created ARROW-17865:
--

 Summary: [Java] Deprecate Plasma JNI bindings
 Key: ARROW-17865
 URL: https://issues.apache.org/jira/browse/ARROW-17865
 Project: Apache Arrow
  Issue Type: Sub-task
  Components: Ruby
Reporter: Antoine Pitrou
 Fix For: 10.0.0






--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (ARROW-17864) [Ruby] Deprecate Plasma Ruby bindings

2022-09-27 Thread Antoine Pitrou (Jira)


 [ 
https://issues.apache.org/jira/browse/ARROW-17864?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Antoine Pitrou updated ARROW-17864:
---
Component/s: Ruby
 (was: Python)

> [Ruby] Deprecate Plasma Ruby bindings
> -
>
> Key: ARROW-17864
> URL: https://issues.apache.org/jira/browse/ARROW-17864
> Project: Apache Arrow
>  Issue Type: Sub-task
>  Components: Ruby
>Reporter: Antoine Pitrou
>Priority: Blocker
> Fix For: 10.0.0
>
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Assigned] (ARROW-14289) [C++] Change Scanner::Head to return a RecordBatchReader

2022-09-27 Thread Todd Farmer (Jira)


 [ 
https://issues.apache.org/jira/browse/ARROW-14289?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Todd Farmer reassigned ARROW-14289:
---

Assignee: (was: Weston Pace)

> [C++] Change Scanner::Head to return a RecordBatchReader
> 
>
> Key: ARROW-14289
> URL: https://issues.apache.org/jira/browse/ARROW-14289
> Project: Apache Arrow
>  Issue Type: Improvement
>  Components: C++, R
>Reporter: Neal Richardson
>Priority: Major
>
> Following ARROW-9731 and ARROW-13893. This would make it more natural to work 
> with ExecPlans that return a RecordBatchReader when you Run them. 
> Alternatively, we could move the business to RecordBatchReader::Head.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (ARROW-13028) [C++] CSV add convert option to attempt 32bit number inferences

2022-09-27 Thread Todd Farmer (Jira)


[ 
https://issues.apache.org/jira/browse/ARROW-13028?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17610149#comment-17610149
 ] 

Todd Farmer commented on ARROW-13028:
-

This issue was last updated over 90 days ago, which may be an indication it is 
no longer being actively worked. To better reflect the current state, the issue 
is being unassigned per [project 
policy|https://arrow.apache.org/docs/dev/developers/bug_reports.html#issue-assignment].
 Please feel free to re-take assignment of the issue if it is being actively 
worked, or if you plan to start that work soon.

> [C++] CSV add convert option to attempt 32bit number inferences
> ---
>
> Key: ARROW-13028
> URL: https://issues.apache.org/jira/browse/ARROW-13028
> Project: Apache Arrow
>  Issue Type: Improvement
>  Components: C++
>Reporter: Nate Clark
>Assignee: Nate Clark
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 40m
>  Remaining Estimate: 0h
>
> When types are being inferred by CSV the numbers are always 64 bit. For large 
> data sets it could be better to use 32 bit types to save over all memory. To 
> do this it would be useful to add an option to ConvertOptions to try 32 bit 
> numbers before 64 bit. By default this option would be disabled.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (ARROW-17864) [Ruby] Deprecate Plasma Ruby bindings

2022-09-27 Thread Antoine Pitrou (Jira)
Antoine Pitrou created ARROW-17864:
--

 Summary: [Ruby] Deprecate Plasma Ruby bindings
 Key: ARROW-17864
 URL: https://issues.apache.org/jira/browse/ARROW-17864
 Project: Apache Arrow
  Issue Type: Sub-task
  Components: Python
Reporter: Antoine Pitrou
 Fix For: 10.0.0






--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (ARROW-17863) [Python] Deprecate Plasma Python bindings

2022-09-27 Thread Antoine Pitrou (Jira)
Antoine Pitrou created ARROW-17863:
--

 Summary: [Python] Deprecate Plasma Python bindings
 Key: ARROW-17863
 URL: https://issues.apache.org/jira/browse/ARROW-17863
 Project: Apache Arrow
  Issue Type: Sub-task
  Components: GLib
Reporter: Antoine Pitrou
 Fix For: 10.0.0






--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (ARROW-17862) [C/GLib] Deprecate Plasma C/GLib bindings

2022-09-27 Thread Antoine Pitrou (Jira)


 [ 
https://issues.apache.org/jira/browse/ARROW-17862?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Antoine Pitrou updated ARROW-17862:
---
Component/s: GLib
 (was: C++)
 (was: C++ - Plasma)

> [C/GLib] Deprecate Plasma C/GLib bindings
> -
>
> Key: ARROW-17862
> URL: https://issues.apache.org/jira/browse/ARROW-17862
> Project: Apache Arrow
>  Issue Type: Sub-task
>  Components: GLib
>Reporter: Antoine Pitrou
>Priority: Blocker
> Fix For: 10.0.0
>
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (ARROW-17862) [C/GLib] Deprecate Plasma C/GLib bindings

2022-09-27 Thread Antoine Pitrou (Jira)
Antoine Pitrou created ARROW-17862:
--

 Summary: [C/GLib] Deprecate Plasma C/GLib bindings
 Key: ARROW-17862
 URL: https://issues.apache.org/jira/browse/ARROW-17862
 Project: Apache Arrow
  Issue Type: Sub-task
  Components: C++, C++ - Plasma
Reporter: Antoine Pitrou
 Fix For: 10.0.0






--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (ARROW-17863) [Python] Deprecate Plasma Python bindings

2022-09-27 Thread Antoine Pitrou (Jira)


 [ 
https://issues.apache.org/jira/browse/ARROW-17863?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Antoine Pitrou updated ARROW-17863:
---
Component/s: Python
 (was: GLib)

> [Python] Deprecate Plasma Python bindings
> -
>
> Key: ARROW-17863
> URL: https://issues.apache.org/jira/browse/ARROW-17863
> Project: Apache Arrow
>  Issue Type: Sub-task
>  Components: Python
>Reporter: Antoine Pitrou
>Priority: Blocker
> Fix For: 10.0.0
>
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (ARROW-17861) [C++] Deprecate Plasma C++

2022-09-27 Thread Antoine Pitrou (Jira)
Antoine Pitrou created ARROW-17861:
--

 Summary: [C++] Deprecate Plasma C++
 Key: ARROW-17861
 URL: https://issues.apache.org/jira/browse/ARROW-17861
 Project: Apache Arrow
  Issue Type: Task
  Components: C++, C++ - Plasma
Reporter: Antoine Pitrou
 Fix For: 10.0.0






--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (ARROW-17861) [C++] Deprecate Plasma C++

2022-09-27 Thread Antoine Pitrou (Jira)


 [ 
https://issues.apache.org/jira/browse/ARROW-17861?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Antoine Pitrou updated ARROW-17861:
---
Parent: ARROW-17860
Issue Type: Sub-task  (was: Task)

> [C++] Deprecate Plasma C++
> --
>
> Key: ARROW-17861
> URL: https://issues.apache.org/jira/browse/ARROW-17861
> Project: Apache Arrow
>  Issue Type: Sub-task
>  Components: C++, C++ - Plasma
>Reporter: Antoine Pitrou
>Priority: Blocker
> Fix For: 10.0.0
>
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (ARROW-17860) [Plasma] Deprecate Plasma

2022-09-27 Thread Antoine Pitrou (Jira)


 [ 
https://issues.apache.org/jira/browse/ARROW-17860?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Antoine Pitrou updated ARROW-17860:
---
Component/s: C++
 GLib
 Java
 Python
 Ruby

> [Plasma] Deprecate Plasma
> -
>
> Key: ARROW-17860
> URL: https://issues.apache.org/jira/browse/ARROW-17860
> Project: Apache Arrow
>  Issue Type: Task
>  Components: C++, C++ - Plasma, GLib, Java, Python, Ruby
>Reporter: Antoine Pitrou
>Priority: Blocker
> Fix For: 10.0.0
>
>
> See discussion at 
> https://lists.apache.org/thread/nw232k2lzmg9kcl8ts475m9ybl34j81p



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (ARROW-17860) [Plasma] Deprecate Plasma

2022-09-27 Thread Antoine Pitrou (Jira)
Antoine Pitrou created ARROW-17860:
--

 Summary: [Plasma] Deprecate Plasma
 Key: ARROW-17860
 URL: https://issues.apache.org/jira/browse/ARROW-17860
 Project: Apache Arrow
  Issue Type: Task
  Components: C++ - Plasma
Reporter: Antoine Pitrou
 Fix For: 10.0.0


See discussion at 
https://lists.apache.org/thread/nw232k2lzmg9kcl8ts475m9ybl34j81p



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Comment Edited] (ARROW-17771) [Python] Python does not finds the DLLs correctly on Windows

2022-09-27 Thread Alenka Frim (Jira)


[ 
https://issues.apache.org/jira/browse/ARROW-17771?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17610127#comment-17610127
 ] 

Alenka Frim edited comment on ARROW-17771 at 9/27/22 4:37 PM:
--

I did some testing on my local Windows machine and this is what I found:
 * With Python 3.8, without setting {{CONDA_DLL_SEARCH_MODIFICATION=1}} and 
running
{code:java}
python install -e .
{code}
instead of making an inplace build there is no difference, the {*}arrow_python 
lib is not found{*}. I had to add {{CONDA_DLL_SEARCH_MODIFICATION}} env var to 
make it work.
https://gist.github.com/AlenkaF/bb12bbe2051db5854e10747e13ca5b91

 * With *Python 3.10* the inplace build works and there is no need to set 
{{CONDA_DLL_SEARCH_MODIFICATION}} as the libs are found correctly.
https://gist.github.com/AlenkaF/0dc8bc837472f83731dd6fa22fce558b

I still do not understand why is this happening and why is 
{{CONDA_DLL_SEARCH_MODIFICATION}} needed as we do not use 
{{os.add_dll_directory}} in the project. But am happy to see that the issue is 
resolved with Python 3.10.

I propose keeping the {{CONDA_DLL_SEARCH_MODIFICATION}} env var set to 1 in 
case older versions of Python are used. What I can do is to add a check for 
Python version in setup.py and set the {{CONDA_DLL_SEARCH_MODIFICATION}} env 
var automatically for older versions of Python.


was (Author: alenkaf):
I did some testing on my local Windows machine and this is what I found:

 * With Python 3.8, without setting {{CONDA_DLL_SEARCH_MODIFICATION=1}} and 
running
{code:java}
python install -e .
{code}
instead of making an inplace build there is no difference, the *arrow_python 
lib is not found*. I had to add {{CONDA_DLL_SEARCH_MODIFICATION}} env var to 
make it work.

 * With *Python 3.10* the inplace build works and there is no need to set 
{{CONDA_DLL_SEARCH_MODIFICATION}} as the libs are found correctly.

I still do not understand why is this happening and why is 
{{CONDA_DLL_SEARCH_MODIFICATION}} needed as we do not use 
{{os.add_dll_directory}} in the project. But am happy to see that the issue is 
resolved with Python 3.10.

I propose keeping the {{CONDA_DLL_SEARCH_MODIFICATION}} env var set to 1 in 
case older versions of Python are used. What I can do is to add a check for 
Python version in setup.py and set the {{CONDA_DLL_SEARCH_MODIFICATION}} env 
var automatically for older versions of Python.

> [Python] Python does not finds the DLLs correctly on Windows
> 
>
> Key: ARROW-17771
> URL: https://issues.apache.org/jira/browse/ARROW-17771
> Project: Apache Arrow
>  Issue Type: Bug
>  Components: Python
>Reporter: Alenka Frim
>Priority: Critical
> Fix For: 10.0.0
>
>
> It seems that after the Python refactoring PR 
> [https://github.com/apache/arrow/pull/13311] (ARROW-16340) Python is unable 
> to find {{arrow_python}} even though the library it is imported into the 
> correct directory.
> Currently this issue is fixed with setting an additional environment variable:
> {code:}
> CONDA_DLL_SEARCH_MODIFICATION_ENABLE=1
> {code}
> We need to investigate further why this error is happening after the 
> [refactoring|https://github.com/apache/arrow/pull/13311] and make sure Python 
> is able to find the libraries on Windows without the additional env vars 
> being specified.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (ARROW-17855) Simultaneous read-write operations causing file corruption.

2022-09-27 Thread Nicola Crane (Jira)


[ 
https://issues.apache.org/jira/browse/ARROW-17855?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17610147#comment-17610147
 ] 

Nicola Crane commented on ARROW-17855:
--

Do you have a code example?

> Simultaneous read-write operations causing file corruption.
> ---
>
> Key: ARROW-17855
> URL: https://issues.apache.org/jira/browse/ARROW-17855
> Project: Apache Arrow
>  Issue Type: Improvement
>Reporter: N Gautam Animesh
>Priority: Major
>
> UseCase: I was trying to simultaneously read and write an arrow file which in 
> turn gave me an Error. It is leading to file corruption. I am currently using 
> read_feather and write_feather functions to save it as a .arrow file. Do let 
> me know if there's anything in this regard or any other way to avoid this. 
> [Error: Invalid: Not an Arrow file]



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (ARROW-17859) [C++] Use self-pipe in signal-receiving StopSource

2022-09-27 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/ARROW-17859?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated ARROW-17859:
---
Labels: pull-request-available  (was: )

> [C++] Use self-pipe in signal-receiving StopSource
> --
>
> Key: ARROW-17859
> URL: https://issues.apache.org/jira/browse/ARROW-17859
> Project: Apache Arrow
>  Issue Type: Improvement
>  Components: C++
>Reporter: Antoine Pitrou
>Assignee: Antoine Pitrou
>Priority: Major
>  Labels: pull-request-available
> Fix For: 10.0.0
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> The signal-receiving StopSource currently uses elaborate hacks to request the 
> StopSource from a signal handler. Instead we should just use a SelfPipe and 
> send signals to a worker thread.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (ARROW-17859) [C++] Use self-pipe in signal-receiving StopSource

2022-09-27 Thread Antoine Pitrou (Jira)
Antoine Pitrou created ARROW-17859:
--

 Summary: [C++] Use self-pipe in signal-receiving StopSource
 Key: ARROW-17859
 URL: https://issues.apache.org/jira/browse/ARROW-17859
 Project: Apache Arrow
  Issue Type: Improvement
  Components: C++
Reporter: Antoine Pitrou
Assignee: Antoine Pitrou
 Fix For: 10.0.0


The signal-receiving StopSource currently uses elaborate hacks to request the 
StopSource from a signal handler. Instead we should just use a SelfPipe and 
send signals to a worker thread.




--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (ARROW-17836) [C++] Allow specifying of alignment in MemoryPool's allocations

2022-09-27 Thread Sasha Krassovsky (Jira)


[ 
https://issues.apache.org/jira/browse/ARROW-17836?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17610139#comment-17610139
 ] 

Sasha Krassovsky commented on ARROW-17836:
--

I added it so that I could have MakeRandomBatches specify alignment 
requirements so that I could test individual components of the spilling PR. 

> [C++] Allow specifying of alignment in MemoryPool's allocations 
> 
>
> Key: ARROW-17836
> URL: https://issues.apache.org/jira/browse/ARROW-17836
> Project: Apache Arrow
>  Issue Type: Improvement
>  Components: C++
>Reporter: Sasha Krassovsky
>Assignee: Sasha Krassovsky
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 40m
>  Remaining Estimate: 0h
>
> For spilling, I need to create buffers that are 512-byte aligned. The task is 
> to augment MemoryPool to allow for specifying alignment explicitly when 
> allocating (but keep the default the same).



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (ARROW-17836) [C++] Allow specifying of alignment in MemoryPool's allocations

2022-09-27 Thread Antoine Pitrou (Jira)


[ 
https://issues.apache.org/jira/browse/ARROW-17836?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17610138#comment-17610138
 ] 

Antoine Pitrou commented on ARROW-17836:


Ok, question: is adding alignment to ArrayBuilder needed for spilling, or is it 
sufficient to add support in MemoryPool, BufferBuilder and PoolBuffer?

> [C++] Allow specifying of alignment in MemoryPool's allocations 
> 
>
> Key: ARROW-17836
> URL: https://issues.apache.org/jira/browse/ARROW-17836
> Project: Apache Arrow
>  Issue Type: Improvement
>  Components: C++
>Reporter: Sasha Krassovsky
>Assignee: Sasha Krassovsky
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 40m
>  Remaining Estimate: 0h
>
> For spilling, I need to create buffers that are 512-byte aligned. The task is 
> to augment MemoryPool to allow for specifying alignment explicitly when 
> allocating (but keep the default the same).



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (ARROW-17836) [C++] Allow specifying of alignment in MemoryPool's allocations

2022-09-27 Thread Weston Pace (Jira)


[ 
https://issues.apache.org/jira/browse/ARROW-17836?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17610137#comment-17610137
 ] 

Weston Pace commented on ARROW-17836:
-

Yes, the direct I/O PR offers a generic filesystem interface and thus has to 
potentially memcpy / buffer incoming data to satisfy alignment.  The difference 
with spilling is that we are already doing a memcpy higher up in the chain.  
When we spill, we take the data we need to spill and partition it.

For example, if we want to add spill to sorting, and we pretend we are sorting 
by date, and we've accumulated too much data we might then partition into 
decade sized buckets and persist to disk.  Then, once all the data has arrived, 
we can process a single decade at a time (with the hope that one decade of data 
is small enough to fit in memory).

That's a rough description, and there are corner cases, but the point is we 
already have to do a memcpy in order to handle the partitioning (partitioning 
is unfortunately a rather row-oriented operation) and so we want to go ahead 
and satisfy the alignment requirement at that point.  This way, when we are 
ready to spill, we don't have to worry about alignment and we can just use 
direct I/O without any extra memcpy.

> [C++] Allow specifying of alignment in MemoryPool's allocations 
> 
>
> Key: ARROW-17836
> URL: https://issues.apache.org/jira/browse/ARROW-17836
> Project: Apache Arrow
>  Issue Type: Improvement
>  Components: C++
>Reporter: Sasha Krassovsky
>Assignee: Sasha Krassovsky
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 40m
>  Remaining Estimate: 0h
>
> For spilling, I need to create buffers that are 512-byte aligned. The task is 
> to augment MemoryPool to allow for specifying alignment explicitly when 
> allocating (but keep the default the same).



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (ARROW-17836) [C++] Allow specifying of alignment in MemoryPool's allocations

2022-09-27 Thread Antoine Pitrou (Jira)


[ 
https://issues.apache.org/jira/browse/ARROW-17836?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17610136#comment-17610136
 ] 

Antoine Pitrou commented on ARROW-17836:


I see, thanks.

> [C++] Allow specifying of alignment in MemoryPool's allocations 
> 
>
> Key: ARROW-17836
> URL: https://issues.apache.org/jira/browse/ARROW-17836
> Project: Apache Arrow
>  Issue Type: Improvement
>  Components: C++
>Reporter: Sasha Krassovsky
>Assignee: Sasha Krassovsky
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 40m
>  Remaining Estimate: 0h
>
> For spilling, I need to create buffers that are 512-byte aligned. The task is 
> to augment MemoryPool to allow for specifying alignment explicitly when 
> allocating (but keep the default the same).



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (ARROW-17836) [C++] Allow specifying of alignment in MemoryPool's allocations

2022-09-27 Thread Sasha Krassovsky (Jira)


[ 
https://issues.apache.org/jira/browse/ARROW-17836?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17610135#comment-17610135
 ] 

Sasha Krassovsky commented on ARROW-17836:
--

That Direct IO PR could probably reuse this functionality, but the motivation 
for this PR is for spilling in the hash join

> [C++] Allow specifying of alignment in MemoryPool's allocations 
> 
>
> Key: ARROW-17836
> URL: https://issues.apache.org/jira/browse/ARROW-17836
> Project: Apache Arrow
>  Issue Type: Improvement
>  Components: C++
>Reporter: Sasha Krassovsky
>Assignee: Sasha Krassovsky
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 40m
>  Remaining Estimate: 0h
>
> For spilling, I need to create buffers that are 512-byte aligned. The task is 
> to augment MemoryPool to allow for specifying alignment explicitly when 
> allocating (but keep the default the same).



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (ARROW-17836) [C++] Allow specifying of alignment in MemoryPool's allocations

2022-09-27 Thread Sasha Krassovsky (Jira)


[ 
https://issues.apache.org/jira/browse/ARROW-17836?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17610134#comment-17610134
 ] 

Sasha Krassovsky commented on ARROW-17836:
--

Hi yes sorry about that, that was a typo. I definitely meant bytes, I’ve 
updated the description. 

> [C++] Allow specifying of alignment in MemoryPool's allocations 
> 
>
> Key: ARROW-17836
> URL: https://issues.apache.org/jira/browse/ARROW-17836
> Project: Apache Arrow
>  Issue Type: Improvement
>  Components: C++
>Reporter: Sasha Krassovsky
>Assignee: Sasha Krassovsky
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 40m
>  Remaining Estimate: 0h
>
> For spilling, I need to create buffers that are 512-byte aligned. The task is 
> to augment MemoryPool to allow for specifying alignment explicitly when 
> allocating (but keep the default the same).



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (ARROW-17836) [C++] Allow specifying of alignment in MemoryPool's allocations

2022-09-27 Thread Antoine Pitrou (Jira)


[ 
https://issues.apache.org/jira/browse/ARROW-17836?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17610133#comment-17610133
 ] 

Antoine Pitrou commented on ARROW-17836:


Well, the direct IO PR handles alignment internally. Is there a plan to handle 
it differently?

> [C++] Allow specifying of alignment in MemoryPool's allocations 
> 
>
> Key: ARROW-17836
> URL: https://issues.apache.org/jira/browse/ARROW-17836
> Project: Apache Arrow
>  Issue Type: Improvement
>  Components: C++
>Reporter: Sasha Krassovsky
>Assignee: Sasha Krassovsky
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 40m
>  Remaining Estimate: 0h
>
> For spilling, I need to create buffers that are 512-byte aligned. The task is 
> to augment MemoryPool to allow for specifying alignment explicitly when 
> allocating (but keep the default the same).



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (ARROW-17852) [python] `dtype` of `Categorical` category columns are not preserved

2022-09-27 Thread Weston Pace (Jira)


 [ 
https://issues.apache.org/jira/browse/ARROW-17852?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Weston Pace updated ARROW-17852:

Summary: [python] `dtype` of `Categorical` category columns are not 
preserved  (was: `dtype` of `Categorical` category columns are not preserved)

> [python] `dtype` of `Categorical` category columns are not preserved
> 
>
> Key: ARROW-17852
> URL: https://issues.apache.org/jira/browse/ARROW-17852
> Project: Apache Arrow
>  Issue Type: Bug
>  Components: Python
>Affects Versions: 9.0.0
>Reporter: Ryan Ballard
>Priority: Major
>  Labels: categorical, pandas, pyarrow
>
> Hi there,
> First time submitting an issue here so apologies if there's anything I've 
> missed.
> I see the below bug, where by the {{dtype}} of the categories themselves 
> (within a {{pd.Categorical}} are not preserved on a round trip via pyarrow. 
> Hopefully the snippet below demonstrates the issue.
> The reason this causes an issue, is because the dtypes need to be the same in 
> order for the categories to be considered the same (so they can then be 
> concatenated, for example).
> Current workaround is to store as a plain {{pd.StringDtype()}} and then 
> convert to {{pd.Categorical}} in memory with Pandas (which infers from the 
> underlying type, but in doing so sacrifices disk saving of storing as a 
> dictionary).
> Using pyarrow 9.0.0 and pandas 1.4.4.
> Thanks
>  
> {{import pandas as pd}}
> {{import pyarrow as pa}}
>  
> {{{}# note, Categorical column B is constructed from 
> `pd.{}}}{{{}StringDtype`{}}}
> {{df = pd.DataFrame(\{"A": ["a", "b", "c", "a"]\}, dtype=pd.StringDtype())}}
> {{df["B"] = df["A"].astype("category")}}
> {{print(df["B"].cat.categories)}}
> {{# Index(['a', 'b', 'c'], dtype='string')}}
>  
> {{# however, this is downcast to `object` during a roundtrip}}
> {{print(pa.Table.from_pandas(df).to_pandas()["B"].cat.categories)}}
> {{# Index(['a', 'b', 'c'], dtype='object')}}
>  
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (ARROW-17836) [C++] Allow specifying of alignment in MemoryPool's allocations

2022-09-27 Thread Sasha Krassovsky (Jira)


 [ 
https://issues.apache.org/jira/browse/ARROW-17836?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sasha Krassovsky updated ARROW-17836:
-
Description: For spilling, I need to create buffers that are 512-byte 
aligned. The task is to augment MemoryPool to allow for specifying alignment 
explicitly when allocating (but keep the default the same).  (was: For 
spilling, I need to create buffers that are 512-bit aligned. The task is to 
augment MemoryPool to allow for specifying alignment explicitly when allocating 
(but keep the default the same).)

> [C++] Allow specifying of alignment in MemoryPool's allocations 
> 
>
> Key: ARROW-17836
> URL: https://issues.apache.org/jira/browse/ARROW-17836
> Project: Apache Arrow
>  Issue Type: Improvement
>  Components: C++
>Reporter: Sasha Krassovsky
>Assignee: Sasha Krassovsky
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 40m
>  Remaining Estimate: 0h
>
> For spilling, I need to create buffers that are 512-byte aligned. The task is 
> to augment MemoryPool to allow for specifying alignment explicitly when 
> allocating (but keep the default the same).



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (ARROW-17858) [C++] Compilating warning in arrow/csv/parser.h

2022-09-27 Thread Antoine Pitrou (Jira)
Antoine Pitrou created ARROW-17858:
--

 Summary: [C++] Compilating warning in arrow/csv/parser.h
 Key: ARROW-17858
 URL: https://issues.apache.org/jira/browse/ARROW-17858
 Project: Apache Arrow
  Issue Type: Bug
  Components: C++
Reporter: Antoine Pitrou


Just got this locally:
{code}
[141/584] Building CXX object 
src/arrow/CMakeFiles/arrow_objlib.dir/csv/converter.cc.o
In file included from /home/antoine/arrow/dev/cpp/src/arrow/csv/converter.cc:32:
/home/antoine/arrow/dev/cpp/src/arrow/csv/parser.h: In member function 
'arrow::Status 
arrow::csv::detail::DataBatch::DecorateWithRowNumber(arrow::Status&&, int64_t, 
int32_t) const':
/home/antoine/arrow/dev/cpp/src/arrow/csv/parser.h:124:3: warning: control 
reaches end of non-void function [-Wreturn-type]
  124 |   }
  |   ^
{code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (ARROW-17858) [C++] Compilating warning in arrow/csv/parser.h

2022-09-27 Thread Antoine Pitrou (Jira)


 [ 
https://issues.apache.org/jira/browse/ARROW-17858?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Antoine Pitrou updated ARROW-17858:
---
Fix Version/s: 10.0.0

> [C++] Compilating warning in arrow/csv/parser.h
> ---
>
> Key: ARROW-17858
> URL: https://issues.apache.org/jira/browse/ARROW-17858
> Project: Apache Arrow
>  Issue Type: Bug
>  Components: C++
>Reporter: Antoine Pitrou
>Priority: Trivial
> Fix For: 10.0.0
>
>
> Just got this locally:
> {code}
> [141/584] Building CXX object 
> src/arrow/CMakeFiles/arrow_objlib.dir/csv/converter.cc.o
> In file included from 
> /home/antoine/arrow/dev/cpp/src/arrow/csv/converter.cc:32:
> /home/antoine/arrow/dev/cpp/src/arrow/csv/parser.h: In member function 
> 'arrow::Status 
> arrow::csv::detail::DataBatch::DecorateWithRowNumber(arrow::Status&&, 
> int64_t, int32_t) const':
> /home/antoine/arrow/dev/cpp/src/arrow/csv/parser.h:124:3: warning: control 
> reaches end of non-void function [-Wreturn-type]
>   124 |   }
>   |   ^
> {code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (ARROW-17771) [Python] Python does not finds the DLLs correctly on Windows

2022-09-27 Thread Alenka Frim (Jira)


[ 
https://issues.apache.org/jira/browse/ARROW-17771?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17610127#comment-17610127
 ] 

Alenka Frim commented on ARROW-17771:
-

I did some testing on my local Windows machine and this is what I found:

 * With Python 3.8, without setting {{CONDA_DLL_SEARCH_MODIFICATION=1}} and 
running
{code:java}
python install -e .
{code}
instead of making an inplace build there is no difference, the *arrow_python 
lib is not found*. I had to add {{CONDA_DLL_SEARCH_MODIFICATION}} env var to 
make it work.

 * With *Python 3.10* the inplace build works and there is no need to set 
{{CONDA_DLL_SEARCH_MODIFICATION}} as the libs are found correctly.

I still do not understand why is this happening and why is 
{{CONDA_DLL_SEARCH_MODIFICATION}} needed as we do not use 
{{os.add_dll_directory}} in the project. But am happy to see that the issue is 
resolved with Python 3.10.

I propose keeping the {{CONDA_DLL_SEARCH_MODIFICATION}} env var set to 1 in 
case older versions of Python are used. What I can do is to add a check for 
Python version in setup.py and set the {{CONDA_DLL_SEARCH_MODIFICATION}} env 
var automatically for older versions of Python.

> [Python] Python does not finds the DLLs correctly on Windows
> 
>
> Key: ARROW-17771
> URL: https://issues.apache.org/jira/browse/ARROW-17771
> Project: Apache Arrow
>  Issue Type: Bug
>  Components: Python
>Reporter: Alenka Frim
>Priority: Critical
> Fix For: 10.0.0
>
>
> It seems that after the Python refactoring PR 
> [https://github.com/apache/arrow/pull/13311] (ARROW-16340) Python is unable 
> to find {{arrow_python}} even though the library it is imported into the 
> correct directory.
> Currently this issue is fixed with setting an additional environment variable:
> {code:}
> CONDA_DLL_SEARCH_MODIFICATION_ENABLE=1
> {code}
> We need to investigate further why this error is happening after the 
> [refactoring|https://github.com/apache/arrow/pull/13311] and make sure Python 
> is able to find the libraries on Windows without the additional env vars 
> being specified.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (ARROW-17857) [C++] Table::CombineChunksToBatch segfaults on empty tables

2022-09-27 Thread David Li (Jira)
David Li created ARROW-17857:


 Summary: [C++] Table::CombineChunksToBatch segfaults on empty 
tables
 Key: ARROW-17857
 URL: https://issues.apache.org/jira/browse/ARROW-17857
 Project: Apache Arrow
  Issue Type: Improvement
  Components: C++
Reporter: David Li
Assignee: David Li


There can be 0 chunks in a ChunkedArray



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (ARROW-17856) [CI][Archery] Add new Archery command to delete old branches and tags on crossbow repository

2022-09-27 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/ARROW-17856?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated ARROW-17856:
---
Labels: pull-request-available  (was: )

> [CI][Archery] Add new Archery command to delete old branches and tags on 
> crossbow repository
> 
>
> Key: ARROW-17856
> URL: https://issues.apache.org/jira/browse/ARROW-17856
> Project: Apache Arrow
>  Issue Type: Improvement
>  Components: Archery, Continuous Integration
>Reporter: Raúl Cumplido
>Assignee: Raúl Cumplido
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> In the past we were using a script from a gist to delete old branches on the 
> crossbow repository:
> [https://gist.github.com/antonio/4586456]
> Due to unclear licensing of the script we decided to remove it but we have to 
> add a new way of having the crossbow repository clean old branches.
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (ARROW-17856) [CI][Archery] Add new Archery command to delete old branches and tags on crossbow repository

2022-09-27 Thread Jira
Raúl Cumplido created ARROW-17856:
-

 Summary: [CI][Archery] Add new Archery command to delete old 
branches and tags on crossbow repository
 Key: ARROW-17856
 URL: https://issues.apache.org/jira/browse/ARROW-17856
 Project: Apache Arrow
  Issue Type: Improvement
  Components: Archery, Continuous Integration
Reporter: Raúl Cumplido
Assignee: Raúl Cumplido


In the past we were using a script from a gist to delete old branches on the 
crossbow repository:

[https://gist.github.com/antonio/4586456]

Due to unclear licensing of the script we decided to remove it but we have to 
add a new way of having the crossbow repository clean old branches.

 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (ARROW-17836) [C++] Allow specifying of alignment in MemoryPool's allocations

2022-09-27 Thread Weston Pace (Jira)


[ 
https://issues.apache.org/jira/browse/ARROW-17836?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17610098#comment-17610098
 ] 

Weston Pace commented on ARROW-17836:
-

I suspect this was a typo and meant to be 512 bytes.  The current spilling PR 
is using direct I/O and so I believe the motivation is the logical block size 
of the disk.

> [C++] Allow specifying of alignment in MemoryPool's allocations 
> 
>
> Key: ARROW-17836
> URL: https://issues.apache.org/jira/browse/ARROW-17836
> Project: Apache Arrow
>  Issue Type: Improvement
>  Components: C++
>Reporter: Sasha Krassovsky
>Assignee: Sasha Krassovsky
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 40m
>  Remaining Estimate: 0h
>
> For spilling, I need to create buffers that are 512-bit aligned. The task is 
> to augment MemoryPool to allow for specifying alignment explicitly when 
> allocating (but keep the default the same).



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Resolved] (ARROW-17760) [Go] Implement Take for Record Batches and Tables

2022-09-27 Thread Matthew Topol (Jira)


 [ 
https://issues.apache.org/jira/browse/ARROW-17760?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Matthew Topol resolved ARROW-17760.
---
Resolution: Resolved

> [Go] Implement Take for Record Batches and Tables
> -
>
> Key: ARROW-17760
> URL: https://issues.apache.org/jira/browse/ARROW-17760
> Project: Apache Arrow
>  Issue Type: Sub-task
>  Components: Go
>Reporter: Matthew Topol
>Assignee: Matthew Topol
>Priority: Major
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (ARROW-17669) [Go] Implement Filter and Take Functions

2022-09-27 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/ARROW-17669?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated ARROW-17669:
---
Labels: pull-request-available  (was: )

> [Go] Implement Filter and Take Functions
> 
>
> Key: ARROW-17669
> URL: https://issues.apache.org/jira/browse/ARROW-17669
> Project: Apache Arrow
>  Issue Type: New Feature
>  Components: Go
>Reporter: Matthew Topol
>Assignee: Matthew Topol
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 10m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Resolved] (ARROW-17669) [Go] Implement Filter and Take Functions

2022-09-27 Thread Matthew Topol (Jira)


 [ 
https://issues.apache.org/jira/browse/ARROW-17669?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Matthew Topol resolved ARROW-17669.
---
Fix Version/s: 10.0.0
   Resolution: Fixed

Issue resolved by pull request 14214
[https://github.com/apache/arrow/pull/14214]

> [Go] Implement Filter and Take Functions
> 
>
> Key: ARROW-17669
> URL: https://issues.apache.org/jira/browse/ARROW-17669
> Project: Apache Arrow
>  Issue Type: New Feature
>  Components: Go
>Reporter: Matthew Topol
>Assignee: Matthew Topol
>Priority: Major
>  Labels: pull-request-available
> Fix For: 10.0.0
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (ARROW-17855) Simultaneous read-write operations causing file corruption.

2022-09-27 Thread N Gautam Animesh (Jira)
N Gautam Animesh created ARROW-17855:


 Summary: Simultaneous read-write operations causing file 
corruption.
 Key: ARROW-17855
 URL: https://issues.apache.org/jira/browse/ARROW-17855
 Project: Apache Arrow
  Issue Type: Improvement
Reporter: N Gautam Animesh


UseCase: I was trying to simultaneously read and write an arrow file which in 
turn gave me an Error. It is leading to file corruption. I am currently using 
read_feather and write_feather functions to save it as a .arrow file. Do let me 
know if there's anything in this regard or any other way to avoid this. 

[Error: Invalid: Not an Arrow file]



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (ARROW-17854) [CI][Developer] Hoste preview docs on S3

2022-09-27 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/ARROW-17854?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated ARROW-17854:
---
Labels: pull-request-available  (was: )

> [CI][Developer] Hoste preview docs on S3
> 
>
> Key: ARROW-17854
> URL: https://issues.apache.org/jira/browse/ARROW-17854
> Project: Apache Arrow
>  Issue Type: Improvement
>  Components: Continuous Integration, Developer Tools
>Reporter: Jacob Wujciak-Jens
>Assignee: Jacob Wujciak-Jens
>Priority: Critical
>  Labels: pull-request-available
> Fix For: 10.0.0
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> Hosting on Github Pages as implemented in [ARROW-12958] is unsustainable due 
> to the size of the arrow docs (~ 200mb).



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Resolved] (ARROW-17770) [C++][Gandiva] Fix const correctness of Gandiva projector Evaluate

2022-09-27 Thread Antoine Pitrou (Jira)


 [ 
https://issues.apache.org/jira/browse/ARROW-17770?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Antoine Pitrou resolved ARROW-17770.

Resolution: Fixed

Issue resolved by pull request 14165
[https://github.com/apache/arrow/pull/14165]

> [C++][Gandiva] Fix const correctness of Gandiva projector Evaluate
> --
>
> Key: ARROW-17770
> URL: https://issues.apache.org/jira/browse/ARROW-17770
> Project: Apache Arrow
>  Issue Type: Bug
>  Components: C++ - Gandiva
>Affects Versions: 9.0.0
>Reporter: Jin Shang
>Assignee: Jin Shang
>Priority: Minor
>  Labels: pull-request-available
> Fix For: 10.0.0
>
>  Time Spent: 40m
>  Remaining Estimate: 0h
>
> I was trying to figure out the thread-safeness of Gandiva projector 
> evaluation, i.e., whether I can use a single Projector to evaluate multiple 
> inputs concurrently. I assumed it isn't safe because the Evaluate function is 
> not marked const. However, as far as I understand, the Evaluate function 
> merely executes a compiled function on the input, which doesn't modify a 
> project's internal states and should be const.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (ARROW-17854) [CI][Developer] Hoste preview docs on S3

2022-09-27 Thread Jacob Wujciak-Jens (Jira)
Jacob Wujciak-Jens created ARROW-17854:
--

 Summary: [CI][Developer] Hoste preview docs on S3
 Key: ARROW-17854
 URL: https://issues.apache.org/jira/browse/ARROW-17854
 Project: Apache Arrow
  Issue Type: Improvement
  Components: Continuous Integration, Developer Tools
Reporter: Jacob Wujciak-Jens
Assignee: Jacob Wujciak-Jens
 Fix For: 10.0.0


Hosting on Github Pages as implemented in [ARROW-12958] is unsustainable due to 
the size of the arrow docs (~ 200mb).



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Assigned] (ARROW-17477) [CI][Docs] Document Docs PR Preview

2022-09-27 Thread Jacob Wujciak-Jens (Jira)


 [ 
https://issues.apache.org/jira/browse/ARROW-17477?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jacob Wujciak-Jens reassigned ARROW-17477:
--

Assignee: (was: Jacob Wujciak-Jens)

> [CI][Docs] Document Docs PR Preview
> ---
>
> Key: ARROW-17477
> URL: https://issues.apache.org/jira/browse/ARROW-17477
> Project: Apache Arrow
>  Issue Type: Improvement
>  Components: Continuous Integration, Documentation
>Reporter: Jacob Wujciak-Jens
>Priority: Critical
> Fix For: 10.0.0
>
>
> Document the changes from [ARROW-12958] here: 
> https://arrow.apache.org/docs/developers/documentation.html



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (ARROW-17853) [Python][CI] Timeout in test_dataset.py::test_write_dataset_s3_put_only

2022-09-27 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/ARROW-17853?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated ARROW-17853:
---
Labels: pull-request-available  (was: )

> [Python][CI] Timeout in test_dataset.py::test_write_dataset_s3_put_only
> ---
>
> Key: ARROW-17853
> URL: https://issues.apache.org/jira/browse/ARROW-17853
> Project: Apache Arrow
>  Issue Type: Bug
>  Components: Continuous Integration, Python
>Reporter: Antoine Pitrou
>Assignee: Weston Pace
>Priority: Blocker
>  Labels: pull-request-available
> Fix For: 10.0.0
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> This is very likely caused by the fix in ARROW-17614. It can be seen in 
> multiple CI runs and reproduced locally using Archery.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Assigned] (ARROW-17831) [Python][Docs] PyArrow Architecture page outdated after moving pyarrow C++ code

2022-09-27 Thread Alenka Frim (Jira)


 [ 
https://issues.apache.org/jira/browse/ARROW-17831?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alenka Frim reassigned ARROW-17831:
---

Assignee: Alenka Frim

> [Python][Docs] PyArrow Architecture page outdated after moving pyarrow C++ 
> code
> ---
>
> Key: ARROW-17831
> URL: https://issues.apache.org/jira/browse/ARROW-17831
> Project: Apache Arrow
>  Issue Type: Bug
>  Components: Documentation, Python
>Reporter: Joris Van den Bossche
>Assignee: Alenka Frim
>Priority: Major
>
> This section is no longer up to date: 
> https://arrow.apache.org/docs/dev/python/getting_involved.html#pyarrow-architecture
> (it still mentions cpp/src/arrow/python)
> cc [~alenka]



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


  1   2   >