[jira] [Commented] (ARROW-10957) Expanding pyarrow buffer size more than 2GB for pandas_udf functions

2021-03-01 Thread Dmitry Kravchuk (Jira)
[ https://issues.apache.org/jira/browse/ARROW-10957?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17293387#comment-17293387 ] Dmitry Kravchuk commented on ARROW-10957: - [~emkornfield] okay, I've created thi

[jira] [Updated] (ARROW-11833) [C++] Vendored fast_float errors for emscripten (architecture flag missing)

2021-03-01 Thread ASF GitHub Bot (Jira)
[ https://issues.apache.org/jira/browse/ARROW-11833?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated ARROW-11833: --- Labels: pull-request-available (was: ) > [C++] Vendored fast_float errors for emscripten (a

[jira] [Created] (ARROW-11833) [C++] Vendored fast_float errors for emscripten (architecture flag missing)

2021-03-01 Thread Timothy Paine (Jira)
Timothy Paine created ARROW-11833: - Summary: [C++] Vendored fast_float errors for emscripten (architecture flag missing) Key: ARROW-11833 URL: https://issues.apache.org/jira/browse/ARROW-11833 Project

[jira] [Created] (ARROW-11832) [R] Handle conversion of extra nested struct column

2021-03-01 Thread Neal Richardson (Jira)
Neal Richardson created ARROW-11832: --- Summary: [R] Handle conversion of extra nested struct column Key: ARROW-11832 URL: https://issues.apache.org/jira/browse/ARROW-11832 Project: Apache Arrow

[jira] [Resolved] (ARROW-10570) [R] Use Converter API to convert SEXP to Array/ChunkedArray

2021-03-01 Thread Neal Richardson (Jira)
[ https://issues.apache.org/jira/browse/ARROW-10570?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Neal Richardson resolved ARROW-10570. - Fix Version/s: 4.0.0 Resolution: Fixed Issue resolved by pull request 8650 [https

[jira] [Updated] (ARROW-11735) [R] Allow Parquet and Arrow Dataset to be optional components

2021-03-01 Thread ASF GitHub Bot (Jira)
[ https://issues.apache.org/jira/browse/ARROW-11735?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated ARROW-11735: --- Labels: pull-request-available (was: ) > [R] Allow Parquet and Arrow Dataset to be optional

[jira] [Updated] (ARROW-11735) [R] Allow Parquet and Arrow Dataset to be optional components

2021-03-01 Thread Ian Cook (Jira)
[ https://issues.apache.org/jira/browse/ARROW-11735?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ian Cook updated ARROW-11735: - Summary: [R] Allow Parquet and Arrow Dataset to be optional components (was: [R] Allow parquet to be an

[jira] [Updated] (ARROW-11830) [C++] gRPC compilation tests occur every time

2021-03-01 Thread ASF GitHub Bot (Jira)
[ https://issues.apache.org/jira/browse/ARROW-11830?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated ARROW-11830: --- Labels: pull-request-available (was: ) > [C++] gRPC compilation tests occur every time > --

[jira] [Commented] (ARROW-11831) [R] vignette ‘dataset’ not found

2021-03-01 Thread Neal Richardson (Jira)
[ https://issues.apache.org/jira/browse/ARROW-11831?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17293149#comment-17293149 ] Neal Richardson commented on ARROW-11831: - Works on my machine with a CRAN insta

[jira] [Updated] (ARROW-11831) [R] vignette ‘dataset’ not found

2021-03-01 Thread Neal Richardson (Jira)
[ https://issues.apache.org/jira/browse/ARROW-11831?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Neal Richardson updated ARROW-11831: Summary: [R] vignette ‘dataset’ not found (was: vignette ‘dataset’ not found) > [R] vign

[jira] [Created] (ARROW-11831) vignette ‘dataset’ not found

2021-03-01 Thread Jira
Mauricio 'Pachá' Vargas Sepúlveda created ARROW-11831: - Summary: vignette ‘dataset’ not found Key: ARROW-11831 URL: https://issues.apache.org/jira/browse/ARROW-11831 Project: Apache

[jira] [Assigned] (ARROW-11830) [C++] gRPC compilation tests occur every time

2021-03-01 Thread David Li (Jira)
[ https://issues.apache.org/jira/browse/ARROW-11830?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] David Li reassigned ARROW-11830: Assignee: David Li > [C++] gRPC compilation tests occur every time >

[jira] [Commented] (ARROW-10957) Expanding pyarrow buffer size more than 2GB for pandas_udf functions

2021-03-01 Thread Micah Kornfield (Jira)
[ https://issues.apache.org/jira/browse/ARROW-10957?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17293035#comment-17293035 ] Micah Kornfield commented on ARROW-10957: - I think there might be some confusion

[jira] [Commented] (ARROW-11830) [C++] gRPC compilation tests occur every time

2021-03-01 Thread Antoine Pitrou (Jira)
[ https://issues.apache.org/jira/browse/ARROW-11830?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17293032#comment-17293032 ] Antoine Pitrou commented on ARROW-11830: [~lidavidm] > [C++] gRPC compilation t

[jira] [Created] (ARROW-11830) [C++] gRPC compilation tests occur every time

2021-03-01 Thread Antoine Pitrou (Jira)
Antoine Pitrou created ARROW-11830: -- Summary: [C++] gRPC compilation tests occur every time Key: ARROW-11830 URL: https://issues.apache.org/jira/browse/ARROW-11830 Project: Apache Arrow Issu

[jira] [Commented] (ARROW-10405) [C++] IsIn kernel should be able to lookup dictionary in string

2021-03-01 Thread Rok Mihevc (Jira)
[ https://issues.apache.org/jira/browse/ARROW-10405?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17293030#comment-17293030 ] Rok Mihevc commented on ARROW-10405: [~westonpace] indeed just removing pre-dispatch

[jira] [Commented] (ARROW-10553) [Rust] [Parquet] Panic when reading Parquet file produced with parquet-cpp

2021-03-01 Thread Micah Kornfield (Jira)
[ https://issues.apache.org/jira/browse/ARROW-10553?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17293029#comment-17293029 ] Micah Kornfield commented on ARROW-10553: - > So, would it not be better to treat

[jira] [Updated] (ARROW-11742) [Rust] [DataFusion] Add Expr::is_null and Expr::is_not_null functions

2021-03-01 Thread ASF GitHub Bot (Jira)
[ https://issues.apache.org/jira/browse/ARROW-11742?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated ARROW-11742: --- Labels: pull-request-available (was: ) > [Rust] [DataFusion] Add Expr::is_null and Expr::is

[jira] [Assigned] (ARROW-11787) [R] Implement write csv

2021-03-01 Thread Jonathan Keane (Jira)
[ https://issues.apache.org/jira/browse/ARROW-11787?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jonathan Keane reassigned ARROW-11787: -- Assignee: Mauricio 'Pachá' Vargas Sepúlveda > [R] Implement write csv > -

[jira] [Updated] (ARROW-7001) [C++] Develop threading APIs to accommodate nested parallelism

2021-03-01 Thread ASF GitHub Bot (Jira)
[ https://issues.apache.org/jira/browse/ARROW-7001?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated ARROW-7001: -- Labels: pull-request-available (was: ) > [C++] Develop threading APIs to accommodate nested pa

[jira] [Commented] (ARROW-11582) [R] write_dataset "format" argument default and validation could be better

2021-03-01 Thread Neal Richardson (Jira)
[ https://issues.apache.org/jira/browse/ARROW-11582?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17292994#comment-17292994 ] Neal Richardson commented on ARROW-11582: - Some notes: * For this issue, IMO we

[jira] [Updated] (ARROW-10405) [C++] IsIn kernel should be able to lookup dictionary in string

2021-03-01 Thread ASF GitHub Bot (Jira)
[ https://issues.apache.org/jira/browse/ARROW-10405?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated ARROW-10405: --- Labels: pull-request-available (was: ) > [C++] IsIn kernel should be able to lookup diction

[jira] [Commented] (ARROW-10903) [Rust] Implement FromIter>> constructor for FixedSizeBinaryArray

2021-03-01 Thread Ivan Vankov (Jira)
[ https://issues.apache.org/jira/browse/ARROW-10903?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17292955#comment-17292955 ] Ivan Vankov commented on ARROW-10903: - I'm new to this project, but I'd take this on

[jira] [Comment Edited] (ARROW-11735) [R] Allow parquet to be an optional component like S3

2021-03-01 Thread Ian Cook (Jira)
[ https://issues.apache.org/jira/browse/ARROW-11735?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17291960#comment-17291960 ] Ian Cook edited comment on ARROW-11735 at 3/1/21, 2:49 PM: --- On

[jira] [Assigned] (ARROW-11742) [Rust] [DataFusion] Add Expr::is_null and Expr::is_not_null functions

2021-03-01 Thread Andrew Lamb (Jira)
[ https://issues.apache.org/jira/browse/ARROW-11742?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andrew Lamb reassigned ARROW-11742: --- Assignee: Nga Tran > [Rust] [DataFusion] Add Expr::is_null and Expr::is_not_null functions

[jira] [Commented] (ARROW-11629) [C++] Writing float32 values with "Dictionary Encoding" makes parquet files not readable for some tools

2021-03-01 Thread Matthias Rosenthaler (Jira)
[ https://issues.apache.org/jira/browse/ARROW-11629?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17292899#comment-17292899 ] Matthias Rosenthaler commented on ARROW-11629: -- Some news about this bug? C

[jira] [Issue Comment Deleted] (ARROW-11629) [C++] Writing float32 values with "Dictionary Encoding" makes parquet files not readable for some tools

2021-03-01 Thread Matthias Rosenthaler (Jira)
[ https://issues.apache.org/jira/browse/ARROW-11629?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Matthias Rosenthaler updated ARROW-11629: - Comment: was deleted (was: [~GPSnoopy], seems it makes no difference in file siz

[jira] [Updated] (ARROW-11629) [C++] Writing float32 values with "Dictionary Encoding" makes parquet files not readable for some tools

2021-03-01 Thread Matthias Rosenthaler (Jira)
[ https://issues.apache.org/jira/browse/ARROW-11629?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Matthias Rosenthaler updated ARROW-11629: - Description: If I try to read the attached csv file with pyarrow, changing the f

[jira] [Commented] (ARROW-10308) [Python] read_csv from python is slow on some work loads

2021-03-01 Thread Dror Speiser (Jira)
[ https://issues.apache.org/jira/browse/ARROW-10308?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17292895#comment-17292895 ] Dror Speiser commented on ARROW-10308: -- Yeah for sure; I went into the open registr

[jira] [Resolved] (ARROW-11801) [C++] Remove bad header guard in filesystem/type_fwd.h

2021-03-01 Thread Antoine Pitrou (Jira)
[ https://issues.apache.org/jira/browse/ARROW-11801?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Antoine Pitrou resolved ARROW-11801. Resolution: Fixed Issue resolved by pull request 9590 [https://github.com/apache/arrow/pul

[jira] [Resolved] (ARROW-11798) [Integration] Update testing submodule

2021-03-01 Thread Antoine Pitrou (Jira)
[ https://issues.apache.org/jira/browse/ARROW-11798?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Antoine Pitrou resolved ARROW-11798. Fix Version/s: 4.0.0 Resolution: Fixed Issue resolved by pull request 9587 [https:/

[jira] [Commented] (ARROW-10308) [Python] read_csv from python is slow on some work loads

2021-03-01 Thread Antoine Pitrou (Jira)
[ https://issues.apache.org/jira/browse/ARROW-10308?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17292874#comment-17292874 ] Antoine Pitrou commented on ARROW-10308: [~drorspei] The data is very interestin

[jira] [Commented] (ARROW-10308) [Python] read_csv from python is slow on some work loads

2021-03-01 Thread Antoine Pitrou (Jira)
[ https://issues.apache.org/jira/browse/ARROW-10308?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17292872#comment-17292872 ] Antoine Pitrou commented on ARROW-10308: "NUMA", as in "non-uniform memory acces

[jira] [Updated] (ARROW-11802) [Rust][DataFusion] Mixing of crossbeam channel and async tasks can lead to deadlock

2021-03-01 Thread ASF GitHub Bot (Jira)
[ https://issues.apache.org/jira/browse/ARROW-11802?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated ARROW-11802: --- Labels: pull-request-available (was: ) > [Rust][DataFusion] Mixing of crossbeam channel and

[jira] [Commented] (ARROW-9293) [R] Add chunk_size to Table$create()

2021-03-01 Thread Romain Francois (Jira)
[ https://issues.apache.org/jira/browse/ARROW-9293?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17292859#comment-17292859 ] Romain Francois commented on ARROW-9293: Assuming this comes after [https://githu

[jira] [Assigned] (ARROW-11802) [Rust][DataFusion] Mixing of crossbeam channel and async tasks can lead to deadlock

2021-03-01 Thread Andrew Lamb (Jira)
[ https://issues.apache.org/jira/browse/ARROW-11802?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andrew Lamb reassigned ARROW-11802: --- Assignee: Andrew Lamb > [Rust][DataFusion] Mixing of crossbeam channel and async tasks can

[jira] [Resolved] (ARROW-11825) [Rust][DataFusion] Add mimalloc as option to benchmarks

2021-03-01 Thread Andrew Lamb (Jira)
[ https://issues.apache.org/jira/browse/ARROW-11825?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andrew Lamb resolved ARROW-11825. - Fix Version/s: 4.0.0 Resolution: Fixed Issue resolved by pull request 9601 [https://githu

[jira] [Updated] (ARROW-11825) [Rust][DataFusion] Add mimalloc as option to benchmarks

2021-03-01 Thread Andrew Lamb (Jira)
[ https://issues.apache.org/jira/browse/ARROW-11825?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andrew Lamb updated ARROW-11825: Component/s: Rust - DataFusion > [Rust][DataFusion] Add mimalloc as option to benchmarks > ---

[jira] [Resolved] (ARROW-11819) [Rust] Add link to the doc

2021-03-01 Thread Andrew Lamb (Jira)
[ https://issues.apache.org/jira/browse/ARROW-11819?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andrew Lamb resolved ARROW-11819. - Fix Version/s: 4.0.0 Resolution: Fixed Issue resolved by pull request 9594 [https://githu

[jira] [Assigned] (ARROW-11819) [Rust] Add link to the doc

2021-03-01 Thread Andrew Lamb (Jira)
[ https://issues.apache.org/jira/browse/ARROW-11819?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andrew Lamb reassigned ARROW-11819: --- Assignee: Andrew Lamb > [Rust] Add link to the doc > -- > >

[jira] [Comment Edited] (ARROW-10957) Expanding pyarrow buffer size more than 2GB for pandas_udf functions

2021-03-01 Thread Dmitry Kravchuk (Jira)
[ https://issues.apache.org/jira/browse/ARROW-10957?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17292773#comment-17292773 ] Dmitry Kravchuk edited comment on ARROW-10957 at 3/1/21, 10:50 AM: ---

[jira] [Commented] (ARROW-10957) Expanding pyarrow buffer size more than 2GB for pandas_udf functions

2021-03-01 Thread Dmitry Kravchuk (Jira)
[ https://issues.apache.org/jira/browse/ARROW-10957?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17292773#comment-17292773 ] Dmitry Kravchuk commented on ARROW-10957: - [~emkornfield] btw I'm using spark 3.

[jira] [Updated] (ARROW-11567) [C++][Compute] Variance kernel has precision issue

2021-03-01 Thread ASF GitHub Bot (Jira)
[ https://issues.apache.org/jira/browse/ARROW-11567?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated ARROW-11567: --- Labels: pull-request-available (was: ) > [C++][Compute] Variance kernel has precision issue

[jira] [Comment Edited] (ARROW-10553) [Rust] [Parquet] Panic when reading Parquet file produced with parquet-cpp

2021-03-01 Thread Neville Dipale (Jira)
[ https://issues.apache.org/jira/browse/ARROW-10553?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17292766#comment-17292766 ] Neville Dipale edited comment on ARROW-10553 at 3/1/21, 9:28 AM: -

[jira] [Commented] (ARROW-10553) [Rust] [Parquet] Panic when reading Parquet file produced with parquet-cpp

2021-03-01 Thread Neville Dipale (Jira)
[ https://issues.apache.org/jira/browse/ARROW-10553?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17292766#comment-17292766 ] Neville Dipale commented on ARROW-10553: [~emkornfield] The Parquet version hasn

[jira] [Commented] (ARROW-11792) PyArrow unable to read file with large string values

2021-03-01 Thread Daniel Evans (Jira)
[ https://issues.apache.org/jira/browse/ARROW-11792?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17292758#comment-17292758 ] Daniel Evans commented on ARROW-11792: -- I've re-run the file generation over the we

[jira] [Commented] (ARROW-10957) Expanding pyarrow buffer size more than 2GB for pandas_udf functions

2021-03-01 Thread Dmitry Kravchuk (Jira)
[ https://issues.apache.org/jira/browse/ARROW-10957?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17292737#comment-17292737 ] Dmitry Kravchuk commented on ARROW-10957: - [~emkornfield] can help spark communi

[jira] [Commented] (ARROW-10553) [Rust] [Parquet] Panic when reading Parquet file produced with parquet-cpp

2021-03-01 Thread Neville Dipale (Jira)
[ https://issues.apache.org/jira/browse/ARROW-10553?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17292732#comment-17292732 ] Neville Dipale commented on ARROW-10553: Thanks for looking [~emkornfield]. The

[jira] [Commented] (ARROW-10957) Expanding pyarrow buffer size more than 2GB for pandas_udf functions

2021-03-01 Thread Micah Kornfield (Jira)
[ https://issues.apache.org/jira/browse/ARROW-10957?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17292720#comment-17292720 ] Micah Kornfield commented on ARROW-10957: - [~dishka_krauch] I looked at it looks

[jira] [Comment Edited] (ARROW-10957) Expanding pyarrow buffer size more than 2GB for pandas_udf functions

2021-03-01 Thread Micah Kornfield (Jira)
[ https://issues.apache.org/jira/browse/ARROW-10957?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17292720#comment-17292720 ] Micah Kornfield edited comment on ARROW-10957 at 3/1/21, 8:28 AM:

[jira] [Commented] (ARROW-10957) Expanding pyarrow buffer size more than 2GB for pandas_udf functions

2021-03-01 Thread Dmitry Kravchuk (Jira)
[ https://issues.apache.org/jira/browse/ARROW-10957?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17292705#comment-17292705 ] Dmitry Kravchuk commented on ARROW-10957: - [~emkornfield] [~fan_li_ya] Hello.

[jira] [Updated] (ARROW-10957) Expanding pyarrow buffer size more than 2GB for pandas_udf functions

2021-03-01 Thread Dmitry Kravchuk (Jira)
[ https://issues.apache.org/jira/browse/ARROW-10957?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dmitry Kravchuk updated ARROW-10957: Attachment: spark3 env.png spark3 error.png python env.png