[jira] [Created] (ARROW-13257) [Java][Dataset] Allow passing empty columns for projection

2021-07-05 Thread Hongze Zhang (Jira)
Hongze Zhang created ARROW-13257: Summary: [Java][Dataset] Allow passing empty columns for projection Key: ARROW-13257 URL: https://issues.apache.org/jira/browse/ARROW-13257 Project: Apache Arrow

[jira] [Updated] (ARROW-13257) [Java][Dataset] Allow passing empty columns for projection

2021-07-05 Thread ASF GitHub Bot (Jira)
[ https://issues.apache.org/jira/browse/ARROW-13257?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated ARROW-13257: --- Labels: pull-request-available (was: ) > [Java][Dataset] Allow passing empty columns for pr

[jira] [Updated] (ARROW-13141) [C++][Python] HadoopFileSystem: automatically set CLASSPATH based on HADOOP_HOME env variable?

2021-07-05 Thread Alessandro Molina (Jira)
[ https://issues.apache.org/jira/browse/ARROW-13141?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alessandro Molina updated ARROW-13141: -- Fix Version/s: (was: 5.0.0) 6.0.0 > [C++][Python] HadoopFileSys

[jira] [Updated] (ARROW-10436) [Python][Dataset] Deprecate RowGroupInfo

2021-07-05 Thread Alessandro Molina (Jira)
[ https://issues.apache.org/jira/browse/ARROW-10436?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alessandro Molina updated ARROW-10436: -- Fix Version/s: (was: 5.0.0) 6.0.0 > [Python][Dataset] Deprecate

[jira] [Updated] (ARROW-13137) [C++][Documentation] Make in-table references consistent

2021-07-05 Thread Alessandro Molina (Jira)
[ https://issues.apache.org/jira/browse/ARROW-13137?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alessandro Molina updated ARROW-13137: -- Fix Version/s: (was: 5.0.0) 6.0.0 > [C++][Documentation] Make i

[jira] [Updated] (ARROW-12819) [CI] Include build log's url in the nightly crossbow report

2021-07-05 Thread Alessandro Molina (Jira)
[ https://issues.apache.org/jira/browse/ARROW-12819?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alessandro Molina updated ARROW-12819: -- Fix Version/s: (was: 5.0.0) 6.0.0 > [CI] Include build log's ur

[jira] [Updated] (ARROW-12727) [C++][Compute] GroupBy: support more than 2^32 groups

2021-07-05 Thread Alessandro Molina (Jira)
[ https://issues.apache.org/jira/browse/ARROW-12727?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alessandro Molina updated ARROW-12727: -- Fix Version/s: (was: 5.0.0) 6.0.0 > [C++][Compute] GroupBy: sup

[jira] [Updated] (ARROW-11259) [Python] Allow to create field reference to nested field

2021-07-05 Thread Alessandro Molina (Jira)
[ https://issues.apache.org/jira/browse/ARROW-11259?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alessandro Molina updated ARROW-11259: -- Fix Version/s: (was: 5.0.0) 6.0.0 > [Python] Allow to create fi

[jira] [Updated] (ARROW-12358) [C++][Python][R][Dataset] Control overwriting vs appending when writing to existing dataset

2021-07-05 Thread Alessandro Molina (Jira)
[ https://issues.apache.org/jira/browse/ARROW-12358?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alessandro Molina updated ARROW-12358: -- Fix Version/s: (was: 5.0.0) 6.0.0 > [C++][Python][R][Dataset] C

[jira] [Updated] (ARROW-12213) [R] copy_files doesn't make it easy to copy a single file

2021-07-05 Thread Alessandro Molina (Jira)
[ https://issues.apache.org/jira/browse/ARROW-12213?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alessandro Molina updated ARROW-12213: -- Fix Version/s: (was: 5.0.0) 6.0.0 > [R] copy_files doesn't make

[jira] [Updated] (ARROW-12992) [R] bindings for substr(), substring(), str_sub()

2021-07-05 Thread Alessandro Molina (Jira)
[ https://issues.apache.org/jira/browse/ARROW-12992?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alessandro Molina updated ARROW-12992: -- Fix Version/s: (was: 5.0.0) 6.0.0 > [R] bindings for substr(),

[jira] [Updated] (ARROW-6407) [C++] Consolidate thirdparty bundle URLs, version bumping logic, etc

2021-07-05 Thread Alessandro Molina (Jira)
[ https://issues.apache.org/jira/browse/ARROW-6407?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alessandro Molina updated ARROW-6407: - Fix Version/s: (was: 5.0.0) 6.0.0 > [C++] Consolidate thirdparty b

[jira] [Updated] (ARROW-6485) [Format][C++]Support the format of a COO sparse matrix that has separated row and column indices

2021-07-05 Thread Alessandro Molina (Jira)
[ https://issues.apache.org/jira/browse/ARROW-6485?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alessandro Molina updated ARROW-6485: - Fix Version/s: (was: 5.0.0) 6.0.0 > [Format][C++]Support the forma

[jira] [Updated] (ARROW-12633) [C++] Query engine v0 umbrella issue

2021-07-05 Thread Alessandro Molina (Jira)
[ https://issues.apache.org/jira/browse/ARROW-12633?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alessandro Molina updated ARROW-12633: -- Fix Version/s: (was: 5.0.0) 6.0.0 > [C++] Query engine v0 umbre

[jira] [Updated] (ARROW-12063) [C++] Add nulls position option to sort functions

2021-07-05 Thread Alessandro Molina (Jira)
[ https://issues.apache.org/jira/browse/ARROW-12063?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alessandro Molina updated ARROW-12063: -- Fix Version/s: (was: 5.0.0) 6.0.0 > [C++] Add nulls position op

[jira] [Updated] (ARROW-11691) [Developer][CI] Provide a consolidated .env file for benchmark-relevant environment variables

2021-07-05 Thread Alessandro Molina (Jira)
[ https://issues.apache.org/jira/browse/ARROW-11691?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alessandro Molina updated ARROW-11691: -- Fix Version/s: (was: 5.0.0) 6.0.0 > [Developer][CI] Provide a c

[jira] [Updated] (ARROW-12535) Enable metadata writing in the ORCWriter

2021-07-05 Thread Alessandro Molina (Jira)
[ https://issues.apache.org/jira/browse/ARROW-12535?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alessandro Molina updated ARROW-12535: -- Fix Version/s: (was: 5.0.0) 6.0.0 > Enable metadata writing in

[jira] [Updated] (ARROW-7798) [R] Refactor R <-> Array conversion

2021-07-05 Thread Alessandro Molina (Jira)
[ https://issues.apache.org/jira/browse/ARROW-7798?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alessandro Molina updated ARROW-7798: - Fix Version/s: (was: 5.0.0) 6.0.0 > [R] Refactor R <-> Array conve

[jira] [Updated] (ARROW-12315) [R] add max_partitions argument to write_dataset()

2021-07-05 Thread Alessandro Molina (Jira)
[ https://issues.apache.org/jira/browse/ARROW-12315?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alessandro Molina updated ARROW-12315: -- Fix Version/s: (was: 5.0.0) 6.0.0 > [R] add max_partitions argu

[jira] [Updated] (ARROW-10671) [FlightRPC] Bearer Token refresh design with retry mechanism

2021-07-05 Thread Alessandro Molina (Jira)
[ https://issues.apache.org/jira/browse/ARROW-10671?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alessandro Molina updated ARROW-10671: -- Fix Version/s: (was: 5.0.0) 6.0.0 > [FlightRPC] Bearer Token re

[jira] [Updated] (ARROW-10848) [C++] CSV ISO-8601 date and timestamp short form

2021-07-05 Thread Alessandro Molina (Jira)
[ https://issues.apache.org/jira/browse/ARROW-10848?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alessandro Molina updated ARROW-10848: -- Fix Version/s: (was: 5.0.0) 6.0.0 > [C++] CSV ISO-8601 date and

[jira] [Updated] (ARROW-4753) [C++] Extension types and layouts for text-optimized data structures

2021-07-05 Thread Alessandro Molina (Jira)
[ https://issues.apache.org/jira/browse/ARROW-4753?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alessandro Molina updated ARROW-4753: - Fix Version/s: (was: 5.0.0) 6.0.0 > [C++] Extension types and layo

[jira] [Updated] (ARROW-9842) [C++] Explore alternative strategy for Compare kernel implementation for better performance

2021-07-05 Thread Alessandro Molina (Jira)
[ https://issues.apache.org/jira/browse/ARROW-9842?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alessandro Molina updated ARROW-9842: - Fix Version/s: (was: 5.0.0) 6.0.0 > [C++] Explore alternative stra

[jira] [Updated] (ARROW-12723) [C++][Compute] GroupBy: add unittests for individual components of hash group by

2021-07-05 Thread Alessandro Molina (Jira)
[ https://issues.apache.org/jira/browse/ARROW-12723?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alessandro Molina updated ARROW-12723: -- Fix Version/s: (was: 5.0.0) 6.0.0 > [C++][Compute] GroupBy: add

[jira] [Updated] (ARROW-9006) [C++] Use Cast kernels to implement Scalar::Parse and Scalar::CastTo

2021-07-05 Thread Alessandro Molina (Jira)
[ https://issues.apache.org/jira/browse/ARROW-9006?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alessandro Molina updated ARROW-9006: - Fix Version/s: (was: 5.0.0) 6.0.0 > [C++] Use Cast kernels to impl

[jira] [Updated] (ARROW-8936) [C++] Parallelize execution of arrow::compute::ScalarFunction

2021-07-05 Thread Alessandro Molina (Jira)
[ https://issues.apache.org/jira/browse/ARROW-8936?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alessandro Molina updated ARROW-8936: - Fix Version/s: (was: 5.0.0) 6.0.0 > [C++] Parallelize execution of

[jira] [Updated] (ARROW-10657) [CI] Continuous integration on Apple M1 architecture

2021-07-05 Thread Alessandro Molina (Jira)
[ https://issues.apache.org/jira/browse/ARROW-10657?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alessandro Molina updated ARROW-10657: -- Fix Version/s: (was: 5.0.0) 6.0.0 > [CI] Continuous integration

[jira] [Updated] (ARROW-12759) [C++][Compute] Wrap grouped aggregation in an ExecNode

2021-07-05 Thread Alessandro Molina (Jira)
[ https://issues.apache.org/jira/browse/ARROW-12759?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alessandro Molina updated ARROW-12759: -- Fix Version/s: (was: 5.0.0) 6.0.0 > [C++][Compute] Wrap grouped

[jira] [Updated] (ARROW-12821) [CI] Include the first occurrence of a task failure in the nightly report

2021-07-05 Thread Alessandro Molina (Jira)
[ https://issues.apache.org/jira/browse/ARROW-12821?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alessandro Molina updated ARROW-12821: -- Fix Version/s: (was: 5.0.0) 6.0.0 > [CI] Include the first occu

[jira] [Updated] (ARROW-12515) [Dev][Wiki][Release] Fix and update Windows RC verify script

2021-07-05 Thread Alessandro Molina (Jira)
[ https://issues.apache.org/jira/browse/ARROW-12515?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alessandro Molina updated ARROW-12515: -- Fix Version/s: (was: 5.0.0) 6.0.0 > [Dev][Wiki][Release] Fix an

[jira] [Updated] (ARROW-11003) [C++][Dataset] Schema evolution in Dataset scanning

2021-07-05 Thread Alessandro Molina (Jira)
[ https://issues.apache.org/jira/browse/ARROW-11003?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alessandro Molina updated ARROW-11003: -- Fix Version/s: (was: 5.0.0) 6.0.0 > [C++][Dataset] Schema evolu

[jira] [Updated] (ARROW-10439) [C++][Dataset] Add max file size as a dataset writing option

2021-07-05 Thread Alessandro Molina (Jira)
[ https://issues.apache.org/jira/browse/ARROW-10439?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alessandro Molina updated ARROW-10439: -- Fix Version/s: (was: 5.0.0) 6.0.0 > [C++][Dataset] Add max file

[jira] [Updated] (ARROW-10435) [C++][Dataset][Python] Improve ParquetFileFragment serialization

2021-07-05 Thread Alessandro Molina (Jira)
[ https://issues.apache.org/jira/browse/ARROW-10435?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alessandro Molina updated ARROW-10435: -- Fix Version/s: (was: 5.0.0) 6.0.0 > [C++][Dataset][Python] Impr

[jira] [Updated] (ARROW-11465) [C++] Parquet file writer snapshot API and proper ColumnChunk.file_path utilization

2021-07-05 Thread Alessandro Molina (Jira)
[ https://issues.apache.org/jira/browse/ARROW-11465?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alessandro Molina updated ARROW-11465: -- Fix Version/s: (was: 5.0.0) 6.0.0 > [C++] Parquet file writer s

[jira] [Updated] (ARROW-10094) [Python][Doc] Update pandas doc

2021-07-05 Thread Alessandro Molina (Jira)
[ https://issues.apache.org/jira/browse/ARROW-10094?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alessandro Molina updated ARROW-10094: -- Fix Version/s: (was: 5.0.0) 6.0.0 > [Python][Doc] Update pandas

[jira] [Updated] (ARROW-13078) [R] Bindings for str_replace_na()

2021-07-05 Thread Alessandro Molina (Jira)
[ https://issues.apache.org/jira/browse/ARROW-13078?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alessandro Molina updated ARROW-13078: -- Fix Version/s: (was: 5.0.0) 6.0.0 > [R] Bindings for str_replac

[jira] [Updated] (ARROW-13227) [C++][Compute] Document ExecNode, ExecPlan

2021-07-05 Thread Alessandro Molina (Jira)
[ https://issues.apache.org/jira/browse/ARROW-13227?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alessandro Molina updated ARROW-13227: -- Fix Version/s: (was: 5.0.0) 6.0.0 > [C++][Compute] Document Exe

[jira] [Updated] (ARROW-12553) [Release] Add crossbow task to verify ARM64 wheels on travis

2021-07-05 Thread Alessandro Molina (Jira)
[ https://issues.apache.org/jira/browse/ARROW-12553?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alessandro Molina updated ARROW-12553: -- Fix Version/s: (was: 5.0.0) 6.0.0 > [Release] Add crossbow task

[jira] [Updated] (ARROW-12053) [C++] Implement aggregate compute functions for decimal datatype

2021-07-05 Thread Alessandro Molina (Jira)
[ https://issues.apache.org/jira/browse/ARROW-12053?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alessandro Molina updated ARROW-12053: -- Fix Version/s: (was: 5.0.0) 6.0.0 > [C++] Implement aggregate c

[jira] [Updated] (ARROW-13091) [Python] Add compression_level argument to IpcWriteOptions constructor

2021-07-05 Thread Alessandro Molina (Jira)
[ https://issues.apache.org/jira/browse/ARROW-13091?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alessandro Molina updated ARROW-13091: -- Fix Version/s: (was: 5.0.0) 6.0.0 > [Python] Add compression_le

[jira] [Updated] (ARROW-8228) [C++][Parquet] Support writing lists that have null elements that are non-empty.

2021-07-05 Thread Alessandro Molina (Jira)
[ https://issues.apache.org/jira/browse/ARROW-8228?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alessandro Molina updated ARROW-8228: - Fix Version/s: (was: 5.0.0) 6.0.0 > [C++][Parquet] Support writing

[jira] [Updated] (ARROW-13118) [R] Improve handling of R scalars in some nse_funcs

2021-07-05 Thread Alessandro Molina (Jira)
[ https://issues.apache.org/jira/browse/ARROW-13118?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alessandro Molina updated ARROW-13118: -- Fix Version/s: (was: 5.0.0) 6.0.0 > [R] Improve handling of R s

[jira] [Updated] (ARROW-12726) [C++][Compute] GroupBy: add parallelism to hash group by

2021-07-05 Thread Alessandro Molina (Jira)
[ https://issues.apache.org/jira/browse/ARROW-12726?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alessandro Molina updated ARROW-12726: -- Fix Version/s: (was: 5.0.0) 6.0.0 > [C++][Compute] GroupBy: add

[jira] [Updated] (ARROW-13087) [R] Expose Parquet ArrowReaderProperties::coerce_int96_timestamp_unit_

2021-07-05 Thread Alessandro Molina (Jira)
[ https://issues.apache.org/jira/browse/ARROW-13087?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alessandro Molina updated ARROW-13087: -- Fix Version/s: (was: 5.0.0) 6.0.0 > [R] Expose Parquet ArrowRea

[jira] [Updated] (ARROW-11878) [C++] Improve Converter API to support chunking

2021-07-05 Thread Alessandro Molina (Jira)
[ https://issues.apache.org/jira/browse/ARROW-11878?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alessandro Molina updated ARROW-11878: -- Fix Version/s: (was: 5.0.0) 6.0.0 > [C++] Improve Converter API

[jira] [Updated] (ARROW-12939) [R] Simplify RTask stop handling

2021-07-05 Thread Alessandro Molina (Jira)
[ https://issues.apache.org/jira/browse/ARROW-12939?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alessandro Molina updated ARROW-12939: -- Fix Version/s: (was: 5.0.0) 6.0.0 > [R] Simplify RTask stop han

[jira] [Updated] (ARROW-12805) [Python] Use consistent memory_pool / pool keyword argument name

2021-07-05 Thread Alessandro Molina (Jira)
[ https://issues.apache.org/jira/browse/ARROW-12805?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alessandro Molina updated ARROW-12805: -- Fix Version/s: (was: 5.0.0) 6.0.0 > [Python] Use consistent mem

[jira] [Updated] (ARROW-12203) [C++][Python] Switch default Parquet version to 2.0

2021-07-05 Thread Alessandro Molina (Jira)
[ https://issues.apache.org/jira/browse/ARROW-12203?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alessandro Molina updated ARROW-12203: -- Fix Version/s: (was: 5.0.0) 6.0.0 > [C++][Python] Switch defaul

[jira] [Updated] (ARROW-9672) [Python][Parquet] Expose _filters_to_expression

2021-07-05 Thread Alessandro Molina (Jira)
[ https://issues.apache.org/jira/browse/ARROW-9672?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alessandro Molina updated ARROW-9672: - Fix Version/s: (was: 5.0.0) 6.0.0 > [Python][Parquet] Expose _filt

[jira] [Updated] (ARROW-13067) [C++][Compute] Implement integer to decimal cast

2021-07-05 Thread Alessandro Molina (Jira)
[ https://issues.apache.org/jira/browse/ARROW-13067?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alessandro Molina updated ARROW-13067: -- Fix Version/s: (was: 5.0.0) 6.0.0 > [C++][Compute] Implement in

[jira] [Updated] (ARROW-12095) [CI][C++] Add nightly job to test offline build

2021-07-05 Thread Alessandro Molina (Jira)
[ https://issues.apache.org/jira/browse/ARROW-12095?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alessandro Molina updated ARROW-12095: -- Fix Version/s: (was: 5.0.0) 6.0.0 > [CI][C++] Add nightly job t

[jira] [Updated] (ARROW-11980) [Python] Remove "experimental" status from Table.replace_schema_metadata

2021-07-05 Thread Alessandro Molina (Jira)
[ https://issues.apache.org/jira/browse/ARROW-11980?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alessandro Molina updated ARROW-11980: -- Fix Version/s: (was: 5.0.0) 6.0.0 > [Python] Remove "experiment

[jira] [Updated] (ARROW-12892) [C++][Dataset] Remove Dataset::partition_expression

2021-07-05 Thread Alessandro Molina (Jira)
[ https://issues.apache.org/jira/browse/ARROW-12892?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alessandro Molina updated ARROW-12892: -- Fix Version/s: (was: 5.0.0) 6.0.0 > [C++][Dataset] Remove Datas

[jira] [Updated] (ARROW-12016) [C++] Implement array_sort_indices and sort_indices for BOOL type

2021-07-05 Thread Alessandro Molina (Jira)
[ https://issues.apache.org/jira/browse/ARROW-12016?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alessandro Molina updated ARROW-12016: -- Fix Version/s: (was: 5.0.0) 6.0.0 > [C++] Implement array_sort_

[jira] [Updated] (ARROW-13055) [Format] Document "canonical extension type" and criteria

2021-07-05 Thread Alessandro Molina (Jira)
[ https://issues.apache.org/jira/browse/ARROW-13055?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alessandro Molina updated ARROW-13055: -- Fix Version/s: (was: 5.0.0) 6.0.0 > [Format] Document "canonica

[jira] [Updated] (ARROW-12698) [C++][Compute] Extend compute layer to support ternary scalar functions

2021-07-05 Thread Alessandro Molina (Jira)
[ https://issues.apache.org/jira/browse/ARROW-12698?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alessandro Molina updated ARROW-12698: -- Fix Version/s: (was: 5.0.0) 6.0.0 > [C++][Compute] Extend compu

[jira] [Updated] (ARROW-12243) [C++] Datasets/Fragment/ScanOptions should be immutable

2021-07-05 Thread Alessandro Molina (Jira)
[ https://issues.apache.org/jira/browse/ARROW-12243?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alessandro Molina updated ARROW-12243: -- Fix Version/s: (was: 5.0.0) 6.0.0 > [C++] Datasets/Fragment/Sca

[jira] [Updated] (ARROW-10317) [C++] Consider adding documentation for FunctionOption classes

2021-07-05 Thread Alessandro Molina (Jira)
[ https://issues.apache.org/jira/browse/ARROW-10317?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alessandro Molina updated ARROW-10317: -- Fix Version/s: (was: 5.0.0) 6.0.0 > [C++] Consider adding docum

[jira] [Updated] (ARROW-6607) [Python] Support for set/list columns when converting from Pandas

2021-07-05 Thread Alessandro Molina (Jira)
[ https://issues.apache.org/jira/browse/ARROW-6607?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alessandro Molina updated ARROW-6607: - Fix Version/s: (was: 5.0.0) 6.0.0 > [Python] Support for set/list

[jira] [Updated] (ARROW-10222) [C++] Add FileSystem::MakeUri() to serialize file locations to URIs

2021-07-05 Thread Alessandro Molina (Jira)
[ https://issues.apache.org/jira/browse/ARROW-10222?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alessandro Molina updated ARROW-10222: -- Fix Version/s: (was: 5.0.0) 6.0.0 > [C++] Add FileSystem::MakeU

[jira] [Updated] (ARROW-9483) [C++] Reorganize testing headers

2021-07-05 Thread Alessandro Molina (Jira)
[ https://issues.apache.org/jira/browse/ARROW-9483?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alessandro Molina updated ARROW-9483: - Fix Version/s: (was: 5.0.0) 6.0.0 > [C++] Reorganize testing heade

[jira] [Updated] (ARROW-10924) [C++] Validate temporal data in ValidateArrayFull

2021-07-05 Thread Alessandro Molina (Jira)
[ https://issues.apache.org/jira/browse/ARROW-10924?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alessandro Molina updated ARROW-10924: -- Fix Version/s: (was: 5.0.0) 6.0.0 > [C++] Validate temporal dat

[jira] [Updated] (ARROW-7283) Ensure dictionary IPC implementations match spec clarifications

2021-07-05 Thread Alessandro Molina (Jira)
[ https://issues.apache.org/jira/browse/ARROW-7283?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alessandro Molina updated ARROW-7283: - Fix Version/s: (was: 5.0.0) 6.0.0 > Ensure dictionary IPC implemen

[jira] [Updated] (ARROW-12725) [C++][Compute] GroupBy: improve performance by encoding keys in row format only when they are inserted into hash table

2021-07-05 Thread Alessandro Molina (Jira)
[ https://issues.apache.org/jira/browse/ARROW-12725?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alessandro Molina updated ARROW-12725: -- Fix Version/s: (was: 5.0.0) 6.0.0 > [C++][Compute] GroupBy: imp

[jira] [Updated] (ARROW-11424) [C++] Add more StructType and StructArray methods

2021-07-05 Thread Alessandro Molina (Jira)
[ https://issues.apache.org/jira/browse/ARROW-11424?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alessandro Molina updated ARROW-11424: -- Fix Version/s: (was: 5.0.0) 6.0.0 > [C++] Add more StructType a

[jira] [Updated] (ARROW-11296) [C++][Python] Add ReaderOptions for ORC

2021-07-05 Thread Alessandro Molina (Jira)
[ https://issues.apache.org/jira/browse/ARROW-11296?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alessandro Molina updated ARROW-11296: -- Fix Version/s: (was: 5.0.0) 6.0.0 > [C++][Python] Add ReaderOpt

[jira] [Updated] (ARROW-12641) [C++] Provide thread id accessors

2021-07-05 Thread Alessandro Molina (Jira)
[ https://issues.apache.org/jira/browse/ARROW-12641?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alessandro Molina updated ARROW-12641: -- Fix Version/s: (was: 5.0.0) 6.0.0 > [C++] Provide thread id acc

[jira] [Updated] (ARROW-10142) [C++] RecordBatchStreamReader should use StreamDecoder

2021-07-05 Thread Alessandro Molina (Jira)
[ https://issues.apache.org/jira/browse/ARROW-10142?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alessandro Molina updated ARROW-10142: -- Fix Version/s: (was: 5.0.0) 6.0.0 > [C++] RecordBatchStreamRead

[jira] [Updated] (ARROW-13074) [Python] Start with deprecating ParquetDataset custom attributes

2021-07-05 Thread Alessandro Molina (Jira)
[ https://issues.apache.org/jira/browse/ARROW-13074?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alessandro Molina updated ARROW-13074: -- Fix Version/s: (was: 5.0.0) 6.0.0 > [Python] Start with depreca

[jira] [Updated] (ARROW-12724) [C++] Add documentation for authoring compute kernels

2021-07-05 Thread Alessandro Molina (Jira)
[ https://issues.apache.org/jira/browse/ARROW-12724?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alessandro Molina updated ARROW-12724: -- Fix Version/s: (was: 5.0.0) 6.0.0 > [C++] Add documentation for

[jira] [Updated] (ARROW-10209) [Python] support positional arguments for options in compute wrapper

2021-07-05 Thread Alessandro Molina (Jira)
[ https://issues.apache.org/jira/browse/ARROW-10209?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alessandro Molina updated ARROW-10209: -- Fix Version/s: (was: 5.0.0) 6.0.0 > [Python] support positional

[jira] [Updated] (ARROW-12526) [Python] Pre-generate pyarrow.compute members

2021-07-05 Thread Alessandro Molina (Jira)
[ https://issues.apache.org/jira/browse/ARROW-12526?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alessandro Molina updated ARROW-12526: -- Fix Version/s: (was: 5.0.0) 6.0.0 > [Python] Pre-generate pyar

[jira] [Updated] (ARROW-7394) [C++][DataFrame] Implement zero-copy optimizations when performing Filter

2021-07-05 Thread Alessandro Molina (Jira)
[ https://issues.apache.org/jira/browse/ARROW-7394?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alessandro Molina updated ARROW-7394: - Fix Version/s: (was: 5.0.0) 6.0.0 > [C++][DataFrame] Implement zer

[jira] [Updated] (ARROW-9432) [C++/Python] Add option to Take kernel to interpret negative indices as indexing from the right

2021-07-05 Thread Alessandro Molina (Jira)
[ https://issues.apache.org/jira/browse/ARROW-9432?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alessandro Molina updated ARROW-9432: - Fix Version/s: (was: 5.0.0) 6.0.0 > [C++/Python] Add option to Tak

[jira] [Updated] (ARROW-8655) [C++][Dataset][Python][R] Preserve partitioning information for a discovered Dataset

2021-07-05 Thread Alessandro Molina (Jira)
[ https://issues.apache.org/jira/browse/ARROW-8655?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alessandro Molina updated ARROW-8655: - Fix Version/s: (was: 5.0.0) 6.0.0 > [C++][Dataset][Python][R] Pres

[jira] [Updated] (ARROW-9433) [C++/Python] Add option to Take kernel to interpret negative indices as NULL

2021-07-05 Thread Alessandro Molina (Jira)
[ https://issues.apache.org/jira/browse/ARROW-9433?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alessandro Molina updated ARROW-9433: - Fix Version/s: (was: 5.0.0) 6.0.0 > [C++/Python] Add option to Tak

[jira] [Updated] (ARROW-12728) [C++][Compute] Aggregates: implement count distinct

2021-07-05 Thread Alessandro Molina (Jira)
[ https://issues.apache.org/jira/browse/ARROW-12728?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alessandro Molina updated ARROW-12728: -- Fix Version/s: (was: 5.0.0) 6.0.0 > [C++][Compute] Aggregates:

[jira] [Updated] (ARROW-6940) [C++] Expose Message-level IPC metadata in both read and write interfaces

2021-07-05 Thread Alessandro Molina (Jira)
[ https://issues.apache.org/jira/browse/ARROW-6940?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alessandro Molina updated ARROW-6940: - Fix Version/s: (was: 5.0.0) 6.0.0 > [C++] Expose Message-level IPC

[jira] [Updated] (ARROW-12091) [C++] Allow AddCallback/Then to take in an optional Executor.

2021-07-05 Thread Alessandro Molina (Jira)
[ https://issues.apache.org/jira/browse/ARROW-12091?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alessandro Molina updated ARROW-12091: -- Fix Version/s: (was: 5.0.0) 6.0.0 > [C++] Allow AddCallback/The

[jira] [Updated] (ARROW-11841) [R][C++] Allow cancelling long-running commands

2021-07-05 Thread Alessandro Molina (Jira)
[ https://issues.apache.org/jira/browse/ARROW-11841?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alessandro Molina updated ARROW-11841: -- Fix Version/s: (was: 5.0.0) 6.0.0 > [R][C++] Allow cancelling l

[jira] [Updated] (ARROW-8470) [Python][R] Expose incremental write API for Feather files

2021-07-05 Thread Alessandro Molina (Jira)
[ https://issues.apache.org/jira/browse/ARROW-8470?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alessandro Molina updated ARROW-8470: - Fix Version/s: (was: 5.0.0) 6.0.0 > [Python][R] Expose incremental

[jira] [Updated] (ARROW-11755) [R] Add tests from dplyr/test-mutate.r

2021-07-05 Thread Alessandro Molina (Jira)
[ https://issues.apache.org/jira/browse/ARROW-11755?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alessandro Molina updated ARROW-11755: -- Fix Version/s: (was: 5.0.0) 6.0.0 > [R] Add tests from dplyr/te

[jira] [Updated] (ARROW-11243) [C++] Parse time32 from string and infer in CSV reader

2021-07-05 Thread Alessandro Molina (Jira)
[ https://issues.apache.org/jira/browse/ARROW-11243?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alessandro Molina updated ARROW-11243: -- Fix Version/s: (was: 5.0.0) 6.0.0 > [C++] Parse time32 from str

[jira] [Updated] (ARROW-13086) [Python] Expose Parquet ArrowReaderProperties::coerce_int96_timestamp_unit_

2021-07-05 Thread Alessandro Molina (Jira)
[ https://issues.apache.org/jira/browse/ARROW-13086?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alessandro Molina updated ARROW-13086: -- Fix Version/s: (was: 5.0.0) 6.0.0 > [Python] Expose Parquet Arr

[jira] [Updated] (ARROW-10219) [C++] csv::TableReader column names, Read() arguments

2021-07-05 Thread Alessandro Molina (Jira)
[ https://issues.apache.org/jira/browse/ARROW-10219?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alessandro Molina updated ARROW-10219: -- Fix Version/s: (was: 5.0.0) 6.0.0 > [C++] csv::TableReader colu

[jira] [Updated] (ARROW-9111) [Python] csv.read_csv progress bar

2021-07-05 Thread Alessandro Molina (Jira)
[ https://issues.apache.org/jira/browse/ARROW-9111?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alessandro Molina updated ARROW-9111: - Fix Version/s: (was: 5.0.0) 6.0.0 > [Python] csv.read_csv progress

[jira] [Updated] (ARROW-11206) [C++][Dataset][Python] Consider hiding/renaming "project"

2021-07-05 Thread Alessandro Molina (Jira)
[ https://issues.apache.org/jira/browse/ARROW-11206?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alessandro Molina updated ARROW-11206: -- Fix Version/s: (was: 5.0.0) 6.0.0 > [C++][Dataset][Python] Cons

[jira] [Updated] (ARROW-12697) [C++][Compute] Generalize null value promotion for scalar arithmetic functions

2021-07-05 Thread Alessandro Molina (Jira)
[ https://issues.apache.org/jira/browse/ARROW-12697?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alessandro Molina updated ARROW-12697: -- Fix Version/s: (was: 5.0.0) 6.0.0 > [C++][Compute] Generalize n

[jira] [Updated] (ARROW-12830) [C++][Compute] Make GroupBy optimizations work on Big Endian architecture

2021-07-05 Thread Alessandro Molina (Jira)
[ https://issues.apache.org/jira/browse/ARROW-12830?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alessandro Molina updated ARROW-12830: -- Fix Version/s: (was: 5.0.0) 6.0.0 > [C++][Compute] Make GroupBy

[jira] [Updated] (ARROW-13238) [C++][Dataset][Compute] Substitute ExecPlan impl for dataset scans

2021-07-05 Thread Alessandro Molina (Jira)
[ https://issues.apache.org/jira/browse/ARROW-13238?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alessandro Molina updated ARROW-13238: -- Fix Version/s: (was: 5.0.0) 6.0.0 > [C++][Dataset][Compute] Sub

[jira] [Updated] (ARROW-9612) [Python] Automatically back on larger IO block size when JSON parsing fails

2021-07-05 Thread Alessandro Molina (Jira)
[ https://issues.apache.org/jira/browse/ARROW-9612?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alessandro Molina updated ARROW-9612: - Fix Version/s: (was: 5.0.0) 6.0.0 > [Python] Automatically back on

[jira] [Updated] (ARROW-8991) [C++][Compute] Add scalar_hash function

2021-07-05 Thread Alessandro Molina (Jira)
[ https://issues.apache.org/jira/browse/ARROW-8991?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alessandro Molina updated ARROW-8991: - Fix Version/s: (was: 5.0.0) 6.0.0 > [C++][Compute] Add scalar_hash

[jira] [Updated] (ARROW-8047) [Python][Documentation] Document migration from ParquetDataset to pyarrow.datasets

2021-07-05 Thread Alessandro Molina (Jira)
[ https://issues.apache.org/jira/browse/ARROW-8047?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alessandro Molina updated ARROW-8047: - Fix Version/s: (was: 5.0.0) 6.0.0 > [Python][Documentation] Docume

[jira] [Updated] (ARROW-9434) [C++] Store type_code information in UnionScalar::value

2021-07-05 Thread Alessandro Molina (Jira)
[ https://issues.apache.org/jira/browse/ARROW-9434?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alessandro Molina updated ARROW-9434: - Fix Version/s: (was: 5.0.0) 6.0.0 > [C++] Store type_code informat

[jira] [Updated] (ARROW-13094) [R] Bindings for sign()

2021-07-05 Thread Alessandro Molina (Jira)
[ https://issues.apache.org/jira/browse/ARROW-13094?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alessandro Molina updated ARROW-13094: -- Fix Version/s: (was: 5.0.0) 6.0.0 > [R] Bindings for sign() > -

[jira] [Updated] (ARROW-7579) [FlightRPC] Make Handshake optional

2021-07-05 Thread Alessandro Molina (Jira)
[ https://issues.apache.org/jira/browse/ARROW-7579?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alessandro Molina updated ARROW-7579: - Fix Version/s: (was: 5.0.0) 6.0.0 > [FlightRPC] Make Handshake opt

[jira] [Updated] (ARROW-10254) [R] Revisit (ab)use of SubTreeFileSystem

2021-07-05 Thread Alessandro Molina (Jira)
[ https://issues.apache.org/jira/browse/ARROW-10254?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alessandro Molina updated ARROW-10254: -- Fix Version/s: (was: 5.0.0) 6.0.0 > [R] Revisit (ab)use of SubT

[jira] [Updated] (ARROW-12060) [Python] Enable calling compute functions on Expressions

2021-07-05 Thread Alessandro Molina (Jira)
[ https://issues.apache.org/jira/browse/ARROW-12060?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alessandro Molina updated ARROW-12060: -- Fix Version/s: (was: 5.0.0) 6.0.0 > [Python] Enable calling com

[jira] [Updated] (ARROW-12264) [C++][Dataset] Handle NaNs correctly in Parquet predicate push-down

2021-07-05 Thread Alessandro Molina (Jira)
[ https://issues.apache.org/jira/browse/ARROW-12264?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alessandro Molina updated ARROW-12264: -- Fix Version/s: (was: 5.0.0) 6.0.0 > [C++][Dataset] Handle NaNs

[jira] [Updated] (ARROW-12105) [R] Replace vars_select, vars_rename with eval_select, eval_rename

2021-07-05 Thread Alessandro Molina (Jira)
[ https://issues.apache.org/jira/browse/ARROW-12105?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alessandro Molina updated ARROW-12105: -- Fix Version/s: (was: 5.0.0) 6.0.0 > [R] Replace vars_select, va

  1   2   >