[GitHub] [arrow] kszucs commented on a change in pull request #7519: ARROW-9017: [C++][Python] Refactor scalar bindings

2020-07-01 Thread GitBox
kszucs commented on a change in pull request #7519: URL: https://github.com/apache/arrow/pull/7519#discussion_r448551033 ## File path: python/pyarrow/tests/test_scalars.py ## @@ -16,427 +16,443 @@ # under the License. import datetime +import decimal import pytest -import u

[GitHub] [arrow] github-actions[bot] commented on pull request #7612: ARROW-7011: [C++] Implement casts from float/double to decimal

2020-07-01 Thread GitBox
github-actions[bot] commented on pull request #7612: URL: https://github.com/apache/arrow/pull/7612#issuecomment-652589053 https://issues.apache.org/jira/browse/ARROW-7011 This is an automated message from the Apache Git Serv

[GitHub] [arrow] kszucs commented on a change in pull request #7519: ARROW-9017: [C++][Python] Refactor scalar bindings

2020-07-01 Thread GitBox
kszucs commented on a change in pull request #7519: URL: https://github.com/apache/arrow/pull/7519#discussion_r448556316 ## File path: python/pyarrow/scalar.pxi ## @@ -16,1198 +16,745 @@ # under the License. -_NULL = NA = None +import collections cdef class Scalar:

[GitHub] [arrow] kszucs commented on a change in pull request #7519: ARROW-9017: [C++][Python] Refactor scalar bindings

2020-07-01 Thread GitBox
kszucs commented on a change in pull request #7519: URL: https://github.com/apache/arrow/pull/7519#discussion_r448559330 ## File path: python/pyarrow/tests/test_convert_builtin.py ## @@ -968,25 +968,31 @@ def test_sequence_timestamp_from_int_with_unit(): arr_s = pa.array(d

[GitHub] [arrow] kszucs commented on a change in pull request #7519: ARROW-9017: [C++][Python] Refactor scalar bindings

2020-07-01 Thread GitBox
kszucs commented on a change in pull request #7519: URL: https://github.com/apache/arrow/pull/7519#discussion_r448560164 ## File path: python/pyarrow/scalar.pxi ## @@ -16,1198 +16,748 @@ # under the License. -_NULL = NA = None +import collections cdef class Scalar:

[GitHub] [arrow] kszucs commented on a change in pull request #7519: ARROW-9017: [C++][Python] Refactor scalar bindings

2020-07-01 Thread GitBox
kszucs commented on a change in pull request #7519: URL: https://github.com/apache/arrow/pull/7519#discussion_r448560952 ## File path: python/pyarrow/scalar.pxi ## @@ -1217,21 +767,95 @@ cdef dict _scalar_classes = { _Type_INT16: Int16Scalar, _Type_INT32: Int32Scalar,

[GitHub] [arrow] nealrichardson commented on pull request #7611: ARROW-3308: [R] Convert R character vector with data exceeding 2GB to Large type

2020-07-01 Thread GitBox
nealrichardson commented on pull request #7611: URL: https://github.com/apache/arrow/pull/7611#issuecomment-652596005 The failed build is an OOM. Any recommendations for testing this? I could disable the test on CI and maybe that's fine since this code shouldn't be changing much, but skipp

[GitHub] [arrow] nevi-me opened a new pull request #7613: ARROW-8881: [Rust] Add large binary, string and list support

2020-07-01 Thread GitBox
nevi-me opened a new pull request #7613: URL: https://github.com/apache/arrow/pull/7613 Similar to other implementations, this creates binary, string and list arrays with `i64` offsets instead of `i32`. Behaviourally, everything's the same as the `i32` counterparts, except for the larger a

[GitHub] [arrow] github-actions[bot] commented on pull request #7613: ARROW-8881: [Rust] Add large binary, string and list support

2020-07-01 Thread GitBox
github-actions[bot] commented on pull request #7613: URL: https://github.com/apache/arrow/pull/7613#issuecomment-652606493 https://issues.apache.org/jira/browse/ARROW-8881 This is an automated message from the Apache Git Serv

[GitHub] [arrow] nealrichardson commented on pull request #7613: ARROW-8881: [Rust] Add large binary, string and list support

2020-07-01 Thread GitBox
nealrichardson commented on pull request #7613: URL: https://github.com/apache/arrow/pull/7613#issuecomment-652613744 > I'll look at the relevant integration tests separately. Separately as in another commit to this branch, or is there a JIRA already for enabling these integration te

[GitHub] [arrow] nealrichardson edited a comment on pull request #7613: ARROW-8881: [Rust] Add large binary, string and list support

2020-07-01 Thread GitBox
nealrichardson edited a comment on pull request #7613: URL: https://github.com/apache/arrow/pull/7613#issuecomment-652613744 > I'll look at the relevant integration tests separately. Separately as in another commit to this branch, or is there a JIRA already for enabling these integra

[GitHub] [arrow] kszucs commented on a change in pull request #7519: ARROW-9017: [C++][Python] Refactor scalar bindings

2020-07-01 Thread GitBox
kszucs commented on a change in pull request #7519: URL: https://github.com/apache/arrow/pull/7519#discussion_r448589857 ## File path: python/pyarrow/scalar.pxi ## @@ -1217,21 +764,50 @@ cdef dict _scalar_classes = { _Type_INT16: Int16Scalar, _Type_INT32: Int32Scalar,

[GitHub] [arrow] kszucs commented on a change in pull request #7519: ARROW-9017: [C++][Python] Refactor scalar bindings

2020-07-01 Thread GitBox
kszucs commented on a change in pull request #7519: URL: https://github.com/apache/arrow/pull/7519#discussion_r448589973 ## File path: python/pyarrow/scalar.pxi ## @@ -1217,21 +764,50 @@ cdef dict _scalar_classes = { _Type_INT16: Int16Scalar, _Type_INT32: Int32Scalar,

[GitHub] [arrow] kou commented on pull request #7605: ARROW-9283: [Python] Expose build info

2020-07-01 Thread GitBox
kou commented on pull request #7605: URL: https://github.com/apache/arrow/pull/7605#issuecomment-652646475 How about using our SNAPSHOT version as the next version of pyarrow? ```diff diff --git a/python/setup.py b/python/setup.py index 4c264a2d7..bc3efee77 100755 --- a/pytho

[GitHub] [arrow] kszucs commented on a change in pull request #6316: ARROW-7717: [CI] Have nightly integration test for Spark's latest release

2020-07-01 Thread GitBox
kszucs commented on a change in pull request #6316: URL: https://github.com/apache/arrow/pull/6316#discussion_r448618127 ## File path: dev/tasks/tasks.yml ## @@ -1833,12 +1833,32 @@ tasks: HDFS: 2.9.2 run: conda-python-hdfs - test-conda-python-3.7-spark-maste

[GitHub] [arrow] kszucs commented on pull request #6316: ARROW-7717: [CI] Have nightly integration test for Spark's latest release

2020-07-01 Thread GitBox
kszucs commented on pull request #6316: URL: https://github.com/apache/arrow/pull/6316#issuecomment-652649719 @github-actions crossbow submit test-*spark* This is an automated message from the Apache Git Service. To respond

[GitHub] [arrow] jorisvandenbossche commented on a change in pull request #7519: ARROW-9017: [C++][Python] Refactor scalar bindings

2020-07-01 Thread GitBox
jorisvandenbossche commented on a change in pull request #7519: URL: https://github.com/apache/arrow/pull/7519#discussion_r448619631 ## File path: python/pyarrow/scalar.pxi ## @@ -16,1198 +16,745 @@ # under the License. -_NULL = NA = None +import collections cdef clas

[GitHub] [arrow] jorisvandenbossche commented on a change in pull request #7519: ARROW-9017: [C++][Python] Refactor scalar bindings

2020-07-01 Thread GitBox
jorisvandenbossche commented on a change in pull request #7519: URL: https://github.com/apache/arrow/pull/7519#discussion_r448619823 ## File path: python/pyarrow/scalar.pxi ## @@ -16,1198 +16,748 @@ # under the License. -_NULL = NA = None +import collections cdef clas

[GitHub] [arrow] github-actions[bot] commented on pull request #6316: ARROW-7717: [CI] Have nightly integration test for Spark's latest release

2020-07-01 Thread GitBox
github-actions[bot] commented on pull request #6316: URL: https://github.com/apache/arrow/pull/6316#issuecomment-652650521 Revision: 55d941160d6cee2da24f951cee31928beed7c76d Submitted crossbow builds: [ursa-labs/crossbow @ actions-372](https://github.com/ursa-labs/crossbow/branches/a

[GitHub] [arrow] kszucs commented on a change in pull request #7519: ARROW-9017: [C++][Python] Refactor scalar bindings

2020-07-01 Thread GitBox
kszucs commented on a change in pull request #7519: URL: https://github.com/apache/arrow/pull/7519#discussion_r448620980 ## File path: python/pyarrow/scalar.pxi ## @@ -16,1198 +16,745 @@ # under the License. -_NULL = NA = None +import collections cdef class Scalar:

[GitHub] [arrow] nealrichardson opened a new pull request #7614: ARROW-8977: [R] Table$create with schema crashes with some dictionary index types

2020-07-01 Thread GitBox
nealrichardson opened a new pull request #7614: URL: https://github.com/apache/arrow/pull/7614 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL abov

[GitHub] [arrow] github-actions[bot] commented on pull request #7614: ARROW-8977: [R] Table$create with schema crashes with some dictionary index types

2020-07-01 Thread GitBox
github-actions[bot] commented on pull request #7614: URL: https://github.com/apache/arrow/pull/7614#issuecomment-652699046 https://issues.apache.org/jira/browse/ARROW-8977 This is an automated message from the Apache Git Serv

[GitHub] [arrow] kou opened a new pull request #7615: ARROW-9294: [GLib] Add GArrowFunction and related objects

2020-07-01 Thread GitBox
kou opened a new pull request #7615: URL: https://github.com/apache/arrow/pull/7615 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to

[GitHub] [arrow] github-actions[bot] commented on pull request #7615: ARROW-9294: [GLib] Add GArrowFunction and related objects

2020-07-01 Thread GitBox
github-actions[bot] commented on pull request #7615: URL: https://github.com/apache/arrow/pull/7615#issuecomment-652728574 https://issues.apache.org/jira/browse/ARROW-9294 This is an automated message from the Apache Git Serv

[GitHub] [arrow] liyafan82 merged pull request #7347: ARROW-8230: [Java] Remove netty dependency from arrow-memory

2020-07-01 Thread GitBox
liyafan82 merged pull request #7347: URL: https://github.com/apache/arrow/pull/7347 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to

[GitHub] [arrow] emkornfield commented on pull request #7347: ARROW-8230: [Java] Remove netty dependency from arrow-memory

2020-07-01 Thread GitBox
emkornfield commented on pull request #7347: URL: https://github.com/apache/arrow/pull/7347#issuecomment-652738045 @liyafan82 if you aren't already please make sure you use the merge script under dev to merge PRs This is an

[GitHub] [arrow] liyafan82 commented on pull request #7543: ARROW-9221: [Java] account for big-endian buffers in ArrowBuf.setBytes

2020-07-01 Thread GitBox
liyafan82 commented on pull request #7543: URL: https://github.com/apache/arrow/pull/7543#issuecomment-652737865 Seems a rebase is required. This is an automated message from the Apache Git Service. To respond to the message

[GitHub] [arrow] liyafan82 commented on pull request #7347: ARROW-8230: [Java] Remove netty dependency from arrow-memory

2020-07-01 Thread GitBox
liyafan82 commented on pull request #7347: URL: https://github.com/apache/arrow/pull/7347#issuecomment-652739238 > @liyafan82 if you aren't already please make sure you use the merge script under dev to merge PRs @emkornfield Thanks a lot for your kind reminder. I will use the script

[GitHub] [arrow] emkornfield commented on pull request #7604: ARROW-9223: [Python] Propagate Timzone information in pandas conversion

2020-07-01 Thread GitBox
emkornfield commented on pull request #7604: URL: https://github.com/apache/arrow/pull/7604#issuecomment-652764305 If the approach is agreeable it would be nice to get in before the next release. This potentially breaks List[Timestamp] because that now returns datetimes as well but I thin

[GitHub] [arrow] vagarwal77 opened a new issue #7616: Critical - PyArrow incompatibility with apache2/Django

2020-07-01 Thread GitBox
vagarwal77 opened a new issue #7616: URL: https://github.com/apache/arrow/issues/7616 I am using Docker based deployment on AWS EKS clusters which works fine. The moment, i had added pyarrow==0.17.1 library into the requirement file, my service had stopped responding without showing any e

[GitHub] [arrow] emkornfield commented on pull request #7604: ARROW-9223: [Python] Propagate Timzone information in pandas conversion

2020-07-01 Thread GitBox
emkornfield commented on pull request #7604: URL: https://github.com/apache/arrow/pull/7604#issuecomment-652765985 One other concern. For timezone naive Timestamps, I'm not sure if we should be adjusting the datetime to reflect UTC instead of system time zone. thoughts?

[GitHub] [arrow] emkornfield edited a comment on pull request #7604: ARROW-9223: [Python] Propagate Timzone information in pandas conversion

2020-07-01 Thread GitBox
emkornfield edited a comment on pull request #7604: URL: https://github.com/apache/arrow/pull/7604#issuecomment-652765985 One other concern. For timezone naive Timestamps, I'm not sure if we should be adjusting the datetime to reflect UTC converted to system time zone. thoughts? --

[GitHub] [arrow] vagarwal77 commented on issue #7616: Critical - PyArrow incompatibility with apache2/Django

2020-07-01 Thread GitBox
vagarwal77 commented on issue #7616: URL: https://github.com/apache/arrow/issues/7616#issuecomment-652768166 root@c3-dev-pe-dev1-b587b5574-pmnhb:/django/phai_web# /usr/sbin/apache2 -v Server version: Apache/2.4.38 (Debian) Server built: 2019-10-15T19:53:42 root@c3-dev-pe-dev1-

[GitHub] [arrow] houqp commented on pull request #7501: ARROW-9192: [Rust] run clippy to lint arrow crate in CI

2020-07-01 Thread GitBox
houqp commented on pull request #7501: URL: https://github.com/apache/arrow/pull/7501#issuecomment-652770694 @kszucs gentle ping, let me know what's the best way for me to help. This is an automated message from the Apache G

[GitHub] [arrow] zeevm commented on a change in pull request #7586: ARROW-9280: [Rust] [Parquet] Calculate page and column statistics

2020-07-01 Thread GitBox
zeevm commented on a change in pull request #7586: URL: https://github.com/apache/arrow/pull/7586#discussion_r448745197 ## File path: rust/parquet/src/column/writer.rs ## @@ -276,12 +372,60 @@ impl ColumnWriterImpl { &values[values_offset..], def_level

[GitHub] [arrow] zeevm commented on a change in pull request #7586: ARROW-9280: [Rust] [Parquet] Calculate page and column statistics

2020-07-01 Thread GitBox
zeevm commented on a change in pull request #7586: URL: https://github.com/apache/arrow/pull/7586#discussion_r448745612 ## File path: rust/parquet/src/column/writer.rs ## @@ -216,26 +278,26 @@ impl ColumnWriterImpl { def_levels_sink: vec![], rep_levels

[GitHub] [arrow] sunchao commented on a change in pull request #7610: ARROW-9290: [Rust] [Parquet] Add features to allow opting out of dependencies

2020-07-01 Thread GitBox
sunchao commented on a change in pull request #7610: URL: https://github.com/apache/arrow/pull/7610#discussion_r448553159 ## File path: rust/parquet/Cargo.toml ## @@ -29,20 +29,29 @@ build = "build.rs" edition = "2018" [dependencies] -parquet-format = "~2.6" +parquet-format

[GitHub] [arrow] sunchao commented on a change in pull request #7586: ARROW-9280: [Rust] [Parquet] Calculate page and column statistics

2020-07-01 Thread GitBox
sunchao commented on a change in pull request #7586: URL: https://github.com/apache/arrow/pull/7586#discussion_r448755889 ## File path: rust/parquet/src/column/writer.rs ## @@ -216,26 +278,26 @@ impl ColumnWriterImpl { def_levels_sink: vec![], rep_leve

[GitHub] [arrow] zeevm commented on a change in pull request #7586: ARROW-9280: [Rust] [Parquet] Calculate page and column statistics

2020-07-01 Thread GitBox
zeevm commented on a change in pull request #7586: URL: https://github.com/apache/arrow/pull/7586#discussion_r448757699 ## File path: rust/parquet/src/column/writer.rs ## @@ -216,26 +278,26 @@ impl ColumnWriterImpl { def_levels_sink: vec![], rep_levels

[GitHub] [arrow] zeevm commented on a change in pull request #7586: ARROW-9280: [Rust] [Parquet] Calculate page and column statistics

2020-07-01 Thread GitBox
zeevm commented on a change in pull request #7586: URL: https://github.com/apache/arrow/pull/7586#discussion_r448764844 ## File path: rust/parquet/src/column/writer.rs ## @@ -216,26 +278,26 @@ impl ColumnWriterImpl { def_levels_sink: vec![], rep_levels

[GitHub] [arrow] jorisvandenbossche commented on pull request #7604: ARROW-9223: [Python] Propagate Timzone information in pandas conversion

2020-07-01 Thread GitBox
jorisvandenbossche commented on pull request #7604: URL: https://github.com/apache/arrow/pull/7604#issuecomment-652814189 > For timezone naive Timestamps, I'm not sure if we should be adjusting the datetime to reflect UTC converted to system time zone. thoughts? I don't think we shou

<    1   2