[GitHub] [arrow] praveenbingo closed pull request #5947: ARROW-7300: [C++][Gandiva] Implement functions to cast from strings to integers/floats

2020-05-19 Thread GitBox
praveenbingo closed pull request #5947: URL: https://github.com/apache/arrow/pull/5947 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go

[GitHub] [arrow] github-actions[bot] commented on pull request #7230: ARROW-8869: [Rust] [DataFusion] Add support for new scan nodes to type coercion rule

2020-05-19 Thread GitBox
github-actions[bot] commented on pull request #7230: URL: https://github.com/apache/arrow/pull/7230#issuecomment-631188650 https://issues.apache.org/jira/browse/ARROW-8869 This is an automated message from the Apache Git Serv

[GitHub] [arrow] andygrove opened a new pull request #7230: ARROW-8869: [Rust] [DataFusion] Add support for new scan nodes to type coercion rule

2020-05-19 Thread GitBox
andygrove opened a new pull request #7230: URL: https://github.com/apache/arrow/pull/7230 Add support for new scan nodes to type coercion rule This is an automated message from the Apache Git Service. To respond to the messag

[GitHub] [arrow] andygrove commented on pull request #7220: ARROW-8837: [Rust] Implement Null data type [WIP]

2020-05-19 Thread GitBox
andygrove commented on pull request #7220: URL: https://github.com/apache/arrow/pull/7220#issuecomment-631183569 Thanks @nevi-me that seems to have resolved the Null type issue. This is an automated message from the Apache Gi

[GitHub] [arrow] andygrove closed pull request #7219: ARROW-8854: [Rust] [Integration Testing] Standardize error handling

2020-05-19 Thread GitBox
andygrove closed pull request #7219: URL: https://github.com/apache/arrow/pull/7219 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to

[GitHub] [arrow] andygrove closed pull request #7210: ARROW-8839: [Rust] [DataFusion] support CSV schema inference in logical plan

2020-05-19 Thread GitBox
andygrove closed pull request #7210: URL: https://github.com/apache/arrow/pull/7210 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to

[GitHub] [arrow] houqp commented on pull request #7210: WIP: ARROW-8839: [Rust] [DataFusion] support CSV schema inference in logical plan

2020-05-19 Thread GitBox
houqp commented on pull request #7210: URL: https://github.com/apache/arrow/pull/7210#issuecomment-631143176 all feedback addressed, I will send separate PRs to add CSV read option struct and improve inference file read behavior to keep this one simple. ---

[GitHub] [arrow] nealrichardson commented on pull request #7170: Verify 0.17.1 release candidate [WIP]

2020-05-19 Thread GitBox
nealrichardson commented on pull request #7170: URL: https://github.com/apache/arrow/pull/7170#issuecomment-631142273 No need to merge this; closing. This is an automated message from the Apache Git Service. To respond to the

[GitHub] [arrow] nealrichardson closed pull request #7170: Verify 0.17.1 release candidate [WIP]

2020-05-19 Thread GitBox
nealrichardson closed pull request #7170: URL: https://github.com/apache/arrow/pull/7170 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to g

[GitHub] [arrow] andygrove closed pull request #7061: ARROW-8629: [Rust] Eliminate indirection of zero sized allocations

2020-05-19 Thread GitBox
andygrove closed pull request #7061: URL: https://github.com/apache/arrow/pull/7061 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to

[GitHub] [arrow] andygrove commented on a change in pull request #7210: ARROW-8839: [Rust] [DataFusion] support CSV schema inference in logical plan

2020-05-19 Thread GitBox
andygrove commented on a change in pull request #7210: URL: https://github.com/apache/arrow/pull/7210#discussion_r427655895 ## File path: rust/datafusion/src/execution/physical_plan/csv.rs ## @@ -71,15 +75,35 @@ impl CsvExec { /// Create a new execution plan for reading a

[GitHub] [arrow] andygrove closed pull request #6898: ARROW-8399: [Rust] Extend memory alignments to include other architectures

2020-05-19 Thread GitBox
andygrove closed pull request #6898: URL: https://github.com/apache/arrow/pull/6898 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to

[GitHub] [arrow] andygrove commented on a change in pull request #7210: ARROW-8839: [Rust] [DataFusion] support CSV schema inference in logical plan

2020-05-19 Thread GitBox
andygrove commented on a change in pull request #7210: URL: https://github.com/apache/arrow/pull/7210#discussion_r427655072 ## File path: rust/arrow/src/csv/reader.rs ## @@ -87,19 +87,19 @@ fn infer_field_schema(string: &str) -> DataType { /// with `max_read_records` controlli

[GitHub] [arrow] vertexclique commented on pull request #6898: ARROW-8399: [Rust] Extend memory alignments to include other architectures

2020-05-19 Thread GitBox
vertexclique commented on pull request #6898: URL: https://github.com/apache/arrow/pull/6898#issuecomment-631132850 I have addressed the comments and formatting problems. This is an automated message from the Apache Git Servi

[GitHub] [arrow] vertexclique commented on pull request #7061: ARROW-8629: [Rust] Eliminate indirection of zero sized allocations

2020-05-19 Thread GitBox
vertexclique commented on pull request #7061: URL: https://github.com/apache/arrow/pull/7061#issuecomment-631132553 I have addressed the comments and formatting problems. This is an automated message from the Apache Git Servi

[GitHub] [arrow] kou closed pull request #7229: ARROW-8848: [Ruby][CI] Fix MSYS2 update error

2020-05-19 Thread GitBox
kou closed pull request #7229: URL: https://github.com/apache/arrow/pull/7229 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the sp

[GitHub] [arrow] kou commented on pull request #7229: ARROW-8848: [Ruby][CI] Fix MSYS2 update error

2020-05-19 Thread GitBox
kou commented on pull request #7229: URL: https://github.com/apache/arrow/pull/7229#issuecomment-631106668 +1 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and us

[GitHub] [arrow] nealrichardson closed pull request #7218: ARROW-8852: [R] Post-0.17.1 adjustments

2020-05-19 Thread GitBox
nealrichardson closed pull request #7218: URL: https://github.com/apache/arrow/pull/7218 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to g

[GitHub] [arrow] nealrichardson closed pull request #7222: ARROW-8857: [CI] MinGW builds break on system upgrade

2020-05-19 Thread GitBox
nealrichardson closed pull request #7222: URL: https://github.com/apache/arrow/pull/7222 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to g

[GitHub] [arrow] kou closed pull request #7227: ARROW-8862: [C++] NumericBuilder should use MemoryPool passed to CTOR

2020-05-19 Thread GitBox
kou closed pull request #7227: URL: https://github.com/apache/arrow/pull/7227 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the sp

[GitHub] [arrow] nealrichardson commented on pull request #7222: ARROW-8857: [CI] MinGW builds break on system upgrade

2020-05-19 Thread GitBox
nealrichardson commented on pull request #7222: URL: https://github.com/apache/arrow/pull/7222#issuecomment-631105961 I pulled that change into #7218 so I'll close this. This is an automated message from the Apache Git Servic

[GitHub] [arrow] github-actions[bot] commented on pull request #7229: ARROW-8848: [Ruby][CI] Fix MSYS2 update error

2020-05-19 Thread GitBox
github-actions[bot] commented on pull request #7229: URL: https://github.com/apache/arrow/pull/7229#issuecomment-631100806 https://issues.apache.org/jira/browse/ARROW-8848 This is an automated message from the Apache Git Serv

[GitHub] [arrow] nealrichardson commented on pull request #7222: ARROW-8857: [CI] MinGW builds break on system upgrade

2020-05-19 Thread GitBox
nealrichardson commented on pull request #7222: URL: https://github.com/apache/arrow/pull/7222#issuecomment-631099352 I think I don't need to update pacman for the R jobs. Is there a reason I must? This is an automated messa

[GitHub] [arrow] kou commented on pull request #7222: ARROW-8857: [CI] MinGW builds break on system upgrade

2020-05-19 Thread GitBox
kou commented on pull request #7222: URL: https://github.com/apache/arrow/pull/7222#issuecomment-631098032 Could you try updating `pacman` separately like https://github.com/apache/arrow/pull/7229/files#diff-f9a9850561dea0df3b66b2ff57181037 ? -

[GitHub] [arrow] kou opened a new pull request #7229: ARROW-8848: [Ruby][CI] Fix MSYS2 update error

2020-05-19 Thread GitBox
kou opened a new pull request #7229: URL: https://github.com/apache/arrow/pull/7229 We need to avoid updating both pacman and msys2-runtime at once. This is an automated message from the Apache Git Service. To respond to the

[GitHub] [arrow] nevi-me commented on pull request #7210: ARROW-8839: [Rust] [DataFusion] support CSV schema inference in logical plan

2020-05-19 Thread GitBox
nevi-me commented on pull request #7210: URL: https://github.com/apache/arrow/pull/7210#issuecomment-631089381 +1 to creating a struct for csv options This is an automated message from the Apache Git Service. To respond to th

[GitHub] [arrow] nevi-me commented on a change in pull request #7210: ARROW-8839: [Rust] [DataFusion] support CSV schema inference in logical plan

2020-05-19 Thread GitBox
nevi-me commented on a change in pull request #7210: URL: https://github.com/apache/arrow/pull/7210#discussion_r427607065 ## File path: rust/datafusion/src/execution/physical_plan/csv.rs ## @@ -71,15 +75,35 @@ impl CsvExec { /// Create a new execution plan for reading a se

[GitHub] [arrow] houqp commented on a change in pull request #7210: ARROW-8839: [Rust] [DataFusion] support CSV schema inference in logical plan

2020-05-19 Thread GitBox
houqp commented on a change in pull request #7210: URL: https://github.com/apache/arrow/pull/7210#discussion_r427605959 ## File path: rust/arrow/src/csv/reader.rs ## @@ -87,19 +87,19 @@ fn infer_field_schema(string: &str) -> DataType { /// with `max_read_records` controlling t

[GitHub] [arrow] houqp commented on a change in pull request #7210: ARROW-8839: [Rust] [DataFusion] support CSV schema inference in logical plan

2020-05-19 Thread GitBox
houqp commented on a change in pull request #7210: URL: https://github.com/apache/arrow/pull/7210#discussion_r427605427 ## File path: rust/datafusion/src/execution/physical_plan/csv.rs ## @@ -71,15 +75,35 @@ impl CsvExec { /// Create a new execution plan for reading a set

[GitHub] [arrow] houqp commented on a change in pull request #7210: ARROW-8839: [Rust] [DataFusion] support CSV schema inference in logical plan

2020-05-19 Thread GitBox
houqp commented on a change in pull request #7210: URL: https://github.com/apache/arrow/pull/7210#discussion_r427602640 ## File path: rust/datafusion/src/execution/physical_plan/csv.rs ## @@ -71,15 +75,35 @@ impl CsvExec { /// Create a new execution plan for reading a set

[GitHub] [arrow] nevi-me commented on a change in pull request #7210: ARROW-8839: [Rust] [DataFusion] support CSV schema inference in logical plan

2020-05-19 Thread GitBox
nevi-me commented on a change in pull request #7210: URL: https://github.com/apache/arrow/pull/7210#discussion_r427600229 ## File path: rust/datafusion/src/execution/physical_plan/csv.rs ## @@ -71,15 +75,35 @@ impl CsvExec { /// Create a new execution plan for reading a se

[GitHub] [arrow] github-actions[bot] commented on pull request #7228: ARROW-8864: [R] Add methods to Table/RecordBatch for consistency with data.frame

2020-05-19 Thread GitBox
github-actions[bot] commented on pull request #7228: URL: https://github.com/apache/arrow/pull/7228#issuecomment-631080491 https://issues.apache.org/jira/browse/ARROW-8864 This is an automated message from the Apache Git Serv

[GitHub] [arrow] nealrichardson opened a new pull request #7228: ARROW-8864: [R] Add methods to Table/RecordBatch for consistency with data.frame

2020-05-19 Thread GitBox
nealrichardson opened a new pull request #7228: URL: https://github.com/apache/arrow/pull/7228 as.list, row.names, dimnames (which yields colnames), plus list-like column extraction. See https://github.com/wesm/feather/pull/388 -

[GitHub] [arrow] nevi-me commented on a change in pull request #7204: ARROW-8831: [Rust] change simd_compare_op in comparison kernel to use bitmask SIMD operation to significantly improve performance

2020-05-19 Thread GitBox
nevi-me commented on a change in pull request #7204: URL: https://github.com/apache/arrow/pull/7204#discussion_r427593724 ## File path: rust/arrow/src/compute/kernels/comparison.rs ## @@ -210,6 +210,10 @@ where T: ArrowNumericType, F: Fn(T::Simd, T::Simd) -> T::SimdMa

[GitHub] [arrow] nevi-me commented on a change in pull request #7226: ARROW-8791: [Rust] Allow creation of StringDictionaryBuilder with an existing array of dictionary values

2020-05-19 Thread GitBox
nevi-me commented on a change in pull request #7226: URL: https://github.com/apache/arrow/pull/7226#discussion_r427587582 ## File path: rust/arrow/src/array/builder.rs ## @@ -1334,6 +1334,35 @@ where map: HashMap::new(), } } + +pub fn new_with_dic

[GitHub] [arrow] yordan-pavlov commented on a change in pull request #7204: ARROW-8831: [Rust] change simd_compare_op in comparison kernel to use bitmask SIMD operation to significantly improve perfor

2020-05-19 Thread GitBox
yordan-pavlov commented on a change in pull request #7204: URL: https://github.com/apache/arrow/pull/7204#discussion_r427592201 ## File path: rust/arrow/src/compute/kernels/comparison.rs ## @@ -210,6 +210,10 @@ where T: ArrowNumericType, F: Fn(T::Simd, T::Simd) -> T::

[GitHub] [arrow] nevi-me closed pull request #7204: ARROW-8831: [Rust] change simd_compare_op in comparison kernel to use bitmask SIMD operation to significantly improve performance

2020-05-19 Thread GitBox
nevi-me closed pull request #7204: URL: https://github.com/apache/arrow/pull/7204 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to th

[GitHub] [arrow] nevi-me commented on a change in pull request #7204: ARROW-8831: [Rust] change simd_compare_op in comparison kernel to use bitmask SIMD operation to significantly improve performance

2020-05-19 Thread GitBox
nevi-me commented on a change in pull request #7204: URL: https://github.com/apache/arrow/pull/7204#discussion_r427582826 ## File path: rust/arrow/src/compute/kernels/comparison.rs ## @@ -210,6 +210,10 @@ where T: ArrowNumericType, F: Fn(T::Simd, T::Simd) -> T::SimdMa

[GitHub] [arrow] nevi-me commented on pull request #6785: ARROW-8289: [Rust] Implement Arrow writer for Parquet [DRAFT]

2020-05-19 Thread GitBox
nevi-me commented on pull request #6785: URL: https://github.com/apache/arrow/pull/6785#issuecomment-631065724 I'll push the minor changes that I made here, to use the schema conversion. I got distracted by trying to figure out how to compute repetition levels to support nulls, but I can t

[GitHub] [arrow] jorisvandenbossche commented on pull request #7180: ARROW-8062: [C++][Dataset] Implement ParquetDatasetFactory

2020-05-19 Thread GitBox
jorisvandenbossche commented on pull request #7180: URL: https://github.com/apache/arrow/pull/7180#issuecomment-631059751 It seems this failure doesn't happen all the time for me. Running it a few times, I see also see the FileNotFoundError, but in 1 out of 2 cases, approximately.

[GitHub] [arrow] github-actions[bot] commented on pull request #7227: ARROW-8862: [C++] NumericBuilder should use MemoryPool passed to CTOR

2020-05-19 Thread GitBox
github-actions[bot] commented on pull request #7227: URL: https://github.com/apache/arrow/pull/7227#issuecomment-631051118 https://issues.apache.org/jira/browse/ARROW-8862 This is an automated message from the Apache Git Serv

[GitHub] [arrow] nevi-me commented on pull request #7205: ARROW-8782: [Rust] Add benchmark crate

2020-05-19 Thread GitBox
nevi-me commented on pull request #7205: URL: https://github.com/apache/arrow/pull/7205#issuecomment-631050794 Would the idea be to centralise benchmarks from the other crates? This is an automated message from the Apache Git

[GitHub] [arrow] github-actions[bot] commented on pull request #7227: NumericBuilder - use MemoryPool passed to CTOR

2020-05-19 Thread GitBox
github-actions[bot] commented on pull request #7227: URL: https://github.com/apache/arrow/pull/7227#issuecomment-631042978 Thanks for opening a pull request! Could you open an issue for this pull request on JIRA? https://issues.apache.org/jira/browse/ARROW Then could

[GitHub] [arrow] sawatts opened a new pull request #7227: NumericBuilder - use MemoryPool passed to CTOR

2020-05-19 Thread GitBox
sawatts opened a new pull request #7227: URL: https://github.com/apache/arrow/pull/7227 `NumericBuilder` uses the `pool` (`MemoryPool*`) parameter to initialise the `ArrayBuilder` base class, but does not use it to initialise its own internal builder, `data_builder_` (`TypedBufferBuilder`)

[GitHub] [arrow] andygrove commented on pull request #7220: ARROW-8837: [Rust] Implement Null data type [WIP]

2020-05-19 Thread GitBox
andygrove commented on pull request #7220: URL: https://github.com/apache/arrow/pull/7220#issuecomment-631022770 Thanks @nevi-me I will re-run the tests tonight (in 5 hours or so) and let you know how it looks. This is an au

[GitHub] [arrow] nevi-me commented on pull request #7220: ARROW-8837: [Rust] Implement Null data type [WIP]

2020-05-19 Thread GitBox
nevi-me commented on pull request #7220: URL: https://github.com/apache/arrow/pull/7220#issuecomment-631020629 @andygrove I still can't run the tests locally, so I'm flying blind for now. I've pushed 2 commits adding `NullArray` support. I couldn't find an example of how a null array looks

[GitHub] [arrow] fsaintjacques commented on pull request #7180: ARROW-8062: [C++][Dataset] Implement ParquetDatasetFactory

2020-05-19 Thread GitBox
fsaintjacques commented on pull request #7180: URL: https://github.com/apache/arrow/pull/7180#issuecomment-631002289 I'm curious about the exception/segfault. If you can reproduce, feel free to share. This is an automated me

[GitHub] [arrow] github-actions[bot] commented on pull request #7226: ARROW-8791 Allow creation of StringDictionaryBuilder with an existing array of dictionary values

2020-05-19 Thread GitBox
github-actions[bot] commented on pull request #7226: URL: https://github.com/apache/arrow/pull/7226#issuecomment-630993571 https://issues.apache.org/jira/browse/ARROW-8791 This is an automated message from the Apache Git Serv

[GitHub] [arrow] jorisvandenbossche commented on pull request #7180: ARROW-8062: [C++][Dataset] Implement ParquetDatasetFactory

2020-05-19 Thread GitBox
jorisvandenbossche commented on pull request #7180: URL: https://github.com/apache/arrow/pull/7180#issuecomment-630991816 > I don't get a segault for the test you added, just a wrong exception being throw. A FileNotFoundError sounds good (the ValueError I added in the tests was just

[GitHub] [arrow] jhorstmann opened a new pull request #7226: ARROW-8791 Allow creation of StringDictionaryBuilder with an existing array of dictionary values

2020-05-19 Thread GitBox
jhorstmann opened a new pull request #7226: URL: https://github.com/apache/arrow/pull/7226 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to

[GitHub] [arrow] fsaintjacques commented on pull request #7180: ARROW-8062: [C++][Dataset] Implement ParquetDatasetFactory

2020-05-19 Thread GitBox
fsaintjacques commented on pull request #7180: URL: https://github.com/apache/arrow/pull/7180#issuecomment-630985072 I don't get a segault for the test you added, just a wrong exception being throw. ```python > raise IOError(errno, message) E FileNotFoundError: [Errno 2] Fa

[GitHub] [arrow] fsaintjacques commented on a change in pull request #7180: ARROW-8062: [C++][Dataset] Implement ParquetDatasetFactory

2020-05-19 Thread GitBox
fsaintjacques commented on a change in pull request #7180: URL: https://github.com/apache/arrow/pull/7180#discussion_r427493814 ## File path: cpp/src/arrow/dataset/file_parquet.cc ## @@ -380,77 +316,297 @@ Result ParquetFileFormat::ScanFile( return ScanFile(source, std::move

[GitHub] [arrow] fsaintjacques commented on a change in pull request #7180: ARROW-8062: [C++][Dataset] Implement ParquetDatasetFactory

2020-05-19 Thread GitBox
fsaintjacques commented on a change in pull request #7180: URL: https://github.com/apache/arrow/pull/7180#discussion_r427493814 ## File path: cpp/src/arrow/dataset/file_parquet.cc ## @@ -380,77 +316,297 @@ Result ParquetFileFormat::ScanFile( return ScanFile(source, std::move

[GitHub] [arrow] fsaintjacques commented on a change in pull request #7180: ARROW-8062: [C++][Dataset] Implement ParquetDatasetFactory

2020-05-19 Thread GitBox
fsaintjacques commented on a change in pull request #7180: URL: https://github.com/apache/arrow/pull/7180#discussion_r427493504 ## File path: python/pyarrow/dataset.py ## @@ -35,11 +35,13 @@ Fragment, HivePartitioning, IpcFileFormat, +ParquetDatasetFactory,

[GitHub] [arrow] lidavidm commented on pull request #7225: ARROW-8847: [C++] Pass task hints in Executor API

2020-05-19 Thread GitBox
lidavidm commented on pull request #7225: URL: https://github.com/apache/arrow/pull/7225#issuecomment-630953755 > That requires passing the "consumer ID" somehow. Perhaps in the AsyncContext? Yeah, that would make sense. -

[GitHub] [arrow] emkornfield commented on a change in pull request #7175: ARROW-8794: [C++] Expand performance coverage of parquet to arrow reading

2020-05-19 Thread GitBox
emkornfield commented on a change in pull request #7175: URL: https://github.com/apache/arrow/pull/7175#discussion_r427457149 ## File path: cpp/src/parquet/arrow/reader_writer_benchmark.cc ## @@ -95,15 +97,37 @@ void SetBytesProcessed(::benchmark::State& state) { state.SetBy

[GitHub] [arrow] github-actions[bot] commented on pull request #7225: ARROW-8847: [C++] Pass task hints in Executor API

2020-05-19 Thread GitBox
github-actions[bot] commented on pull request #7225: URL: https://github.com/apache/arrow/pull/7225#issuecomment-630943986 https://issues.apache.org/jira/browse/ARROW-8847 This is an automated message from the Apache Git Serv

[GitHub] [arrow] pitrou commented on pull request #7225: ARROW-8847: [C++] Pass task hints in Executor API

2020-05-19 Thread GitBox
pitrou commented on pull request #7225: URL: https://github.com/apache/arrow/pull/7225#issuecomment-630943041 That requires passing the "consumer ID" somehow. Perhaps in the AsyncContext? This is an automated message from the

[GitHub] [arrow] pitrou opened a new pull request #7225: ARROW-8847: [C++] Pass task hints in Executor API

2020-05-19 Thread GitBox
pitrou opened a new pull request #7225: URL: https://github.com/apache/arrow/pull/7225 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go

[GitHub] [arrow] hurricane1026 commented on a change in pull request #7202: ARROW-8825: [C++] Fix compilation for Wunused-paraemter flag

2020-05-19 Thread GitBox
hurricane1026 commented on a change in pull request #7202: URL: https://github.com/apache/arrow/pull/7202#discussion_r427398016 ## File path: cpp/src/arrow/array.h ## @@ -798,7 +798,12 @@ class ARROW_EXPORT FixedSizeListArray : public Array { i += data_->offset; retur

[GitHub] [arrow] github-actions[bot] commented on pull request #7224: ARROW-8858: [FlightRPC] ensure binary/multi-valued headers are properly exposed

2020-05-19 Thread GitBox
github-actions[bot] commented on pull request #7224: URL: https://github.com/apache/arrow/pull/7224#issuecomment-630880344 https://issues.apache.org/jira/browse/ARROW-8858 This is an automated message from the Apache Git Serv

[GitHub] [arrow] lidavidm commented on pull request #7224: ARROW-8858: [FlightRPC] ensure binary/multi-valued headers are properly exposed

2020-05-19 Thread GitBox
lidavidm commented on pull request #7224: URL: https://github.com/apache/arrow/pull/7224#issuecomment-630877429 CC @rymurr, if you have time to take a look that would be much appreciated (since you've also worked on the metadata handling). grpc-java is very strict about using the proper me

[GitHub] [arrow] lidavidm opened a new pull request #7224: ARROW-8858: [FlightRPC] ensure binary/multi-valued headers are properly exposed

2020-05-19 Thread GitBox
lidavidm opened a new pull request #7224: URL: https://github.com/apache/arrow/pull/7224 This fixes a few things: - Sending/receiving text headers in Java - Iterating over all headers in Java (binary ones used to be filtered out) - Receiving binary headers in Python --

[GitHub] [arrow] jacques-n commented on pull request #6433: ARROW-7495: [Java] Remove "empty" concept from ArrowBuf, replace with custom referencemanager

2020-05-19 Thread GitBox
jacques-n commented on pull request #6433: URL: https://github.com/apache/arrow/pull/6433#issuecomment-630874974 Looks mostly good. +1 As a follow-on, it'd be nice to include a test which ensures that the Netty Allocator returns an empty-behaving byte buffer when users allocate a zer

[GitHub] [arrow] alippai commented on pull request #6676: ARROW-8175: [Python] Setup type checking with mypy

2020-05-19 Thread GitBox
alippai commented on pull request #6676: URL: https://github.com/apache/arrow/pull/6676#issuecomment-630874504 > What's the status of mypy support of cython? (not very familiar with this) Are you looking for something like this? https://github.com/python/mypy/pull/8631

[GitHub] [arrow] itamarst commented on pull request #7169: ARROW-5359: [Python] Support loading non-nanosecond out-of-range timestamps

2020-05-19 Thread GitBox
itamarst commented on pull request #7169: URL: https://github.com/apache/arrow/pull/7169#issuecomment-630868204 OK, I've addressed the testing comments, so we're down to semantics and naming of the flag. This is an automated

[GitHub] [arrow] pitrou commented on pull request #6810: [DO NOT MERGE] [Python] Reformat with black

2020-05-19 Thread GitBox
pitrou commented on pull request #6810: URL: https://github.com/apache/arrow/pull/6810#issuecomment-630864784 Superseded by PR #7215 This is an automated message from the Apache Git Service. To respond to the message, please

[GitHub] [arrow] pitrou closed pull request #6810: [DO NOT MERGE] [Python] Reformat with black

2020-05-19 Thread GitBox
pitrou closed pull request #6810: URL: https://github.com/apache/arrow/pull/6810 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the

[GitHub] [arrow] itamarst commented on pull request #7169: ARROW-5359: [Python] Support loading non-nanosecond out-of-range timestamps

2020-05-19 Thread GitBox
itamarst commented on pull request #7169: URL: https://github.com/apache/arrow/pull/7169#issuecomment-630861784 Actually, how about changing the keyword to "dont_force_nanosecond_timetamp" or something? That would address two of your concerns. -

[GitHub] [arrow] itamarst commented on pull request #7169: ARROW-5359: [Python] Support loading non-nanosecond out-of-range timestamps

2020-05-19 Thread GitBox
itamarst commented on pull request #7169: URL: https://github.com/apache/arrow/pull/7169#issuecomment-630861038 Ah yes, will do. Any thoughts on the remaining inline review comment re objects for nanosecond? This is an autom

[GitHub] [arrow] jorisvandenbossche commented on a change in pull request #7169: ARROW-5359: [Python] Support loading non-nanosecond out-of-range timestamps

2020-05-19 Thread GitBox
jorisvandenbossche commented on a change in pull request #7169: URL: https://github.com/apache/arrow/pull/7169#discussion_r427351793 ## File path: python/pyarrow/array.pxi ## @@ -549,6 +550,11 @@ cdef class _PandasConvertible: Cast integers with nulls to objects

[GitHub] [arrow] jorisvandenbossche commented on pull request #7169: ARROW-5359: [Python] Support loading non-nanosecond out-of-range timestamps

2020-05-19 Thread GitBox
jorisvandenbossche commented on pull request #7169: URL: https://github.com/apache/arrow/pull/7169#issuecomment-630858966 @itamarst thanks for the updates! See also my non-inline questions/comments at https://github.com/apache/arrow/pull/7169#pullrequestreview-411737751 --

[GitHub] [arrow] itamarst commented on pull request #7169: ARROW-5359: [Python] Support loading non-nanosecond out-of-range timestamps

2020-05-19 Thread GitBox
itamarst commented on pull request #7169: URL: https://github.com/apache/arrow/pull/7169#issuecomment-630853662 @jorisvandenbossche (or any other reviewer): I have resolved two comments, and the remaining one needs some additional feedback from a reviewer. Thanks!

[GitHub] [arrow] jorisvandenbossche commented on pull request #6676: ARROW-8175: [Python] Setup type checking with mypy

2020-05-19 Thread GitBox
jorisvandenbossche commented on pull request #6676: URL: https://github.com/apache/arrow/pull/6676#issuecomment-630853533 What's the status of mypy support of cython? (not very familiar with this) This is an automated message

[GitHub] [arrow] itamarst commented on a change in pull request #7169: ARROW-5359: [Python] Support loading non-nanosecond out-of-range timestamps

2020-05-19 Thread GitBox
itamarst commented on a change in pull request #7169: URL: https://github.com/apache/arrow/pull/7169#discussion_r427341919 ## File path: python/pyarrow/pandas_compat.py ## @@ -699,6 +699,17 @@ def _reconstruct_block(item, columns=None, extension_columns=None): block_arr

[GitHub] [arrow] itamarst commented on a change in pull request #7169: ARROW-5359: [Python] Support loading non-nanosecond out-of-range timestamps

2020-05-19 Thread GitBox
itamarst commented on a change in pull request #7169: URL: https://github.com/apache/arrow/pull/7169#discussion_r427342127 ## File path: python/pyarrow/tests/test_pandas.py ## @@ -3941,3 +3945,28 @@ def test_metadata_compat_missing_field_name(): result = table.to_pandas()

[GitHub] [arrow] itamarst commented on a change in pull request #7169: ARROW-5359: [Python] Support loading non-nanosecond out-of-range timestamps

2020-05-19 Thread GitBox
itamarst commented on a change in pull request #7169: URL: https://github.com/apache/arrow/pull/7169#discussion_r427336072 ## File path: python/pyarrow/pandas_compat.py ## @@ -699,6 +699,17 @@ def _reconstruct_block(item, columns=None, extension_columns=None): block_arr

[GitHub] [arrow] itamarst commented on a change in pull request #7169: ARROW-5359: [Python] Support loading non-nanosecond out-of-range timestamps

2020-05-19 Thread GitBox
itamarst commented on a change in pull request #7169: URL: https://github.com/apache/arrow/pull/7169#discussion_r427331603 ## File path: python/pyarrow/array.pxi ## @@ -549,6 +550,11 @@ cdef class _PandasConvertible: Cast integers with nulls to objects dat

[GitHub] [arrow] nevi-me commented on pull request #7220: ARROW-8837: [Rust] Implement Null data type [WIP]

2020-05-19 Thread GitBox
nevi-me commented on pull request #7220: URL: https://github.com/apache/arrow/pull/7220#issuecomment-630823838 > At first glance it looks like you are just in the wrong directory. You need to run archery from the root of the repo. Then it would look for java/pom.xml and find it. > […](#

[GitHub] [arrow] andygrove commented on pull request #7220: ARROW-8837: [Rust] Implement Null data type [WIP]

2020-05-19 Thread GitBox
andygrove commented on pull request #7220: URL: https://github.com/apache/arrow/pull/7220#issuecomment-630821367 At first glance it looks like you are just in the wrong directory. You need to run archery from the root of the repo. Then it would look for java/pom.xml and find it.

[GitHub] [arrow] nevi-me edited a comment on pull request #7220: ARROW-8837: [Rust] Implement Null data type [WIP]

2020-05-19 Thread GitBox
nevi-me edited a comment on pull request #7220: URL: https://github.com/apache/arrow/pull/7220#issuecomment-630823838 I was in the root, but looks like I'm being too eager and trying to run the tests without first building the jar and exe, and there's an env that I hadn't set up ye

[GitHub] [arrow] nevi-me removed a comment on pull request #7220: ARROW-8837: [Rust] Implement Null data type [WIP]

2020-05-19 Thread GitBox
nevi-me removed a comment on pull request #7220: URL: https://github.com/apache/arrow/pull/7220#issuecomment-630819383 I keep getting the below on Windows. I'll try with Linux ```python Traceback (most recent call last): File "C:\ProgramData\Anaconda3\Scripts\archery-script.py

[GitHub] [arrow] nevi-me commented on pull request #7220: ARROW-8837: [Rust] Implement Null data type [WIP]

2020-05-19 Thread GitBox
nevi-me commented on pull request #7220: URL: https://github.com/apache/arrow/pull/7220#issuecomment-630819383 I keep getting the below on Windows. I'll try with Linux ```python Traceback (most recent call last): File "C:\ProgramData\Anaconda3\Scripts\archery-script.py", line

[GitHub] [arrow] andygrove closed pull request #7203: ARROW-8822: [Rust] [DataFusion] Add InMemoryScan to LogicalPlan

2020-05-19 Thread GitBox
andygrove closed pull request #7203: URL: https://github.com/apache/arrow/pull/7203 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to

[GitHub] [arrow] andygrove commented on pull request #7203: ARROW-8822: [Rust] [DataFusion] Add InMemoryScan to LogicalPlan

2020-05-19 Thread GitBox
andygrove commented on pull request #7203: URL: https://github.com/apache/arrow/pull/7203#issuecomment-630814547 Since there is no feedback in the past two days, I'm going to go ahead and merge this one. This is an automated

[GitHub] [arrow] pitrou commented on a change in pull request #7179: ARROW-8732: [C++] Add basic cancellation API

2020-05-19 Thread GitBox
pitrou commented on a change in pull request #7179: URL: https://github.com/apache/arrow/pull/7179#discussion_r427297417 ## File path: cpp/src/arrow/util/cancel.cc ## @@ -0,0 +1,130 @@ +// Licensed to the Apache Software Foundation (ASF) under one +// or more contributor licens

[GitHub] [arrow] xhochy commented on pull request #6676: ARROW-8175: [Python] Setup type checking with mypy

2020-05-19 Thread GitBox
xhochy commented on pull request #6676: URL: https://github.com/apache/arrow/pull/6676#issuecomment-630814495 > So ideally a follow-up PR would generate mypy check-able code from [this](https://github.com/apache/arrow/blob/1164079d5442c3910c18549bfcd2e68d4554b909/python/pyarrow/includes/lib

[GitHub] [arrow] andygrove commented on pull request #7220: ARROW-8837: [Rust] Implement Null data type [WIP]

2020-05-19 Thread GitBox
andygrove commented on pull request #7220: URL: https://github.com/apache/arrow/pull/7220#issuecomment-630806944 Thanks @nevi-me there is a README in the new integration-testing crate which pretty much just links to the integration testing docs [1]. I have been testing against the Java imp

[GitHub] [arrow] jorisvandenbossche commented on a change in pull request #7180: ARROW-8062: [C++][Dataset] Implement ParquetDatasetFactory

2020-05-19 Thread GitBox
jorisvandenbossche commented on a change in pull request #7180: URL: https://github.com/apache/arrow/pull/7180#discussion_r427163007 ## File path: cpp/src/arrow/dataset/file_parquet.h ## @@ -97,53 +103,167 @@ class ARROW_DS_EXPORT ParquetFileFormat : public FileFormat { Res

[GitHub] [arrow] andygrove commented on a change in pull request #7219: ARROW-8854: [Rust] [Integration Testing] Standardize error handling

2020-05-19 Thread GitBox
andygrove commented on a change in pull request #7219: URL: https://github.com/apache/arrow/pull/7219#discussion_r427282251 ## File path: rust/integration-testing/src/bin/arrow-json-integration-test.rs ## @@ -68,24 +69,19 @@ fn main() { .value_of("json") .expe

[GitHub] [arrow] nevi-me commented on pull request #7220: ARROW-8837: [Rust] Implement Null data type [WIP]

2020-05-19 Thread GitBox
nevi-me commented on pull request #7220: URL: https://github.com/apache/arrow/pull/7220#issuecomment-630798459 @andygrove how does one run the integration tests? I can complete this PR with null array + IPC support Thi

[GitHub] [arrow] pitrou closed pull request #7213: ARROW-8841: [C++] Add benchmark and unittest for encoding::PLAIN spaced

2020-05-19 Thread GitBox
pitrou closed pull request #7213: URL: https://github.com/apache/arrow/pull/7213 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the

[GitHub] [arrow] alippai edited a comment on pull request #6676: ARROW-8175: [Python] Setup type checking with mypy

2020-05-19 Thread GitBox
alippai edited a comment on pull request #6676: URL: https://github.com/apache/arrow/pull/6676#issuecomment-630748138 So ideally a follow-up PR would generate mypy check-able code from [this](https://github.com/apache/arrow/blob/1164079d5442c3910c18549bfcd2e68d4554b909/python/pyarrow/includ

[GitHub] [arrow] alippai commented on pull request #6676: ARROW-8175: [Python] Setup type checking with mypy

2020-05-19 Thread GitBox
alippai commented on pull request #6676: URL: https://github.com/apache/arrow/pull/6676#issuecomment-630748138 So ideally a follow-up PR would generate mypy check-able code from [this](https://github.com/apache/arrow/blob/1164079d5442c3910c18549bfcd2e68d4554b909/python/pyarrow/includes/liba

[GitHub] [arrow] jianxind commented on a change in pull request #7213: ARROW-8841: [C++] Add benchmark and unittest for encoding::PLAIN spaced

2020-05-19 Thread GitBox
jianxind commented on a change in pull request #7213: URL: https://github.com/apache/arrow/pull/7213#discussion_r427214775 ## File path: cpp/src/parquet/encoding_benchmark.cc ## @@ -25,6 +25,7 @@ #include "arrow/testing/util.h" #include "arrow/type.h" #include "arrow/util/by

[GitHub] [arrow] github-actions[bot] commented on pull request #7223: ARROW-8711: [Python] Expose timestamp_parsers in csv.ConvertOptions

2020-05-19 Thread GitBox
github-actions[bot] commented on pull request #7223: URL: https://github.com/apache/arrow/pull/7223#issuecomment-630740695 https://issues.apache.org/jira/browse/ARROW-8711 This is an automated message from the Apache Git Serv

[GitHub] [arrow] pitrou opened a new pull request #7223: ARROW-8711: [Python] Expose timestamp_parsers in csv.ConvertOptions

2020-05-19 Thread GitBox
pitrou opened a new pull request #7223: URL: https://github.com/apache/arrow/pull/7223 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go

[GitHub] [arrow] pitrou commented on a change in pull request #7213: ARROW-8841: [C++] Add benchmark and unittest for encoding::PLAIN spaced

2020-05-19 Thread GitBox
pitrou commented on a change in pull request #7213: URL: https://github.com/apache/arrow/pull/7213#discussion_r427200565 ## File path: cpp/src/parquet/encoding_benchmark.cc ## @@ -25,6 +25,7 @@ #include "arrow/testing/util.h" #include "arrow/type.h" #include "arrow/util/byte

[GitHub] [arrow] jianxind commented on a change in pull request #7213: ARROW-8841: [C++] Add benchmark and unittest for encoding::PLAIN spaced

2020-05-19 Thread GitBox
jianxind commented on a change in pull request #7213: URL: https://github.com/apache/arrow/pull/7213#discussion_r427192686 ## File path: cpp/src/parquet/encoding_benchmark.cc ## @@ -25,6 +25,7 @@ #include "arrow/testing/util.h" #include "arrow/type.h" #include "arrow/util/by

[GitHub] [arrow] xhochy commented on pull request #6676: ARROW-8175: [Python] Setup type checking with mypy

2020-05-19 Thread GitBox
xhochy commented on pull request #6676: URL: https://github.com/apache/arrow/pull/6676#issuecomment-630717054 Things that would help: * One of @wesm, @pitrou or @jorisvandenbossche should state their opinion on this ;) * We would need to have better support in Cython for type anno

  1   2   >