[jira] [Created] (ARROW-13977) [Format] Clarify leap seconds and leap days for interval type

2021-09-11 Thread QP Hou (Jira)
QP Hou created ARROW-13977:
--

 Summary: [Format] Clarify leap seconds and leap days for interval 
type
 Key: ARROW-13977
 URL: https://issues.apache.org/jira/browse/ARROW-13977
 Project: Apache Arrow
  Issue Type: Task
Reporter: QP Hou
Assignee: QP Hou


It's unclear how leap  seconds and leap days should be handled for interval 
type, we should clarify them in the spec.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (ARROW-11721) json schame inference should return Schema type instead of SchemaRef

2021-02-20 Thread QP Hou (Jira)
QP Hou created ARROW-11721:
--

 Summary: json schame inference should return Schema type instead 
of SchemaRef
 Key: ARROW-11721
 URL: https://issues.apache.org/jira/browse/ARROW-11721
 Project: Apache Arrow
  Issue Type: Task
  Components: Rust
Reporter: QP Hou
Assignee: QP Hou






--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (ARROW-11719) Support merged schema for memory table

2021-02-20 Thread QP Hou (Jira)
QP Hou created ARROW-11719:
--

 Summary: Support merged schema for memory table
 Key: ARROW-11719
 URL: https://issues.apache.org/jira/browse/ARROW-11719
 Project: Apache Arrow
  Issue Type: Task
  Components: Rust - DataFusion
Reporter: QP Hou
Assignee: QP Hou


Memory table should support loading batches with compatible schemas instead of 
forcing all schemas to be the same.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (ARROW-11708) Clean up Rust 2021 linting warning

2021-02-20 Thread QP Hou (Jira)
QP Hou created ARROW-11708:
--

 Summary: Clean up Rust 2021 linting warning
 Key: ARROW-11708
 URL: https://issues.apache.org/jira/browse/ARROW-11708
 Project: Apache Arrow
  Issue Type: Task
  Components: Rust
Reporter: QP Hou
Assignee: QP Hou






--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (ARROW-11707) Support CSV schema inference without seek

2021-02-20 Thread QP Hou (Jira)
QP Hou created ARROW-11707:
--

 Summary: Support CSV schema inference without seek
 Key: ARROW-11707
 URL: https://issues.apache.org/jira/browse/ARROW-11707
 Project: Apache Arrow
  Issue Type: Task
  Components: Rust
Reporter: QP Hou
Assignee: QP Hou






--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (ARROW-11542) [Rust] json reader should not crash when reading nested list

2021-02-06 Thread QP Hou (Jira)
QP Hou created ARROW-11542:
--

 Summary: [Rust] json reader should not crash when reading nested 
list
 Key: ARROW-11542
 URL: https://issues.apache.org/jira/browse/ARROW-11542
 Project: Apache Arrow
  Issue Type: Task
  Components: Rust
Reporter: QP Hou
Assignee: QP Hou






--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (ARROW-11491) support json schema inference for nested list and struct

2021-02-03 Thread QP Hou (Jira)
QP Hou created ARROW-11491:
--

 Summary: support json schema inference for nested list and struct
 Key: ARROW-11491
 URL: https://issues.apache.org/jira/browse/ARROW-11491
 Project: Apache Arrow
  Issue Type: Task
  Components: Rust
Reporter: QP Hou
Assignee: QP Hou






--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (ARROW-11435) Allow creating ParquetPartition from external crate

2021-01-30 Thread QP Hou (Jira)
QP Hou created ARROW-11435:
--

 Summary: Allow creating ParquetPartition from external crate
 Key: ARROW-11435
 URL: https://issues.apache.org/jira/browse/ARROW-11435
 Project: Apache Arrow
  Issue Type: Task
  Components: Rust - DataFusion
Reporter: QP Hou
Assignee: QP Hou


Without this functionality, it's not possible to implement table provider in 
external crate that targets parquet format since ParquetExec takes 
ParquetPartition as an argument.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (ARROW-11366) Support boolean literal in equality expression

2021-01-24 Thread QP Hou (Jira)
QP Hou created ARROW-11366:
--

 Summary: Support boolean literal in equality expression
 Key: ARROW-11366
 URL: https://issues.apache.org/jira/browse/ARROW-11366
 Project: Apache Arrow
  Issue Type: Task
  Components: Rust - DataFusion
Reporter: QP Hou
Assignee: QP Hou






--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (ARROW-11310) implement arrow JSON writer

2021-01-18 Thread QP Hou (Jira)
QP Hou created ARROW-11310:
--

 Summary: implement arrow JSON writer
 Key: ARROW-11310
 URL: https://issues.apache.org/jira/browse/ARROW-11310
 Project: Apache Arrow
  Issue Type: Task
  Components: Rust
Reporter: QP Hou
Assignee: QP Hou






--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (ARROW-11113) [Rust] support as_struct_array cast

2021-01-02 Thread QP Hou (Jira)
QP Hou created ARROW-3:
--

 Summary: [Rust] support as_struct_array cast
 Key: ARROW-3
 URL: https://issues.apache.org/jira/browse/ARROW-3
 Project: Apache Arrow
  Issue Type: Bug
Reporter: QP Hou
Assignee: QP Hou






--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (ARROW-11110) [Rust] [Datafusion] context.table should not take a mutable self reference

2021-01-02 Thread QP Hou (Jira)
QP Hou created ARROW-0:
--

 Summary: [Rust] [Datafusion] context.table should not take a 
mutable self reference
 Key: ARROW-0
 URL: https://issues.apache.org/jira/browse/ARROW-0
 Project: Apache Arrow
  Issue Type: Improvement
  Components: Rust - DataFusion
Reporter: QP Hou
Assignee: QP Hou
 Fix For: 3.0.0






--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (ARROW-10876) [Rust] json reader should validate value type

2020-12-10 Thread QP Hou (Jira)
QP Hou created ARROW-10876:
--

 Summary: [Rust] json reader should validate value type
 Key: ARROW-10876
 URL: https://issues.apache.org/jira/browse/ARROW-10876
 Project: Apache Arrow
  Issue Type: Bug
Reporter: QP Hou


json reader should error out if row type is not object



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (ARROW-10875) simplify simd cfg check

2020-12-10 Thread QP Hou (Jira)
QP Hou created ARROW-10875:
--

 Summary: simplify simd cfg check
 Key: ARROW-10875
 URL: https://issues.apache.org/jira/browse/ARROW-10875
 Project: Apache Arrow
  Issue Type: Bug
  Components: Rust
Reporter: QP Hou
Assignee: QP Hou


make simd cfg check DRY for easier maintenance



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (ARROW-10842) [Rust] decouple IO from json schema inference code

2020-12-07 Thread QP Hou (Jira)
QP Hou created ARROW-10842:
--

 Summary: [Rust] decouple IO from json schema inference code
 Key: ARROW-10842
 URL: https://issues.apache.org/jira/browse/ARROW-10842
 Project: Apache Arrow
  Issue Type: Bug
  Components: Rust
Reporter: QP Hou
Assignee: QP Hou






--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (ARROW-10830) [Rust] json reader should not hard crash on invalid json

2020-12-06 Thread QP Hou (Jira)
QP Hou created ARROW-10830:
--

 Summary: [Rust] json reader should not hard crash on invalid json
 Key: ARROW-10830
 URL: https://issues.apache.org/jira/browse/ARROW-10830
 Project: Apache Arrow
  Issue Type: Bug
Reporter: QP Hou
Assignee: QP Hou






--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (ARROW-10822) [Rust] [Datafusion] support compiling datafusion with simd support

2020-12-05 Thread QP Hou (Jira)
QP Hou created ARROW-10822:
--

 Summary: [Rust] [Datafusion] support compiling datafusion with 
simd support
 Key: ARROW-10822
 URL: https://issues.apache.org/jira/browse/ARROW-10822
 Project: Apache Arrow
  Issue Type: New Feature
  Components: Rust - DataFusion
Reporter: QP Hou






--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (ARROW-10821) [Rust] [Datafusion] implement negative expression

2020-12-05 Thread QP Hou (Jira)
QP Hou created ARROW-10821:
--

 Summary: [Rust] [Datafusion] implement negative expression
 Key: ARROW-10821
 URL: https://issues.apache.org/jira/browse/ARROW-10821
 Project: Apache Arrow
  Issue Type: New Feature
  Components: Rust - DataFusion
Reporter: QP Hou
Assignee: QP Hou






--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (ARROW-10458) [Rust] [Datafusion] context.create_logical_plan should not take a mutable self reference

2020-11-01 Thread QP Hou (Jira)
QP Hou created ARROW-10458:
--

 Summary: [Rust] [Datafusion] context.create_logical_plan should 
not take a mutable self reference
 Key: ARROW-10458
 URL: https://issues.apache.org/jira/browse/ARROW-10458
 Project: Apache Arrow
  Issue Type: Improvement
  Components: Rust - DataFusion
Reporter: QP Hou
Assignee: QP Hou






--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (ARROW-10454) [Rust][Datafusion] support creating ParquetExec from externally resolved file list and schema

2020-10-31 Thread QP Hou (Jira)
QP Hou created ARROW-10454:
--

 Summary: [Rust][Datafusion] support creating ParquetExec from 
externally resolved file list and schema
 Key: ARROW-10454
 URL: https://issues.apache.org/jira/browse/ARROW-10454
 Project: Apache Arrow
  Issue Type: Improvement
  Components: Rust - DataFusion
Reporter: QP Hou
Assignee: QP Hou






--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (ARROW-9327) Fix all clippy errors for arrow crate

2020-07-05 Thread QP Hou (Jira)
QP Hou created ARROW-9327:
-

 Summary: Fix all clippy errors for arrow crate
 Key: ARROW-9327
 URL: https://issues.apache.org/jira/browse/ARROW-9327
 Project: Apache Arrow
  Issue Type: Improvement
  Components: Rust
Reporter: QP Hou
Assignee: QP Hou






--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (ARROW-9192) [Rust] Enable clippy linting for arrow crate in CI pipeline

2020-06-19 Thread QP Hou (Jira)
QP Hou created ARROW-9192:
-

 Summary: [Rust] Enable clippy linting for arrow crate in CI 
pipeline 
 Key: ARROW-9192
 URL: https://issues.apache.org/jira/browse/ARROW-9192
 Project: Apache Arrow
  Issue Type: Improvement
  Components: Rust
Reporter: QP Hou
Assignee: QP Hou






--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (ARROW-9184) [Rust][Datafusion] table scan without projection should return all columns

2020-06-18 Thread QP Hou (Jira)
QP Hou created ARROW-9184:
-

 Summary: [Rust][Datafusion] table scan without projection should 
return all columns
 Key: ARROW-9184
 URL: https://issues.apache.org/jira/browse/ARROW-9184
 Project: Apache Arrow
  Issue Type: Bug
Reporter: QP Hou
Assignee: QP Hou


Projection should be optional if user already want to fetch all columns



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (ARROW-9158) [Rust][Datafusion] Projection physical plan compilation should preserve nullability

2020-06-17 Thread QP Hou (Jira)
QP Hou created ARROW-9158:
-

 Summary: [Rust][Datafusion] Projection physical plan compilation 
should preserve nullability
 Key: ARROW-9158
 URL: https://issues.apache.org/jira/browse/ARROW-9158
 Project: Apache Arrow
  Issue Type: Improvement
Reporter: QP Hou
Assignee: QP Hou


When compiling logical plan to physical plan, field nullability should be 
preserved.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (ARROW-9157) [Rust][Datafusion] execution context's create_physical_plan should take self as immutable reference

2020-06-16 Thread QP Hou (Jira)
QP Hou created ARROW-9157:
-

 Summary: [Rust][Datafusion] execution context's 
create_physical_plan should take self as immutable reference
 Key: ARROW-9157
 URL: https://issues.apache.org/jira/browse/ARROW-9157
 Project: Apache Arrow
  Issue Type: Improvement
Reporter: QP Hou
Assignee: QP Hou


It's not mutating self, so mutable reference is not necessary.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (ARROW-9124) [Rust][Datafusion] DFParser should consume sql query as instead of String

2020-06-13 Thread QP Hou (Jira)


 [ 
https://issues.apache.org/jira/browse/ARROW-9124?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

QP Hou updated ARROW-9124:
--
Summary: [Rust][Datafusion] DFParser should consume sql query as  
instead of String  (was: DFParser should consume sql query as  instead of 
String)

> [Rust][Datafusion] DFParser should consume sql query as  instead of String
> --
>
> Key: ARROW-9124
> URL: https://issues.apache.org/jira/browse/ARROW-9124
> Project: Apache Arrow
>  Issue Type: Improvement
>  Components: Rust - DataFusion
>Reporter: QP Hou
>Assignee: QP Hou
>Priority: Minor
>
> It's more efficient to use  instead of String



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (ARROW-9124) DFParser should consume sql query as instead of String

2020-06-13 Thread QP Hou (Jira)
QP Hou created ARROW-9124:
-

 Summary: DFParser should consume sql query as  instead of 
String
 Key: ARROW-9124
 URL: https://issues.apache.org/jira/browse/ARROW-9124
 Project: Apache Arrow
  Issue Type: Improvement
Reporter: QP Hou
Assignee: QP Hou


It's more efficient to use  instead of String



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (ARROW-9124) DFParser should consume sql query as instead of String

2020-06-13 Thread QP Hou (Jira)


 [ 
https://issues.apache.org/jira/browse/ARROW-9124?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

QP Hou updated ARROW-9124:
--
Component/s: Rust - DataFusion

> DFParser should consume sql query as  instead of String
> ---
>
> Key: ARROW-9124
> URL: https://issues.apache.org/jira/browse/ARROW-9124
> Project: Apache Arrow
>  Issue Type: Improvement
>  Components: Rust - DataFusion
>Reporter: QP Hou
>Assignee: QP Hou
>Priority: Minor
>
> It's more efficient to use  instead of String



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (ARROW-9057) Projection should work on InMemoryScan without error

2020-06-07 Thread QP Hou (Jira)
QP Hou created ARROW-9057:
-

 Summary: Projection should work on InMemoryScan without error
 Key: ARROW-9057
 URL: https://issues.apache.org/jira/browse/ARROW-9057
 Project: Apache Arrow
  Issue Type: Bug
  Components: Rust - DataFusion
Reporter: QP Hou
Assignee: QP Hou






--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (ARROW-8824) [Rust] [DataFusion] Implement new SQL parser

2020-06-06 Thread QP Hou (Jira)


[ 
https://issues.apache.org/jira/browse/ARROW-8824?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17127242#comment-17127242
 ] 

QP Hou commented on ARROW-8824:
---

+1 on rewriting a new dedicated parser for datafusion.

> [Rust] [DataFusion] Implement new SQL parser
> 
>
> Key: ARROW-8824
> URL: https://issues.apache.org/jira/browse/ARROW-8824
> Project: Apache Arrow
>  Issue Type: Improvement
>  Components: Rust, Rust - DataFusion
>Reporter: Andy Grove
>Assignee: Andy Grove
>Priority: Major
> Fix For: 1.0.0
>
>
> We currently depend on the sqlparser crate that I originally created but has 
> moved on since the version we use and that project is aiming to support 
> multiple SQL dialects and I don't think it is appropriate for what we need in 
> DataFusion.
> I think it would be better to build a new SQL parser as part of the 
> DataFusion crate so that we can more easily maintain it, and it can use Arrow 
> as the native type system.
> Another option would be to try and donate the sqlparser 0.2.x code base but 
> there are a fair number of committers and it is probably easier just to 
> implement it from scratch (without referencing the existing code).
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (ARROW-9005) Support sort expression

2020-06-01 Thread QP Hou (Jira)
QP Hou created ARROW-9005:
-

 Summary: Support sort expression
 Key: ARROW-9005
 URL: https://issues.apache.org/jira/browse/ARROW-9005
 Project: Apache Arrow
  Issue Type: Improvement
  Components: Rust - DataFusion
Reporter: QP Hou
Assignee: QP Hou






--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (ARROW-8931) [Rust] Support lexical sort in arrow compute kernel

2020-05-24 Thread QP Hou (Jira)
QP Hou created ARROW-8931:
-

 Summary: [Rust] Support lexical sort in arrow compute kernel
 Key: ARROW-8931
 URL: https://issues.apache.org/jira/browse/ARROW-8931
 Project: Apache Arrow
  Issue Type: Improvement
  Components: Rust
Reporter: QP Hou
Assignee: QP Hou






--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (ARROW-8906) [Rust] Support reading multiple CSV files for schema inference

2020-05-22 Thread QP Hou (Jira)
QP Hou created ARROW-8906:
-

 Summary: [Rust] Support reading multiple CSV files for schema 
inference
 Key: ARROW-8906
 URL: https://issues.apache.org/jira/browse/ARROW-8906
 Project: Apache Arrow
  Issue Type: Improvement
  Components: Rust
Reporter: QP Hou
Assignee: QP Hou






--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (ARROW-8877) [Rust] add CSV read option struct to simplify datafusion interface

2020-05-20 Thread QP Hou (Jira)
QP Hou created ARROW-8877:
-

 Summary: [Rust] add CSV read option struct to simplify datafusion 
interface
 Key: ARROW-8877
 URL: https://issues.apache.org/jira/browse/ARROW-8877
 Project: Apache Arrow
  Issue Type: Improvement
  Components: Rust, Rust - DataFusion
Reporter: QP Hou
Assignee: QP Hou






--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (ARROW-8840) [Rust] datafusion ExecutionError should implement std::error:Error trait

2020-05-17 Thread QP Hou (Jira)
QP Hou created ARROW-8840:
-

 Summary: [Rust] datafusion ExecutionError should implement 
std::error:Error trait
 Key: ARROW-8840
 URL: https://issues.apache.org/jira/browse/ARROW-8840
 Project: Apache Arrow
  Issue Type: Improvement
  Components: Rust - DataFusion
Reporter: QP Hou
Assignee: QP Hou






--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (ARROW-8839) [Rust] datafusion logical plan should support scaning csv without provided schema

2020-05-17 Thread QP Hou (Jira)
QP Hou created ARROW-8839:
-

 Summary: [Rust] datafusion logical plan should support scaning csv 
without provided schema
 Key: ARROW-8839
 URL: https://issues.apache.org/jira/browse/ARROW-8839
 Project: Apache Arrow
  Issue Type: Improvement
  Components: Rust - DataFusion
Reporter: QP Hou
Assignee: QP Hou






--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (ARROW-8821) [Rust] nested binary expression with Like, NotLike and Not operator results in type cast error

2020-05-16 Thread QP Hou (Jira)


 [ 
https://issues.apache.org/jira/browse/ARROW-8821?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

QP Hou updated ARROW-8821:
--
Component/s: (was: Rust)
 Rust - DataFusion

> [Rust] nested binary expression with Like, NotLike and Not operator results 
> in type cast error
> --
>
> Key: ARROW-8821
> URL: https://issues.apache.org/jira/browse/ARROW-8821
> Project: Apache Arrow
>  Issue Type: Bug
>  Components: Rust - DataFusion
>Reporter: QP Hou
>Assignee: QP Hou
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 0.5h
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (ARROW-8821) [Rust] nested binary expression with Like, NotLike and Not operator results in type cast error

2020-05-16 Thread QP Hou (Jira)


 [ 
https://issues.apache.org/jira/browse/ARROW-8821?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

QP Hou updated ARROW-8821:
--
Component/s: Rust

> [Rust] nested binary expression with Like, NotLike and Not operator results 
> in type cast error
> --
>
> Key: ARROW-8821
> URL: https://issues.apache.org/jira/browse/ARROW-8821
> Project: Apache Arrow
>  Issue Type: Bug
>  Components: Rust
>Reporter: QP Hou
>Assignee: QP Hou
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 0.5h
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (ARROW-8821) [Rust] nested binary expression with Like, NotLike and Not operator results in type cast error

2020-05-16 Thread QP Hou (Jira)
QP Hou created ARROW-8821:
-

 Summary: [Rust] nested binary expression with Like, NotLike and 
Not operator results in type cast error
 Key: ARROW-8821
 URL: https://issues.apache.org/jira/browse/ARROW-8821
 Project: Apache Arrow
  Issue Type: Bug
Reporter: QP Hou
Assignee: QP Hou






--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (ARROW-8752) Remove unused hashmap

2020-05-09 Thread QP Hou (Jira)
QP Hou created ARROW-8752:
-

 Summary: Remove unused hashmap 
 Key: ARROW-8752
 URL: https://issues.apache.org/jira/browse/ARROW-8752
 Project: Apache Arrow
  Issue Type: Improvement
Reporter: QP Hou
Assignee: QP Hou


both base_nodes and base_nodes_set doesn't seem to be used at all in 
build_array_reader.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (ARROW-8751) [Rust] ParquetFileArrowReader should be able to read empty parquet file without error

2020-05-09 Thread QP Hou (Jira)
QP Hou created ARROW-8751:
-

 Summary: [Rust] ParquetFileArrowReader should be able to read 
empty parquet file without error
 Key: ARROW-8751
 URL: https://issues.apache.org/jira/browse/ARROW-8751
 Project: Apache Arrow
  Issue Type: New Feature
Reporter: QP Hou
Assignee: QP Hou


Sometimes spark will write out parquet files with zero row groups, which will 
result in error if read using ParquetFileArrowReader.

It would be more convenient if ParquetFileArrowReader can support this 
edge-case out of the box.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (ARROW-8744) [Rust] ParquetIterator's next method should be safe to call even after reached end of iteration

2020-05-08 Thread QP Hou (Jira)
QP Hou created ARROW-8744:
-

 Summary: [Rust] ParquetIterator's next method should be safe to 
call even after reached end of iteration
 Key: ARROW-8744
 URL: https://issues.apache.org/jira/browse/ARROW-8744
 Project: Apache Arrow
  Issue Type: Improvement
Reporter: QP Hou
Assignee: QP Hou


Once reached end of iteration, calling next on ParquetIterator will result in 
an error. This is inconvenient in two ways:
* when shared between multiple threads, only one of the thread will be able to 
terminate without error
* sender for response_rx cannot terminate the iteration early and free up 
resources, instead, it needs to always wait for signal from request_tx before 
closing up the connection



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (ARROW-8744) [Rust] ParquetIterator's next method should be safe to call even after reached end of iteration

2020-05-08 Thread QP Hou (Jira)


 [ 
https://issues.apache.org/jira/browse/ARROW-8744?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

QP Hou updated ARROW-8744:
--
Component/s: Rust - DataFusion
   Priority: Minor  (was: Major)

> [Rust] ParquetIterator's next method should be safe to call even after 
> reached end of iteration
> ---
>
> Key: ARROW-8744
> URL: https://issues.apache.org/jira/browse/ARROW-8744
> Project: Apache Arrow
>  Issue Type: Improvement
>  Components: Rust - DataFusion
>Reporter: QP Hou
>Assignee: QP Hou
>Priority: Minor
>
> Once reached end of iteration, calling next on ParquetIterator will result in 
> an error. This is inconvenient in two ways:
> * when shared between multiple threads, only one of the thread will be able 
> to terminate without error
> * sender for response_rx cannot terminate the iteration early and free up 
> resources, instead, it needs to always wait for signal from request_tx before 
> closing up the connection



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (ARROW-8725) redundant directory walk in rust parquet datasource code

2020-05-06 Thread QP Hou (Jira)
QP Hou created ARROW-8725:
-

 Summary: redundant directory walk in rust parquet datasource code
 Key: ARROW-8725
 URL: https://issues.apache.org/jira/browse/ARROW-8725
 Project: Apache Arrow
  Issue Type: Improvement
Reporter: QP Hou
Assignee: QP Hou


In the rust code base, `common::build_file_list` is called within 
`ParquetExec::try_new`, so there is no need to build the file list before 
calling `try_new`.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (ARROW-8552) [Rust] support column iteration for parquet row

2020-04-21 Thread QP Hou (Jira)
QP Hou created ARROW-8552:
-

 Summary: [Rust] support column iteration for parquet row
 Key: ARROW-8552
 URL: https://issues.apache.org/jira/browse/ARROW-8552
 Project: Apache Arrow
  Issue Type: Improvement
  Components: Rust
Reporter: QP Hou


It would be useful to be able to iterate through all the columns in a parquet 
row.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)