[jira] [Created] (ARROW-10794) Typescript Arrowjs Class 'RecordBatch' incorrectly extends base class 'StructVector
vikash created ARROW-10794: -- Summary: Typescript Arrowjs Class 'RecordBatch' incorrectly extends base class 'StructVector Key: ARROW-10794 URL: https://issues.apache.org/jira/browse/ARROW-10794 Project: Apache Arrow Issue Type: Bug Components: JavaScript Affects Versions: 2.0.0 Reporter: vikash Attachments: Screenshot_1.png i am trying to use apache-arrow js in angular typescript version 4.0.2 ,for that i have seen issues in Typescript failed to compile steps to reprodcue - 1) install angular cli npm install -g @angular/cli 2) create new project using ng new my-app 3) install apache arrow using npm install apache-arrow 4) file app.componenet.ts have added below code ``` import \{ Component } from '@angular/core'; import \{ Table } from 'apache-arrow'; import \{ readFileSync } from 'fs'; @Component({ selector: 'app-root', templateUrl: './app.component.html', styleUrls: ['./app.component.css'] }) export class AppComponent { title = 'arrow-typescript'; arrow = readFileSync('simple.arrow'); table = Table.from([this.arrow]); } ``` but when i am using npm run build its failed with below error Error: node_modules/apache-arrow/recordbatch.d.ts:17:18 - error TS2430: Interface 'RecordBatch' incorrectly extends interface 'StructVector'. The types of 'slice(...).clone' are incompatible between these types. Type '(data: Data>, children?: AbstractVector[] | undefined) => RecordBatch' is not assignable to type ' = Struct>(data: Data, children?: AbstractVector[] | undefined) => VectorType'. Types of parameters 'data' and 'data' are incompatible. Type 'Data' is not assignable to type 'Data>'. Type 'R' is not assignable to type 'Struct'. Property 'dataTypes' is missing in type 'DataType' but required in type 'Struct'. 17 export interface RecordBatch' incorrectly extends base class 'StructVector'. 24 export declare class RecordBatch', but here has type 'Schema'. 236 schema: Schema; ~~ node_modules/apache-arrow/ipc/reader.d.ts:189:5 189 schema: Schema; ~~ 'schema' was also declared here. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (ARROW-10793) [Rust] [DataFusion] Decide on CAST behaviour for invalid inputs
Andy Grove created ARROW-10793: -- Summary: [Rust] [DataFusion] Decide on CAST behaviour for invalid inputs Key: ARROW-10793 URL: https://issues.apache.org/jira/browse/ARROW-10793 Project: Apache Arrow Issue Type: Improvement Components: Rust - DataFusion Reporter: Andy Grove This is a placeholder for now. See discussion on [https://github.com/apache/arrow/pull/8794] Briefly, the issue is do we want CAST to return null for invalid inputs or throw an error. Spark has different behavior depending on whether ANSI mode is enabled or not. I'm not sure if this is a DataFusion specific or a more general Arrow issue yet. It needs a discussion. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (ARROW-10792) [Rust] [CI] Modulararize CI for faster and smaller builds
Jorge Leitão created ARROW-10792: Summary: [Rust] [CI] Modulararize CI for faster and smaller builds Key: ARROW-10792 URL: https://issues.apache.org/jira/browse/ARROW-10792 Project: Apache Arrow Issue Type: Improvement Components: CI, Rust Reporter: Jorge Leitão Assignee: Jorge Leitão -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (ARROW-10791) [Rust] StreamReader, read_dictionary duplicating schema info
Carol Nichols created ARROW-10791: - Summary: [Rust] StreamReader, read_dictionary duplicating schema info Key: ARROW-10791 URL: https://issues.apache.org/jira/browse/ARROW-10791 Project: Apache Arrow Issue Type: Bug Components: Rust Reporter: Carol Nichols Assignee: Carol Nichols The `read_dictionary` function takes both an ipc schema and a `Schema`, but it can get the information it needs from just the `Schema`. Then the `StreamReader` doesn't need to keep the ipc schema bytes around, because it also has the `Schema`. The Flight integration test code needs to read dictionaries as well, and it seemed overly complex to need both kinds of schemas. PR coming! -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (ARROW-10790) [C++][Compute] Investigate ChunkedArray sortperformance
Antoine Pitrou created ARROW-10790: -- Summary: [C++][Compute] Investigate ChunkedArray sortperformance Key: ARROW-10790 URL: https://issues.apache.org/jira/browse/ARROW-10790 Project: Apache Arrow Issue Type: Wish Components: C++ Reporter: Antoine Pitrou >From our micro-benchmarks, it seems that sorting a ChunkedArray can be >significantly than sorting a linear Array of the same size. Perhaps this can >be improved. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (ARROW-10789) [Rust][DataFusion] Make TableProvider dynamically typed
Remi Dettai created ARROW-10789: --- Summary: [Rust][DataFusion] Make TableProvider dynamically typed Key: ARROW-10789 URL: https://issues.apache.org/jira/browse/ARROW-10789 Project: Apache Arrow Issue Type: Improvement Components: Rust, Rust - DataFusion Reporter: Remi Dettai The {{TableProvider}} trait can be used to provide custom datasources to the query plan. It can be useful for usecases like plan serialization to be able to downcast to the concrete implementation, the same way it is done for the {{ExecutionPlan}} trait. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (ARROW-10788) [C++] Make S3 recursive calls parallel
Antoine Pitrou created ARROW-10788: -- Summary: [C++] Make S3 recursive calls parallel Key: ARROW-10788 URL: https://issues.apache.org/jira/browse/ARROW-10788 Project: Apache Arrow Issue Type: Improvement Components: C++ Reporter: Antoine Pitrou Assignee: Antoine Pitrou Doing a recursive S3 directory walk using GetFileInfo(Selector) currently lists all encountered directories serially, waiting for the results of one directory listing (or portion thereof) before launching the next one. Instead, we should use the Async APIs provided by the AWS SDK to parallelize HTTP requests as much as possible. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (ARROW-10787) [C++][Flight] DoExchange doesn't support dictionary replacement
Antoine Pitrou created ARROW-10787: -- Summary: [C++][Flight] DoExchange doesn't support dictionary replacement Key: ARROW-10787 URL: https://issues.apache.org/jira/browse/ARROW-10787 Project: Apache Arrow Issue Type: Bug Components: C++, FlightRPC Reporter: Antoine Pitrou Looking at the server {{DoExchangeMessageWriter}} implementation, it seems to assume that the dictionary values for a given dictionary-encoded field will never change accross record batches. -- This message was sent by Atlassian Jira (v8.3.4#803005)