[jira] [Created] (ARROW-10794) Typescript Arrowjs Class 'RecordBatch' incorrectly extends base class 'StructVector

2020-12-02 Thread vikash (Jira)
vikash created ARROW-10794:
--

 Summary: Typescript Arrowjs Class 'RecordBatch' incorrectly 
extends base class 'StructVector
 Key: ARROW-10794
 URL: https://issues.apache.org/jira/browse/ARROW-10794
 Project: Apache Arrow
  Issue Type: Bug
  Components: JavaScript
Affects Versions: 2.0.0
Reporter: vikash
 Attachments: Screenshot_1.png

i  am  trying  to  use apache-arrow  js  in  angular typescript version 
4.0.2 ,for that  i have  seen  issues  in  Typescript  failed  to  compile
 steps  to  reprodcue
-
1) install  angular cli  npm install -g @angular/cli
2) create  new  project  using ng new my-app
3) install apache  arrow  using  npm install apache-arrow
4) file  app.componenet.ts have  added below code
```
import \{ Component } from '@angular/core';
import \{ Table } from 'apache-arrow';
import \{ readFileSync } from 'fs';
@Component({
  selector: 'app-root',
  templateUrl: './app.component.html',
  styleUrls: ['./app.component.css']
})
export class AppComponent {
  title = 'arrow-typescript';
   arrow = readFileSync('simple.arrow');
 table = Table.from([this.arrow]);
}
```
 
but  when  i  am  using  npm  run  build  its  failed  with  below  error

Error: node_modules/apache-arrow/recordbatch.d.ts:17:18 - error TS2430: 
Interface 'RecordBatch' incorrectly extends interface 'StructVector'.
 The types of 'slice(...).clone' are incompatible between these types.
 Type '(data: Data>, children?: AbstractVector[] | undefined) => 
RecordBatch' is not assignable to type ' = 
Struct>(data: Data, children?: AbstractVector[] | undefined) => 
VectorType'.
 Types of parameters 'data' and 'data' are incompatible.
 Type 'Data' is not assignable to type 'Data>'.
 Type 'R' is not assignable to type 'Struct'.
 Property 'dataTypes' is missing in type 'DataType' but required in 
type 'Struct'.

17 export interface RecordBatch' incorrectly extends base class 'StructVector'.

24 export declare class RecordBatch', but here has type 'Schema'.

236 schema: Schema;
 ~~

node_modules/apache-arrow/ipc/reader.d.ts:189:5
 189 schema: Schema;
 ~~
 'schema' was also declared here.
 
 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (ARROW-10793) [Rust] [DataFusion] Decide on CAST behaviour for invalid inputs

2020-12-02 Thread Andy Grove (Jira)
Andy Grove created ARROW-10793:
--

 Summary: [Rust] [DataFusion] Decide on CAST behaviour for invalid 
inputs
 Key: ARROW-10793
 URL: https://issues.apache.org/jira/browse/ARROW-10793
 Project: Apache Arrow
  Issue Type: Improvement
  Components: Rust - DataFusion
Reporter: Andy Grove


This is a placeholder for now. See discussion on 
[https://github.com/apache/arrow/pull/8794]

Briefly, the issue is do we want CAST to return null for invalid inputs or 
throw an error. Spark has different behavior depending on whether ANSI mode is 
enabled or not.

I'm not sure if this is a DataFusion specific or a more general Arrow issue 
yet. It needs a discussion.

 

 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (ARROW-10792) [Rust] [CI] Modulararize CI for faster and smaller builds

2020-12-02 Thread Jira
Jorge Leitão created ARROW-10792:


 Summary: [Rust] [CI] Modulararize CI for faster and smaller builds
 Key: ARROW-10792
 URL: https://issues.apache.org/jira/browse/ARROW-10792
 Project: Apache Arrow
  Issue Type: Improvement
  Components: CI, Rust
Reporter: Jorge Leitão
Assignee: Jorge Leitão






--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (ARROW-10791) [Rust] StreamReader, read_dictionary duplicating schema info

2020-12-02 Thread Carol Nichols (Jira)
Carol Nichols created ARROW-10791:
-

 Summary: [Rust] StreamReader, read_dictionary duplicating schema 
info
 Key: ARROW-10791
 URL: https://issues.apache.org/jira/browse/ARROW-10791
 Project: Apache Arrow
  Issue Type: Bug
  Components: Rust
Reporter: Carol Nichols
Assignee: Carol Nichols


The `read_dictionary` function takes both an ipc schema and a `Schema`, but it 
can get the information it needs from just the `Schema`. Then the 
`StreamReader` doesn't need to keep the ipc schema bytes around, because it 
also has the `Schema`.

The Flight integration test code needs to read dictionaries as well, and it 
seemed overly complex to need both kinds of schemas. PR coming!



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (ARROW-10790) [C++][Compute] Investigate ChunkedArray sortperformance

2020-12-02 Thread Antoine Pitrou (Jira)
Antoine Pitrou created ARROW-10790:
--

 Summary: [C++][Compute] Investigate ChunkedArray sortperformance
 Key: ARROW-10790
 URL: https://issues.apache.org/jira/browse/ARROW-10790
 Project: Apache Arrow
  Issue Type: Wish
  Components: C++
Reporter: Antoine Pitrou


>From our micro-benchmarks, it seems that sorting a ChunkedArray can be 
>significantly than sorting a linear Array of the same size. Perhaps this can 
>be improved.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (ARROW-10789) [Rust][DataFusion] Make TableProvider dynamically typed

2020-12-02 Thread Remi Dettai (Jira)
Remi Dettai created ARROW-10789:
---

 Summary: [Rust][DataFusion] Make TableProvider dynamically typed
 Key: ARROW-10789
 URL: https://issues.apache.org/jira/browse/ARROW-10789
 Project: Apache Arrow
  Issue Type: Improvement
  Components: Rust, Rust - DataFusion
Reporter: Remi Dettai


The {{TableProvider}} trait can be used to provide custom datasources to the 
query plan. It can be useful for usecases like plan serialization to be able to 
downcast to the concrete implementation, the same way it is done for the 
{{ExecutionPlan}} trait.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (ARROW-10788) [C++] Make S3 recursive calls parallel

2020-12-02 Thread Antoine Pitrou (Jira)
Antoine Pitrou created ARROW-10788:
--

 Summary: [C++] Make S3 recursive calls parallel
 Key: ARROW-10788
 URL: https://issues.apache.org/jira/browse/ARROW-10788
 Project: Apache Arrow
  Issue Type: Improvement
  Components: C++
Reporter: Antoine Pitrou
Assignee: Antoine Pitrou


Doing a recursive S3 directory walk using GetFileInfo(Selector) currently lists 
all encountered directories serially, waiting for the results of one directory 
listing (or portion thereof) before launching the next one. Instead, we should 
use the Async APIs provided by the AWS SDK to parallelize HTTP requests as much 
as possible.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (ARROW-10787) [C++][Flight] DoExchange doesn't support dictionary replacement

2020-12-02 Thread Antoine Pitrou (Jira)
Antoine Pitrou created ARROW-10787:
--

 Summary: [C++][Flight] DoExchange doesn't support dictionary 
replacement
 Key: ARROW-10787
 URL: https://issues.apache.org/jira/browse/ARROW-10787
 Project: Apache Arrow
  Issue Type: Bug
  Components: C++, FlightRPC
Reporter: Antoine Pitrou


Looking at the server {{DoExchangeMessageWriter}} implementation, it seems to 
assume that the dictionary values for a given dictionary-encoded field will 
never change accross record batches.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)