[ 
https://issues.apache.org/jira/browse/ARROW-6341?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Francois Saint-Jacques updated ARROW-6341:
------------------------------------------
    Description: 
The following classes should be accessible from Python:
 * class DataSource
 * class DataSourceDiscovery
 * class Dataset
 * class ScanContext, ScanOptions, ScanTask
 * class ScannerBuilder
 * class Scanner

The end result is reading a directory of parquet files as a single stream. One 
should be able to re-implement [https://github.com/apache/arrow/pull/5720] in 
python.

  was:
The following classes should be accessible from Python:

* class DataSource
* class DataFragment
* function DiscoverySource
* class ScanContext, ScanOptions, ScanTask
* class Dataset
* class ScannerBuilder
* class Scanner

The end result is reading a directory of parquet files as a single stream.


> [Python] Implement low-level bindings for Dataset
> -------------------------------------------------
>
>                 Key: ARROW-6341
>                 URL: https://issues.apache.org/jira/browse/ARROW-6341
>             Project: Apache Arrow
>          Issue Type: New Feature
>          Components: Python
>            Reporter: Francois Saint-Jacques
>            Assignee: Krisztian Szucs
>            Priority: Major
>              Labels: dataset, pull-request-available
>          Time Spent: 0.5h
>  Remaining Estimate: 0h
>
> The following classes should be accessible from Python:
>  * class DataSource
>  * class DataSourceDiscovery
>  * class Dataset
>  * class ScanContext, ScanOptions, ScanTask
>  * class ScannerBuilder
>  * class Scanner
> The end result is reading a directory of parquet files as a single stream. 
> One should be able to re-implement 
> [https://github.com/apache/arrow/pull/5720] in python.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to