Francois Saint-Jacques created ARROW-8065: ---------------------------------------------
Summary: [C++][Dataset] Untangle Dataset, Fragment and ScanOptions Key: ARROW-8065 URL: https://issues.apache.org/jira/browse/ARROW-8065 Project: Apache Arrow Issue Type: Improvement Reporter: Francois Saint-Jacques We should be able to list fragments without going through the Scanner/ScanOptions hoops. This exposes a flaw with the current API where it require a ScanOptions to create Fragment, this is also a problem for ARROW-7824, i.e. why do we need a ScanOptions (read manifest) to write record batches to a given path. # Remove {{ScanOptions}} from Fragment's properties and move it into {{Fragment::Scan}} parameters. # Remove {{ScanOptions}} from {{Dataset::GetFragments}}, if required, we can still provide an alternate signature, e.g. {{Dataset::GetFragments(std::shared_ptr<Expression> predicate)}} for sub-tree pruning in FileSystemDataset. # Fragment constructor should take a schema (and store it as a property), usually extracted from the Dataset schema. Update the schema() method accordingly. -- This message was sent by Atlassian Jira (v8.3.4#803005)