[ https://issues.apache.org/jira/browse/ARROW-15409?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17640873#comment-17640873 ]
Apache Arrow JIRA Bot commented on ARROW-15409: ----------------------------------------------- This issue was last updated over 90 days ago, which may be an indication it is no longer being actively worked. To better reflect the current state, the issue is being unassigned per [project policy|https://arrow.apache.org/docs/dev/developers/bug_reports.html#issue-assignment]. Please feel free to re-take assignment of the issue if it is being actively worked, or if you plan to start that work soon. > [C++] The C++ API for writing datasets could be improved > -------------------------------------------------------- > > Key: ARROW-15409 > URL: https://issues.apache.org/jira/browse/ARROW-15409 > Project: Apache Arrow > Issue Type: Improvement > Components: C++ > Reporter: Weston Pace > Assignee: Alvin Chunga Mamani > Priority: Major > Labels: good-first-issue, pull-request-available > Time Spent: 1h 40m > Remaining Estimate: 0h > > I was working on write dataset testing in the C++ API today and ran into a > number of things that were not very intuitive. All of these are abstracted > away / hidden by the python / R interface so this really only applies to > anyone using the C++ API directly. > * If no partitioning is specified the write will segfault. Instead it > should us a default (no-op) partitioning. > * The min_rows_per_group option should probably default to something higher > than 0 > * It's not clear how to specify the format (you do it by creating a format, > then setting the file write options, which sets the format privately) > * There is no default for basename_template > * There is no default for filesystem (should be local filesystem) -- This message was sent by Atlassian Jira (v8.20.10#820010)