[jira] [Resolved] (ARROW-18341) [Doc][Python] Update note about bundling Arrow C++ on Windows
[ https://issues.apache.org/jira/browse/ARROW-18341?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joris Van den Bossche resolved ARROW-18341. --- Resolution: Fixed Issue resolved by pull request 14660 [https://github.com/apache/arrow/pull/14660] > [Doc][Python] Update note about bundling Arrow C++ on Windows > - > > Key: ARROW-18341 > URL: https://issues.apache.org/jira/browse/ARROW-18341 > Project: Apache Arrow > Issue Type: Improvement > Components: Documentation, Python >Reporter: Alenka Frim >Assignee: Alenka Frim >Priority: Major > Labels: pull-request-available > Fix For: 11.0.0 > > Time Spent: 1h 50m > Remaining Estimate: 0h > > There is a note on the python development page under Widnows section about > bundling the Arrow C++ libraries with Python extensions: > [https://arrow.apache.org/docs/dev/developers/python.html#building-on-windows] > This note can be revised: > * if you are using conda, the fact that Arrow C++ libs are not bundled is > fine since conda will ensure those libs are found. > * If you are not using conda, you have to ensure those libs can be found: > either by updating {{PATH}} (every time before importing pyarrow), or either > by bundling them (... using the {{PYARROW_BUNDLE_ARROW_CPP}} env variable > instead of {{{}--bundle-arrow-cpp{}}}). With the caveat those won't be > automatically updated when rebuilding the arrow-cpp libs then. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Resolved] (ARROW-18225) [Python] write_metadata does not fully use **kwargs
[ https://issues.apache.org/jira/browse/ARROW-18225?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joris Van den Bossche resolved ARROW-18225. --- Fix Version/s: 11.0.0 Resolution: Fixed Issue resolved by pull request 14574 [https://github.com/apache/arrow/pull/14574] > [Python] write_metadata does not fully use **kwargs > --- > > Key: ARROW-18225 > URL: https://issues.apache.org/jira/browse/ARROW-18225 > Project: Apache Arrow > Issue Type: Bug > Components: Python >Reporter: François Chareyron >Assignee: Miles Granger >Priority: Major > Labels: pull-request-available > Fix For: 11.0.0 > > Time Spent: 3h 10m > Remaining Estimate: 0h > > When using {{write_metadata}}, {{kwargs}} can be used to pass a FileSystem to > a ParquetWriter. However, those {{kwargs}} are not passed to > {{read_metadata}} later on despite the function accepting a filesystem > argument. > This creates an error when trying to write metadata on a S3FileSystem for > example. > {code:python} > def write_metadata(schema, where, metadata_collector=None, **kwargs): > writer = ParquetWriter(where, schema, **kwargs) > writer.close() > if metadata_collector is not None: > metadata = read_metadata(where) # kwargs should be passed here > for m in metadata_collector: > metadata.append_row_groups(m) > metadata.write_metadata_file(where) # kwargs should be passed here > {code} > {code:python} > def read_metadata(where, memory_map=False, decryption_properties=None, > filesystem=None): > ...{code} -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Resolved] (ARROW-18366) [Packaging][RPM][Gandiva] Failed to link on AlmaLinux 9
[ https://issues.apache.org/jira/browse/ARROW-18366?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kouhei Sutou resolved ARROW-18366. -- Fix Version/s: 11.0.0 Resolution: Fixed Issue resolved by pull request 14680 [https://github.com/apache/arrow/pull/14680] > [Packaging][RPM][Gandiva] Failed to link on AlmaLinux 9 > > > Key: ARROW-18366 > URL: https://issues.apache.org/jira/browse/ARROW-18366 > Project: Apache Arrow > Issue Type: Improvement > Components: Packaging >Reporter: Kouhei Sutou >Assignee: Kouhei Sutou >Priority: Major > Labels: pull-request-available > Fix For: 11.0.0 > > Time Spent: 1h > Remaining Estimate: 0h > > https://github.com/ursacomputing/crossbow/actions/runs/3502784911/jobs/5867407921#step:6:4748 > {noformat} > FAILED: gandiva-glib/Gandiva-1.0.gir > env > PKG_CONFIG_PATH=/usr/lib64/pkgconfig:/usr/share/pkgconfig:/build/rpmbuild/BUILD/apache-arrow-11.0.0.dev130/c_glib/build/meson-uninstalled > /usr/bin/g-ir-scanner --quiet --no-libtool --namespace=Gandiva > --nsversion=1.0 --warn-all --output gandiva-glib/Gandiva-1.0.gir > --c-include=gandiva-glib/gandiva-glib.h --warn-all > --include-uninstalled=./arrow-glib/Arrow-1.0.gir > -I/build/rpmbuild/BUILD/apache-arrow-11.0.0.dev130/c_glib/gandiva-glib > -I/build/rpmbuild/BUILD/apache-arrow-11.0.0.dev130/c_glib/build/gandiva-glib > -I/build/rpmbuild/BUILD/apache-arrow-11.0.0.dev130/c_glib/. > -I/build/rpmbuild/BUILD/apache-arrow-11.0.0.dev130/c_glib/build/. > -I/build/rpmbuild/BUILD/apache-arrow-11.0.0.dev130/c_glib/../cpp/redhat-linux-build/src > > -I/build/rpmbuild/BUILD/apache-arrow-11.0.0.dev130/c_glib/build/../cpp/redhat-linux-build/src > -I/build/rpmbuild/BUILD/apache-arrow-11.0.0.dev130/c_glib/../cpp/src > -I/build/rpmbuild/BUILD/apache-arrow-11.0.0.dev130/c_glib/build/../cpp/src > -I/build/rpmbuild/BUILD/apache-arrow-11.0.0.dev130/c_glib/. > -I/build/rpmbuild/BUILD/apache-arrow-11.0.0.dev130/c_glib/build/. > -I/build/rpmbuild/BUILD/apache-arrow-11.0.0.dev130/c_glib/../cpp/redhat-linux-build/src > > -I/build/rpmbuild/BUILD/apache-arrow-11.0.0.dev130/c_glib/build/../cpp/redhat-linux-build/src > -I/build/rpmbuild/BUILD/apache-arrow-11.0.0.dev130/c_glib/../cpp/src > -I/build/rpmbuild/BUILD/apache-arrow-11.0.0.dev130/c_glib/build/../cpp/src > --filelist=/build/rpmbuild/BUILD/apache-arrow-11.0.0.dev130/c_glib/build/gandiva-glib/libgandiva-glib.so.1100.0.0.p/Gandiva_1.0_gir_filelist > --include=Arrow-1.0 --symbol-prefix=ggandiva --identifier-prefix=GGandiva > --pkg-export=gandiva-glib --cflags-begin > -I/build/rpmbuild/BUILD/apache-arrow-11.0.0.dev130/c_glib/. > -I/build/rpmbuild/BUILD/apache-arrow-11.0.0.dev130/c_glib/build/. > -I/build/rpmbuild/BUILD/apache-arrow-11.0.0.dev130/c_glib/../cpp/redhat-linux-build/src > > -I/build/rpmbuild/BUILD/apache-arrow-11.0.0.dev130/c_glib/build/../cpp/redhat-linux-build/src > -I/build/rpmbuild/BUILD/apache-arrow-11.0.0.dev130/c_glib/../cpp/src > -I/build/rpmbuild/BUILD/apache-arrow-11.0.0.dev130/c_glib/build/../cpp/src > -I/usr/include/glib-2.0 -I/usr/lib64/glib-2.0/include > -I/usr/include/sysprof-4 -I/usr/include/gobject-introspection-1.0 > --cflags-end > --add-include-path=/build/rpmbuild/BUILD/apache-arrow-11.0.0.dev130/c_glib/build/arrow-glib > --add-include-path=/usr/share/gir-1.0 > -L/build/rpmbuild/BUILD/apache-arrow-11.0.0.dev130/c_glib/build/gandiva-glib > --library gandiva-glib > -L/build/rpmbuild/BUILD/apache-arrow-11.0.0.dev130/c_glib/build/arrow-glib > -L/build/rpmbuild/BUILD/apache-arrow-11.0.0.dev130/c_glib/build/../../cpp/redhat-linux-build/release > --extra-library=gobject-2.0 --extra-library=glib-2.0 > --extra-library=girepository-1.0 --sources-top-dirs > /build/rpmbuild/BUILD/apache-arrow-11.0.0.dev130/c_glib/ --sources-top-dirs > /build/rpmbuild/BUILD/apache-arrow-11.0.0.dev130/c_glib/build/ --warn-error > /usr/bin/ld: > /build/rpmbuild/BUILD/apache-arrow-11.0.0.dev130/c_glib/build/../../cpp/redhat-linux-build/release/libgandiva.so.1100: > undefined reference to `std::__glibcxx_assert_fail(char const*, int, char > const*, char const*)' > collect2: error: ld returned 1 exit status > {noformat} -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (ARROW-18366) [Packaging][RPM][Gandiva] Failed to link on AlmaLinux 9
[ https://issues.apache.org/jira/browse/ARROW-18366?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kouhei Sutou updated ARROW-18366: - Fix Version/s: 10.0.2 > [Packaging][RPM][Gandiva] Failed to link on AlmaLinux 9 > > > Key: ARROW-18366 > URL: https://issues.apache.org/jira/browse/ARROW-18366 > Project: Apache Arrow > Issue Type: Improvement > Components: Packaging >Reporter: Kouhei Sutou >Assignee: Kouhei Sutou >Priority: Major > Labels: pull-request-available > Fix For: 10.0.2, 11.0.0 > > Time Spent: 1h > Remaining Estimate: 0h > > https://github.com/ursacomputing/crossbow/actions/runs/3502784911/jobs/5867407921#step:6:4748 > {noformat} > FAILED: gandiva-glib/Gandiva-1.0.gir > env > PKG_CONFIG_PATH=/usr/lib64/pkgconfig:/usr/share/pkgconfig:/build/rpmbuild/BUILD/apache-arrow-11.0.0.dev130/c_glib/build/meson-uninstalled > /usr/bin/g-ir-scanner --quiet --no-libtool --namespace=Gandiva > --nsversion=1.0 --warn-all --output gandiva-glib/Gandiva-1.0.gir > --c-include=gandiva-glib/gandiva-glib.h --warn-all > --include-uninstalled=./arrow-glib/Arrow-1.0.gir > -I/build/rpmbuild/BUILD/apache-arrow-11.0.0.dev130/c_glib/gandiva-glib > -I/build/rpmbuild/BUILD/apache-arrow-11.0.0.dev130/c_glib/build/gandiva-glib > -I/build/rpmbuild/BUILD/apache-arrow-11.0.0.dev130/c_glib/. > -I/build/rpmbuild/BUILD/apache-arrow-11.0.0.dev130/c_glib/build/. > -I/build/rpmbuild/BUILD/apache-arrow-11.0.0.dev130/c_glib/../cpp/redhat-linux-build/src > > -I/build/rpmbuild/BUILD/apache-arrow-11.0.0.dev130/c_glib/build/../cpp/redhat-linux-build/src > -I/build/rpmbuild/BUILD/apache-arrow-11.0.0.dev130/c_glib/../cpp/src > -I/build/rpmbuild/BUILD/apache-arrow-11.0.0.dev130/c_glib/build/../cpp/src > -I/build/rpmbuild/BUILD/apache-arrow-11.0.0.dev130/c_glib/. > -I/build/rpmbuild/BUILD/apache-arrow-11.0.0.dev130/c_glib/build/. > -I/build/rpmbuild/BUILD/apache-arrow-11.0.0.dev130/c_glib/../cpp/redhat-linux-build/src > > -I/build/rpmbuild/BUILD/apache-arrow-11.0.0.dev130/c_glib/build/../cpp/redhat-linux-build/src > -I/build/rpmbuild/BUILD/apache-arrow-11.0.0.dev130/c_glib/../cpp/src > -I/build/rpmbuild/BUILD/apache-arrow-11.0.0.dev130/c_glib/build/../cpp/src > --filelist=/build/rpmbuild/BUILD/apache-arrow-11.0.0.dev130/c_glib/build/gandiva-glib/libgandiva-glib.so.1100.0.0.p/Gandiva_1.0_gir_filelist > --include=Arrow-1.0 --symbol-prefix=ggandiva --identifier-prefix=GGandiva > --pkg-export=gandiva-glib --cflags-begin > -I/build/rpmbuild/BUILD/apache-arrow-11.0.0.dev130/c_glib/. > -I/build/rpmbuild/BUILD/apache-arrow-11.0.0.dev130/c_glib/build/. > -I/build/rpmbuild/BUILD/apache-arrow-11.0.0.dev130/c_glib/../cpp/redhat-linux-build/src > > -I/build/rpmbuild/BUILD/apache-arrow-11.0.0.dev130/c_glib/build/../cpp/redhat-linux-build/src > -I/build/rpmbuild/BUILD/apache-arrow-11.0.0.dev130/c_glib/../cpp/src > -I/build/rpmbuild/BUILD/apache-arrow-11.0.0.dev130/c_glib/build/../cpp/src > -I/usr/include/glib-2.0 -I/usr/lib64/glib-2.0/include > -I/usr/include/sysprof-4 -I/usr/include/gobject-introspection-1.0 > --cflags-end > --add-include-path=/build/rpmbuild/BUILD/apache-arrow-11.0.0.dev130/c_glib/build/arrow-glib > --add-include-path=/usr/share/gir-1.0 > -L/build/rpmbuild/BUILD/apache-arrow-11.0.0.dev130/c_glib/build/gandiva-glib > --library gandiva-glib > -L/build/rpmbuild/BUILD/apache-arrow-11.0.0.dev130/c_glib/build/arrow-glib > -L/build/rpmbuild/BUILD/apache-arrow-11.0.0.dev130/c_glib/build/../../cpp/redhat-linux-build/release > --extra-library=gobject-2.0 --extra-library=glib-2.0 > --extra-library=girepository-1.0 --sources-top-dirs > /build/rpmbuild/BUILD/apache-arrow-11.0.0.dev130/c_glib/ --sources-top-dirs > /build/rpmbuild/BUILD/apache-arrow-11.0.0.dev130/c_glib/build/ --warn-error > /usr/bin/ld: > /build/rpmbuild/BUILD/apache-arrow-11.0.0.dev130/c_glib/build/../../cpp/redhat-linux-build/release/libgandiva.so.1100: > undefined reference to `std::__glibcxx_assert_fail(char const*, int, char > const*, char const*)' > collect2: error: ld returned 1 exit status > {noformat} -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Comment Edited] (ARROW-18355) [R] support the quoted_na argument in open_dataset for CSVs by mapping it to CSVConvertOptions$strings_can_be_null
[ https://issues.apache.org/jira/browse/ARROW-18355?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17636936#comment-17636936 ] Nicola Crane edited comment on ARROW-18355 at 11/22/22 12:27 AM: - Nah, I don't think it's worth us spending time adding and then later removing it unless users are clamouring for it, which I don't see here. Thanks for looking into this! was (Author: thisisnic): Nah, I don't think it's worth us spending time adding and then later removing it unless users are clamouring for it, which I don't see here. > [R] support the quoted_na argument in open_dataset for CSVs by mapping it to > CSVConvertOptions$strings_can_be_null > -- > > Key: ARROW-18355 > URL: https://issues.apache.org/jira/browse/ARROW-18355 > Project: Apache Arrow > Issue Type: Sub-task > Components: R >Reporter: Nicola Crane >Assignee: Will Jones >Priority: Major > -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Closed] (ARROW-18355) [R] support the quoted_na argument in open_dataset for CSVs by mapping it to CSVConvertOptions$strings_can_be_null
[ https://issues.apache.org/jira/browse/ARROW-18355?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nicola Crane closed ARROW-18355. Resolution: Won't Fix > [R] support the quoted_na argument in open_dataset for CSVs by mapping it to > CSVConvertOptions$strings_can_be_null > -- > > Key: ARROW-18355 > URL: https://issues.apache.org/jira/browse/ARROW-18355 > Project: Apache Arrow > Issue Type: Sub-task > Components: R >Reporter: Nicola Crane >Assignee: Will Jones >Priority: Major > -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (ARROW-18355) [R] support the quoted_na argument in open_dataset for CSVs by mapping it to CSVConvertOptions$strings_can_be_null
[ https://issues.apache.org/jira/browse/ARROW-18355?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17636936#comment-17636936 ] Nicola Crane commented on ARROW-18355: -- Nah, I don't think it's worth us spending time adding and then later removing it unless users are clamouring for it, which I don't see here. > [R] support the quoted_na argument in open_dataset for CSVs by mapping it to > CSVConvertOptions$strings_can_be_null > -- > > Key: ARROW-18355 > URL: https://issues.apache.org/jira/browse/ARROW-18355 > Project: Apache Arrow > Issue Type: Sub-task > Components: R >Reporter: Nicola Crane >Assignee: Will Jones >Priority: Major > -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Assigned] (ARROW-15812) [R] Allow user to supply col_names argument when reading in a CSV dataset
[ https://issues.apache.org/jira/browse/ARROW-15812?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Will Jones reassigned ARROW-15812: -- Assignee: Will Jones > [R] Allow user to supply col_names argument when reading in a CSV dataset > - > > Key: ARROW-15812 > URL: https://issues.apache.org/jira/browse/ARROW-15812 > Project: Apache Arrow > Issue Type: Sub-task > Components: R >Reporter: Nicola Crane >Assignee: Will Jones >Priority: Major > > Allow the user to supply the {{col_names}} argument from {{readr}} when > reading in a dataset. > This is already possible when reading in a single CSV file via > {{arrow::read_csv_arrow()}} via the {{readr_to_csv_read_options}} function, > and so once the C++ functionality to autogenerate column names for Datasets > is implemented, we should hook up {{readr_to_csv_read_options}} in > {{csv_file_format_read_opts}} just like we have with > {{readr_to_csv_parse_options}} in {{csv_file_format_parse_options}}. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (ARROW-15812) [R] Allow user to supply col_names argument when reading in a CSV dataset
[ https://issues.apache.org/jira/browse/ARROW-15812?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17636918#comment-17636918 ] Will Jones commented on ARROW-15812: Auto-generation of column names was added to Datasets in https://issues.apache.org/jira/browse/ARROW-16436 > [R] Allow user to supply col_names argument when reading in a CSV dataset > - > > Key: ARROW-15812 > URL: https://issues.apache.org/jira/browse/ARROW-15812 > Project: Apache Arrow > Issue Type: Sub-task > Components: R >Reporter: Nicola Crane >Priority: Major > > Allow the user to supply the {{col_names}} argument from {{readr}} when > reading in a dataset. > This is already possible when reading in a single CSV file via > {{arrow::read_csv_arrow()}} via the {{readr_to_csv_read_options}} function, > and so once the C++ functionality to autogenerate column names for Datasets > is implemented, we should hook up {{readr_to_csv_read_options}} in > {{csv_file_format_read_opts}} just like we have with > {{readr_to_csv_parse_options}} in {{csv_file_format_parse_options}}. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (ARROW-18378) MIGRATION: Disable issue reporting in ASF Jira
Todd Farmer created ARROW-18378: --- Summary: MIGRATION: Disable issue reporting in ASF Jira Key: ARROW-18378 URL: https://issues.apache.org/jira/browse/ARROW-18378 Project: Apache Arrow Issue Type: Task Reporter: Todd Farmer ARROW-18364 enabled issue reporting for Apache Arrow in GitHub issues. Even though existing Jira issues have not yet been migrated and are still being worked in the Jira system, we should assess disabling creation of new issues in ASF Jira, and instead pointing users to GitHub issues. This may benefit the project by reducing the need to monitor inflow in two discrete systems. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (ARROW-18377) MIGRATION: Automate component labels from issue form content
Todd Farmer created ARROW-18377: --- Summary: MIGRATION: Automate component labels from issue form content Key: ARROW-18377 URL: https://issues.apache.org/jira/browse/ARROW-18377 Project: Apache Arrow Issue Type: Task Reporter: Todd Farmer ARROW-18364 added the ability to report issues in GitHub, and includes GitHub issue templates with a drop-down component(s) selector. These form elements drive resulting issue markdown only, and cannot dynamically drive issue labels. This requires GitHub actions, which also have a few limitations. First, the issue form does not produce any structured data, it only produces the issue description markdown, so a parser is required. Second, ASF restricts GitHub actions to a selection of approved actions. It is likely that while community actions exist to generate structured data from issue forms, the Apache Arrow project will need to write its own parser and label application action. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (ARROW-18116) [R][Doc] correct paths for the read_parquet examples in cloud storage vignette
[ https://issues.apache.org/jira/browse/ARROW-18116?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17636887#comment-17636887 ] Sam Albers commented on ARROW-18116: This was fixed with https://issues.apache.org/jira/browse/ARROW-17448 so it can probably be closed as it is correct on the main docs now: https://arrow.apache.org/docs/r/articles/fs.html > [R][Doc] correct paths for the read_parquet examples in cloud storage vignette > -- > > Key: ARROW-18116 > URL: https://issues.apache.org/jira/browse/ARROW-18116 > Project: Apache Arrow > Issue Type: Bug > Components: Documentation, R >Reporter: Stephanie Hazlitt >Priority: Major > Labels: triaged > > {{The S3 file paths don't run:}} > {code:java} > > library(arrow) > > read_parquet(file = > > "s3://voltrondata-labs-datasets/nyc-taxi/year=2019/month=6/data.parquet") > Error in url(file, open = "rb") : URL scheme unsupported by this method{code} > {{It looks like the file names are `part-0.parquet` not `data.parquet`.}} > {{This runs:}} > {code:java} > read_parquet(file = > "s3://voltrondata-labs-datasets/nyc-taxi/year=2019/month=6/part-0.parquet"){code} > -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Resolved] (ARROW-18303) [Go] Missing tag for compute module
[ https://issues.apache.org/jira/browse/ARROW-18303?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Matthew Topol resolved ARROW-18303. --- Fix Version/s: 11.0.0 Resolution: Fixed Issue resolved by pull request 14690 [https://github.com/apache/arrow/pull/14690] > [Go] Missing tag for compute module > --- > > Key: ARROW-18303 > URL: https://issues.apache.org/jira/browse/ARROW-18303 > Project: Apache Arrow > Issue Type: Improvement > Components: Go >Affects Versions: 10.0.0 >Reporter: Lilian Maurel >Assignee: Matthew Topol >Priority: Major > Labels: pull-request-available > Fix For: 11.0.0 > > Original Estimate: 1h > Time Spent: 0.5h > Remaining Estimate: 0.5h > > Since https://issues.apache.org/jira/browse/ARROW-17456 compute is separate > to a separate module. > > import change to github.com/apache/arrow/go/v9/arrow/compute to > github.com/apache/arrow/go/arrow/compute/v10 > > Tag go/arrow/compute/v10.0.0 must be create for go mod resolution > > Also in go.mod > line module github.com/apache/arrow/go/v10/arrow/compute > must be change by module github.com/apache/arrow/go/arrow/compute/v10 -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (ARROW-18376) MIGRATION: Add component labels to GitHub
Todd Farmer created ARROW-18376: --- Summary: MIGRATION: Add component labels to GitHub Key: ARROW-18376 URL: https://issues.apache.org/jira/browse/ARROW-18376 Project: Apache Arrow Issue Type: Task Reporter: Todd Farmer Similar to ARROW-18375, component labels have been established based on existing component values defined in ASF Jira. The following labels are needed: * Component: Archery * Component: Benchmarking * Component: C * Component: C# * Component: C++ * Component: C++ - Gandiva * Component: C++ - Plasma * Component: Continuous Integration * Component: Dart * Component: Developer Tools * Component: Documentation * Component: FlightRPC * Component: Format * Component: GLib * Component: Go * Component: GPU * Component: Integration * Component: Java * Component: JavaScript * Component: MATLAB * Component: Packaging * Component: Parquet * Component: Python * Component: R * Component: Ruby * Component: Swift * Component: Website * Component: Other -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (ARROW-18375) MIGRATION: Enable GitHub issue type labels
Todd Farmer created ARROW-18375: --- Summary: MIGRATION: Enable GitHub issue type labels Key: ARROW-18375 URL: https://issues.apache.org/jira/browse/ARROW-18375 Project: Apache Arrow Issue Type: Task Reporter: Todd Farmer As part of enabling GitHub issue reporting, the following labels have been defined and need to be added to the repository label options. Without these labels added, [new issues|https://github.com/apache/arrow/issues/14692] do not get the issue template-defined issue type labels set properly. Labels: * Type: bug * Type: enhancement * Type: usage -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Resolved] (ARROW-17610) [C++] Support additional source types in SourceNode
[ https://issues.apache.org/jira/browse/ARROW-17610?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Weston Pace resolved ARROW-17610. - Fix Version/s: 11.0.0 Resolution: Fixed Issue resolved by pull request 14207 [https://github.com/apache/arrow/pull/14207] > [C++] Support additional source types in SourceNode > --- > > Key: ARROW-17610 > URL: https://issues.apache.org/jira/browse/ARROW-17610 > Project: Apache Arrow > Issue Type: New Feature > Components: C++ >Reporter: Yaron Gvili >Assignee: Yaron Gvili >Priority: Major > Labels: pull-request-available > Fix For: 11.0.0 > > Time Spent: 5.5h > Remaining Estimate: 0h > > This issue will add support for `ArrayVector`, `ExecBatch`, and `RecordBatch` > sources in `SourceNode`. See [this > thread|https://lists.apache.org/thread/9l23c0w48ywx314klbyshz8ntyzgs1zw] for > context. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (ARROW-18303) [Go] Missing tag for compute module
[ https://issues.apache.org/jira/browse/ARROW-18303?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17636826#comment-17636826 ] Matthew Topol commented on ARROW-18303: --- Instead of keeping the compute module as a separate module, the attached PR marks every file in the Compute package and it's sub-packages as only buildable in go1.18+ so that it can be maintained as part of the arrow module (rather than requiring separate git tags and an entirely separate module) without breaking compatibility for earlier Go versions. The module docs have been updated to make this explicit and state that the {{compute}} package requires go1.18, but the rest of the Arrow module maintains compatibility with Go 1.17+ > [Go] Missing tag for compute module > --- > > Key: ARROW-18303 > URL: https://issues.apache.org/jira/browse/ARROW-18303 > Project: Apache Arrow > Issue Type: Improvement > Components: Go >Affects Versions: 10.0.0 >Reporter: Lilian Maurel >Assignee: Matthew Topol >Priority: Major > Labels: pull-request-available > Original Estimate: 1h > Time Spent: 20m > Remaining Estimate: 40m > > Since https://issues.apache.org/jira/browse/ARROW-17456 compute is separate > to a separate module. > > import change to github.com/apache/arrow/go/v9/arrow/compute to > github.com/apache/arrow/go/arrow/compute/v10 > > Tag go/arrow/compute/v10.0.0 must be create for go mod resolution > > Also in go.mod > line module github.com/apache/arrow/go/v10/arrow/compute > must be change by module github.com/apache/arrow/go/arrow/compute/v10 -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Resolved] (ARROW-18374) [Go][CI][Benchmarks] Fix Go Bench Script after conbench change
[ https://issues.apache.org/jira/browse/ARROW-18374?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Matthew Topol resolved ARROW-18374. --- Fix Version/s: 11.0.0 Resolution: Fixed Issue resolved by pull request 14689 [https://github.com/apache/arrow/pull/14689] > [Go][CI][Benchmarks] Fix Go Bench Script after conbench change > -- > > Key: ARROW-18374 > URL: https://issues.apache.org/jira/browse/ARROW-18374 > Project: Apache Arrow > Issue Type: Bug > Components: Benchmarking, Continuous Integration, Go >Reporter: Matthew Topol >Assignee: Matthew Topol >Priority: Major > Labels: pull-request-available > Fix For: 11.0.0 > > Time Spent: 20m > Remaining Estimate: 0h > > Change [https://github.com/conbench/conbench/pull/417/files#] requires now > putting an explicit {{github=None}} as an argument to {{BenchmarkResult}} to > have it get the github info from the locally cloned repo. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (ARROW-18371) [C++] Expose *FromJSON helpers
[ https://issues.apache.org/jira/browse/ARROW-18371?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17636820#comment-17636820 ] Weston Pace commented on ARROW-18371: - {{MakeBasicBatches}} I agree is a definite no. The new source types being added in ARROW-17610 should remove any real need for {{BatchesWithSchema}} so I think that is a no too. I agree the random data generation could be quite useful. I think it would also be quite interesting to expose random data generation as an exec node but probably shouldn't add more work :) > [C++] Expose *FromJSON helpers > -- > > Key: ARROW-18371 > URL: https://issues.apache.org/jira/browse/ARROW-18371 > Project: Apache Arrow > Issue Type: New Feature > Components: C++ >Reporter: Rok Mihevc >Priority: Major > Labels: testing > > {Array,{{Exec,Record}Batch}FromJSON helper functions would be useful when > testing in projects that use Arrow. BatchesWithSchema and MakeBasicBatches > could be considered as well. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (ARROW-18303) [Go] Missing tag for compute module
[ https://issues.apache.org/jira/browse/ARROW-18303?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated ARROW-18303: --- Labels: pull-request-available (was: ) > [Go] Missing tag for compute module > --- > > Key: ARROW-18303 > URL: https://issues.apache.org/jira/browse/ARROW-18303 > Project: Apache Arrow > Issue Type: Improvement > Components: Go >Affects Versions: 10.0.0 >Reporter: Lilian Maurel >Assignee: Matthew Topol >Priority: Major > Labels: pull-request-available > Original Estimate: 1h > Time Spent: 10m > Remaining Estimate: 50m > > Since https://issues.apache.org/jira/browse/ARROW-17456 compute is separate > to a separate module. > > import change to github.com/apache/arrow/go/v9/arrow/compute to > github.com/apache/arrow/go/arrow/compute/v10 > > Tag go/arrow/compute/v10.0.0 must be create for go mod resolution > > Also in go.mod > line module github.com/apache/arrow/go/v10/arrow/compute > must be change by module github.com/apache/arrow/go/arrow/compute/v10 -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (ARROW-18374) [Go][CI][Benchmarks] Fix Go Bench Script after conbench change
[ https://issues.apache.org/jira/browse/ARROW-18374?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated ARROW-18374: --- Labels: pull-request-available (was: ) > [Go][CI][Benchmarks] Fix Go Bench Script after conbench change > -- > > Key: ARROW-18374 > URL: https://issues.apache.org/jira/browse/ARROW-18374 > Project: Apache Arrow > Issue Type: Bug > Components: Benchmarking, Continuous Integration, Go >Reporter: Matthew Topol >Assignee: Matthew Topol >Priority: Major > Labels: pull-request-available > Time Spent: 10m > Remaining Estimate: 0h > > Change [https://github.com/conbench/conbench/pull/417/files#] requires now > putting an explicit {{github=None}} as an argument to {{BenchmarkResult}} to > have it get the github info from the locally cloned repo. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (ARROW-18374) [Go][CI][Benchmarks] Fix Go Bench Script after conbench change
Matthew Topol created ARROW-18374: - Summary: [Go][CI][Benchmarks] Fix Go Bench Script after conbench change Key: ARROW-18374 URL: https://issues.apache.org/jira/browse/ARROW-18374 Project: Apache Arrow Issue Type: Bug Components: Benchmarking, Continuous Integration, Go Reporter: Matthew Topol Assignee: Matthew Topol Change [https://github.com/conbench/conbench/pull/417/files#] requires now putting an explicit {{github=None}} as an argument to {{BenchmarkResult}} to have it get the github info from the locally cloned repo. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Resolved] (ARROW-18343) [C++] AllocateBitmap() with out parameter is declared but not defined
[ https://issues.apache.org/jira/browse/ARROW-18343?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Antoine Pitrou resolved ARROW-18343. Fix Version/s: 11.0.0 Resolution: Fixed Issue resolved by pull request 14657 [https://github.com/apache/arrow/pull/14657] > [C++] AllocateBitmap() with out parameter is declared but not defined > - > > Key: ARROW-18343 > URL: https://issues.apache.org/jira/browse/ARROW-18343 > Project: Apache Arrow > Issue Type: Bug > Components: C++ >Reporter: Jin Shang >Assignee: Jin Shang >Priority: Major > Labels: pull-request-available > Fix For: 11.0.0 > > Time Spent: 1h 10m > Remaining Estimate: 0h > > [This variant of > AllocateBitmap|https://github.com/apache/arrow/blob/master/cpp/src/arrow/buffer.h#L483] > is declared but not defined in buffer.cc. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Assigned] (ARROW-17504) [C++] Adding Fetch Node ToProto
[ https://issues.apache.org/jira/browse/ARROW-17504?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Arrow JIRA Bot reassigned ARROW-17504: - Assignee: (was: Vibhatha Lakmal Abeykoon) > [C++] Adding Fetch Node ToProto > --- > > Key: ARROW-17504 > URL: https://issues.apache.org/jira/browse/ARROW-17504 > Project: Apache Arrow > Issue Type: Sub-task >Reporter: Vibhatha Lakmal Abeykoon >Priority: Major > > For roundtrip testing adding the Fetch declaration serialization. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (ARROW-17502) [C++] Fetch Node Substrait Integration
[ https://issues.apache.org/jira/browse/ARROW-17502?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17636797#comment-17636797 ] Apache Arrow JIRA Bot commented on ARROW-17502: --- This issue was last updated over 90 days ago, which may be an indication it is no longer being actively worked. To better reflect the current state, the issue is being unassigned per [project policy|https://arrow.apache.org/docs/dev/developers/bug_reports.html#issue-assignment]. Please feel free to re-take assignment of the issue if it is being actively worked, or if you plan to start that work soon. > [C++] Fetch Node Substrait Integration > -- > > Key: ARROW-17502 > URL: https://issues.apache.org/jira/browse/ARROW-17502 > Project: Apache Arrow > Issue Type: New Feature > Components: C++ >Reporter: Vibhatha Lakmal Abeykoon >Assignee: Vibhatha Lakmal Abeykoon >Priority: Major > > Fetch Node is a newly added node (WIP at the moment[1]). After finalizing the > Fetch node creation, this needs to integrated with Substrait. > > [1].[Fetch Node Creation |https://issues.apache.org/jira/browse/ARROW-17190] -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (ARROW-17504) [C++] Adding Fetch Node ToProto
[ https://issues.apache.org/jira/browse/ARROW-17504?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17636796#comment-17636796 ] Apache Arrow JIRA Bot commented on ARROW-17504: --- This issue was last updated over 90 days ago, which may be an indication it is no longer being actively worked. To better reflect the current state, the issue is being unassigned per [project policy|https://arrow.apache.org/docs/dev/developers/bug_reports.html#issue-assignment]. Please feel free to re-take assignment of the issue if it is being actively worked, or if you plan to start that work soon. > [C++] Adding Fetch Node ToProto > --- > > Key: ARROW-17504 > URL: https://issues.apache.org/jira/browse/ARROW-17504 > Project: Apache Arrow > Issue Type: Sub-task >Reporter: Vibhatha Lakmal Abeykoon >Assignee: Vibhatha Lakmal Abeykoon >Priority: Major > > For roundtrip testing adding the Fetch declaration serialization. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Assigned] (ARROW-17502) [C++] Fetch Node Substrait Integration
[ https://issues.apache.org/jira/browse/ARROW-17502?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Arrow JIRA Bot reassigned ARROW-17502: - Assignee: (was: Vibhatha Lakmal Abeykoon) > [C++] Fetch Node Substrait Integration > -- > > Key: ARROW-17502 > URL: https://issues.apache.org/jira/browse/ARROW-17502 > Project: Apache Arrow > Issue Type: New Feature > Components: C++ >Reporter: Vibhatha Lakmal Abeykoon >Priority: Major > > Fetch Node is a newly added node (WIP at the moment[1]). After finalizing the > Fetch node creation, this needs to integrated with Substrait. > > [1].[Fetch Node Creation |https://issues.apache.org/jira/browse/ARROW-17190] -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (ARROW-18371) [C++] Expose *FromJSON helpers
[ https://issues.apache.org/jira/browse/ARROW-18371?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17636794#comment-17636794 ] Antoine Pitrou commented on ARROW-18371: > I assume the comment is regarding BatchesWithSchema and MakeBasicBatches. Yes, this is what I meant. Sorry for the imprecision. > [C++] Expose *FromJSON helpers > -- > > Key: ARROW-18371 > URL: https://issues.apache.org/jira/browse/ARROW-18371 > Project: Apache Arrow > Issue Type: New Feature > Components: C++ >Reporter: Rok Mihevc >Priority: Major > Labels: testing > > {Array,{{Exec,Record}Batch}FromJSON helper functions would be useful when > testing in projects that use Arrow. BatchesWithSchema and MakeBasicBatches > could be considered as well. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (ARROW-18373) MIGRATION: Enable multiple component selection in issue templates
[ https://issues.apache.org/jira/browse/ARROW-18373?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated ARROW-18373: --- Labels: pull-request-available (was: ) > MIGRATION: Enable multiple component selection in issue templates > - > > Key: ARROW-18373 > URL: https://issues.apache.org/jira/browse/ARROW-18373 > Project: Apache Arrow > Issue Type: Task >Reporter: Todd Farmer >Assignee: Todd Farmer >Priority: Major > Labels: pull-request-available > Time Spent: 10m > Remaining Estimate: 0h > > Per comments in [this merged PR|https://github.com/apache/arrow/pull/14675], > we would like to enable selection of multiple components when reporting > issues via GitHub issues. > Additionally, we may want to add the needed Apache license to the issue > templates and remove the exclusion rules from rat_exclude_files.txt. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Comment Edited] (ARROW-18371) [C++] Expose *FromJSON helpers
[ https://issues.apache.org/jira/browse/ARROW-18371?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17636755#comment-17636755 ] Rok Mihevc edited comment on ARROW-18371 at 11/21/22 4:10 PM: -- I assume the comment is regarding BatchesWithSchema and MakeBasicBatches. was (Author: rokm): I assume the comment regarding BatchesWithSchema and MakeBasicBatches. > [C++] Expose *FromJSON helpers > -- > > Key: ARROW-18371 > URL: https://issues.apache.org/jira/browse/ARROW-18371 > Project: Apache Arrow > Issue Type: New Feature > Components: C++ >Reporter: Rok Mihevc >Priority: Major > Labels: testing > > {Array,{{Exec,Record}Batch}FromJSON helper functions would be useful when > testing in projects that use Arrow. BatchesWithSchema and MakeBasicBatches > could be considered as well. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (ARROW-18371) [C++] Expose *FromJSON helpers
[ https://issues.apache.org/jira/browse/ARROW-18371?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17636755#comment-17636755 ] Rok Mihevc commented on ARROW-18371: I assume the comment regarding BatchesWithSchema and MakeBasicBatches. > [C++] Expose *FromJSON helpers > -- > > Key: ARROW-18371 > URL: https://issues.apache.org/jira/browse/ARROW-18371 > Project: Apache Arrow > Issue Type: New Feature > Components: C++ >Reporter: Rok Mihevc >Priority: Major > Labels: testing > > {Array,{{Exec,Record}Batch}FromJSON helper functions would be useful when > testing in projects that use Arrow. BatchesWithSchema and MakeBasicBatches > could be considered as well. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (ARROW-18371) [C++] Expose *FromJSON helpers
[ https://issues.apache.org/jira/browse/ARROW-18371?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17636754#comment-17636754 ] Li Jin commented on ARROW-18371: > Definitely not. These are functions generating ad hoc data tailored for > specific tests, with little consistency. To clarify, do you know the \{Array,{{Exec,Record}Batch}FromJSON or BatchesWithSchema/MakeBasicBatches > [C++] Expose *FromJSON helpers > -- > > Key: ARROW-18371 > URL: https://issues.apache.org/jira/browse/ARROW-18371 > Project: Apache Arrow > Issue Type: New Feature > Components: C++ >Reporter: Rok Mihevc >Priority: Major > Labels: testing > > {Array,{{Exec,Record}Batch}FromJSON helper functions would be useful when > testing in projects that use Arrow. BatchesWithSchema and MakeBasicBatches > could be considered as well. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Comment Edited] (ARROW-18371) [C++] Expose *FromJSON helpers
[ https://issues.apache.org/jira/browse/ARROW-18371?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17636754#comment-17636754 ] Li Jin edited comment on ARROW-18371 at 11/21/22 4:08 PM: -- > Definitely not. These are functions generating ad hoc data tailored for > specific tests, with little consistency. To clarify, do you mean the \{Array,{{Exec,Record}Batch}FromJSON or BatchesWithSchema/MakeBasicBatches was (Author: icexelloss): > Definitely not. These are functions generating ad hoc data tailored for > specific tests, with little consistency. To clarify, do you know the \{Array,{{Exec,Record}Batch}FromJSON or BatchesWithSchema/MakeBasicBatches > [C++] Expose *FromJSON helpers > -- > > Key: ARROW-18371 > URL: https://issues.apache.org/jira/browse/ARROW-18371 > Project: Apache Arrow > Issue Type: New Feature > Components: C++ >Reporter: Rok Mihevc >Priority: Major > Labels: testing > > {Array,{{Exec,Record}Batch}FromJSON helper functions would be useful when > testing in projects that use Arrow. BatchesWithSchema and MakeBasicBatches > could be considered as well. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (ARROW-18371) [C++] Expose *FromJSON helpers
[ https://issues.apache.org/jira/browse/ARROW-18371?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17636747#comment-17636747 ] Antoine Pitrou commented on ARROW-18371: Definitely not. These are functions generating ad hoc data tailored for specific tests, with little consistency. We could expose the Random generation class, though, possibly together with some API cleanup. > [C++] Expose *FromJSON helpers > -- > > Key: ARROW-18371 > URL: https://issues.apache.org/jira/browse/ARROW-18371 > Project: Apache Arrow > Issue Type: New Feature > Components: C++ >Reporter: Rok Mihevc >Priority: Major > Labels: testing > > {Array,{{Exec,Record}Batch}FromJSON helper functions would be useful when > testing in projects that use Arrow. BatchesWithSchema and MakeBasicBatches > could be considered as well. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Resolved] (ARROW-18110) [Go] Scalar Comparisons
[ https://issues.apache.org/jira/browse/ARROW-18110?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Matthew Topol resolved ARROW-18110. --- Fix Version/s: 11.0.0 Resolution: Fixed Issue resolved by pull request 14669 [https://github.com/apache/arrow/pull/14669] > [Go] Scalar Comparisons > --- > > Key: ARROW-18110 > URL: https://issues.apache.org/jira/browse/ARROW-18110 > Project: Apache Arrow > Issue Type: Sub-task > Components: Go >Reporter: Matthew Topol >Assignee: Matthew Topol >Priority: Major > Labels: pull-request-available > Fix For: 11.0.0 > > Time Spent: 40m > Remaining Estimate: 0h > -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (ARROW-18371) [C++] Expose *FromJSON helpers
[ https://issues.apache.org/jira/browse/ARROW-18371?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17636736#comment-17636736 ] Rok Mihevc commented on ARROW-18371: *FromJSON functions seem clear cut. How about also adding BatchesWithSchema and MakeBasicBatches. [~apitrou] [~westonpace] > [C++] Expose *FromJSON helpers > -- > > Key: ARROW-18371 > URL: https://issues.apache.org/jira/browse/ARROW-18371 > Project: Apache Arrow > Issue Type: New Feature > Components: C++ >Reporter: Rok Mihevc >Priority: Major > Labels: testing > > {Array,{{Exec,Record}Batch}FromJSON helper functions would be useful when > testing in projects that use Arrow. BatchesWithSchema and MakeBasicBatches > could be considered as well. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Assigned] (ARROW-18373) MIGRATION: Enable multiple component selection in issue templates
[ https://issues.apache.org/jira/browse/ARROW-18373?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Todd Farmer reassigned ARROW-18373: --- Assignee: Todd Farmer > MIGRATION: Enable multiple component selection in issue templates > - > > Key: ARROW-18373 > URL: https://issues.apache.org/jira/browse/ARROW-18373 > Project: Apache Arrow > Issue Type: Task >Reporter: Todd Farmer >Assignee: Todd Farmer >Priority: Major > > Per comments in [this merged PR|https://github.com/apache/arrow/pull/14675], > we would like to enable selection of multiple components when reporting > issues via GitHub issues. > Additionally, we may want to add the needed Apache license to the issue > templates and remove the exclusion rules from rat_exclude_files.txt. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (ARROW-18373) MIGRATION: Enable multiple component selection in issue templates
Todd Farmer created ARROW-18373: --- Summary: MIGRATION: Enable multiple component selection in issue templates Key: ARROW-18373 URL: https://issues.apache.org/jira/browse/ARROW-18373 Project: Apache Arrow Issue Type: Task Reporter: Todd Farmer Per comments in [this merged PR|https://github.com/apache/arrow/pull/14675], we would like to enable selection of multiple components when reporting issues via GitHub issues. Additionally, we may want to add the needed Apache license to the issue templates and remove the exclusion rules from rat_exclude_files.txt. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Resolved] (ARROW-18323) MIGRATION TEST ISSUE #2
[ https://issues.apache.org/jira/browse/ARROW-18323?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nicola Crane resolved ARROW-18323. -- Fix Version/s: 11.0.0 Resolution: Fixed Issue resolved by pull request 14675 [https://github.com/apache/arrow/pull/14675] > MIGRATION TEST ISSUE #2 > --- > > Key: ARROW-18323 > URL: https://issues.apache.org/jira/browse/ARROW-18323 > Project: Apache Arrow > Issue Type: Task >Reporter: Todd Farmer >Assignee: Todd Farmer >Priority: Major > Labels: pull-request-available > Fix For: 11.0.0 > > Time Spent: 1h > Remaining Estimate: 0h > > This issue was created to help test migration-related process and tooling. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Assigned] (ARROW-18363) [Docs] Include warning when viewing old docs (redirecting to stable/dev docs)
[ https://issues.apache.org/jira/browse/ARROW-18363?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alenka Frim reassigned ARROW-18363: --- Assignee: Alenka Frim > [Docs] Include warning when viewing old docs (redirecting to stable/dev docs) > - > > Key: ARROW-18363 > URL: https://issues.apache.org/jira/browse/ARROW-18363 > Project: Apache Arrow > Issue Type: Improvement > Components: Documentation >Reporter: Joris Van den Bossche >Assignee: Alenka Frim >Priority: Major > Labels: pull-request-available > Time Spent: 0.5h > Remaining Estimate: 0h > > Now we have versioned docs, we also have the old versions of the developers > docs (eg > https://arrow.apache.org/docs/9.0/developers/guide/communication.html). Those > might be outdated (eg regarding communication channels, build instructions, > etc), and typically when contributing / developing with the latest arrow, one > should _always_ check the latest dev version of the contributing docs. > We could add a warning box pointing this out and linking to the dev docs. > For example similarly how some projects warn about viewing old docs in > general and point to the stable docs (eg https://mne.tools/1.1/index.html or > https://scikit-learn.org/1.0/user_guide.html). In this case we could have a > custom box when at a page in /developers to point to the dev docs instead of > stable docs -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (ARROW-18363) [Docs] Include warning when viewing old docs (redirecting to stable/dev docs)
[ https://issues.apache.org/jira/browse/ARROW-18363?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated ARROW-18363: --- Labels: pull-request-available (was: ) > [Docs] Include warning when viewing old docs (redirecting to stable/dev docs) > - > > Key: ARROW-18363 > URL: https://issues.apache.org/jira/browse/ARROW-18363 > Project: Apache Arrow > Issue Type: Improvement > Components: Documentation >Reporter: Joris Van den Bossche >Priority: Major > Labels: pull-request-available > Time Spent: 10m > Remaining Estimate: 0h > > Now we have versioned docs, we also have the old versions of the developers > docs (eg > https://arrow.apache.org/docs/9.0/developers/guide/communication.html). Those > might be outdated (eg regarding communication channels, build instructions, > etc), and typically when contributing / developing with the latest arrow, one > should _always_ check the latest dev version of the contributing docs. > We could add a warning box pointing this out and linking to the dev docs. > For example similarly how some projects warn about viewing old docs in > general and point to the stable docs (eg https://mne.tools/1.1/index.html or > https://scikit-learn.org/1.0/user_guide.html). In this case we could have a > custom box when at a page in /developers to point to the dev docs instead of > stable docs -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (ARROW-18372) [R] "Error in `collect()`: ! Invalid: negative malloc size" after large computation returning one cell
Lucas Mation created ARROW-18372: Summary: [R] "Error in `collect()`: ! Invalid: negative malloc size" after large computation returning one cell Key: ARROW-18372 URL: https://issues.apache.org/jira/browse/ARROW-18372 Project: Apache Arrow Issue Type: Bug Components: R Affects Versions: 10.0.0 Reporter: Lucas Mation I have a large parquet file 900 million rows , 40cols parquet file, subdivided into folders for each year. I was trying to calculate how many unique combinations of id1+id2+id3+id4 there are in the dataset. Notice that the "collected" dataset is supposed to be only one row and one cel, containing the count (I've confirmed this by subseting the dataset ("%>% head(10^6)" ) before computing the count, and it works). That is why the error below is so weird ``` fa <- 'myparteq folder' #huge va <- open_dataset(fa) tic() d <- va %>% head(10^6) %>% count(id1,id2,id3,id4) %>% count %>% collect toc() Error in `collect()`: ! Invalid: negative malloc size Run `rlang::last_error()` to see where the error occurred. > rlang::last_error() Error in `collect()`: ! Invalid: negative malloc size --- Backtrace: 1. ... %>% collect 3. arrow:::collect.arrow_dplyr_query(.) Run `rlang::last_trace()` to see the full context. > rlang::last_trace() Error in `collect()`: ! Invalid: negative malloc size --- Backtrace: x 1. +-... %>% collect 2. +-dplyr::collect(.) 3. \-arrow:::collect.arrow_dplyr_query(.) 4. \-base::tryCatch(...) 5. \-base (local) tryCatchList(expr, classes, parentenv, handlers) 6. \-base (local) tryCatchOne(expr, names, parentenv, handlers[[1L]]) 7. \-value[[3L]](cond) 8. \-arrow:::augment_io_error_msg(e, call, schema = x$.data$schema) 9. \-rlang::abort(msg, call = call) ``` I am running this on a windows server, 512Gb of RAM. sessionInfo() R version 4.2.1 (2022-06-23 ucrt) Platform: x86_64-w64-mingw32/x64 (64-bit) Running under: Windows Server 2012 R2 x64 (build 9600) Matrix products: default locale: [1] LC_COLLATE=Portuguese_Brazil.1252 LC_CTYPE=Portuguese_Brazil.1252 LC_MONETARY=Portuguese_Brazil.1252 LC_NUMERIC=C [5] LC_TIME=Portuguese_Brazil.1252 attached base packages: [1] stats graphics grDevices utils datasets methods base other attached packages: [1] arrow_10.0.0 data.table_1.14.4 forcats_0.5.2 dplyr_1.0.10 purrr_0.3.5 readr_2.1.3 tidyr_1.2.1 tibble_3.1.8 [9] ggplot2_3.3.6 tidyverse_1.3.2 gt_0.7.0 xtable_1.8-4 ggthemes_4.2.4 collapse_1.8.6 pryr_0.1.5 janitor_2.1.0 [17] tictoc_1.1 lubridate_1.8.0 stringr_1.4.1 readxl_1.4.1 loaded via a namespace (and not attached): [1] Rcpp_1.0.9 assertthat_0.2.1 digest_0.6.30 utf8_1.2.2 R6_2.5.1 cellranger_1.1.0 backports_1.4.1 [8] reprex_2.0.2 httr_1.4.4 pillar_1.8.1 rlang_1.0.6 googlesheets4_1.0.1 rstudioapi_0.14 googledrive_2.0.0 [15] bit_4.0.4 munsell_0.5.0 broom_1.0.1 compiler_4.2.1 modelr_0.1.9 pkgconfig_2.0.3 htmltools_0.5.3 [22] tidyselect_1.2.0 codetools_0.2-18 fansi_1.0.3 crayon_1.5.2 tzdb_0.3.0 dbplyr_2.2.1 withr_2.5.0 [29] grid_4.2.1 jsonlite_1.8.3 gtable_0.3.1 lifecycle_1.0.3 DBI_1.1.3 magrittr_2.0.3 scales_1.2.1 [36] cli_3.4.1 stringi_1.7.8 fs_1.5.2 snakecase_0.11.0 xml2_1.3.3 ellipsis_0.3.2 generics_0.1.3 [43] vctrs_0.5.0 tools_4.2.1 bit64_4.0.5 glue_1.6.2 hms_1.1.2 parallel_4.2.1 fastmap_1.1.0 [50] colorspace_2.0-3 gargle_1.2.1 rvest_1.0.3 haven_2.5.1 arrow_info() Arrow package version: 10.0.0 Capabilities: dataset TRUE substrait FALSE parquet TRUE json TRUE s3 TRUE gcs TRUE utf8proc TRUE re2 TRUE snappy TRUE gzip TRUE brotli TRUE zstd TRUE lz4 TRUE lz4_frame TRUE lzo FALSE bz2 TRUE jemalloc FALSE mimalloc TRUE Arrow options(): arrow.use_threads FALSE Memory: Allocator mimalloc Current 74.82 Gb Max 97.75 Gb Runtime: SIMD Level avx2 Detected SIMD Level avx2 Build: C++ Library Version 10.0.0 C++ Compiler GNU C++ Compiler Version 10.3.0 Git ID aa7118b6e5f49b354fa8a93d9cf363c9ebe9a3f0 -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (ARROW-18371) [C++] Expose *FromJSON helpers
Rok Mihevc created ARROW-18371: -- Summary: [C++] Expose *FromJSON helpers Key: ARROW-18371 URL: https://issues.apache.org/jira/browse/ARROW-18371 Project: Apache Arrow Issue Type: New Feature Components: C++ Reporter: Rok Mihevc {Array,{{Exec,Record}Batch}FromJSON helper functions would be useful when testing in projects that use Arrow. BatchesWithSchema and MakeBasicBatches could be considered as well. -- This message was sent by Atlassian Jira (v8.20.10#820010)