[ https://issues.apache.org/jira/browse/ARROW-6407?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Neal Richardson updated ARROW-6407: ----------------------------------- Description: This will prevent issues like ARROW-6406 After ARROW-8266 ensures every _ep has at least two URLs this problem becomes harder since we have a few classes of URL which don't necessarily overlap: preferred (highest priority during build configuration), canonical (whence new tarballs should be downloaded when versions get bumped), and backup. It's difficult to represent this in a txt file. A script should be available for automatically bumping versions (a generalization of upload-boost.sh): - update versions.txt, including checksums - run download_dependencies.sh to get fresh archives - run (or inline) trim-boost.sh to trim the boost archive - update ursalabs-managed archives: https://github.com/ursa-labs/thirdparty/releases/tag/latest Checksums: If a checksum is not provided to ExternalProject_Add it may re-download a tarball even if that's not necessary (any time the generated build must be modified, IIUC). Ideally we should provide a checksum for all _eps (and not just Thrift) to cut down on unnecessary network access when building bundled. ARROW-8222 introduces a subtlety: we now default to a trimmed boost archive which contains only the 10% which we need, only falling back to a full boost archive when that fails to download. This is faster but not equivalent to the full boost archive so we can't provide a single checksum which matches both. This will probably just mean that those cases are extra slow was: This will prevent issues like ARROW-6406 After ARROW-8266 ensures every _ep has at least two URLs this problem becomes harder since we have a few classes of URL which don't necessarily overlap: preferred (highest priority during build configuration), canonical (whence new tarballs should be downloaded when versions get bumped), and backup. It's difficult to represent this in a txt file. A script should be available for automatically bumping versions (a generalization of upload-boost.sh): - update versions.txt, including checksums - run download_dependencies.sh to get fresh archives - run (or inline) trim-boost.sh to trim the boost archive - update ursalabs-managed archives in https://dl.bintray.com/ursalabs/ and https://github.com/ursa-labs/thirdparty/releases/tag/latest Checksums: If a checksum is not provided to ExternalProject_Add it may re-download a tarball even if that's not necessary (any time the generated build must be modified, IIUC). Ideally we should provide a checksum for all _eps (and not just Thrift) to cut down on unnecessary network access when building bundled. ARROW-8222 introduces a subtlety: we now default to a trimmed boost archive which contains only the 10% which we need, only falling back to a full boost archive when that fails to download. This is faster but not equivalent to the full boost archive so we can't provide a single checksum which matches both. This will probably just mean that those cases are extra slow > [C++] Consolidate thirdparty bundle URLs, version bumping logic, etc > -------------------------------------------------------------------- > > Key: ARROW-6407 > URL: https://issues.apache.org/jira/browse/ARROW-6407 > Project: Apache Arrow > Issue Type: Improvement > Components: C++ > Affects Versions: 0.16.0 > Reporter: Wes McKinney > Priority: Major > Fix For: 4.0.0 > > > This will prevent issues like ARROW-6406 > After ARROW-8266 ensures every _ep has at least two URLs this problem becomes > harder since we have a few classes of URL which don't necessarily overlap: > preferred (highest priority during build configuration), canonical (whence > new tarballs should be downloaded when versions get bumped), and backup. It's > difficult to represent this in a txt file. > A script should be available for automatically bumping versions (a > generalization of upload-boost.sh): > - update versions.txt, including checksums > - run download_dependencies.sh to get fresh archives > - run (or inline) trim-boost.sh to trim the boost archive > - update ursalabs-managed archives: > https://github.com/ursa-labs/thirdparty/releases/tag/latest > Checksums: > If a checksum is not provided to ExternalProject_Add it may re-download a > tarball even if that's not necessary (any time the generated build must be > modified, IIUC). Ideally we should provide a checksum for all _eps (and not > just Thrift) to cut down on unnecessary network access when building bundled. > ARROW-8222 introduces a subtlety: we now default to a trimmed boost archive > which contains only the 10% which we need, only falling back to a full boost > archive when that fails to download. This is faster but not equivalent to the > full boost archive so we can't provide a single checksum which matches both. > This will probably just mean that those cases are extra slow -- This message was sent by Atlassian Jira (v8.3.4#803005)