[jira] [Updated] (ARROW-16878) [R] Move Windows GCS dependency building upstream

2022-06-25 Thread Neal Richardson (Jira)


 [ 
https://issues.apache.org/jira/browse/ARROW-16878?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Neal Richardson updated ARROW-16878:

Summary: [R] Move Windows GCS dependency building upstream  (was: [R] Add 
GCS to Windows builds)

> [R] Move Windows GCS dependency building upstream
> -
>
> Key: ARROW-16878
> URL: https://issues.apache.org/jira/browse/ARROW-16878
> Project: Apache Arrow
>  Issue Type: New Feature
>  Components: Packaging, R
>Reporter: Neal Richardson
>Priority: Major
> Fix For: 9.0.0
>
>
> On ARROW-16510, I made some progress on this but had to back out the changes. 
> There is no google-cloud-cpp in https://github.com/msys2/MINGW-packages, so 
> either we'd have to make one up for rtools-packages, or we use the bundled 
> google-cloud-cpp in our cmake and see if we can put as many of its 
> dependencies in rtools-packages to ease the build. 
> https://github.com/msys2/MINGW-packages/tree/master/mingw-w64-nlohmann-json 
> exists in MINGW-packages and could be brought over, but I don't think it's a 
> big deal if it is bundled.
> https://github.com/msys2/MINGW-packages/blob/master/mingw-w64-abseil-cpp/PKGBUILD
>  exists and could be brought over, but note that it uses C++17. 
> It may be that we have to bump arrow up to C++17 for this to work anyway, 
> judging from the undefined symbols errors I got (see below). 
> https://github.com/msys2/MINGW-packages/pull/11758 and 
> https://github.com/apache/arrow/pull/13407 suggest maybe so. That should work 
> ok for rtools >= 40, but we'll see what other problems it brings.
> In case it's relevant, 
> https://github.com/r-windows/rtools-packages/blob/master/mingw-w64-grpc/PKGBUILD
>  exists in rtools-packages already, and grpc depends on abseil, but note how 
> it handles abseil. It doesn't use all of the same parts of abseil that 
> google-cloud-cpp does, so maybe that's fine there but not here?
> Here's the diff I backed out of ARROW-16510, and below that is the 
> undefined-symbol messages from the build failure. There's something up with 
> libcurl too that I don't understand because I added it.
> {code}
> diff --git a/ci/scripts/PKGBUILD b/ci/scripts/PKGBUILD
> index b9b0194f5c8c..566ec881e404 100644
> --- a/ci/scripts/PKGBUILD
> +++ b/ci/scripts/PKGBUILD
> @@ -25,6 +25,7 @@ arch=("any")
>  url="https://arrow.apache.org/";
>  license=("Apache-2.0")
>  depends=("${MINGW_PACKAGE_PREFIX}-aws-sdk-cpp"
> + "${MINGW_PACKAGE_PREFIX}-curl" # for google-cloud-cpp bundled build
>   "${MINGW_PACKAGE_PREFIX}-libutf8proc"
>   "${MINGW_PACKAGE_PREFIX}-re2"
>   "${MINGW_PACKAGE_PREFIX}-thrift"
> @@ -79,11 +80,13 @@ build() {
>  export PATH="/C/Rtools${MINGW_PREFIX/mingw/mingw_}/bin:$PATH"
>  export CPPFLAGS="${CPPFLAGS} -I${MINGW_PREFIX}/include"
>  export LIBS="-L${MINGW_PREFIX}/libs"
> +export ARROW_GCS=OFF
>  export ARROW_S3=OFF
>  export ARROW_WITH_RE2=OFF
>  # Without this, some dataset functionality segfaults
>  export CMAKE_UNITY_BUILD=ON
>else
> +export ARROW_GCS=ON
>  export ARROW_S3=ON
>  export ARROW_WITH_RE2=ON
>  # Without this, some compute functionality segfaults in tests
> @@ -101,6 +104,7 @@ build() {
>  -DARROW_CSV=ON \
>  -DARROW_DATASET=ON \
>  -DARROW_FILESYSTEM=ON \
> +-DARROW_GCS="${ARROW_GCS}" \
>  -DARROW_HDFS=OFF \
>  -DARROW_JEMALLOC=OFF \
>  -DARROW_JSON=ON \
> diff --git a/ci/scripts/r_windows_build.sh b/ci/scripts/r_windows_build.sh
> index 89d5737a09bd..3334eab8663a 100755
> --- a/ci/scripts/r_windows_build.sh
> +++ b/ci/scripts/r_windows_build.sh
> @@ -87,7 +87,7 @@ if [ -d mingw64/lib/ ]; then
># These may be from https://dl.bintray.com/rtools/backports/
>cp $MSYS_LIB_DIR/mingw64/lib/lib{thrift,snappy}.a 
> $DST_DIR/${RWINLIB_LIB_DIR}/x64
># These are from https://dl.bintray.com/rtools/mingw{32,64}/
> -  cp 
> $MSYS_LIB_DIR/mingw64/lib/lib{zstd,lz4,brotli*,crypto,utf8proc,re2,aws*}.a 
> $DST_DIR/lib/x64
> +  cp 
> $MSYS_LIB_DIR/mingw64/lib/lib{zstd,lz4,brotli*,crypto,curl,ss*,utf8proc,re2,aws*}.a
>  $DST_DIR/lib/x64
>  fi
>  
>  # Same for the 32-bit versions
> @@ -97,7 +97,7 @@ if [ -d mingw32/lib/ ]; then
>mkdir -p $DST_DIR/lib/i386
>mv mingw32/lib/*.a $DST_DIR/${RWINLIB_LIB_DIR}/i386
>cp $MSYS_LIB_DIR/mingw32/lib/lib{thrift,snappy}.a 
> $DST_DIR/${RWINLIB_LIB_DIR}/i386
> -  cp 
> $MSYS_LIB_DIR/mingw32/lib/lib{zstd,lz4,brotli*,crypto,utf8proc,re2,aws*}.a 
> $DST_DIR/lib/i386
> +  cp 
> $MSYS_LIB_DIR/mingw32/lib/lib{zstd,lz4,brotli*,crypto,curl,ss*,utf8proc,re2,aws*}.a
>  $DST_DIR/lib/i386
>  fi
>  
>  # Do the same also for ucrt64
> @@ -105,7 +105,7 @@ if [ -d ucrt64/lib/ ]; then
>ls $MSYS_LIB_DIR/ucrt64/lib/
>mkdir -p $DST_DIR/lib/x64-ucrt
>mv ucrt64/lib/*.a $DS

[jira] [Updated] (ARROW-16878) [R] Move Windows GCS dependency building upstream

2022-06-25 Thread Neal Richardson (Jira)


 [ 
https://issues.apache.org/jira/browse/ARROW-16878?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Neal Richardson updated ARROW-16878:

Description: 
On ARROW-16510, I added the GCS filesystem to the arrow PKGBUILD, bundling it 
in the arrow build. A better solution would be to put google-cloud-cpp in 
rtools-packages so we don't have to build it every time. 

There is no google-cloud-cpp in https://github.com/msys2/MINGW-packages, so 
either we'd have to make one up for rtools-packages, or we use the bundled 
google-cloud-cpp in our cmake and see if we can put as many of its dependencies 
in rtools-packages to ease the build. 

https://github.com/msys2/MINGW-packages/tree/master/mingw-w64-nlohmann-json 
exists in MINGW-packages and could be brought over, but I don't think it's a 
big deal if it is bundled.

https://github.com/msys2/MINGW-packages/blob/master/mingw-w64-abseil-cpp/PKGBUILD
 exists and could be brought over, but note that it uses C++17. That doesn't 
seem to be a hard requirement, at least for what we're using, since we're 
building it with C++11.


  was:
On ARROW-16510, I made some progress on this but had to back out the changes. 
There is no google-cloud-cpp in https://github.com/msys2/MINGW-packages, so 
either we'd have to make one up for rtools-packages, or we use the bundled 
google-cloud-cpp in our cmake and see if we can put as many of its dependencies 
in rtools-packages to ease the build. 

https://github.com/msys2/MINGW-packages/tree/master/mingw-w64-nlohmann-json 
exists in MINGW-packages and could be brought over, but I don't think it's a 
big deal if it is bundled.

https://github.com/msys2/MINGW-packages/blob/master/mingw-w64-abseil-cpp/PKGBUILD
 exists and could be brought over, but note that it uses C++17. 

It may be that we have to bump arrow up to C++17 for this to work anyway, 
judging from the undefined symbols errors I got (see below). 
https://github.com/msys2/MINGW-packages/pull/11758 and 
https://github.com/apache/arrow/pull/13407 suggest maybe so. That should work 
ok for rtools >= 40, but we'll see what other problems it brings.

In case it's relevant, 
https://github.com/r-windows/rtools-packages/blob/master/mingw-w64-grpc/PKGBUILD
 exists in rtools-packages already, and grpc depends on abseil, but note how it 
handles abseil. It doesn't use all of the same parts of abseil that 
google-cloud-cpp does, so maybe that's fine there but not here?

Here's the diff I backed out of ARROW-16510, and below that is the 
undefined-symbol messages from the build failure. There's something up with 
libcurl too that I don't understand because I added it.

{code}
diff --git a/ci/scripts/PKGBUILD b/ci/scripts/PKGBUILD
index b9b0194f5c8c..566ec881e404 100644
--- a/ci/scripts/PKGBUILD
+++ b/ci/scripts/PKGBUILD
@@ -25,6 +25,7 @@ arch=("any")
 url="https://arrow.apache.org/";
 license=("Apache-2.0")
 depends=("${MINGW_PACKAGE_PREFIX}-aws-sdk-cpp"
+ "${MINGW_PACKAGE_PREFIX}-curl" # for google-cloud-cpp bundled build
  "${MINGW_PACKAGE_PREFIX}-libutf8proc"
  "${MINGW_PACKAGE_PREFIX}-re2"
  "${MINGW_PACKAGE_PREFIX}-thrift"
@@ -79,11 +80,13 @@ build() {
 export PATH="/C/Rtools${MINGW_PREFIX/mingw/mingw_}/bin:$PATH"
 export CPPFLAGS="${CPPFLAGS} -I${MINGW_PREFIX}/include"
 export LIBS="-L${MINGW_PREFIX}/libs"
+export ARROW_GCS=OFF
 export ARROW_S3=OFF
 export ARROW_WITH_RE2=OFF
 # Without this, some dataset functionality segfaults
 export CMAKE_UNITY_BUILD=ON
   else
+export ARROW_GCS=ON
 export ARROW_S3=ON
 export ARROW_WITH_RE2=ON
 # Without this, some compute functionality segfaults in tests
@@ -101,6 +104,7 @@ build() {
 -DARROW_CSV=ON \
 -DARROW_DATASET=ON \
 -DARROW_FILESYSTEM=ON \
+-DARROW_GCS="${ARROW_GCS}" \
 -DARROW_HDFS=OFF \
 -DARROW_JEMALLOC=OFF \
 -DARROW_JSON=ON \
diff --git a/ci/scripts/r_windows_build.sh b/ci/scripts/r_windows_build.sh
index 89d5737a09bd..3334eab8663a 100755
--- a/ci/scripts/r_windows_build.sh
+++ b/ci/scripts/r_windows_build.sh
@@ -87,7 +87,7 @@ if [ -d mingw64/lib/ ]; then
   # These may be from https://dl.bintray.com/rtools/backports/
   cp $MSYS_LIB_DIR/mingw64/lib/lib{thrift,snappy}.a 
$DST_DIR/${RWINLIB_LIB_DIR}/x64
   # These are from https://dl.bintray.com/rtools/mingw{32,64}/
-  cp 
$MSYS_LIB_DIR/mingw64/lib/lib{zstd,lz4,brotli*,crypto,utf8proc,re2,aws*}.a 
$DST_DIR/lib/x64
+  cp 
$MSYS_LIB_DIR/mingw64/lib/lib{zstd,lz4,brotli*,crypto,curl,ss*,utf8proc,re2,aws*}.a
 $DST_DIR/lib/x64
 fi
 
 # Same for the 32-bit versions
@@ -97,7 +97,7 @@ if [ -d mingw32/lib/ ]; then
   mkdir -p $DST_DIR/lib/i386
   mv mingw32/lib/*.a $DST_DIR/${RWINLIB_LIB_DIR}/i386
   cp $MSYS_LIB_DIR/mingw32/lib/lib{thrift,snappy}.a 
$DST_DIR/${RWINLIB_LIB_DIR}/i386
-  cp 
$MSYS_LIB_DIR/mingw32/lib/lib{zstd,lz4,brotli*,crypto,utf8proc,re2,aws*}.a 
$DST_DIR/lib/i386
+  cp 
$MSYS_LIB_DIR/mingw32/lib/lib

[jira] [Updated] (ARROW-16878) [R] Move Windows GCS dependency building upstream

2022-06-25 Thread Neal Richardson (Jira)


 [ 
https://issues.apache.org/jira/browse/ARROW-16878?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Neal Richardson updated ARROW-16878:

Description: 
On ARROW-16510, I added the GCS filesystem to the arrow PKGBUILD, bundling it 
in the arrow build. A better solution would be to put google-cloud-cpp in 
rtools-packages so we don't have to build it every time. 

There is no google-cloud-cpp in https://github.com/msys2/MINGW-packages, so 
either we'd have to make one up for rtools-packages, or we use the bundled 
google-cloud-cpp in our cmake and see if we can put as many of its dependencies 
in rtools-packages to ease the build. Either way, we'd want to start by adding 
its dependencies.

https://github.com/msys2/MINGW-packages/tree/master/mingw-w64-nlohmann-json 
exists in MINGW-packages and could be brought over, but I don't think it's a 
big deal if it is bundled.

https://github.com/msys2/MINGW-packages/blob/master/mingw-w64-abseil-cpp/PKGBUILD
 exists and could be brought over, but note that it uses C++17. That doesn't 
seem to be a hard requirement, at least for what we're using, since we're 
building it with C++11.


  was:
On ARROW-16510, I added the GCS filesystem to the arrow PKGBUILD, bundling it 
in the arrow build. A better solution would be to put google-cloud-cpp in 
rtools-packages so we don't have to build it every time. 

There is no google-cloud-cpp in https://github.com/msys2/MINGW-packages, so 
either we'd have to make one up for rtools-packages, or we use the bundled 
google-cloud-cpp in our cmake and see if we can put as many of its dependencies 
in rtools-packages to ease the build. 

https://github.com/msys2/MINGW-packages/tree/master/mingw-w64-nlohmann-json 
exists in MINGW-packages and could be brought over, but I don't think it's a 
big deal if it is bundled.

https://github.com/msys2/MINGW-packages/blob/master/mingw-w64-abseil-cpp/PKGBUILD
 exists and could be brought over, but note that it uses C++17. That doesn't 
seem to be a hard requirement, at least for what we're using, since we're 
building it with C++11.



> [R] Move Windows GCS dependency building upstream
> -
>
> Key: ARROW-16878
> URL: https://issues.apache.org/jira/browse/ARROW-16878
> Project: Apache Arrow
>  Issue Type: New Feature
>  Components: Packaging, R
>Reporter: Neal Richardson
>Priority: Major
> Fix For: 9.0.0
>
>
> On ARROW-16510, I added the GCS filesystem to the arrow PKGBUILD, bundling it 
> in the arrow build. A better solution would be to put google-cloud-cpp in 
> rtools-packages so we don't have to build it every time. 
> There is no google-cloud-cpp in https://github.com/msys2/MINGW-packages, so 
> either we'd have to make one up for rtools-packages, or we use the bundled 
> google-cloud-cpp in our cmake and see if we can put as many of its 
> dependencies in rtools-packages to ease the build. Either way, we'd want to 
> start by adding its dependencies.
> https://github.com/msys2/MINGW-packages/tree/master/mingw-w64-nlohmann-json 
> exists in MINGW-packages and could be brought over, but I don't think it's a 
> big deal if it is bundled.
> https://github.com/msys2/MINGW-packages/blob/master/mingw-w64-abseil-cpp/PKGBUILD
>  exists and could be brought over, but note that it uses C++17. That doesn't 
> seem to be a hard requirement, at least for what we're using, since we're 
> building it with C++11.



--
This message was sent by Atlassian Jira
(v8.20.7#820007)


[jira] [Updated] (ARROW-16878) [R] Move Windows GCS dependency building upstream

2022-06-30 Thread Jonathan Keane (Jira)


 [ 
https://issues.apache.org/jira/browse/ARROW-16878?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jonathan Keane updated ARROW-16878:
---
Fix Version/s: (was: 9.0.0)

> [R] Move Windows GCS dependency building upstream
> -
>
> Key: ARROW-16878
> URL: https://issues.apache.org/jira/browse/ARROW-16878
> Project: Apache Arrow
>  Issue Type: New Feature
>  Components: Packaging, R
>Reporter: Neal Richardson
>Priority: Major
>
> On ARROW-16510, I added the GCS filesystem to the arrow PKGBUILD, bundling it 
> in the arrow build. A better solution would be to put google-cloud-cpp in 
> rtools-packages so we don't have to build it every time. 
> There is no google-cloud-cpp in https://github.com/msys2/MINGW-packages, so 
> either we'd have to make one up for rtools-packages, or we use the bundled 
> google-cloud-cpp in our cmake and see if we can put as many of its 
> dependencies in rtools-packages to ease the build. Either way, we'd want to 
> start by adding its dependencies.
> https://github.com/msys2/MINGW-packages/tree/master/mingw-w64-nlohmann-json 
> exists in MINGW-packages and could be brought over, but I don't think it's a 
> big deal if it is bundled.
> https://github.com/msys2/MINGW-packages/blob/master/mingw-w64-abseil-cpp/PKGBUILD
>  exists and could be brought over, but note that it uses C++17. That doesn't 
> seem to be a hard requirement, at least for what we're using, since we're 
> building it with C++11.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)