[jira] [Created] (ARROW-10386) spatial (sf geometry) data does not roundtrip correctly (attributes lost)
Petr Bouchal created ARROW-10386: Summary: spatial (sf geometry) data does not roundtrip correctly (attributes lost) Key: ARROW-10386 URL: https://issues.apache.org/jira/browse/ARROW-10386 Project: Apache Arrow Issue Type: Bug Components: R Affects Versions: 2.0.0 Environment: Mac OS 10.15.7 R 4.0.2 arrow 2.0 sf 0.9-6 Reporter: Petr Bouchal Hi all - thanks for the improvement addressed in ARROW-9271. In arrow 2.0 spatial data (class sf) now retains metadata at column level, but still does not roundtrip correctly as metadata (attributes) are lost at the level of individual elements of the list-columns; at least I think that is the problem as that is where I can see changes in the metadata.) Is this something that is addressable? See reprex below on what happens + what attributes exist at the element level. FWIW a workaround with spatial data using sf would be to convert to WKT before writing it out (sf::st_as_text()). It might be useful to note this somewhere in the docs. This is using arrow 2.0 and sf 0.9-6. Reproducible example: {{ ``` r library(arrow) #> #> Attaching package: 'arrow' #> The following object is masked from 'package:utils': #> #> timestamp library(sf) #> Linking to GEOS 3.8.1, GDAL 3.1.1, PROJ 6.3.1 fname <- system.file("shape/nc.shp", package="sf") df_spatial <- st_read(fname) #> Reading layer `nc' from data source `/Users/petr/Library/R/4.0/library/sf/shape/nc.shp' using driver `ESRI Shapefile' #> Simple feature collection with 100 features and 14 fields #> geometry type: MULTIPOLYGON #> dimension: XY #> bbox: xmin: -84.32385 ymin: 33.88199 xmax: -75.45698 ymax: 36.58965 #> geographic CRS: NAD27 write_parquet(df_spatial, "spatial.parquet") roundtripped <- read_parquet("spatial.parquet") roundtripped #> Simple feature collection with 100 features and 14 fields #> geometry type: MULTIPOLYGON #> dimension: arrow_list #> bbox: xmin: -84.32385 ymin: 33.88199 xmax: -75.45698 ymax: 36.58965 #> geographic CRS: NAD27 #> First 10 features: #> Error in vapply(lst, class, rep(NA_character_, 3)): values must be length 3, #> but FUN(X[[1]]) result is length 1 attributes(roundtripped$geometry[[1]]) #> $class #> [1] "arrow_list" "vctrs_list_of" "vctrs_vctr" "list" #> #> $ptype #> [0]> attributes(df_spatial$geometry[[1]]) #> $class #> [1] "XY" "MULTIPOLYGON" "sfg" ``` }} Created on 2020-10-24 by the [reprex package]([https://reprex.tidyverse.org|https://reprex.tidyverse.org/]) (v0.3.0) -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (ARROW-10385) [C++][Gandiva] Add support for LLVM 11
Kouhei Sutou created ARROW-10385: Summary: [C++][Gandiva] Add support for LLVM 11 Key: ARROW-10385 URL: https://issues.apache.org/jira/browse/ARROW-10385 Project: Apache Arrow Issue Type: Improvement Components: C++ - Gandiva Reporter: Kouhei Sutou Assignee: Kouhei Sutou -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (ARROW-10384) [c++] Fix typos and spelling
Kazuaki Ishizaki created ARROW-10384: Summary: [c++] Fix typos and spelling Key: ARROW-10384 URL: https://issues.apache.org/jira/browse/ARROW-10384 Project: Apache Arrow Issue Type: Improvement Components: C++ Affects Versions: 3.0.0 Reporter: Kazuaki Ishizaki Fix typo under {{cpp}} directory -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (ARROW-10383) [docs] Fix typos and spelling
Kazuaki Ishizaki created ARROW-10383: Summary: [docs] Fix typos and spelling Key: ARROW-10383 URL: https://issues.apache.org/jira/browse/ARROW-10383 Project: Apache Arrow Issue Type: Improvement Components: Documentation Affects Versions: 3.0.0 Reporter: Kazuaki Ishizaki Fix typo under {{docs}} directory -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (ARROW-10382) [rust] Fix typos and spelling
Kazuaki Ishizaki created ARROW-10382: Summary: [rust] Fix typos and spelling Key: ARROW-10382 URL: https://issues.apache.org/jira/browse/ARROW-10382 Project: Apache Arrow Issue Type: Improvement Components: Rust Affects Versions: 3.0.0 Reporter: Kazuaki Ishizaki Fix typo under {{rust}} directory -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (ARROW-10381) [Rust] Generalize Arrow to support MergeSort
Jorge Leitão created ARROW-10381: Summary: [Rust] Generalize Arrow to support MergeSort Key: ARROW-10381 URL: https://issues.apache.org/jira/browse/ARROW-10381 Project: Apache Arrow Issue Type: Improvement Components: Rust Affects Versions: 3.0.0 Reporter: Jorge Leitão Assignee: Jorge Leitão Currently, the code to sort is centered around creating an array that can be sorted. This is useful for intra-array comparison, but does not allow things like `merge-sort`, where a comparison between two arrays (of the same data type). The goal of this issue is to generalize the current code to support both. -- This message was sent by Atlassian Jira (v8.3.4#803005)