SHIMA Tatsuya created ARROW-17417:
-------------------------------------

             Summary: [R] Implement typeof() in Arrow dplyr queries
                 Key: ARROW-17417
                 URL: https://issues.apache.org/jira/browse/ARROW-17417
             Project: Apache Arrow
          Issue Type: Improvement
          Components: R
    Affects Versions: 9.0.0
            Reporter: SHIMA Tatsuya


Currently this is useless because it always returns the string {{environment}}, 
but in dbplyr, duckdb and others have the typeof function, so it works as is.


{code:r}
dplyr::starwars |>
  dplyr::transmute(x = typeof(films))
#> # A tibble: 87 × 1
#>    x
#>    <chr>
#>  1 list
#>  2 list
#>  3 list
#>  4 list
#>  5 list
#>  6 list
#>  7 list
#>  8 list
#>  9 list
#> 10 list
#> # … with 77 more rows
#> # ℹ Use `print(n = ...)` to see more rows

dplyr::starwars |>
  arrow::to_duckdb() |>
  dplyr::transmute(x = typeof(films)) |>
  dplyr::collect()
#> # A tibble: 87 × 1
#>    x
#>    <chr>
#>  1 VARCHAR[]
#>  2 VARCHAR[]
#>  3 VARCHAR[]
#>  4 VARCHAR[]
#>  5 VARCHAR[]
#>  6 VARCHAR[]
#>  7 VARCHAR[]
#>  8 VARCHAR[]
#>  9 VARCHAR[]
#> 10 VARCHAR[]
#> # … with 77 more rows
#> # ℹ Use `print(n = ...)` to see more rows

dplyr::starwars |>
  arrow::arrow_table() |>
  dplyr::transmute(x = typeof(films)) |>
  dplyr::collect()
#> # A tibble: 87 × 1
#>    x
#>    <chr>
#>  1 environment
#>  2 environment
#>  3 environment
#>  4 environment
#>  5 environment
#>  6 environment
#>  7 environment
#>  8 environment
#>  9 environment
#> 10 environment
#> # … with 77 more rows
#> # ℹ Use `print(n = ...)` to see more rows
{code}

I would expect it to work as follows.

{code:r}
dplyr::starwars |>
  arrow::arrow_table() |>
  dplyr::transmute(x = arrow::infer_type(films)$ToString()) |>
  dplyr::collect()
#> # A tibble: 87 × 1
#>    x
#>    <chr>
#>  1 list<item: string>
#>  2 list<item: string>
#>  3 list<item: string>
#>  4 list<item: string>
#>  5 list<item: string>
#>  6 list<item: string>
#>  7 list<item: string>
#>  8 list<item: string>
#>  9 list<item: string>
#> 10 list<item: string>
#> # … with 77 more rows
#> # ℹ Use `print(n = ...)` to see more rows
{code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to