thisisnic commented on issue #41474:
URL: https://github.com/apache/arrow/issues/41474#issuecomment-2093376936

   Thanks for reporting this @rafapereirabr!  It looks like this isn't a bug, 
but a difference between how data.table and tibble print these kinds of 
numbers.  I tried that example but passing the results into `dput()` to print 
the underlying objects, and the values were identical:
   
   ``` r
   library(arrow)
   #> 
   #> Attaching package: 'arrow'
   #> The following object is masked from 'package:utils':
   #> 
   #>     timestamp
   library(dplyr)
   #> 
   #> Attaching package: 'dplyr'
   #> The following objects are masked from 'package:stats':
   #> 
   #>     filter, lag
   #> The following objects are masked from 'package:base':
   #> 
   #>     intersect, setdiff, setequal, union
   library(data.table)
   #> 
   #> Attaching package: 'data.table'
   #> The following objects are masked from 'package:dplyr':
   #> 
   #>     between, first, last
   options(scipen = 9999)
   
   # create sample data
   df <- data.frame(tipo = 1,
                    id =23232308,
                    vinc = 99930010917577)
   
   df$tipo <- as.integer(df$tipo)
   df$id <- as.numeric(df$id)
   df$vinc <- bit64::as.integer64(df$vinc)
   fwrite(df, 'test.csv')
   
   # check csv
   data.table::fread('test.csv') %>%
     dput()
   #> structure(list(tipo = 1L, id = 23232308L, vinc = 
structure(0.000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000493719853829155,
 class = "integer64")), row.names = c(NA, 
   #> -1L), class = c("data.table", "data.frame"), .internal.selfref = 
<pointer: 0x576171e05370>)
   
   
   
   test <- arrow::open_csv_dataset('test.csv')
   dplyr::collect(test) %>% 
     dput()
   #> structure(list(tipo = 1L, id = 23232308L, vinc = 
structure(0.000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000493719853829155,
 class = "integer64")), class = c("tbl_df", 
   #> "tbl", "data.frame"), row.names = c(NA, -1L))
   
   
   arrow::write_parquet(df, 'test.parquet')
   test2 <- arrow::open_dataset('test.parquet')
   collect(test2) %>% dput()
   #> structure(list(tipo = 1L, id = 23232308, vinc = 
structure(0.000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000493719853829155,
 class = "integer64")), row.names = c(NA, 
   #> -1L), class = c("tbl_df", "tbl", "data.frame"))
   ```
   
   <sup>Created on 2024-05-03 with [reprex 
v2.1.0](https://reprex.tidyverse.org)</sup>
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@arrow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

Reply via email to