thisisnic commented on code in PR #14514:
URL: https://github.com/apache/arrow/pull/14514#discussion_r1012447084


##########
r/vignettes/data_objects.Rmd:
##########
@@ -0,0 +1,206 @@
+---
+title: "Data objects"
+description: > 
+  Learn about Scalar, Array, Table, and Dataset objects in `arrow` 
+  (among others), how they relate to each other, as well as their 
+  relationships to familiar R objects like data frames and vectors 
+output: rmarkdown::html_vignette
+---
+
+This article describes the various data object types supplied by `arrow`, and 
documents how these objects are structured. 
+
+```{r include=FALSE}
+library(arrow, warn.conflicts = FALSE)
+```
+
+The `arrow` package supplies several object classes that are used to represent 
data. `RecordBatch`, `Table`, and `Dataset` objects are two-dimensional 
rectangular data structures used to store tabular data. For columnar, 
one-dimensional data, the `Array` and `ChunkedArray` classes are provided. 
Finally, `Scalar` objects represent individual values. The table below 
summarizes these objects and shows how you can create new instances using the 
[`R6`](https://r6.r-lib.org/) class object, as well as convenience functions 
that provide the same functionality in a more traditional R-like fashion:
+
+| Dim | Class          | How to create an instance                     | 
Convenience function                          |
+| --- | -------------- | ----------------------------------------------| 
--------------------------------------------- |
+| 0   | `Scalar`       | `Scalar$create(value, type)`                  |       
                                        |
+| 1   | `Array`        | `Array$create(vector, type)`                  |       
                                        |
+| 1   | `ChunkedArray` | `ChunkedArray$create(..., type)`              | 
`chunked_array(..., type)`                    |
+| 2   | `RecordBatch`  | `RecordBatch$create(...)`                     | 
`record_batch(...)`                           |
+| 2   | `Table`        | `Table$create(...)`                           | 
`arrow_table(...)`                            |
+| 2   | `Dataset`      | `Dataset$create(sources, schema)`             | 
`open_dataset(sources, schema)`               |
+  
+Later in the article we'll look at each of these in more detail.
+
+For now we note that each of these object classes corresponds to a class of 
the same name in the underlying Arrow C++ library. It is also worth mentioning 
that the `arrow` package also defines classes that do not exist in the C++ 
library including:
+
+* `ArrowDatum`: inherited by `Scalar`, `Array`, and `ChunkedArray`
+* `ArrowTabular`: inherited by `RecordBatch` and `Table`
+* `ArrowObject`: inherited by all Arrow objects

Review Comment:
   Sorry, I was reading it as new content as I'd not looking at the getting 
started page in ages and ages!  Honestly, I'd just err on the side of your own 
judgment in cases like this; I agree this section is for one of the dev 
vignettes.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@arrow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

Reply via email to