westonpace commented on a change in pull request #63:
URL: https://github.com/apache/arrow-cookbook/pull/63#discussion_r701570334



##########
File path: python/source/create.rst
##########
@@ -7,6 +7,68 @@ Tensors and all other Arrow entities.
 
 .. contents::
 
+Creating Arrays
+===============
+
+Arrow keeps data in continuous arrays optimised for memory footprint
+and SIMD analyses. In Python it's possible to build :class:`pyarrow.Array`
+starting from Python ``lists`` (or sequence types in general),
+``numpy`` arrays and ``pandas`` Series.
+
+.. testcode::
+
+    import pyarrow as pa
+
+    array = pa.array([1, 2, 3, 4, 5])
+
+.. testcode::
+
+    print(array)
+
+.. testoutput::
+
+    [
+      1,
+      2,
+      3,
+      4,
+      5
+    ]
+
+Arrays can also provide a ``mask`` to specify which values should

Review comment:
       I don't know if it's worth mentioning but the `mask` must be a numpy 
array (e.g. typical python list won't work)

##########
File path: python/source/create.rst
##########
@@ -7,6 +7,68 @@ Tensors and all other Arrow entities.
 
 .. contents::
 
+Creating Arrays
+===============
+
+Arrow keeps data in continuous arrays optimised for memory footprint
+and SIMD analyses. In Python it's possible to build :class:`pyarrow.Array`
+starting from Python ``lists`` (or sequence types in general),
+``numpy`` arrays and ``pandas`` Series.
+
+.. testcode::
+
+    import pyarrow as pa
+
+    array = pa.array([1, 2, 3, 4, 5])
+
+.. testcode::
+
+    print(array)
+
+.. testoutput::
+
+    [
+      1,
+      2,
+      3,
+      4,
+      5
+    ]
+
+Arrays can also provide a ``mask`` to specify which values should
+be considered nulls
+
+.. testcode::
+
+    import numpy as np
+
+    array = pa.array([1, 2, 3, 4, 5], 
+                     mask=np.array([True, False, True, False, True]))
+
+    print(array)
+
+.. testoutput::
+
+    [
+      null,
+      2,
+      null,
+      4,
+      null
+    ]
+
+When building arrays from ``numpy`` or ``pandas``, Arrow will leverage
+optimized code paths that rely on the internal in-memory representation
+of the data by ``numpy`` and ``pandas``
+
+.. testcode::
+
+    import numpy as np
+    import pandas

Review comment:
       If you are going to do `import numpy as np` and `import pyarrow as pa` 
you should probably do `import pandas as pd` for consistency.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@arrow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Reply via email to