aosingh opened a new issue, #478:
URL: https://github.com/apache/arrow-nanoarrow/issues/478
Thanks to the Arrow community for developing this lightweight wrapper.
I am planning to add support for Apache Arrow in one of the projects I am
working on. The aim is to leverage nanoarrow to support exporting tabular data
in arrow format.
Users will have access to a function `to_arrow()`:
```python
import nanoarrow as na
def gen_name():
for i in range(100):
yield "John Doe"
def gen_age():
for i in range(100):
yield 34
def to_arrow():
results = [na.c_array(gen_name(), na.string()), na.c_array(gen_age(),
na.int64())]
return results
```
Users of the library can optionally install `pyarrow` and `pandas` to work
with the exported data. And the export works fine!
```python
import pyarrow as pa
parray = pa.Table.from_arrays(to_arrow(), names=["name", "age"])
print(parray.to_pandas())
```
```bash
name age
0 John Doe 34
1 John Doe 34
2 John Doe 34
3 John Doe 34
4 John Doe 34
.. ... ...
95 John Doe 34
96 John Doe 34
97 John Doe 34
98 John Doe 34
99 John Doe 34
[100 rows x 2 columns]
```
Adding a third field `timestamp` to the above list raises an error:
```python
def gen_timestamp():
for i in range(100):
yield datetime.datetime.now().timestamp()
result = [na.c_array(gen_name(), na.string()),
na.c_array(gen_age(), na.int64()),
na.c_array(gen_timestamp(), na.timestamp("s"))]
parray = pa.Table.from_arrays(result, names=["name", "age", "timestamp"])
print(parray.to_pandas())
```
Error:
```
Traceback (most recent call last):
File "/Users/as/nanoarrow/simple.py", line 28, in <module>
na.c_array(gen_timestamp(), na.timestamp("s"))]
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File
"/Users/as/nanoarrow/arrow-nanoarrow/python/src/nanoarrow/c_array.py", line
131, in c_array
raise ValueError(
ValueError: An error occurred whilst converting generator to
nanoarrow.c_array:
Can't build array of type timestamp from iterable
```
I understand the source of error is the
[mapping](https://github.com/apache/arrow-nanoarrow/blob/main/python/src/nanoarrow/c_array.py#L531C1-L552)
maintained for each datatype.
How can I add support to incrementally build arrays for more datatypes ?
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]