jorisvandenbossche commented on code in PR #44126:
URL: https://github.com/apache/arrow/pull/44126#discussion_r1768997230
##########
python/pyarrow/table.pxi:
##########
@@ -6259,6 +6259,56 @@ def concat_tables(tables, MemoryPool memory_pool=None,
str promote_options="none
return pyarrow_wrap_table(c_result_table)
+def concat_recordbatches(recordbatches, MemoryPool memory_pool=None):
Review Comment:
```suggestion
def concat_record_batches(recordbatches, MemoryPool memory_pool=None):
```
? (to be consistent with `record_batch(..)`)
##########
python/pyarrow/table.pxi:
##########
@@ -6259,6 +6259,56 @@ def concat_tables(tables, MemoryPool memory_pool=None,
str promote_options="none
return pyarrow_wrap_table(c_result_table)
+def concat_recordbatches(recordbatches, MemoryPool memory_pool=None):
+ """
+ Concatenate pyarrow.RecordBatch objects.
+
+ All recordbatches must share the same Schema,
+ the operation is guaranteed to be zero-copy.
+
+ Parameters
+ ----------
+ recordbatches : iterable of pyarrow.RecordBatch objects
+ Pyarrow recordbatches to concatenate into a single RecordBatch.
+ memory_pool : MemoryPool, default None
+ For memory allocations, if required, otherwise use default pool.
+
+ Examples
+ --------
+ >>> import pyarrow as pa
+ >>> t1 = pa.record_batch([
+ ... pa.array([2, 4, 5, 100]),
+ ... pa.array(["Flamingo", "Horse", "Brittle stars", "Centipede"])
+ ... ], names=['n_legs', 'animals'])
+ >>> t2 = pa.record_batch([
+ ... pa.array([2, 4]),
+ ... pa.array(["Parrot", "Dog"])
+ ... ], names=['n_legs', 'animals'])
+ >>> pa.concat_recordbatches([t1,t2])
Review Comment:
```suggestion
>>> pa.concat_recordbatches([t1, t2])
```
##########
python/pyarrow/table.pxi:
##########
@@ -6259,6 +6259,56 @@ def concat_tables(tables, MemoryPool memory_pool=None,
str promote_options="none
return pyarrow_wrap_table(c_result_table)
+def concat_recordbatches(recordbatches, MemoryPool memory_pool=None):
+ """
+ Concatenate pyarrow.RecordBatch objects.
+
+ All recordbatches must share the same Schema,
+ the operation is guaranteed to be zero-copy.
Review Comment:
I suppose this method would never be zero copy? (the result is again a
RecordBatch, so it actually needs to copy to concatenate the arrays)
##########
python/pyarrow/table.pxi:
##########
@@ -6259,6 +6259,56 @@ def concat_tables(tables, MemoryPool memory_pool=None,
str promote_options="none
return pyarrow_wrap_table(c_result_table)
+def concat_recordbatches(recordbatches, MemoryPool memory_pool=None):
+ """
+ Concatenate pyarrow.RecordBatch objects.
+
+ All recordbatches must share the same Schema,
Review Comment:
The table version does not require this. That's a missing feature on the C++
side?
##########
python/pyarrow/table.pxi:
##########
@@ -6259,6 +6259,56 @@ def concat_tables(tables, MemoryPool memory_pool=None,
str promote_options="none
return pyarrow_wrap_table(c_result_table)
+def concat_recordbatches(recordbatches, MemoryPool memory_pool=None):
+ """
+ Concatenate pyarrow.RecordBatch objects.
+
+ All recordbatches must share the same Schema,
+ the operation is guaranteed to be zero-copy.
+
+ Parameters
+ ----------
+ recordbatches : iterable of pyarrow.RecordBatch objects
+ Pyarrow recordbatches to concatenate into a single RecordBatch.
Review Comment:
```suggestion
Pyarrow record batches to concatenate into a single RecordBatch.
```
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]