[
https://issues.apache.org/jira/browse/ARROW-1720?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16217214#comment-16217214
]
ASF GitHub Bot commented on ARROW-1720:
---------------------------------------
wesm closed pull request #1243: ARROW-1720: [Python] Implement bounds check in
chunk getter
URL: https://github.com/apache/arrow/pull/1243
This is a PR merged from a forked repository.
As GitHub hides the original diff on merge, it is displayed below for
the sake of provenance:
As this is a foreign pull request (from a fork), the diff is supplied
below (as it won't show otherwise due to GitHub magic):
diff --git a/python/pyarrow/table.pxi b/python/pyarrow/table.pxi
index dd42cf231..694fe9190 100644
--- a/python/pyarrow/table.pxi
+++ b/python/pyarrow/table.pxi
@@ -102,6 +102,10 @@ cdef class ChunkedArray:
pyarrow.Array
"""
self._check_nullptr()
+
+ if i >= self.num_chunks or i < 0:
+ raise IndexError('Chunk index out of range.')
+
return pyarrow_wrap_array(self.chunked_array.chunk(i))
def iterchunks(self):
diff --git a/python/pyarrow/tests/test_table.py
b/python/pyarrow/tests/test_table.py
index 4a2868a3c..50190f597 100644
--- a/python/pyarrow/tests/test_table.py
+++ b/python/pyarrow/tests/test_table.py
@@ -211,6 +211,12 @@ def test_table_basics():
for chunk in col.data.iterchunks():
assert chunk is not None
+ with pytest.raises(IndexError):
+ col.data.chunk(-1)
+
+ with pytest.raises(IndexError):
+ col.data.chunk(col.data.num_chunks)
+
def test_table_from_arrays_invalid_names():
data = [
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]
> [Python] Segmentation fault while trying to access an out-of-bound chunk
> ------------------------------------------------------------------------
>
> Key: ARROW-1720
> URL: https://issues.apache.org/jira/browse/ARROW-1720
> Project: Apache Arrow
> Issue Type: Bug
> Components: Python
> Affects Versions: 0.7.1
> Environment: OS X, Python 3.6.3
> Reporter: Dorus Leliveld
> Priority: Minor
> Labels: pull-request-available
> Fix For: 0.8.0
>
>
> Following code segfaults.
> {code}
> import pyarrow as pa
> data = [
> pa.array([1, 2, 3, 4]),
> pa.array(['foo', 'bar', 'baz', None]),
> pa.array([True, None, False, True])
> ]
> batch = pa.RecordBatch.from_arrays(data, ['f0', 'f1', 'f2'])
> batches = [batch] * 5
> table = pa.Table.from_batches(batches)
> c = table[0]
> c.data.chunk(5)
> {code}
--
This message was sent by Atlassian JIRA
(v6.4.14#64029)