[GitHub] [arrow] jorisvandenbossche commented on a change in pull request #7631: ARROW-8651: [Python][Dataset] Support pickling of Dataset objects

GitBox Fri, 03 Jul 2020 05:51:01 -0700


jorisvandenbossche commented on a change in pull request #7631:
URL: https://github.com/apache/arrow/pull/7631#discussion_r449567025




##########
File path: python/pyarrow/tests/test_dataset.py
##########
@@ -635,6 +635,37 @@ def test_make_fragment_from_buffer():
     assert pickled.to_table().equals(fragment.to_table())
 
 
[email protected]
+def test_make_parquet_fragment_from_buffer():
+    import pyarrow.parquet as pq
+
+    table = pa.table([['a', 'b', 'c'],
+                      [12, 11, 10],
+                      ['dog', 'cat', 'rabbit']],
+                     names=['alpha', 'num', 'animal'])
+
+    out = pa.BufferOutputStream()
+    pq.write_table(table, out)
+
+    buffer = out.getvalue()
+
+    formats = [
+        ds.ParquetFileFormat(),
+        ds.ParquetFileFormat(
+            read_options=ds.ParquetReadOptions(
+                use_buffered_stream=True,
+                buffer_size=4096,

Review comment:
       we probably need to use an option that actually alters the output to be 
able to catch a failure, eg `dictionary_columns`




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]

[GitHub] [arrow] jorisvandenbossche commented on a change in pull request #7631: ARROW-8651: [Python][Dataset] Support pickling of Dataset objects

Reply via email to