jorisvandenbossche commented on code in PR #41135:
URL: https://github.com/apache/arrow/pull/41135#discussion_r1587574083


##########
docs/source/python/install.rst:
##########
@@ -93,3 +100,41 @@ a custom path to the database from Python:
 
    >>> import pyarrow as pa
    >>> pa.set_timezone_db_path("custom_path")
+
+
+.. _python-conda-differences:
+
+Differences between conda-forge packages
+----------------------------------------
+
+PyArrow is packaged on `conda-forge <https://conda-forge.org/>`_ as three
+separate packages, each providing varying levels of functionality. This is in
+contrast to PyPi, where only a single PyArrow package is provided.
+
+The purpose of this split is to minimize the size of the installed package for
+most users (``pyarrow``), provide a smaller, minimal package for specialized 
use
+cases (``pyarrow-core``), while still providing a complete package for users 
who
+require it (``pyarrow-all``).
+
+The table below lists the functionality provided by each package and may be
+useful when deciding to use one package over another:
+
++------------+------------------------------+------------------------------+------------------------------+
+| Component  | pyarrow                      | pyarrow-core                 | 
pyarrow-all                  |
++============+==============================+==============================+==============================+
+| Core       | :fas:`check;sd-text-success` | :fas:`check;sd-text-success` | 
:fas:`check;sd-text-success` |
++------------+------------------------------+------------------------------+------------------------------+
+| Parquet    | :fas:`check;sd-text-success` |                              | 
:fas:`check;sd-text-success` |
++------------+------------------------------+------------------------------+------------------------------+
+| Datasets   | :fas:`check;sd-text-success` |                              | 
:fas:`check;sd-text-success` |
++------------+------------------------------+------------------------------+------------------------------+
+| Acero      | :fas:`check;sd-text-success` |                              | 
:fas:`check;sd-text-success` |
++------------+------------------------------+------------------------------+------------------------------+
+| Substrait  | :fas:`check;sd-text-success` |                              | 
:fas:`check;sd-text-success` |
++------------+------------------------------+------------------------------+------------------------------+
+| Flight     |                              |                              | 
:fas:`check;sd-text-success` |
++------------+------------------------------+------------------------------+------------------------------+
+| Flight SQL |                              |                              | 
:fas:`check;sd-text-success` |
++------------+------------------------------+------------------------------+------------------------------+
+| Gandiva    |                              |                              | 
:fas:`check;sd-text-success` |
++------------+------------------------------+------------------------------+------------------------------+

Review Comment:
   > I need to look at where the modules I mentioned above (JSON, etc) live 
because I think that should be clear.
   
   Right now all the things you mentioned (json, csv, filesystems, orc) are 
included in the main libarrow.so, and so can only be either enabled or disabled 
(cannot be installed separately). 
   For the filesystems, there is work under way to actually build them as 
separate libraries, so we will have to update this doc section once that is 
done. For ORC I am actually wondering if we should open an issue about 
splitting that in a separate library (although liborc is not that big of a 
dependency, so it might not be super important)



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to