Hi, yu zelin

Thank you for initiating this discussion.

I'm also working on this. My current plan is to build paimon-rust, followed by 
paimon-python via pyo3 by exposing the paimon-rust API.

PyO3 can build a native Python package without additional dependencies. This 
way, users can install paimon-python simply by running pip install paimon, 
without needing any extra setup for Java, Paimon, Flink or other components.

Are you interested in this direction?

Some context: the iceberg community is also working use iceberg-rust in 
pyicberg directly: https://github.com/apache/iceberg-rust/pull/518

On Wed, Aug 7, 2024, at 19:24, yu zelin wrote:
> Hi devs,
>
> I'd like to introduce a python SDK for paimon (paimon-python). Python users
> can use it to access paimon data more easily.
>
> In the first version, I would leverage py4j to wrap Java SDK with python
> codes. Briefly speaking, py4j can start a JVM and
> load Java classes, so we can use it to access Paimon table Java API and get
> results in Python code. An example is flink-python:
> https://github.com/apache/flink/tree/master/flink-python
>
> I'd like to give an paimon example:
> ```
>
> class FileStoreTable(object):
>
>
>     @classmethod
>
>     def create(cls, context: CatalogContext) -> 'FileStoreTable':
>
>         *# gateway is built via py4j to access JVM*
>
>         gateway = get_gateway()
>
>         *# use gateway.jvm to access java classes*
>
>         j_table =
> gateway.jvm.FileStoreTableFactory.create(context.to_j_catalog_context())
>
>         return FileStoreTable(j_table)
>
>
>     def __init__(self, j_table):
>
>         self.__j_table = j_table
>
>
>     # wrap Java method
>
>     def primary_keys(self) -> List[str]:
>
>         return self.__j_table.primaryKeys()
> ```
> Then we can wrap scan, read interface to read table and write, commit
> interface to write table via Python.
>
> Looking forward to your suggestions.
>
> Best Regards,
> Zelin Yu

-- 
Xuanwo

https://xuanwo.io/

Reply via email to