ostinru opened a new issue, #74: URL: https://github.com/apache/cloudberry-pxf/issues/74
I am going to test that cloudberry-pxf works as good as greenplum-pxf with JDBC data sources (and update docs). Currently, Automation covers only PostgreSQL JDBC (because postgresq JDBC driver is included into pxf build and we can use Cloudberry as PostgreSQL instance). However most issues we observe in production happens with other databases (usually Oracle and ClickHouse). Oracle uses weird data types that doesn't match PostgreSQL ones, ClickHouse used to have issues with huge volumes of data users going to export. Here I see couple issues: 1. Oracle licensing is not clear for me. I am not sure that we can run containerised Oracle in CI / dev machines. I have seen that TestContainers provides[1] Oracle as one of the options. And it is stated that this container is used across different open-source projects[2]. Here I need guidance and best practices from @tuhaihe / Apache. 2. MS SQL Server also requires accepting EULA before running container with a database [3]. Is it OK? And, is it possible to download these docker images from "special network environments" (#63)? ### Test Design ideas **Dependencies**: I am not sure that we want to provide "~~batteries~~ drivers included" with `cloudberry-pxf`, or even as `cloudberry-pxf-driveres`[4]. It will be an obligation to support different databases for ages (as we are doing with HBase). However we can keep these drivers as test dependencies for `pxf-jdbc`. **TestContainers**: I think that we can start Cloudberry + PXF container (`ci/docker/pxf-cbdb-dev`) directly from java code in TestContainers with shared network with 3rd party databases. I know, that this will run SLOW (slow Cloudberry compilation, PXF build, Hadoop start). But it seems to be step into right direction. ``` +----------------------------------------------------------------+ | Host | | +-------------------------------+ +-----------------------+ | | | Docker | | Docker | | | | [Cloudberry] --> [PXF] ------------> [Database] | | | | | | | | | +-------------------------------+ +-----------------------+ | | | | [Automation] | +----------------------------------------------------------------+ ``` @MisterRaindrop , any thoughts on this? [1] https://java.testcontainers.org/modules/databases/oraclefree/ [2] https://github.com/gvenzl/oci-oracle-free?tab=readme-ov-file#users-of-these-images [3] https://java.testcontainers.org/modules/databases/mssqlserver/ [4] https://github.com/open-gpdb/cloudberry-pxf/pull/6 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
