soumyakanti3578 commented on PR #5510: URL: https://github.com/apache/hive/pull/5510#issuecomment-2445455874
@deniskuzZ > why not simply parameterize the db to run tests against? Only option I saw was to add something like `--!qt:database:postgres:<script-to-run-in-postgres>`. Did you have this in mind as well? I didn't go with this approach because I need to run all tpcds queries on external tables. There are 2 ways to do this: 1. Have a single qfile - `test_all_tpcds_queries_postgres.q` with `--!qt:database:postgres:q_test_tpcds_tables.postgres.sql` ([q_test_tpcds_tables.postgres.sql](https://github.com/apache/hive/pull/5510/files#diff-a05f632368b15e78cbf5c2f71aeb19278b867d94e6a370168263295602c61966)) on the top of the file to create all tpcds queries in postgres. Then we would need the contents of [q_test_external_tpcds_tables_postgres.q](https://github.com/apache/hive/pull/5510/files#diff-a9cdca90dd8b5872a171faa1c815504f2078040df7328370482b754e6c32bcd4) in `test_all_tpcds_queries_postgres.q` to create all external tables. And then, we will need all tpcds queries too. This would be a huge file, and the resulting qout file could be very difficult to maintain too. 2. Other approach that I can think of is to have individual q files for each tpcds query, but then we will need to create `<script-to-run-in-postgres>` for each one of them if we want to only create the required tables in postgres. We can also just create all tpcds tables for each q file but that would be inefficient I guess. We would still need to create external tables in each q file. For example, for [cbo_query3.q](https://github.com/apache/hive/blob/57f720da75ba5c7416f303730c7a03fae3a08655/ql/src/test/queries/clientpositive/perf/cbo_query3.q), we would need to create 3 jdbc tables (date_dim, store_sales, item) in `<script-to-run-in-postgres-for-cbo_query3>`, and 3 external tables in `cbo_query3_postgres.q`. This could become very difficult to maintain too. With the new driver, we can configure [CliConfigs.java](https://github.com/apache/hive/pull/5510/files#diff-b27a4591a3b7f43fc1fb488aa1d4b2d1b55b67fb1dea63c2871827fc427a4021) to run 1 file to create all tpcds tables once with ``` setJdbcInitScript("q_test_tpcds_tables.postgres.sql"); ``` and run 1 file to create all external tables once with ``` setExternalTablesForJdbcInitScript("q_test_external_tpcds_tables_postgres.q"); ``` and in [CoreJdbcCliDriver.java](https://github.com/apache/hive/pull/5510/files#diff-6275c227d594ccf98ef6bf8adaad5eb8915bc0bb406a139c7ee7c036991fcecf), we can launch the docker container before starting the tests with `beforeClass()`, run all the tests, then stop the docker container in `shutdown()` method. I think this approach is easier to maintain. But please let me know what your thoughts are regarding this, and if you have something else in mind too! Thanks! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
