soumyakanti3578 commented on PR #5510:
URL: https://github.com/apache/hive/pull/5510#issuecomment-2445455874

   @deniskuzZ 
   
   > why not simply parameterize the db to run tests against?
   
   Only option I saw was to add something like 
`--!qt:database:postgres:<script-to-run-in-postgres>`. Did you have this in 
mind as well? I didn't go with this approach because I need to run all tpcds 
queries on external tables. There are 2 ways to do this:
   1. Have a single qfile - `test_all_tpcds_queries_postgres.q` with 
`--!qt:database:postgres:q_test_tpcds_tables.postgres.sql` 
([q_test_tpcds_tables.postgres.sql](https://github.com/apache/hive/pull/5510/files#diff-a05f632368b15e78cbf5c2f71aeb19278b867d94e6a370168263295602c61966))
 on the top of the file to create all tpcds queries in postgres. Then we would 
need the contents of 
[q_test_external_tpcds_tables_postgres.q](https://github.com/apache/hive/pull/5510/files#diff-a9cdca90dd8b5872a171faa1c815504f2078040df7328370482b754e6c32bcd4)
 in `test_all_tpcds_queries_postgres.q` to create all external tables. And 
then, we will need all tpcds queries too. This would be a huge file, and the 
resulting qout file could be very difficult to maintain too.
   2. Other approach that I can think of is to have individual q files for each 
tpcds query, but then we will need to create `<script-to-run-in-postgres>` for 
each one of them if we want to only create the required tables in postgres. We 
can also just create all tpcds tables for each q file but that would be 
inefficient I guess. We would still need to create external tables in each q 
file. For example, for 
[cbo_query3.q](https://github.com/apache/hive/blob/57f720da75ba5c7416f303730c7a03fae3a08655/ql/src/test/queries/clientpositive/perf/cbo_query3.q),
 we would need to create 3 jdbc tables (date_dim, store_sales, item) in 
`<script-to-run-in-postgres-for-cbo_query3>`, and 3 external tables in 
`cbo_query3_postgres.q`. This could become very difficult to maintain too.
   
   With the new driver, we can configure 
[CliConfigs.java](https://github.com/apache/hive/pull/5510/files#diff-b27a4591a3b7f43fc1fb488aa1d4b2d1b55b67fb1dea63c2871827fc427a4021)
 to run 1 file to create all tpcds tables once with 
   ```
   setJdbcInitScript("q_test_tpcds_tables.postgres.sql");
   ```
   and run 1 file to create all external tables once with
   ```
   
setExternalTablesForJdbcInitScript("q_test_external_tpcds_tables_postgres.q");
   ```
   and in 
[CoreJdbcCliDriver.java](https://github.com/apache/hive/pull/5510/files#diff-6275c227d594ccf98ef6bf8adaad5eb8915bc0bb406a139c7ee7c036991fcecf),
 we can launch the docker container before starting the tests with 
`beforeClass()`, run all the tests, then stop the docker container in 
`shutdown()` method.
   
   I think this approach is easier to maintain.
   
   But please let me know what your thoughts are regarding this, and if you 
have something else in mind too!
   
   Thanks!


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to