[ https://issues.apache.org/jira/browse/HIVE-2935?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13476550#comment-13476550 ]
Carl Steinbach commented on HIVE-2935: -------------------------------------- @Thejas: One of the problems we encountered with running qfile tests in parallel is that many of these tests create temporary tables and indexes, and several also modify the underlying default source tables. Both of these issues affect the output of the tests, and when you add in concurrency you end up with non-deterministic output that is impossible to validate with the diff-based verification scheme we have in place today. QFileClient works around this problem by running each qfile test in its own DB/Schema. This solves the problem for the overwhelming majority of qfile tests that exist in Hive today. However, there are several notable drawbacks: # We can't support tests that create new DB/Schemas, use the 'USE' command to switch databases, create catalog objects in separate DB/Schemas using fully qualified names, invoke the 'SHOW DATABASES' command, etc. Tests which fall into this category include ctas_uses_database_location.q, add_part_exist.q, etc. I added most of these tests to the test.beeline.positive.exclude list in build.properties, but clearly I missed some, and it also looks like there are some tests in that list that need to be removed. I'll update this soon. # Index tests such as alter_index.q are also affected since the full name of an index catalog object is ${table_schema}__${table_name}_${index_name}__. We should be able to work around this specific problem by defining a substitution property for each test corresponding to the db name. I will file a separate subtask for this issue. # This partitioning scheme allows us to test concurrent DDL/DML commands in separate namespaces, but doesn't provide any coverage for running concurrent DDL/DML in the same namespace. I don't think it's feasible to do this with the current set of qfiles, and propose instead that we create a separate test that concurrently runs a carefully selected subset of these qfiles in the same namespace. Most of the tests that you listed above fall into one of these categories. Please let me know if you find any that don't and I'll look at them more closely. A separate but related matter is that if we commit this patch with TestBeeLineDriver enabled, the overall time to run all tests will roughly double from ~4hrs to ~8hrs. My preference is do deprecate TestCliDriver in favor of TestBeeLineDriver, but I can't make that decision on my own. > Implement HiveServer2 > --------------------- > > Key: HIVE-2935 > URL: https://issues.apache.org/jira/browse/HIVE-2935 > Project: Hive > Issue Type: New Feature > Components: Server Infrastructure > Reporter: Carl Steinbach > Assignee: Carl Steinbach > Labels: HiveServer2 > Attachments: beelinepositive.tar.gz, HIVE-2935.1.notest.patch.txt, > HIVE-2935.2.notest.patch.txt, HIVE-2935.2.nothrift.patch.txt > > -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira