[
https://issues.apache.org/jira/browse/HIVE-2935?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13476550#comment-13476550
]
Carl Steinbach commented on HIVE-2935:
--------------------------------------
@Thejas: One of the problems we encountered with running qfile tests in
parallel is that many of these tests create temporary tables and indexes, and
several also modify the underlying default source tables. Both of these issues
affect the output of the tests, and when you add in concurrency you end up with
non-deterministic output that is impossible to validate with the diff-based
verification scheme we have in place today. QFileClient works around this
problem by running each qfile test in its own DB/Schema. This solves the
problem for the overwhelming majority of qfile tests that exist in Hive today.
However, there are several notable drawbacks:
# We can't support tests that create new DB/Schemas, use the 'USE' command to
switch databases, create catalog objects in separate DB/Schemas using fully
qualified names, invoke the 'SHOW DATABASES' command, etc. Tests which fall
into this category include ctas_uses_database_location.q, add_part_exist.q,
etc. I added most of these tests to the test.beeline.positive.exclude list in
build.properties, but clearly I missed some, and it also looks like there are
some tests in that list that need to be removed. I'll update this soon.
# Index tests such as alter_index.q are also affected since the full name of an
index catalog object is ${table_schema}__${table_name}_${index_name}__. We
should be able to work around this specific problem by defining a substitution
property for each test corresponding to the db name. I will file a separate
subtask for this issue.
# This partitioning scheme allows us to test concurrent DDL/DML commands in
separate namespaces, but doesn't provide any coverage for running concurrent
DDL/DML in the same namespace. I don't think it's feasible to do this with the
current set of qfiles, and propose instead that we create a separate test that
concurrently runs a carefully selected subset of these qfiles in the same
namespace.
Most of the tests that you listed above fall into one of these categories.
Please let me know if you find any that don't and I'll look at them more
closely.
A separate but related matter is that if we commit this patch with
TestBeeLineDriver enabled, the overall time to run all tests will roughly
double from ~4hrs to ~8hrs. My preference is do deprecate TestCliDriver in
favor of TestBeeLineDriver, but I can't make that decision on my own.
> Implement HiveServer2
> ---------------------
>
> Key: HIVE-2935
> URL: https://issues.apache.org/jira/browse/HIVE-2935
> Project: Hive
> Issue Type: New Feature
> Components: Server Infrastructure
> Reporter: Carl Steinbach
> Assignee: Carl Steinbach
> Labels: HiveServer2
> Attachments: beelinepositive.tar.gz, HIVE-2935.1.notest.patch.txt,
> HIVE-2935.2.notest.patch.txt, HIVE-2935.2.nothrift.patch.txt
>
>
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira