Joe McDonnell created IMPALA-14258:
--------------------------------------
Summary: TestAcidRowValidation.test_row_validation fails with
tuple caching
Key: IMPALA-14258
URL: https://issues.apache.org/jira/browse/IMPALA-14258
Project: IMPALA
Issue Type: Bug
Components: Frontend
Affects Versions: Impala 5.0.0
Reporter: Joe McDonnell
When running tuple caching with correctness checking,
TestAcidRowValidation.test_row_validation fails with a correctness issue:
{noformat}
query_test/test_acid_row_validation.py:74: in test_row_validation
self.run_test_case('QueryTest/acid-row-validation-2', vector,
use_db=unique_database)
common/impala_test_suite.py:886: in run_test_case
result = exec_fn(query, user=test_section.get('USER', '').strip() or None)
common/impala_test_suite.py:816: in __exec_in_impala
result = self.__execute_query(target_impalad_client, query, user=user)
common/impala_test_suite.py:1294: in __execute_query
return impalad_client.execute(query, user=user)
common/impala_connection.py:692: in execute
fetch_exec_summary=fetch_exec_summary, profile_format=profile_format)
common/impala_connection.py:705: in __fetch_results_and_profile
profile_format=profile_format)
common/impala_connection.py:868: in __fetch_results
result_tuples = cursor.fetchall()
/data/jenkins/workspace/impala-private-basic-parameterized/repos/Impala/infra/python/env-gcc10.4.0/lib/python2.7/site-packages/impala/hiveserver2.py:624:
in fetchall
elements = self._pop_from_buffer(self.buffersize)
/data/jenkins/workspace/impala-private-basic-parameterized/repos/Impala/infra/python/env-gcc10.4.0/lib/python2.7/site-packages/impala/hiveserver2.py:701:
in _pop_from_buffer
self._ensure_buffer_is_filled()
/data/jenkins/workspace/impala-private-basic-parameterized/repos/Impala/infra/python/env-gcc10.4.0/lib/python2.7/site-packages/impala/hiveserver2.py:683:
in _ensure_buffer_is_filled
convert_strings_to_unicode=self.convert_strings_to_unicode)
/data/jenkins/workspace/impala-private-basic-parameterized/repos/Impala/infra/python/env-gcc10.4.0/lib/python2.7/site-packages/impala/hiveserver2.py:1506:
in fetch
resp = self._rpc('FetchResults', req, False)
/data/jenkins/workspace/impala-private-basic-parameterized/repos/Impala/infra/python/env-gcc10.4.0/lib/python2.7/site-packages/impala/hiveserver2.py:1181:
in _rpc
err_if_rpc_not_ok(response)
/data/jenkins/workspace/impala-private-basic-parameterized/repos/Impala/infra/python/env-gcc10.4.0/lib/python2.7/site-packages/impala/hiveserver2.py:867:
in err_if_rpc_not_ok
raise HiveServer2Error(resp.status.errorMessage)
E HiveServer2Error: Query 82429bc4256e150b:567c70e400000000 failed:
E Inconsistent tuple cache found: Result '[("a5" "b6")]' of file
'/data/jenkins/workspace/tmp/impala-tuplecache-debugdump-0/tuple-cache-debug-dump/3c74c05ecd7a27da552d80ec1c68f446_3279139167/82429bc4256e150b:567c70e400000001_2.bad'
doesn't exist in the reference file:
'/data/jenkins/workspace/tmp/impala-tuplecache-debugdump-0/tuple-cache-debug-dump/3c74c05ecd7a27da552d80ec1c68f446_3279139167/82429bc4256e150b:567c70e400000001_2_fa45023fc3594f7b:14f9671700000001_2_ref.bad'.{noformat}
Full ACID has a validWriteIdList that can impact the results for a table even
without the underlying files changing. Tuple caching will need to handle this
properly.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)