Hi Amos, I tried to see if IMPALA-3643 <https://issues.cloudera.org/browse/IMPALA-3643> is still reproducible for me. In the process I also ran into the "NoClassDefFoundError" error you saw and found out, that this happens if my local hadoop services (./testdata/bin/run-all.sh) isn't running.
Once I have the local hadoop services running I'm able to repro the issue in IMPALA-3643. The only thing I remember doing differently from the recommended ways of dev-machine setup is using Java 8. The Java 8 docs on LinkedHashSet <https://docs.oracle.com/javase/8/docs/api/java/util/LinkedHashSet.html> don't read like they'd change the ordering, however it doesn't say that the ordering has to be stable *between different versions of Java*. I will try to confirm that this issue does not repro with Java 7 when I have time. Can you check the Java version you're running. Mine is this: $ $JAVA_HOME/bin/java -version java version "1.8.0_101" Java(TM) SE Runtime Environment (build 1.8.0_101-b13) Java HotSpot(TM) 64-Bit Server VM (build 25.101-b13, mixed mode) Alex, could a difference in the implementation between Java 7 and 8 be a reasonable explanation for this? Cheers, Lars On Thu, Sep 22, 2016 at 12:52 AM, Amos Bird <amosb...@gmail.com> wrote: > > Thank you for helping me out :D > > > Thanks for keeping this discussion going. > > > > On Sun, Sep 18, 2016 at 8:13 AM, Amos Bird <amosb...@gmail.com> wrote: > > > >> > >> > On Fri, Sep 16, 2016 at 9:06 PM, Amos Bird <amosb...@gmail.com> > wrote: > >> > > >> >> > >> >> Hi there, > >> >> > >> >> I followed the wiki > >> >> https://cwiki.apache.org/confluence/display/IMPALA/How+ > >> >> to+load+and+run+Impala+tests > >> >> carefully but still have some problems in my local env. > >> >> > >> >> 1. I need to manually execute "hdfs dfs -mkdir > >> /test-warehouse/emptytable" > >> >> to get rid of some fe test error. > >> >> > >> >> > >> > Ideally, you should not have to do this. Could you tell me what errors > >> you > >> > encountered? Sounds like there may be a test or data loading bug we > >> should > >> > fix. > >> > >> The error is : > >> > >> TestLoadData(com.cloudera.impala.analysis.AnalyzeStmtsTest) Time > >> elapsed: 0.033 sec <<< FAILURE! > >> java.lang.AssertionError: got error: > >> INPATH location 'hdfs://localhost:20500/test-warehouse/emptytable' does > >> not exist. > >> expected: > >> INPATH location 'hdfs://localhost:20500/test-warehouse/emptytable' > >> contains no visible files. > >> at org.junit.Assert.fail(Assert.java:88) > >> at org.junit.Assert.assertTrue(Assert.java:41) > >> at com.cloudera.impala.common.FrontendTestBase.AnalysisError( > >> FrontendTestBase.java:312) > >> at com.cloudera.impala.common.FrontendTestBase.AnalysisError( > >> FrontendTestBase.java:292) > >> at com.cloudera.impala.analysis.AnalyzeStmtsTest.TestLoadData( > >> AnalyzeStmtsTest.java:2860) > >> > >> > > Do you have a table functional.emptytable? If yes, then what location is > > reported in "show create table"? > Query: show create table functional.emptytable > +-------------------------------------------------------------+ > | result | > +-------------------------------------------------------------+ > | CREATE EXTERNAL TABLE functional.emptytable ( | > | field STRING | > | ) | > | PARTITIONED BY ( | > | f2 INT | > | ) | > | STORED AS TEXTFILE | > | LOCATION 'hdfs://localhost:20500/test-warehouse/emptytable' | > | TBLPROPERTIES ('transient_lastDdlTime'='1464782625') | > +-------------------------------------------------------------+ > Fetched 1 row(s) in 5.51s > > > Does the directory exist in HDFS? > No. > > > > > You could try to manually reload the table and see if the directory is > > created: > > bin/load-data.py -f -w functional-query --table_names=emptytable > > --table_formats=text/none > After executing this command the directory appears. > > > > >>> > >> >> 2. I have authz-policy.ini in HDFS, but I still get authorization > >> errors. > >> >> > >> >> TestSelect[0](com.cloudera.impala.analysis.AuthorizationTest) Time > >> >> elapsed: 0.333 sec <<< FAILURE! > >> >> java.lang.AssertionError: got error: > >> >> User 'amos' does not have privileges to execute 'SELECT' on: > >> default.nodb > >> >> expected: > >> >> User 'amos' does not have privileges to execute 'SELECT' on: > >> nodb.alltypes > >> >> at org.junit.Assert.fail(Assert.java:88) > >> >> at org.junit.Assert.assertTrue(Assert.java:41) > >> >> at com.cloudera.impala.analysis.AuthorizationTest.AuthzError( > >> >> AuthorizationTest.java:2220) > >> >> at com.cloudera.impala.analysis.AuthorizationTest.AuthzError( > >> >> AuthorizationTest.java:2203) > >> >> at com.cloudera.impala.analysis.AuthorizationTest.AuthzError( > >> >> AuthorizationTest.java:2197) > >> >> at com.cloudera.impala.analysis.AuthorizationTest.TestSelect( > >> >> AuthorizationTest.java:512) > >> >> > >> >> TestSelect[1](com.cloudera.impala.analysis.AuthorizationTest) Time > >> >> elapsed: 0.324 sec <<< FAILURE! > >> >> java.lang.AssertionError: got error: > >> >> User 'amos' does not have privileges to execute 'SELECT' on: > >> default.nodb > >> >> expected: > >> >> User 'amos' does not have privileges to execute 'SELECT' on: > >> nodb.alltypes > >> >> at org.junit.Assert.fail(Assert.java:88) > >> >> at org.junit.Assert.assertTrue(Assert.java:41) > >> >> at com.cloudera.impala.analysis.AuthorizationTest.AuthzError( > >> >> AuthorizationTest.java:2220) > >> >> at com.cloudera.impala.analysis.AuthorizationTest.AuthzError( > >> >> AuthorizationTest.java:2203) > >> >> at com.cloudera.impala.analysis.AuthorizationTest.AuthzError( > >> >> AuthorizationTest.java:2197) > >> >> at com.cloudera.impala.analysis.AuthorizationTest.TestSelect( > >> >> AuthorizationTest.java:512) > >> >> > >> >> > >> >> Results : > >> >> > >> >> Failed tests: > >> >> AuthorizationTest.TestSelect:512->AuthzError:2197-> > >> >> AuthzError:2203->AuthzError:2220 got error: > >> >> User 'amos' does not have privileges to execute 'SELECT' on: > >> default.nodb > >> >> expected: > >> >> User 'amos' does not have privileges to execute 'SELECT' on: > >> nodb.alltypes > >> >> AuthorizationTest.TestSelect:512->AuthzError:2197-> > >> >> AuthzError:2203->AuthzError:2220 got error: > >> >> User 'amos' does not have privileges to execute 'SELECT' on: > >> default.nodb > >> >> expected: > >> >> User 'amos' does not have privileges to execute 'SELECT' on: > >> nodb.alltypes > >> >> > >> >> > >> >> > >> > Strange. In this test, we register two authorization requests, and it > >> seems > >> > like those are not checked in the expected order. However, that should > >> not > >> > be possible because we store them in a LinkedHashSet. > >> > Could you dig into this a little further to see if you can figure out > why > >> > the order is wrong? > >> > > >> > This is where we register the authorization requests: > >> > https://github.com/cloudera/Impala/blob/cdh5-trunk/fe/src/ > >> main/java/com/cloudera/impala/analysis/Analyzer.java#L544 > >> > > >> > This is where we check the authorization requests: > >> > https://github.com/cloudera/Impala/blob/cdh5-trunk/fe/src/ > >> main/java/com/cloudera/impala/analysis/AnalysisContext.java#L391 > >> > > >> > > >> > >> I tried directly executing "select 1 from nodb.alltypes" in > >> impala-shell, leading to this error: > >> ERROR: AnalysisException: Could not resolve table reference: > >> 'nodb.alltypes' > >> > >> How can I reproduce the authorization tests in impala-shell so I can > >> debug it? > >> > >> > >> > > FYI, this is actually a known issue and may have something to do with the > > JRE version you are running: https://issues.cloudera.org/ > browse/IMPALA-3643 > > As far as I can tell the bug should be "impossible" because we use a > > LinkedHashSet, but maybe certain JREs do not properly honor the > guarantees. > > > > The AuthorizationTests in particular require a non-trivial setup, so I'd > > not recommend trying to debug via the Impala shell. > > > > I'd recommend debugging in one of these ways: > > - Run the test manually via "mvn test -Dtest=AuthorizationTest" from the > FE > > directory. Attach debugger and break in TestSelect(). > > - Run the JUnit test from an IDE such as Eclipse and then debug the test. > > I'm afraid there is no easy way to just run that single query in our > > current test setup. You will need to run the whole suite, but you can > break > > TestSelect() or hack the code in various places to set useful > breakpoints. > > > > Hope that helps. > > I tried jdk1.8.0_102. Compilation works just fine but AuthorizationTest > fails with ClassNoDefine. > Tests run: 1, Failures: 0, Errors: 1, Skipped: 0, Time elapsed: 0.008 sec > <<< FAILURE! - in com.cloudera.impala.analysis.AuthorizationTest > initializationError(com.cloudera.impala.analysis.AuthorizationTest) Time > elapsed: 0.006 sec <<< ERROR! > java.lang.NoClassDefFoundError: Could not initialize class > com.cloudera.impala.analysis.AuthorizationTest > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at sun.reflect.NativeMethodAccessorImpl.invoke( > NativeMethodAccessorImpl.java:62) > at sun.reflect.DelegatingMethodAccessorImpl.invoke( > DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:498) > at org.junit.runners.model.FrameworkMethod$1.runReflectiveCall( > FrameworkMethod.java:50) > at org.junit.internal.runners.model.ReflectiveCallable.run( > ReflectiveCallable.java:12) > at org.junit.runners.model.FrameworkMethod.invokeExplosively( > FrameworkMethod.java:47) > at org.junit.runners.Parameterized.allParameters(Parameterized.java:280) > at org.junit.runners.Parameterized.<init>(Parameterized.java:248) > at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method) > at sun.reflect.NativeConstructorAccessorImpl.newInstance( > NativeConstructorAccessorImpl.java:62) > at sun.reflect.DelegatingConstructorAccessorImpl.newInstance( > DelegatingConstructorAccessorImpl.java:45) > at java.lang.reflect.Constructor.newInstance(Constructor.java:423) > at org.junit.internal.builders.AnnotatedBuilder.buildRunner( > AnnotatedBuilder.java:104) > at org.junit.internal.builders.AnnotatedBuilder.runnerForClass( > AnnotatedBuilder.java:86) > at org.junit.runners.model.RunnerBuilder.safeRunnerForClass( > RunnerBuilder.java:59) > at org.junit.internal.builders.AllDefaultPossibilitiesBuilder > .runnerForClass(AllDefaultPossibilitiesBuilder.java:26) > at org.junit.runners.model.RunnerBuilder.safeRunnerForClass( > RunnerBuilder.java:59) > at org.junit.internal.requests.ClassRequest.getRunner( > ClassRequest.java:33) > at org.apache.maven.surefire.junit4.JUnit4Provider.execute( > JUnit4Provider.java:283) > at org.apache.maven.surefire.junit4.JUnit4Provider.executeWithRerun( > JUnit4Provider.java:173) > at org.apache.maven.surefire.junit4.JUnit4Provider. > executeTestSet(JUnit4Provider.java:153) > at org.apache.maven.surefire.junit4.JUnit4Provider.invoke( > JUnit4Provider.java:128) > at org.apache.maven.surefire.booter.ForkedBooter. > invokeProviderInSameClassLoader(ForkedBooter.java:203) > at org.apache.maven.surefire.booter.ForkedBooter.runSuitesInProcess( > ForkedBooter.java:155) > at org.apache.maven.surefire.booter.ForkedBooter.main( > ForkedBooter.java:103) > > > > >> > >> >> > >> >> > >> >> 3. For end-to-end tests, I encountered two kinds of errors > >> >> > >> >> a) connection refused. > >> >> > >> >> SET sync_ddl=False; > >> >> -- executing against localhost:21000 > >> >> DROP DATABASE `test_drop_cleans_hdfs_dirs_fdfd4f8` CASCADE; > >> >> > >> >> ___________________ ERROR at setup of TestLoadData.test_load[exec_ > >> option: > >> >> {'disable_codegen': False, 'abort_on_error': 1, > 'exec_single_node_rows_ > >> threshold': > >> >> 0, 'batch_size': 0, 'num_nodes': 0} | table_format: text/none] > >> >> ___________________ > >> >> [gw5] linux2 -- Python 2.6.6 /home/amos/incubator-impala/ > >> >> bin/../infra/python/env/bin/python > >> >> metadata/test_load.py:77: in setup_method > >> >> "{0}/{1}/100101.txt".format(STAGING_PATH, i)) > >> >> util/hdfs_util.py:122: in copy > >> >> data = self.read_file(src) > >> >> ../infra/python/env/lib/python2.6/site-packages/ > >> pywebhdfs/webhdfs.py:183: > >> >> in read_file > >> >> response = requests.get(uri, allow_redirects=True) > >> >> ../infra/python/env/lib/python2.6/site-packages/requests/api.py:69: > in > >> get > >> >> return request('get', url, params=params, **kwargs) > >> >> ../infra/python/env/lib/python2.6/site-packages/requests/api.py:50: > in > >> >> request > >> >> response = session.request(method=method, url=url, **kwargs) > >> >> ../infra/python/env/lib/python2.6/site-packages/ > >> requests/sessions.py:465: > >> >> in request > >> >> resp = self.send(prep, **send_kwargs) > >> >> ../infra/python/env/lib/python2.6/site-packages/ > >> requests/sessions.py:594: > >> >> in send > >> >> history = [resp for resp in gen] if allow_redirects else [] > >> >> ../infra/python/env/lib/python2.6/site-packages/ > >> requests/sessions.py:196: > >> >> in resolve_redirects > >> >> **adapter_kwargs > >> >> ../infra/python/env/lib/python2.6/site-packages/ > >> requests/sessions.py:573: > >> >> in send > >> >> r = adapter.send(request, **kwargs) > >> >> ../infra/python/env/lib/python2.6/site-packages/ > >> requests/adapters.py:415: > >> >> in send > >> >> raise ConnectionError(err, request=request) > >> >> E ConnectionError: ('Connection aborted.', error(111, 'Connection > >> >> refused')) > >> >> > >> >> > >> > The connection refused issue is very bizarre. One thing that I > noticed is > >> > that your Python does not seem to match what we use (Python 2.7.3). > >> > Could you re-run infra/python/bootstrap_virtualenv.py and see if you > get > >> > the expected version into infra/python/env/local/bin? > >> > > >> > Alternatively, maybe there's a problem with your /etc/hosts? You can > try > >> > searching online for WebHdfs and /etc/hosts > >> > > >> > >> well, I find this 'find_py26.py' file under deps. Is it normal? > >> > > > > Yes, that's normal. That file looks Python 2.6 on your system but should > > not be relevant for running tests because we use the Python from our > > virtualenv and not the one on your system. > > > > What's your output when you run "impala-python --version". You should get > > 'Python 2.7.3". > > [amos@t450s tests]$ impala-python --version > Python 2.6.6 > > > Also, what's the Python version on your system? Our virtualenv will use > > Python 2.6 if your system has a Python < 2.6. > > You could try to upgrade your system Python and then > > re-run infra/python/bootstrap_virtualenv.py > > My system has python 2.6.6. > > > > > Still, theoretically Python 2.6 in the virtual env should work. I think > > it's more likely you are having a connection problem due to a > misconfigured > > /etc/hosts. > > > > Are you running the test from a shell that has bin/impala-config.sh and > > bin/set-classpath.sh sourced? > Yes. > > > > > To further debug this you could try to specify your namenode address when > > running the test to see whether it is somehow picking up a wrong address: > > cd tests > > ./run-tests.py metadata/test_load.py --namenode_http_address= > localhost:50070 > > > > And see if that works. > > Unfortunately, no. > > > > > > > > >> [amos@nobida143 incubator-impala]$ ls infra/python/deps/ > >> download_requirements find_py26.py pip_download.py requirements.txt > >> [amos@nobida143 incubator-impala]$ cat infra/python/deps/download_ > >> requirements > >> #!/bin/bash > >> > >> # Licensed to the Apache Software Foundation (ASF) under one > >> # or more contributor license agreements. See the NOTICE file > >> # distributed with this work for additional information > >> # regarding copyright ownership. The ASF licenses this file > >> # to you under the Apache License, Version 2.0 (the > >> # "License"); you may not use this file except in compliance > >> # with the License. You may obtain a copy of the License at > >> # > >> # http://www.apache.org/licenses/LICENSE-2.0 > >> # > >> # Unless required by applicable law or agreed to in writing, > >> # software distributed under the License is distributed on an > >> # "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY > >> # KIND, either express or implied. See the License for the > >> # specific language governing permissions and limitations > >> # under the License. > >> > >> set -euo pipefail > >> > >> DIR="$(dirname "$0")" > >> > >> pushd "$DIR" > >> PY26="$(./find_py26.py)" > >> # Directly download packages listed in requirements.txt, but don't > install > >> them. > >> "$PY26" pip_download.py > >> # For virtualenv, other scripts rely on the .tar.gz package (not a .whl > >> package). > >> "$PY26" pip_download.py virtualenv 13.1.0 > >> # kudu-python is downloaded separately because pip install attempts to > >> execute a > >> # setup.py subcommand for kudu-python that can fail even if the download > >> succeeds. > >> "$PY26" pip_download.py kudu-python 0.1.1 > >> popd > >> > >> > >> > >> > b) stats not match > >> >> > >> >> [gw4] linux2 -- Python 2.6.6 /home/amos/incubator-impala/ > >> >> bin/../infra/python/env/bin/python > >> >> metadata/test_metadata_query_statements.py:67: in test_show_stats > >> >> self.run_test_case('QueryTest/show-stats', vector, "functional") > >> >> common/impala_test_suite.py:342: in run_test_case > >> >> self.__verify_results_and_errors(vector, test_section, result, > >> use_db) > >> >> common/impala_test_suite.py:234: in __verify_results_and_errors > >> >> replace_filenames_with_placeholder) > >> >> common/test_result_verifier.py:398: in verify_raw_results > >> >> VERIFIER_MAP[verifier](expected, actual) > >> >> common/test_result_verifier.py:231: in verify_query_result_is_equal > >> >> assert expected_results == actual_results > >> >> > >> >> ... > >> >> > >> >> -- executing against localhost:21000 > >> >> show column stats alltypes_clone; > >> >> > >> >> MainThread: Comparing QueryTestResults (expected vs actual): > >> >> 'bigint_col','BIGINT',10,-1,8,8 == 'bigint_col','BIGINT',10,-1,8,8 > >> >> 'bool_col','BOOLEAN',2,-1,1,1 == 'bool_col','BOOLEAN',2,-1,1,1 > >> >> 'date_string_col','STRING',736,-1,8,8 == 'date_string_col','STRING', > >> >> 736,-1,8,8 > >> >> 'double_col','DOUBLE',-1,-1,8,8 == 'double_col','DOUBLE',-1,-1,8,8 > >> >> 'float_col','FLOAT',10,-1,4,4 == 'float_col','FLOAT',10,-1,4,4 > >> >> 'id','INT',7505,-1,4,4 == 'id','INT',7505,-1,4,4 > >> >> 'int_col','INT',-1,-1,4,4 == 'int_col','INT',-1,-1,4,4 > >> >> 'month','INT',12,0,4,4 == 'month','INT',12,0,4,4 > >> >> 'smallint_col','SMALLINT',10,-1,2,2 == > 'smallint_col','SMALLINT',10,- > >> 1,2,2 > >> >> 'string_col','STRING',10,-1,-1,-1 == 'string_col','STRING',10,-1,- > 1,-1 > >> >> 'timestamp_col','TIMESTAMP',7554,-1,16,16 != > >> 'timestamp_col','TIMESTAMP', > >> >> 7552,-1,16,16 > >> >> 'tinyint_col','TINYINT',10,-1,1,1 == 'tinyint_col','TINYINT',10,-1, > 1,1 > >> >> 'year','INT',2,0,4,4 == 'year','INT',2,0,4,4 > >> >> > >> >> > >> >> Very strange. Can you do a compute stats on functional.alltypes and > >> > confirm that the NDV for timestamp_col are 7552 in your setup? > >> > >> Yes. > >> > > > > I'll need to ask around for help. I have no idea why this is happening. > > Thanks :) > > > > >> > >> > > >> > > >> > > >> >> I'm using CentOS 6.8 final. I have no idea what goes wrong. Any help > is > >> >> much appreciated! > >> > > >> > > >> > > >> > > >> >> > >> >> Best regards, > >> >> Amos > >> >> > >> > >> > >