Hi Quanlong, It looks like you're missing the TPC-H data. In older versions of Impala you had to generate the data manually and put it in that directory. We've automated that in more recent versions (I think probably since a year ago). If you can switch to a newer version, then this will just work. Data loading is a lot more reliable now.
Otherwise this is the script that generates the data. You can probably copy this script to your repository and run it by hand: https://github.com/apache/incubator-impala/blob/master/testdata/datasets/tpch/preload You will also need to do the same for TPC-DS: https://github.com/apache/incubator-impala/blob/master/testdata/datasets/tpcds/preload Cheers, Tim On Thu, Jun 1, 2017 at 12:54 AM, 黄权隆 <huang_quanl...@126.com> wrote: > Hi friends, > > > I'm trying to run the impala tests. What I referred is the wiki 'How to > load and run Impala tests'. > Although I just want to run some end-to-end tests, I know I should load > the test data first. So I use > | > ./buildall.sh -noclean -testdata > | > It succeeded to load the functional test data, but failed to load the tpch > data set. Here are some related logs: > > > /home/CORP/quanlong.huang/workspace/Impala-cdh5.7.3- > release/testdata/target > SUCCESS, data generated into /home/CORP/quanlong.huang/ > workspace/Impala-cdh5.7.3-release/testdata/target > Loading Hive Builtins (logging to load-hive-builtins.log)... OK > Generating HBase data (logging to create-hbase.log)... OK > Creating /test-warehouse HDFS directory (logging to > create-test-warehouse-dir.log)... OK > Starting Impala cluster (logging to start-impala-cluster.log)... OK > Setting up HDFS environment (logging to setup-hdfs-env.log)... OK > Loading custom schemas (logging to load-custom-schemas.log)... OK > Loading functional-query data (logging to load-functional-query.log)... OK > Loading TPC-H data (logging to load-tpch.log)... FAILED > 'load-data tpch core' failed. Tail of log: > Log for command 'load-data tpch core' > Loading workload 'tpch' Using exploration strategy 'core'. Logging to > /home/CORP/quanlong.huang/workspace/Impala-cdh5.7.3- > release/cluster_logs/data_loading/data-load-tpch-core.log > Error loading data. The end of the log file is: > at org.apache.thrift.ProcessFunction.process( > ProcessFunction.java:39) > at org.apache.thrift.TBaseProcessor.process( > TBaseProcessor.java:39) > at org.apache.hive.service.auth.TSetIpAddressProcessor.process( > TSetIpAddressProcessor.java:56) > at org.apache.thrift.server.TThreadPoolServer$WorkerProcess.run( > TThreadPoolServer.java:285) > at java.util.concurrent.ThreadPoolExecutor.runWorker( > ThreadPoolExecutor.java:1145) > at java.util.concurrent.ThreadPoolExecutor$Worker.run( > ThreadPoolExecutor.java:615) > at java.lang.Thread.run(Thread.java:745) > Caused by: org.apache.hadoop.hive.ql.parse.SemanticException: Line 1:23 > Invalid path ''/home/CORP/quanlong.huang/workspace/Impala-cdh5.7.3- > release/testdata/impala-data/tpch/lineitem'': No files matching path > file:/home/CORP/quanlong.huang/workspace/Impala-cdh5.7. > 3-release/testdata/impala-data/tpch/lineitem > at org.apache.hadoop.hive.ql.parse.LoadSemanticAnalyzer. > applyConstraints(LoadSemanticAnalyzer.java:139) > at org.apache.hadoop.hive.ql.parse.LoadSemanticAnalyzer. > analyzeInternal(LoadSemanticAnalyzer.java:230) > at org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer. > analyze(BaseSemanticAnalyzer.java:222) > at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:445) > at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:311) > at org.apache.hadoop.hive.ql.Driver.compileInternal(Driver. > java:1189) > at org.apache.hadoop.hive.ql.Driver.compileAndRespond( > Driver.java:1176) > at org.apache.hive.service.cli.operation.SQLOperation. > prepare(SQLOperation.java:134) > ... 26 more > > > Closing: 0: jdbc:hive2://localhost:11050/default;auth=none > Error executing file from Hive: load-tpch-core-hive-generated.sql > Error in /home/CORP/quanlong.huang/workspace/Impala-cdh5.7.3- > release/testdata/bin/create-load-data.sh at line 41: while [ -n "$*" ] > Error in ./buildall.sh at line 368: > ${IMPALA_HOME}/testdata/bin/create-load-data.sh > ${CREATE_LOAD_DATA_ARGS} <<< Y > > > I'm using version cdh5.7.3-release. The directory > ${IMPALA_HOME}/testdata/impala-data > dose not exist. > > > Could you tell me how to generate this data set? Or where can I download > the snapshot file of test-warehouse so I can skip this step? > > > Thanks > ---- > Quanlong > > > > 【网易自营|30天无忧退货】德国Birkenstock制造商“经典软木凉拖”限时仅69.9元>> > > > > 【网易自营|30天无忧退货】德国Birkenstock制造商“经典软木凉拖”限时仅69.9元>> > > > > 【网易自营|30天无忧退货】德国Birkenstock制造商“经典软木凉拖”限时仅69.9元>>