[ https://issues.apache.org/jira/browse/IMPALA-11593?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17609619#comment-17609619 ]
Joe McDonnell commented on IMPALA-11593: ---------------------------------------- This seems to impact various tests, so making this a more generic JIRA. > Disk I/O error with NullPointerException from libhdfs in S3 builds > ------------------------------------------------------------------ > > Key: IMPALA-11593 > URL: https://issues.apache.org/jira/browse/IMPALA-11593 > Project: IMPALA > Issue Type: Bug > Reporter: Quanlong Huang > Assignee: Quanlong Huang > Priority: Critical > Labels: broken-build > > Saw this failure on an S3 build: > {noformat} > custom_cluster/test_mem_reservations.py:102: in > test_per_backend_min_reservation > assert t.error is None > E assert 'ImpalaBeeswaxException:\n Query aborted:Disk I/O error on > impala-ec2-centos79-m6i-4xlarge-ondemand-1db1.vpc.cloudera....warehouse/alltypes/year=2009/month=9/090901.txt\nError(255): > Unknown error 255\nRoot cause: NullPointerException: \n\n' is None > E + where 'ImpalaBeeswaxException:\n Query aborted:Disk I/O error on > impala-ec2-centos79-m6i-4xlarge-ondemand-1db1.vpc.cloudera....warehouse/alltypes/year=2009/month=9/090901.txt\nError(255): > Unknown error 255\nRoot cause: NullPointerException: \n\n' = > <QuerySubmitThread(Thread-165, stopped 140272709113600)>.error > {noformat} > Impalad logs for the query: > {noformat} > I0915 03:12:33.839942 21677 impala-server.cc:1333] > 09439d05a2468038:3816f0f200000000] Registered query > query_id=09439d05a2468038:3816f0f200000000 > session_id=874c5100c59607af:a86e04c8f62bb9a9 > I0915 03:12:33.889168 21677 Frontend.java:1628] > 09439d05a2468038:3816f0f200000000] Analyzing query: select max(t.c1), > avg(t.c2), min(t.c3), avg(c4), avg(c5), avg(c6) > from (select > max(tinyint_col) over (order by int_col) c1, > avg(tinyint_col) over (order by smallint_col) c2, > min(tinyint_col) over (order by smallint_col desc) c3, > rank() over (order by int_col desc) c4, > dense_rank() over (order by bigint_col) c5, > first_value(tinyint_col) over (order by bigint_col desc) c6 > from functional.alltypes) t; db: default > I0915 03:12:33.981251 21677 FeSupport.java:315] > 09439d05a2468038:3816f0f200000000] Requesting prioritized load of table(s): > functional.alltypes > I0915 03:12:33.986737 21677 thrift-util.cc:99] > 09439d05a2468038:3816f0f200000000] TSocket::open() connect() <Host: localhost > Port: 26000>: Connection refused > I0915 03:12:34.582643 21677 BaseAuthorizationChecker.java:113] > 09439d05a2468038:3816f0f200000000] Authorization check took 693 ms > I0915 03:12:34.582674 21677 Frontend.java:1671] > 09439d05a2468038:3816f0f200000000] Analysis and authorization finished. > I0915 03:12:34.723712 21208 control-service.cc:148] > 4a4ebd3b7575254c:eb71cd8000000000] ExecQueryFInstances(): > query_id=4a4ebd3b7575254c:eb71cd8000000000 > coord=impala-ec2-centos79-m6i-4xlarge-ondemand-1db1.vpc.cloudera.com:27000 > #instances=1 > I0915 03:12:34.738032 21758 query-state.cc:942] > 4a4ebd3b7575254c:eb71cd8000000002] Executing instance. > instance_id=4a4ebd3b7575254c:eb71cd8000000002 fragment_idx=1 > per_fragment_instance_idx=1 coord_state_idx=1 #in-flight=1 > I0915 03:12:34.850791 21820 admission-controller.cc:1819] > 09439d05a2468038:3816f0f200000000] Trying to admit > id=09439d05a2468038:3816f0f200000000 in pool_name=default-pool > executor_group_name=default per_host_mem_estimate=1.34 GB > dedicated_coord_mem_estimate=1.10 GB max_requests=-1 max_queued=200 > max_mem=-1.00 B > I0915 03:12:34.850811 21820 admission-controller.cc:1827] > 09439d05a2468038:3816f0f200000000] Stats: agg_num_running=1, > agg_num_queued=0, agg_mem_reserved=1.56 GB, local_host(local_mem_admitted=0, > num_admitted_running=0, num_queued=0, backend_mem_reserved=192.46 MB, > topN_query_stats: queries=[4a4ebd3b7575254c:eb71cd8000000000], > total_mem_consumed=192.46 MB, fraction_of_pool_total_mem=1; pool_level_stats: > num_running=1, min=192.46 MB, max=192.46 MB, pool_total_mem=192.46 MB, > average_per_query=192.46 MB) > I0915 03:12:34.850852 21820 admission-controller.cc:1218] > 09439d05a2468038:3816f0f200000000] Admitting query > id=09439d05a2468038:3816f0f200000000 > I0915 03:12:34.850939 21820 impala-server.cc:2159] > 09439d05a2468038:3816f0f200000000] Registering query locations > I0915 03:12:34.850998 21820 coordinator.cc:150] > 09439d05a2468038:3816f0f200000000] Exec() > query_id=09439d05a2468038:3816f0f200000000 stmt=select max(t.c1), avg(t.c2), > min(t.c3), avg(c4), avg(c5), avg(c6) > from (select > max(tinyint_col) over (order by int_col) c1, > avg(tinyint_col) over (order by smallint_col) c2, > min(tinyint_col) over (order by smallint_col desc) c3, > rank() over (order by int_col desc) c4, > dense_rank() over (order by bigint_col) c5, > first_value(tinyint_col) over (order by bigint_col desc) c6 > from functional.alltypes) t; > I0915 03:12:34.851434 21820 coordinator.cc:474] > 09439d05a2468038:3816f0f200000000] starting execution on 3 backends for > query_id=09439d05a2468038:3816f0f200000000 > I0915 03:12:34.856995 21208 control-service.cc:148] > 09439d05a2468038:3816f0f200000000] ExecQueryFInstances(): > query_id=09439d05a2468038:3816f0f200000000 > coord=impala-ec2-centos79-m6i-4xlarge-ondemand-1db1.vpc.cloudera.com:27001 > #instances=2 > I0915 03:12:34.858456 21820 coordinator.cc:533] > 09439d05a2468038:3816f0f200000000] started execution on 3 backends for > query_id=09439d05a2468038:3816f0f200000000 > I0915 03:12:34.860503 21841 query-state.cc:942] > 09439d05a2468038:3816f0f200000002] Executing instance. > instance_id=09439d05a2468038:3816f0f200000002 fragment_idx=1 > per_fragment_instance_idx=1 coord_state_idx=0 #in-flight=2 > I0915 03:12:34.860591 21843 query-state.cc:942] > 09439d05a2468038:3816f0f200000000] Executing instance. > instance_id=09439d05a2468038:3816f0f200000000 fragment_idx=0 > per_fragment_instance_idx=0 coord_state_idx=0 #in-flight=3 > I0915 03:12:35.057634 21208 coordinator.cc:1032] Backend completed: > host=impala-ec2-centos79-m6i-4xlarge-ondemand-1db1.vpc.cloudera.com:27002 > remaining=3 query_id=09439d05a2468038:3816f0f200000000 > I0915 03:12:35.057649 21208 coordinator-backend-state.cc:371] > query_id=09439d05a2468038:3816f0f200000000: first in-progress backend: > impala-ec2-centos79-m6i-4xlarge-ondemand-1db1.vpc.cloudera.com:27001 > I0915 03:12:35.149704 21208 coordinator.cc:1032] Backend completed: > host=impala-ec2-centos79-m6i-4xlarge-ondemand-1db1.vpc.cloudera.com:27000 > remaining=2 query_id=09439d05a2468038:3816f0f200000000 > I0915 03:12:35.149719 21208 coordinator-backend-state.cc:371] > query_id=09439d05a2468038:3816f0f200000000: first in-progress backend: > impala-ec2-centos79-m6i-4xlarge-ondemand-1db1.vpc.cloudera.com:27001 > I0915 03:12:35.106189 21377 status.cc:71] Disk I/O error on > impala-ec2-centos79-m6i-4xlarge-ondemand-1db1.vpc.cloudera.com:27001: Failed > to open HDFS file > s3a://impala-test-uswest2-1/test-warehouse/alltypes/year=2009/month=9/090901.txt > Error(255): Unknown error 255 > Root cause: NullPointerException: > @ 0x1f096f4 impala::Status::Status() > @ 0x1f1dfdb impala::Status::Status() > @ 0x2ea28bb impala::io::OpenHdfsFileOp::Execute() > @ 0x2ea3884 impala::SynchronousWorkItem::WorkerExecute() > @ 0x2ea4071 impala::SynchronousThreadPool::Worker() > @ 0x2ea6a89 > boost::detail::function::void_function_invoker2<>::invoke() > @ 0x2ea6734 boost::function2<>::operator()() > @ 0x2ea56f1 impala::ThreadPool<>::WorkerThread() > @ 0x2ea8819 boost::_mfi::mf1<>::operator()() > @ 0x2ea8645 boost::_bi::list2<>::operator()<>() > @ 0x2ea82d2 boost::_bi::bind_t<>::operator()() > @ 0x2ea7eaf > boost::detail::function::void_function_obj_invoker0<>::invoke() > @ 0x221e4f7 boost::function0<>::operator()() > @ 0x29a898f impala::Thread::SuperviseThread() > @ 0x29b12f0 boost::_bi::list5<>::operator()<>() > @ 0x29b1214 boost::_bi::bind_t<>::operator()() > @ 0x29b11d5 boost::detail::thread_data<>::run() > @ 0x42018b1 thread_proxy > @ 0x7f4590612ea4 start_thread > @ 0x7f458d00cb0c __clone > I0915 03:12:35.195694 21854 hdfs-scan-node.cc:515] > 09439d05a2468038:3816f0f200000002] Scan node (id=0) ran into a parse error > for scan range > s3a://impala-test-uswest2-1/test-warehouse/alltypes/year=2009/month=9/090901.txt(0:20179). > Processed 0 bytes. > I0915 03:12:35.196115 21841 query-state.cc:951] > 09439d05a2468038:3816f0f200000002] Instance completed. > instance_id=09439d05a2468038:3816f0f200000002 #in-flight=1 > status=DISK_IO_ERROR: Disk I/O error on > impala-ec2-centos79-m6i-4xlarge-ondemand-1db1.vpc.cloudera.com:27001: Failed > to open HDFS file > s3a://impala-test-uswest2-1/test-warehouse/alltypes/year=2009/month=9/090901.txt > Error(255): Unknown error 255 > Root cause: NullPointerException: > I0915 03:12:35.196135 21829 query-state.cc:462] > 09439d05a2468038:3816f0f200000000] UpdateBackendExecState(): last report for > 09439d05a2468038:3816f0f200000000 > I0915 03:12:35.198632 21208 coordinator.cc:1032] Backend completed: > host=impala-ec2-centos79-m6i-4xlarge-ondemand-1db1.vpc.cloudera.com:27001 > remaining=1 query_id=09439d05a2468038:3816f0f200000000 > I0915 03:12:35.198649 21208 coordinator.cc:752] ExecState: query > id=09439d05a2468038:3816f0f200000000 > finstance=09439d05a2468038:3816f0f200000002 on > host=impala-ec2-centos79-m6i-4xlarge-ondemand-1db1.vpc.cloudera.com:27001 > (EXECUTING -> ERROR) status=Disk I/O error on > impala-ec2-centos79-m6i-4xlarge-ondemand-1db1.vpc.cloudera.com:27001: Failed > to open HDFS file > s3a://impala-test-uswest2-1/test-warehouse/alltypes/year=2009/month=9/090901.txt > Error(255): Unknown error 255 > Root cause: NullPointerException: > I0915 03:12:35.198689 21208 coordinator-backend-state.cc:974] > query_id=09439d05a2468038:3816f0f200000000 target backend=127.0.0.1:27001: > Not cancelling because the backend is already done: Disk I/O error on > impala-ec2-centos79-m6i-4xlarge-ondemand-1db1.vpc.cloudera.com:27001: Failed > to open HDFS file > s3a://impala-test-uswest2-1/test-warehouse/alltypes/year=2009/month=9/090901.txt > Error(255): Unknown error 255 > Root cause: NullPointerException: > I0915 03:12:35.198695 21208 coordinator-backend-state.cc:974] > query_id=09439d05a2468038:3816f0f200000000 target backend=127.0.0.1:27000: > Not cancelling because the backend is already done: > I0915 03:12:35.198702 21208 coordinator-backend-state.cc:974] > query_id=09439d05a2468038:3816f0f200000000 target backend=127.0.0.1:27002: > Not cancelling because the backend is already done: > I0915 03:12:35.198706 21208 coordinator.cc:999] CancelBackends() > query_id=09439d05a2468038:3816f0f200000000, tried to cancel 0 backends > I0915 03:12:35.198752 21208 coordinator.cc:1375] Release admission control > resources for query_id=09439d05a2468038:3816f0f200000000 > {noformat} > This could be due to the same cause of IMPALA-11592. Maybe there is an issue > inside the hdfs client. Hadoop version: hadoop-3.1.1.7.2.16.0-171 -- This message was sent by Atlassian Jira (v8.20.10#820010) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org