[jira] [Commented] (IMPALA-10186) Write invalid parquet PageLocations which table sort by some columns

2020-09-24 Thread guojingfeng (Jira)


[ 
https://issues.apache.org/jira/browse/IMPALA-10186?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17201316#comment-17201316
 ] 

guojingfeng commented on IMPALA-10186:
--

Is there are other cases that cause empty pages. as my inspect, my column is 
fix sized (46 Bytes) STRING which is smaller than DEFAULT_DATA_PAGE_SIZE (64KB)

> Write invalid parquet PageLocations which table sort by some columns
> 
>
> Key: IMPALA-10186
> URL: https://issues.apache.org/jira/browse/IMPALA-10186
> Project: IMPALA
>  Issue Type: Bug
>Reporter: guojingfeng
>Assignee: Zoltán Borók-Nagy
>Priority: Major
>
> Current parquet writer write -1 of PageLocation.offset and 
> PageLocation.first_row_index when meet a empty page. 
>  hdfs-parquet-file-writer.cc  Line: 808 ~ 819
> {code:java}
>   // Write data pages
>   for (const DataPage& page : pages_) {
> if (page.header.data_page_header.num_values == 0) {
>   // Skip empty pages
>   location.offset = -1;
>   location.compressed_page_size = 0;
>   location.first_row_index = -1;
>   AddLocationToOffsetIndex(location);
>   continue;
> }
> {code}
> But -1 values may cause   ComputeCandidatePages function run into unexpected 
> status.
> {code:java}
> bool ComputeCandidatePages(
> const vector& page_locations,
> const vector& candidate_ranges,
> const int64_t num_rows, vector* candidate_pages) {
>   if (!ValidatePageLocations(page_locations, num_rows)) return false
> {code}
> and then cause  IMPALA-9952
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Commented] (IMPALA-10183) Hit promise DCHECK while looping result spooling tests

2020-09-24 Thread ASF subversion and git services (Jira)


[ 
https://issues.apache.org/jira/browse/IMPALA-10183?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17201323#comment-17201323
 ] 

ASF subversion and git services commented on IMPALA-10183:
--

Commit c4277d8f7558451731b6da4d984df7bd78942050 in impala's branch 
refs/heads/master from stiga-huang
[ https://gitbox.apache.org/repos/asf?p=impala.git;h=c4277d8 ]

IMPALA-10183: Fix hitting DCHECK when cancelling a query with result spooling

BufferedPlanRootSink has a Promise, all_results_spooled_, that could be
accessed by different threads, e.g. the fragment execution thread and
cancellation threads. The main purpose of setting this Promise is to
unblock the coordinator if it's waiting for this. So we can simply
declare this Promise's mode to be MULTIPLE_PRODUCER to avoid hitting the
DCHECK in Promise.Set().

Tests:
 - Run TestResultSpoolingFailpoints::test_failpoints for more than 4000
   iterations

Change-Id: Iaba0ed729ef984f9c51347df02e9fb6f90bc71e0
Reviewed-on: http://gerrit.cloudera.org:8080/16489
Reviewed-by: Tim Armstrong 
Tested-by: Impala Public Jenkins 


> Hit promise DCHECK while looping result spooling tests
> --
>
> Key: IMPALA-10183
> URL: https://issues.apache.org/jira/browse/IMPALA-10183
> Project: IMPALA
>  Issue Type: Bug
>  Components: Backend
>Reporter: Tim Armstrong
>Assignee: Quanlong Huang
>Priority: Major
> Attachments: impalad.ERROR.gz, impalad.FATAL.gz, impalad.INFO.gz
>
>
> {noformat}
> while impala-py.test tests/query_test/test_result_spooling.py -n4 ; do date; 
> done
> {noformat}
> {noformat}
> F0921 10:14:35.355281  5842 promise.h:61] Check failed: mode == 
> PromiseMode::MULTIPLE_PRODUCER [ mode = 0 , PromiseM
> ode::MULTIPLE_PRODUCER = 1 ]Called Set(..) twice on the same Promise in 
> SINGLE_PRODUCER mode
> *** Check failure stack trace: ***
> @  0x52087fc  google::LogMessage::Fail()
> @  0x520a0ec  google::LogMessage::SendToLog()
> @  0x520815a  google::LogMessage::Flush()
> @  0x520bd58  google::LogMessageFatal::~LogMessageFatal()
> @  0x223cc50  impala::Promise<>::Set()
> @  0x293f21d  impala::BufferedPlanRootSink::Cancel()
> @  0x2317856  impala::FragmentInstanceState::Cancel()
> @  0x2284c62  impala::QueryState::Cancel()
> @  0x2464728  impala::ControlService::CancelQueryFInstances()
> @  0x253df37  
> _ZZN6impala16ControlServiceIfC4ERK13scoped_refptrIN4kudu12MetricEntityEERKS1_INS2_3rpc13Re
> sultTrackerEEENKUlPKN6google8protobuf7MessageEPSE_PNS7_10RpcContextEE4_clESG_SH_SJ_
> @  0x253fb65  
> _ZNSt17_Function_handlerIFvPKN6google8protobuf7MessageEPS2_PN4kudu3rpc10RpcContextEEZN6imp
> ala16ControlServiceIfC4ERK13scoped_refptrINS6_12MetricEntityEERKSD_INS7_13ResultTrackerEEEUlS4_S5_S9_E4_E9_M_invokeE
> RKSt9_Any_dataOS4_OS5_OS9_
> @  0x2c9612f  std::function<>::operator()()
> @  0x2c95ade  kudu::rpc::GeneratedServiceIf::Handle()
> @  0x21d8c55  impala::ImpalaServicePool::RunThread()
> @  0x21de836  boost::_mfi::mf0<>::operator()()
> @  0x21de468  boost::_bi::list1<>::operator()<>()
> @  0x21de02e  boost::_bi::bind_t<>::operator()()
> @  0x21ddaa5  
> boost::detail::function::void_function_obj_invoker0<>::invoke()
> @  0x2140b55  boost::function0<>::operator()()
> @  0x271e1a9  impala::Thread::SuperviseThread()
> @  0x2726146  boost::_bi::list5<>::operator()<>()
> @  0x272606a  boost::_bi::bind_t<>::operator()()
> @  0x272602b  boost::detail::thread_data<>::run()
> @  0x3f0f621  thread_proxy
> @ 0x7f4db3f356da  start_thread
> @ 0x7f4db092ca3e  clone
> Wrote minidump to 
> /home/tarmstrong/Impala/impala/logs/cluster/minidumps/impalad/3204ffe5-6905-4842-d702c395-21c4eca5
> .dmp
> (END)
> {noformat}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Commented] (IMPALA-10170) Data race on Webserver::UrlHandler::is_on_nav_bar_

2020-09-24 Thread ASF subversion and git services (Jira)


[ 
https://issues.apache.org/jira/browse/IMPALA-10170?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17201322#comment-17201322
 ] 

ASF subversion and git services commented on IMPALA-10170:
--

Commit d6442e0d8afb9fc4f5070583f1ce2ee143b7f13b in impala's branch 
refs/heads/master from Sahil Takiar
[ https://gitbox.apache.org/repos/asf?p=impala.git;h=d6442e0 ]

IMPALA-10170: Data race on Webserver::UrlHandler::is_on_nav_bar_

This data race can be reproduced by
TestCompactCatalogUpdates.test_restart_catalogd, although it does not
seem to always occur. The data race was originally reported in a Jenkins
job, but I could not reproduce it locally.

The fix is to acquire a read lock while reading some UrlHandler objects.
I cleaned up some of the other involved variables and made them const.
These variables are set during construction time, and never modified
afterwards.

Testing:
* Ran be and custom cluster TSAN tests

Change-Id: I6923af4754e3fe72b8b04c5303a1e7a79da7613a
Reviewed-on: http://gerrit.cloudera.org:8080/16459
Reviewed-by: Impala Public Jenkins 
Tested-by: Impala Public Jenkins 


> Data race on Webserver::UrlHandler::is_on_nav_bar_
> --
>
> Key: IMPALA-10170
> URL: https://issues.apache.org/jira/browse/IMPALA-10170
> Project: IMPALA
>  Issue Type: Sub-task
>  Components: Backend
>Reporter: Sahil Takiar
>Assignee: Sahil Takiar
>Priority: Major
>
> {code}
> WARNING: ThreadSanitizer: data race (pid=31102)
>   Read of size 1 at 0x7b2c0006e3b0 by thread T42:
> #0 impala::Webserver::UrlHandler::is_on_nav_bar() const 
> /data/jenkins/workspace/impala-private-parameterized/repos/Impala/be/src/util/webserver.h:152:41
>  (impalad+0x256ff39)
> #1 
> impala::Webserver::GetCommonJson(rapidjson::GenericDocument,
>  rapidjson::MemoryPoolAllocator, 
> rapidjson::CrtAllocator>*, sq_connection const*, 
> kudu::WebCallbackRegistry::WebRequest const&) 
> /data/jenkins/workspace/impala-private-parameterized/repos/Impala/be/src/util/webserver.cc:527:24
>  (impalad+0x256be13)
> #2 impala::Webserver::RenderUrlWithTemplate(sq_connection const*, 
> kudu::WebCallbackRegistry::WebRequest const&, impala::Webserver::UrlHandler 
> const&, std::__cxx11::basic_stringstream, 
> std::allocator >*, impala::ContentType*) 
> /data/jenkins/workspace/impala-private-parameterized/repos/Impala/be/src/util/webserver.cc:816:3
>  (impalad+0x256e882)
> #3 impala::Webserver::BeginRequestCallback(sq_connection*, 
> sq_request_info*) 
> /data/jenkins/workspace/impala-private-parameterized/repos/Impala/be/src/util/webserver.cc:714:5
>  (impalad+0x256cfbb)
> #4 impala::Webserver::BeginRequestCallbackStatic(sq_connection*) 
> /data/jenkins/workspace/impala-private-parameterized/repos/Impala/be/src/util/webserver.cc:556:20
>  (impalad+0x256ba98)
> #5 handle_request  (impalad+0x2582d59)
>   Previous write of size 2 at 0x7b2c0006e3b0 by main thread:
> #0 
> impala::Webserver::UrlHandler::UrlHandler(impala::Webserver::UrlHandler&&) 
> /data/jenkins/workspace/impala-private-parameterized/repos/Impala/be/src/util/webserver.h:141:9
>  (impalad+0x2570dbc)
> #1 std::pair, 
> std::allocator > const, 
> impala::Webserver::UrlHandler>::pair std::char_traits, std::allocator >, 
> impala::Webserver::UrlHandler, 
> true>(std::pair, 
> std::allocator >, impala::Webserver::UrlHandler>&&) 
> /data/jenkins/workspace/impala-private-parameterized/Impala-Toolchain/toolchain-packages-gcc7.5.0/gcc-7.5.0/lib/gcc/x86_64-pc-linux-gnu/7.5.0/../../../../include/c++/7.5.0/bits/stl_pair.h:362:4
>  (impalad+0x25738b3)
> #2 void 
> __gnu_cxx::new_allocator  std::char_traits, std::allocator > const, 
> impala::Webserver::UrlHandler> > 
> >::construct std::char_traits, std::allocator > const, 
> impala::Webserver::UrlHandler>, std::pair std::char_traits, std::allocator >, 
> impala::Webserver::UrlHandler> >(std::pair std::char_traits, std::allocator > const, 
> impala::Webserver::UrlHandler>*, std::pair std::char_traits, std::allocator >, 
> impala::Webserver::UrlHandler>&&) 
> /data/jenkins/workspace/impala-private-parameterized/Impala-Toolchain/toolchain-packages-gcc7.5.0/gcc-7.5.0/lib/gcc/x86_64-pc-linux-gnu/7.5.0/../../../../include/c++/7.5.0/ext/new_allocator.h:136:23
>  (impalad+0x2573848)
> #3 void 
> std::allocator_traits  std::char_traits, std::allocator > const, 
> impala::Webserver::UrlHandler> > > 
> >::construct std::char_traits, std::allocator > const, 
> impala::Webserver::UrlHandler>, std::pair std::char_traits, std::allocator >, 
> impala::Webserver::UrlHandler> 
> >(std::allocator  std::char_traits, std::allocator > const, 
> impala::Webserver::UrlHandler> > >&, 
> std::pair, 
> std::allocator > const, impala::Webserver::UrlHandler>*, 
> std::pair, 
> std::allocator >, impala::Webserver:

[jira] [Resolved] (IMPALA-10183) Hit promise DCHECK while looping result spooling tests

2020-09-24 Thread Quanlong Huang (Jira)


 [ 
https://issues.apache.org/jira/browse/IMPALA-10183?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Quanlong Huang resolved IMPALA-10183.
-
Fix Version/s: Impala 4.0
   Resolution: Fixed

> Hit promise DCHECK while looping result spooling tests
> --
>
> Key: IMPALA-10183
> URL: https://issues.apache.org/jira/browse/IMPALA-10183
> Project: IMPALA
>  Issue Type: Bug
>  Components: Backend
>Reporter: Tim Armstrong
>Assignee: Quanlong Huang
>Priority: Major
> Fix For: Impala 4.0
>
> Attachments: impalad.ERROR.gz, impalad.FATAL.gz, impalad.INFO.gz
>
>
> {noformat}
> while impala-py.test tests/query_test/test_result_spooling.py -n4 ; do date; 
> done
> {noformat}
> {noformat}
> F0921 10:14:35.355281  5842 promise.h:61] Check failed: mode == 
> PromiseMode::MULTIPLE_PRODUCER [ mode = 0 , PromiseM
> ode::MULTIPLE_PRODUCER = 1 ]Called Set(..) twice on the same Promise in 
> SINGLE_PRODUCER mode
> *** Check failure stack trace: ***
> @  0x52087fc  google::LogMessage::Fail()
> @  0x520a0ec  google::LogMessage::SendToLog()
> @  0x520815a  google::LogMessage::Flush()
> @  0x520bd58  google::LogMessageFatal::~LogMessageFatal()
> @  0x223cc50  impala::Promise<>::Set()
> @  0x293f21d  impala::BufferedPlanRootSink::Cancel()
> @  0x2317856  impala::FragmentInstanceState::Cancel()
> @  0x2284c62  impala::QueryState::Cancel()
> @  0x2464728  impala::ControlService::CancelQueryFInstances()
> @  0x253df37  
> _ZZN6impala16ControlServiceIfC4ERK13scoped_refptrIN4kudu12MetricEntityEERKS1_INS2_3rpc13Re
> sultTrackerEEENKUlPKN6google8protobuf7MessageEPSE_PNS7_10RpcContextEE4_clESG_SH_SJ_
> @  0x253fb65  
> _ZNSt17_Function_handlerIFvPKN6google8protobuf7MessageEPS2_PN4kudu3rpc10RpcContextEEZN6imp
> ala16ControlServiceIfC4ERK13scoped_refptrINS6_12MetricEntityEERKSD_INS7_13ResultTrackerEEEUlS4_S5_S9_E4_E9_M_invokeE
> RKSt9_Any_dataOS4_OS5_OS9_
> @  0x2c9612f  std::function<>::operator()()
> @  0x2c95ade  kudu::rpc::GeneratedServiceIf::Handle()
> @  0x21d8c55  impala::ImpalaServicePool::RunThread()
> @  0x21de836  boost::_mfi::mf0<>::operator()()
> @  0x21de468  boost::_bi::list1<>::operator()<>()
> @  0x21de02e  boost::_bi::bind_t<>::operator()()
> @  0x21ddaa5  
> boost::detail::function::void_function_obj_invoker0<>::invoke()
> @  0x2140b55  boost::function0<>::operator()()
> @  0x271e1a9  impala::Thread::SuperviseThread()
> @  0x2726146  boost::_bi::list5<>::operator()<>()
> @  0x272606a  boost::_bi::bind_t<>::operator()()
> @  0x272602b  boost::detail::thread_data<>::run()
> @  0x3f0f621  thread_proxy
> @ 0x7f4db3f356da  start_thread
> @ 0x7f4db092ca3e  clone
> Wrote minidump to 
> /home/tarmstrong/Impala/impala/logs/cluster/minidumps/impalad/3204ffe5-6905-4842-d702c395-21c4eca5
> .dmp
> (END)
> {noformat}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Assigned] (IMPALA-9822) Impala does not notify user that row format delimited fields is only logical when using STORED AS TEXTFILE

2020-09-24 Thread Fucun Chu (Jira)


 [ 
https://issues.apache.org/jira/browse/IMPALA-9822?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Fucun Chu reassigned IMPALA-9822:
-

Assignee: Fucun Chu

> Impala does not notify user that row format delimited fields is only logical 
> when using STORED AS TEXTFILE
> --
>
> Key: IMPALA-9822
> URL: https://issues.apache.org/jira/browse/IMPALA-9822
> Project: IMPALA
>  Issue Type: Improvement
>  Components: Frontend
>Affects Versions: Impala 3.4.0
>Reporter: Alexandra Dunai
>Assignee: Fucun Chu
>Priority: Minor
>  Labels: newbie, ramp-up, usability
>
> When creating a table with added "ROW FORMAT DELIMITED FIELDS", Impala does 
> not alert the user that this is only logical when using STORED AS TEXTFILE.
> You only discover that you made a mistake after trying to run a select from 
> the table.
>  Table creation:
> {code:java}
> [adunai-1.adunai.root.hwx.site:21000] default> CREATE EXTERNAL TABLE 
> sales_fact_1997(product_id INT,time_id INT,customer_id INT,promotion_id 
> INT,store_id INT,store_sales DECIMAL(10,4),store_cost 
> DECIMAL(10,4),unit_sales DECIMAL(10,4)) > row format delimited fields 
> terminated by '\011' STORED AS PARQUET > location 
> '/user/impala/mondrian/sales_fact_1997';
> Query: CREATE EXTERNAL TABLE sales_fact_1997(product_id INT,time_id 
> INT,customer_id INT,promotion_id INT,store_id INT,store_sales 
> DECIMAL(10,4),store_cost DECIMAL(10,4),unit_sales DECIMAL(10,4))row format 
> delimited fields terminated by '\011' STORED AS PARQUETlocation 
> '/user/impala/mondrian/sales_fact_1997'
>  
> +-+| summary |+-+| Table has 
> been created. |+-+
> Fetched 1 row(s) in 0.10s
> {code}
>  
> Select: 
> {code:java}
> [adunai-1.adunai.root.hwx.site:21000] mondrian> select count(*) from 
> agg_c_10_sales_fact_1997;Query: select count(*) from 
> agg_c_10_sales_fact_1997Query submitted at: 2020-06-03 11:55:06 (Coordinator: 
> http://adunai-1.adunai.root.hwx.site:25000)Query progress can be monitored 
> at: 
> http://adunai-1.adunai.root.hwx.site:25000/query_plan?query_id=d547fafd0162da4e:872a95c1ERROR:
>  File 
> 'hdfs://adunai-2.adunai.root.hwx.site:8020/user/impala/mondrian/agg_c_10_sales_fact_1997/agg_c_10_sales_fact_1997.tsv'
>  has an invalid Parquet version number: 717. Please check that it is a valid 
> Parquet file. This error can also occur due to stale metadata. If you believe 
> this is a valid Parquet file, try running "refresh 
> mondrian.agg_c_10_sales_fact_1997".
>  {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Commented] (IMPALA-9952) Invalid offset index in Parquet file

2020-09-24 Thread Jira


[ 
https://issues.apache.org/jira/browse/IMPALA-9952?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17201539#comment-17201539
 ] 

Zoltán Borók-Nagy commented on IMPALA-9952:
---

Uploaded fix for review: [https://gerrit.cloudera.org/#/c/16503/]

>  Invalid offset index in Parquet file
> -
>
> Key: IMPALA-9952
> URL: https://issues.apache.org/jira/browse/IMPALA-9952
> Project: IMPALA
>  Issue Type: Bug
>  Components: Backend
>Affects Versions: Impala 3.4.0
>Reporter: guojingfeng
>Assignee: Zoltán Borók-Nagy
>Priority: Major
>  Labels: Parquet
>
> When reading parquet file in impala 3.4, encountered the following error:
> {code:java}
> I0714 16:11:48.307806 1075820 runtime-state.cc:207] 
> 8c43203adb2d4fc8:0478df9b018b] Error from query 
> 8c43203adb2d4fc8:0478df9b: Invalid offset index in Parquet file 
> hdfs://path/4844de7af4545a39-e8ebc7da005f_2015704758_data.0.parq.
> I0714 16:11:48.834901 1075838 status.cc:126] 
> 8c43203adb2d4fc8:0478df9b02c0] Invalid offset index in Parquet file 
> hdfs://path/4844de7af4545a39-e8ebc7da005f_2015704758_data.0.parq.
> @   0xbf4ef9
> @  0x1748c41
> @  0x174e170
> @  0x1750e58
> @  0x17519f0
> @  0x1748559
> @  0x1510b41
> @  0x1512c8f
> @  0x137488a
> @  0x1375759
> @  0x1b48a19
> @ 0x7f34509f5e24
> @ 0x7f344d5ed35c
> I0714 16:11:48.835763 1075838 runtime-state.cc:207] 
> 8c43203adb2d4fc8:0478df9b02c0] Error from query 
> 8c43203adb2d4fc8:0478df9b: Invalid offset index in Parquet file 
> hdfs://path/4844de7af4545a39-e8ebc7da005f_2015704758_data.0.parq.
> I0714 16:11:48.893784 1075820 status.cc:126] 
> 8c43203adb2d4fc8:0478df9b018b] Top level rows aren't in sync during page 
> filtering in file 
> hdfs://path/4844de7af4545a39-e8ebc7da005f_2015704758_data.0.parq.
> @   0xbf4ef9
> @  0x1749104
> @  0x17494cc
> @  0x1751aee
> @  0x1748559
> @  0x1510b41
> @  0x1512c8f
> @  0x137488a
> @  0x1375759
> @  0x1b48a19
> @ 0x7f34509f5e24
> @ 0x7f344d5ed35c
> {code}
>  Corresponding source code:
> {code:java}
> Status HdfsParquetScanner::CheckPageFiltering() {
>   if (candidate_ranges_.empty() || scalar_readers_.empty()) return 
> Status::OK();  int64_t current_row = scalar_readers_[0]->LastProcessedRow();
>   for (int i = 1; i < scalar_readers_.size(); ++i) {
> if (current_row != scalar_readers_[i]->LastProcessedRow()) {
>   DCHECK(false);
>   return Status(Substitute(
>   "Top level rows aren't in sync during page filtering in file $0.", 
> filename()));
> }
>   }
>   return Status::OK();
> }
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Commented] (IMPALA-10186) Write invalid parquet PageLocations which table sort by some columns

2020-09-24 Thread Jira


[ 
https://issues.apache.org/jira/browse/IMPALA-10186?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17201542#comment-17201542
 ] 

Zoltán Borók-Nagy commented on IMPALA-10186:


Yeah, seems like "compressed size" crossed the page size somehow, so Impala had 
to start a new page. I'll take a look why this happen and how can we avoid 
writing empty pages.

> Write invalid parquet PageLocations which table sort by some columns
> 
>
> Key: IMPALA-10186
> URL: https://issues.apache.org/jira/browse/IMPALA-10186
> Project: IMPALA
>  Issue Type: Bug
>Reporter: guojingfeng
>Assignee: Zoltán Borók-Nagy
>Priority: Major
>
> Current parquet writer write -1 of PageLocation.offset and 
> PageLocation.first_row_index when meet a empty page. 
>  hdfs-parquet-file-writer.cc  Line: 808 ~ 819
> {code:java}
>   // Write data pages
>   for (const DataPage& page : pages_) {
> if (page.header.data_page_header.num_values == 0) {
>   // Skip empty pages
>   location.offset = -1;
>   location.compressed_page_size = 0;
>   location.first_row_index = -1;
>   AddLocationToOffsetIndex(location);
>   continue;
> }
> {code}
> But -1 values may cause   ComputeCandidatePages function run into unexpected 
> status.
> {code:java}
> bool ComputeCandidatePages(
> const vector& page_locations,
> const vector& candidate_ranges,
> const int64_t num_rows, vector* candidate_pages) {
>   if (!ValidatePageLocations(page_locations, num_rows)) return false
> {code}
> and then cause  IMPALA-9952
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Commented] (IMPALA-10078) Proper codegen for KuduPartitionExpr

2020-09-24 Thread Daniel Becker (Jira)


[ 
https://issues.apache.org/jira/browse/IMPALA-10078?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17201580#comment-17201580
 ] 

Daniel Becker commented on IMPALA-10078:


https://gerrit.cloudera.org/#/c/16419/

> Proper codegen for KuduPartitionExpr
> 
>
> Key: IMPALA-10078
> URL: https://issues.apache.org/jira/browse/IMPALA-10078
> Project: IMPALA
>  Issue Type: Improvement
>Reporter: Daniel Becker
>Assignee: Daniel Becker
>Priority: Major
>
> Implement codegen for KuduPartitionExpr and remove the use of 
> GetCodegendComputeFnWrapper.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Resolved] (IMPALA-10170) Data race on Webserver::UrlHandler::is_on_nav_bar_

2020-09-24 Thread Sahil Takiar (Jira)


 [ 
https://issues.apache.org/jira/browse/IMPALA-10170?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sahil Takiar resolved IMPALA-10170.
---
Fix Version/s: Impala 4.0
   Resolution: Fixed

> Data race on Webserver::UrlHandler::is_on_nav_bar_
> --
>
> Key: IMPALA-10170
> URL: https://issues.apache.org/jira/browse/IMPALA-10170
> Project: IMPALA
>  Issue Type: Sub-task
>  Components: Backend
>Reporter: Sahil Takiar
>Assignee: Sahil Takiar
>Priority: Major
> Fix For: Impala 4.0
>
>
> {code}
> WARNING: ThreadSanitizer: data race (pid=31102)
>   Read of size 1 at 0x7b2c0006e3b0 by thread T42:
> #0 impala::Webserver::UrlHandler::is_on_nav_bar() const 
> /data/jenkins/workspace/impala-private-parameterized/repos/Impala/be/src/util/webserver.h:152:41
>  (impalad+0x256ff39)
> #1 
> impala::Webserver::GetCommonJson(rapidjson::GenericDocument,
>  rapidjson::MemoryPoolAllocator, 
> rapidjson::CrtAllocator>*, sq_connection const*, 
> kudu::WebCallbackRegistry::WebRequest const&) 
> /data/jenkins/workspace/impala-private-parameterized/repos/Impala/be/src/util/webserver.cc:527:24
>  (impalad+0x256be13)
> #2 impala::Webserver::RenderUrlWithTemplate(sq_connection const*, 
> kudu::WebCallbackRegistry::WebRequest const&, impala::Webserver::UrlHandler 
> const&, std::__cxx11::basic_stringstream, 
> std::allocator >*, impala::ContentType*) 
> /data/jenkins/workspace/impala-private-parameterized/repos/Impala/be/src/util/webserver.cc:816:3
>  (impalad+0x256e882)
> #3 impala::Webserver::BeginRequestCallback(sq_connection*, 
> sq_request_info*) 
> /data/jenkins/workspace/impala-private-parameterized/repos/Impala/be/src/util/webserver.cc:714:5
>  (impalad+0x256cfbb)
> #4 impala::Webserver::BeginRequestCallbackStatic(sq_connection*) 
> /data/jenkins/workspace/impala-private-parameterized/repos/Impala/be/src/util/webserver.cc:556:20
>  (impalad+0x256ba98)
> #5 handle_request  (impalad+0x2582d59)
>   Previous write of size 2 at 0x7b2c0006e3b0 by main thread:
> #0 
> impala::Webserver::UrlHandler::UrlHandler(impala::Webserver::UrlHandler&&) 
> /data/jenkins/workspace/impala-private-parameterized/repos/Impala/be/src/util/webserver.h:141:9
>  (impalad+0x2570dbc)
> #1 std::pair, 
> std::allocator > const, 
> impala::Webserver::UrlHandler>::pair std::char_traits, std::allocator >, 
> impala::Webserver::UrlHandler, 
> true>(std::pair, 
> std::allocator >, impala::Webserver::UrlHandler>&&) 
> /data/jenkins/workspace/impala-private-parameterized/Impala-Toolchain/toolchain-packages-gcc7.5.0/gcc-7.5.0/lib/gcc/x86_64-pc-linux-gnu/7.5.0/../../../../include/c++/7.5.0/bits/stl_pair.h:362:4
>  (impalad+0x25738b3)
> #2 void 
> __gnu_cxx::new_allocator  std::char_traits, std::allocator > const, 
> impala::Webserver::UrlHandler> > 
> >::construct std::char_traits, std::allocator > const, 
> impala::Webserver::UrlHandler>, std::pair std::char_traits, std::allocator >, 
> impala::Webserver::UrlHandler> >(std::pair std::char_traits, std::allocator > const, 
> impala::Webserver::UrlHandler>*, std::pair std::char_traits, std::allocator >, 
> impala::Webserver::UrlHandler>&&) 
> /data/jenkins/workspace/impala-private-parameterized/Impala-Toolchain/toolchain-packages-gcc7.5.0/gcc-7.5.0/lib/gcc/x86_64-pc-linux-gnu/7.5.0/../../../../include/c++/7.5.0/ext/new_allocator.h:136:23
>  (impalad+0x2573848)
> #3 void 
> std::allocator_traits  std::char_traits, std::allocator > const, 
> impala::Webserver::UrlHandler> > > 
> >::construct std::char_traits, std::allocator > const, 
> impala::Webserver::UrlHandler>, std::pair std::char_traits, std::allocator >, 
> impala::Webserver::UrlHandler> 
> >(std::allocator  std::char_traits, std::allocator > const, 
> impala::Webserver::UrlHandler> > >&, 
> std::pair, 
> std::allocator > const, impala::Webserver::UrlHandler>*, 
> std::pair, 
> std::allocator >, impala::Webserver::UrlHandler>&&) 
> /data/jenkins/workspace/impala-private-parameterized/Impala-Toolchain/toolchain-packages-gcc7.5.0/gcc-7.5.0/lib/gcc/x86_64-pc-linux-gnu/7.5.0/../../../../include/c++/7.5.0/bits/alloc_traits.h:475:8
>  (impalad+0x25737f1)
> #4 void std::_Rb_tree std::char_traits, std::allocator >, 
> std::pair, 
> std::allocator > const, impala::Webserver::UrlHandler>, 
> std::_Select1st std::char_traits, std::allocator > const, 
> impala::Webserver::UrlHandler> >, std::less std::char_traits, std::allocator > >, 
> std::allocator std::char_traits, std::allocator > const, 
> impala::Webserver::UrlHandler> > 
> >::_M_construct_node std::char_traits, std::allocator >, 
> impala::Webserver::UrlHandler> 
> >(std::_Rb_tree_node std::char_traits, std::allocator > const, 
> impala::Webserver::UrlHandler> >*, std::pair std::char_traits, std::allocator >, 
> impala::Webserver::UrlHandler>&&) 
> /data/j

[jira] [Assigned] (IMPALA-9779) Unnecessarily reloading file metadata in some DDLs

2020-09-24 Thread Tim Armstrong (Jira)


 [ 
https://issues.apache.org/jira/browse/IMPALA-9779?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tim Armstrong reassigned IMPALA-9779:
-

Assignee: Tim Armstrong

> Unnecessarily reloading file metadata in some DDLs
> --
>
> Key: IMPALA-9779
> URL: https://issues.apache.org/jira/browse/IMPALA-9779
> Project: IMPALA
>  Issue Type: Improvement
>  Components: Catalog
>Affects Versions: Impala 2.8.0, Impala 2.9.0, Impala 2.10.0, Impala 
> 2.11.0, Impala 3.0, Impala 2.12.0, Impala 3.1.0, Impala 3.2.0, Impala 3.3.0, 
> Impala 3.4.0
>Reporter: Quanlong Huang
>Assignee: Tim Armstrong
>Priority: Critical
>
> Some DDLs are not modifying the actual table data. We don't need to reload 
> file meta for them. These DDLs include:
> * Compute (incremental) stats
> * Drop stats
> * Alter table set row format
> * Alter table set file format
> Code paths of them both call CatalogOpExecutor.bulkAlterPartitions(). The 
> related partitions are marked as "dirty" anyway. Dirty partitions will be 
> dropped and reloaded at the end of 
> CatalogOpExecutor.alterTable(TAlterTableParams, TDdlExecResponse). See the 
> details in HdfsTable.updatePartitionsFromHms().
> We can consider not marking related partitions as "dirty" in these DDLs.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Commented] (IMPALA-9779) Unnecessarily reloading file metadata in some DDLs

2020-09-24 Thread Tim Armstrong (Jira)


[ 
https://issues.apache.org/jira/browse/IMPALA-9779?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17201660#comment-17201660
 ] 

Tim Armstrong commented on IMPALA-9779:
---

I noticed that with the drop stats code paths, they don't actually remove the 
row counts from the partitions in all cases. So reloading the partitions is 
actually preventing them from getting stale.

> Unnecessarily reloading file metadata in some DDLs
> --
>
> Key: IMPALA-9779
> URL: https://issues.apache.org/jira/browse/IMPALA-9779
> Project: IMPALA
>  Issue Type: Improvement
>  Components: Catalog
>Affects Versions: Impala 2.8.0, Impala 2.9.0, Impala 2.10.0, Impala 
> 2.11.0, Impala 3.0, Impala 2.12.0, Impala 3.1.0, Impala 3.2.0, Impala 3.3.0, 
> Impala 3.4.0
>Reporter: Quanlong Huang
>Assignee: Tim Armstrong
>Priority: Critical
>
> Some DDLs are not modifying the actual table data. We don't need to reload 
> file meta for them. These DDLs include:
> * Compute (incremental) stats
> * Drop stats
> * Alter table set row format
> * Alter table set file format
> Code paths of them both call CatalogOpExecutor.bulkAlterPartitions(). The 
> related partitions are marked as "dirty" anyway. Dirty partitions will be 
> dropped and reloaded at the end of 
> CatalogOpExecutor.alterTable(TAlterTableParams, TDdlExecResponse). See the 
> details in HdfsTable.updatePartitionsFromHms().
> We can consider not marking related partitions as "dirty" in these DDLs.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Commented] (IMPALA-9779) Unnecessarily reloading file metadata in some DDLs

2020-09-24 Thread Tim Armstrong (Jira)


[ 
https://issues.apache.org/jira/browse/IMPALA-9779?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17201678#comment-17201678
 ] 

Tim Armstrong commented on IMPALA-9779:
---

I think for both of these we need to a) make sure that the state of the 
Partition object is cleanly set without a reload (including directly relevant 
fields and dependent fields like HdfsTable::hasAvroData_ and 2) avoid setting 
markDirty via bulkAlterPartitions.

> Unnecessarily reloading file metadata in some DDLs
> --
>
> Key: IMPALA-9779
> URL: https://issues.apache.org/jira/browse/IMPALA-9779
> Project: IMPALA
>  Issue Type: Improvement
>  Components: Catalog
>Affects Versions: Impala 2.8.0, Impala 2.9.0, Impala 2.10.0, Impala 
> 2.11.0, Impala 3.0, Impala 2.12.0, Impala 3.1.0, Impala 3.2.0, Impala 3.3.0, 
> Impala 3.4.0
>Reporter: Quanlong Huang
>Assignee: Tim Armstrong
>Priority: Critical
>
> Some DDLs are not modifying the actual table data. We don't need to reload 
> file meta for them. These DDLs include:
> * Compute (incremental) stats
> * Drop stats
> * Alter table set row format
> * Alter table set file format
> Code paths of them both call CatalogOpExecutor.bulkAlterPartitions(). The 
> related partitions are marked as "dirty" anyway. Dirty partitions will be 
> dropped and reloaded at the end of 
> CatalogOpExecutor.alterTable(TAlterTableParams, TDdlExecResponse). See the 
> details in HdfsTable.updatePartitionsFromHms().
> We can consider not marking related partitions as "dirty" in these DDLs.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Created] (IMPALA-10189) Avoid unnecessarily loading metadata for compute/drop stats DDLs

2020-09-24 Thread Tim Armstrong (Jira)
Tim Armstrong created IMPALA-10189:
--

 Summary: Avoid unnecessarily loading metadata for compute/drop 
stats DDLs
 Key: IMPALA-10189
 URL: https://issues.apache.org/jira/browse/IMPALA-10189
 Project: IMPALA
  Issue Type: Sub-task
  Components: Catalog
Affects Versions: Impala 3.4.0, Impala 3.3.0, Impala 3.2.0, Impala 3.1.0, 
Impala 2.12.0, Impala 3.0, Impala 2.11.0, Impala 2.10.0
Reporter: Tim Armstrong
Assignee: Tim Armstrong






--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Updated] (IMPALA-10189) Avoid unnecessarily loading metadata for compute/drop stats DDLs

2020-09-24 Thread Tim Armstrong (Jira)


 [ 
https://issues.apache.org/jira/browse/IMPALA-10189?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tim Armstrong updated IMPALA-10189:
---
Priority: Critical  (was: Major)

> Avoid unnecessarily loading metadata for compute/drop stats DDLs
> 
>
> Key: IMPALA-10189
> URL: https://issues.apache.org/jira/browse/IMPALA-10189
> Project: IMPALA
>  Issue Type: Sub-task
>  Components: Catalog
>Affects Versions: Impala 2.10.0, Impala 2.11.0, Impala 3.0, Impala 2.12.0, 
> Impala 3.1.0, Impala 3.2.0, Impala 3.3.0, Impala 3.4.0
>Reporter: Tim Armstrong
>Assignee: Tim Armstrong
>Priority: Critical
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Reopened] (IMPALA-9833) query_test.test_observability.TestQueryStates.test_error_query_state is flaky

2020-09-24 Thread David Rorke (Jira)


 [ 
https://issues.apache.org/jira/browse/IMPALA-9833?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

David Rorke reopened IMPALA-9833:
-

I also just hit this in our nightlies.  Not sure if it's the same occurrence 
[~wzhou] noticed:
{noformat}
query_test/test_observability.py:777: in test_error_query_state lambda: 
self.client.get_runtime_profile(handle)) common/impala_test_suite.py:1141: in 
assert_eventually count, timeout_s, error_msg_str)) E   Timeout: Check 
failed to return True after 300 tries and 300 seconds
{noformat}

Reopening issue.

> query_test.test_observability.TestQueryStates.test_error_query_state is flaky
> -
>
> Key: IMPALA-9833
> URL: https://issues.apache.org/jira/browse/IMPALA-9833
> Project: IMPALA
>  Issue Type: Bug
>Affects Versions: Impala 4.0
>Reporter: Xiaomeng Zhang
>Assignee: Sahil Takiar
>Priority: Major
> Fix For: Impala 4.0
>
>
> [https://jenkins.impala.io/job/ubuntu-16.04-dockerised-tests/2521/testReport/junit/query_test.test_observability/TestQueryStates/test_error_query_state/]
> It seems the test could not get query profile after retries in 30s.
> {code:java}
> Stacktracequery_test/test_observability.py:777: in test_error_query_state
> lambda: self.client.get_runtime_profile(handle))
> common/impala_test_suite.py:1120: in assert_eventually
> count, timeout_s, error_msg_str))
> E   Timeout: Check failed to return True after 30 tries and 30 seconds error 
> message: Query (id=fe45e8bfd138acd3:c67a3796)
> {code}
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Commented] (IMPALA-9833) query_test.test_observability.TestQueryStates.test_error_query_state is flaky

2020-09-24 Thread Wenzhe Zhou (Jira)


[ 
https://issues.apache.org/jira/browse/IMPALA-9833?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17201774#comment-17201774
 ] 

Wenzhe Zhou commented on IMPALA-9833:
-

Same occurrence [~drorke] noticed.

 

> query_test.test_observability.TestQueryStates.test_error_query_state is flaky
> -
>
> Key: IMPALA-9833
> URL: https://issues.apache.org/jira/browse/IMPALA-9833
> Project: IMPALA
>  Issue Type: Bug
>Affects Versions: Impala 4.0
>Reporter: Xiaomeng Zhang
>Assignee: Sahil Takiar
>Priority: Major
> Fix For: Impala 4.0
>
>
> [https://jenkins.impala.io/job/ubuntu-16.04-dockerised-tests/2521/testReport/junit/query_test.test_observability/TestQueryStates/test_error_query_state/]
> It seems the test could not get query profile after retries in 30s.
> {code:java}
> Stacktracequery_test/test_observability.py:777: in test_error_query_state
> lambda: self.client.get_runtime_profile(handle))
> common/impala_test_suite.py:1120: in assert_eventually
> count, timeout_s, error_msg_str))
> E   Timeout: Check failed to return True after 30 tries and 30 seconds error 
> message: Query (id=fe45e8bfd138acd3:c67a3796)
> {code}
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Work started] (IMPALA-10189) Avoid unnecessarily loading metadata for compute/drop stats DDLs

2020-09-24 Thread Tim Armstrong (Jira)


 [ 
https://issues.apache.org/jira/browse/IMPALA-10189?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Work on IMPALA-10189 started by Tim Armstrong.
--
> Avoid unnecessarily loading metadata for compute/drop stats DDLs
> 
>
> Key: IMPALA-10189
> URL: https://issues.apache.org/jira/browse/IMPALA-10189
> Project: IMPALA
>  Issue Type: Sub-task
>  Components: Catalog
>Affects Versions: Impala 2.10.0, Impala 2.11.0, Impala 3.0, Impala 2.12.0, 
> Impala 3.1.0, Impala 3.2.0, Impala 3.3.0, Impala 3.4.0
>Reporter: Tim Armstrong
>Assignee: Tim Armstrong
>Priority: Critical
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Created] (IMPALA-10190) Remove impalad_coord_exec Dockerfile

2020-09-24 Thread Sahil Takiar (Jira)
Sahil Takiar created IMPALA-10190:
-

 Summary: Remove impalad_coord_exec Dockerfile
 Key: IMPALA-10190
 URL: https://issues.apache.org/jira/browse/IMPALA-10190
 Project: IMPALA
  Issue Type: Improvement
Reporter: Sahil Takiar


The impalad_coord_exec Dockerfile is a bit redundant because it basically 
contains all the same dependencies as the impalad_coordinator Dockerfile. The 
only different between the two files is that the startup flags for 
impalad_coordinator contain {{is_executor=false}}. We should find a way to 
remove the {{impalad_coord_exec}} altogether.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Created] (IMPALA-10191) Test impalad_coordinator and impalad_executor in Dockerized tests

2020-09-24 Thread Sahil Takiar (Jira)
Sahil Takiar created IMPALA-10191:
-

 Summary: Test impalad_coordinator and impalad_executor in 
Dockerized tests
 Key: IMPALA-10191
 URL: https://issues.apache.org/jira/browse/IMPALA-10191
 Project: IMPALA
  Issue Type: Improvement
Reporter: Sahil Takiar


Currently only the impalad_coord_exec images are tested in the Dockerized 
tests, it would be nice to get test coverage for the other images as well.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Commented] (IMPALA-9918) HdfsOrcScanner crash on resolving columns

2020-09-24 Thread David Rorke (Jira)


[ 
https://issues.apache.org/jira/browse/IMPALA-9918?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17201830#comment-17201830
 ] 

David Rorke commented on IMPALA-9918:
-

Hit this again while running
{noformat}
TestFullAcid::()::test_full_acid_support{noformat}
 

Similar backtrace to the earlier report (attached). I also see the following 
log output from the failed DCHECK in PrintPath:
{noformat}
F0924 06:46:15.196028 28193 debug-util.cc:260] 
a44074e914841df2:e8a77ca30003] Check failed: false [7 0] 1 INT
{noformat}
 

> HdfsOrcScanner crash on resolving columns
> -
>
> Key: IMPALA-9918
> URL: https://issues.apache.org/jira/browse/IMPALA-9918
> Project: IMPALA
>  Issue Type: Bug
>  Components: Backend
>Affects Versions: Impala 4.0
> Environment: BUILD_TAG
> jenkins-impala-cdpd-master-core-ubsan-111
>Reporter: Wenzhe Zhou
>Assignee: Csaba Ringhofer
>Priority: Major
>  Labels: broken-build
> Attachments: backtraces.txt, backtraces.txt
>
>
> Core file generated in impala-cdpd-master-core-ubsan build
> Back traces:
> CORE: ./tests/core.1594000709.13971.impalad
> BINARY: ./be/build/latest/service/impalad
> Core was generated by 
> `/data/jenkins/workspace/impala-cdpd-master-core-ubsan/repos/Impala/be/build/lat'.
> Program terminated with signal SIGABRT, Aborted.
> #0 0x7f7a481851f7 in raise () from /lib64/libc.so.6
> To enable execution of this file add
>  add-auto-load-safe-path 
> /data0/jenkins/workspace/impala-cdpd-master-core-ubsan/Impala-Toolchain/gcc-4.9.2/lib64/libstdc++.so.6.0.20-gdb.py
> line to your configuration file "/var/lib/jenkins/.gdbinit".
> To completely disable this security protection add
>  set auto-load safe-path /
> line to your configuration file "/var/lib/jenkins/.gdbinit".
> For more information about this security protection see the
> "Auto-loading safe path" section in the GDB manual. E.g., run from the shell:
>  info "(gdb)Auto-loading safe path"
> #0 0x7f7a481851f7 in raise () from /lib64/libc.so.6
> #1 0x7f7a481868e8 in abort () from /lib64/libc.so.6
> #2 0x083401c4 in google::DumpStackTraceAndExit() ()
> #3 0x08336b5d in google::LogMessage::Fail() ()
> #4 0x08338402 in google::LogMessage::SendToLog() ()
> #5 0x08336537 in google::LogMessage::Flush() ()
> #6 0x08339afe in google::LogMessageFatal::~LogMessageFatal() ()
> #7 0x03215662 in impala::PrintPath (tbl_desc=..., path=...) at 
> /data/jenkins/workspace/impala-cdpd-master-core-ubsan/repos/Impala/be/src/util/debug-util.cc:259
> #8 0x0370dfe9 in impala::HdfsOrcScanner::ResolveColumns 
> (this=0x14555c00, tuple_desc=..., selected_nodes=0x7f79722730a8, 
> pos_slots=0x7f7972273058) at 
> /data/jenkins/workspace/impala-cdpd-master-core-ubsan/repos/Impala/be/src/exec/hdfs-orc-scanner.cc:436
> #9 0x037099dd in impala::HdfsOrcScanner::SelectColumns 
> (this=0x14555c00, tuple_desc=...) at 
> /data/jenkins/workspace/impala-cdpd-master-core-ubsan/repos/Impala/be/src/exec/hdfs-orc-scanner.cc:456
> #10 0x03707688 in impala::HdfsOrcScanner::Open (this=0x14555c00, 
> context=0x7f7972274700) at 
> /data/jenkins/workspace/impala-cdpd-master-core-ubsan/repos/Impala/be/src/exec/hdfs-orc-scanner.cc:221
> #11 0x035e0a48 in 
> impala::HdfsScanNodeBase::CreateAndOpenScannerHelper (this=0x1b1c7100, 
> partition=0x142f9d00, context=0x7f7972274700, scanner=0x7f79722746f8) at 
> /data/jenkins/workspace/impala-cdpd-master-core-ubsan/repos/Impala/be/src/exec/hdfs-scan-node-base.cc:882
> #12 0x039df2e8 in impala::HdfsScanNode::ProcessSplit 
> (this=0x1b1c7100, filter_ctxs=..., expr_results_pool=0x7f7972274bd8, 
> scan_range=0x12a16c40, scanner_thread_reservation=0x7f7972274e18) at 
> /data/jenkins/workspace/impala-cdpd-master-core-ubsan/repos/Impala/be/src/exec/hdfs-scan-node.cc:480
> #13 0x039ddd85 in impala::HdfsScanNode::ScannerThread 
> (this=0x1b1c7100, first_thread=true, scanner_thread_reservation=8192) at 
> /data/jenkins/workspace/impala-cdpd-master-core-ubsan/repos/Impala/be/src/exec/hdfs-scan-node.cc:418
> #14 0x039e1980 in 
> impala::HdfsScanNode::ThreadTokenAvailableCb(impala::ThreadResourcePool*)::$_0::operator()()
>  const (this=0x7f7972275450) at 
> /data/jenkins/workspace/impala-cdpd-master-core-ubsan/repos/Impala/be/src/exec/hdfs-scan-node.cc:339
> #15 0x039e13b2 in 
> boost::detail::function::void_function_obj_invoker0  void>::invoke(boost::detail::function::function_buffer&) 
> (function_obj_ptr=...) at 
> /data/jenkins/workspace/impala-cdpd-master-core-ubsan/Impala-Toolchain/boost-1.61.0-p2/include/boost/function/function_template.hpp:159
> #16 0x024d46f0 in boost::function0::operator() 
> (this=0x7f7972275448) at 
> /data/jenkins/workspace/impal

[jira] [Updated] (IMPALA-9918) HdfsOrcScanner crash on resolving columns

2020-09-24 Thread David Rorke (Jira)


 [ 
https://issues.apache.org/jira/browse/IMPALA-9918?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

David Rorke updated IMPALA-9918:

Attachment: 092420_backtraces.txt

> HdfsOrcScanner crash on resolving columns
> -
>
> Key: IMPALA-9918
> URL: https://issues.apache.org/jira/browse/IMPALA-9918
> Project: IMPALA
>  Issue Type: Bug
>  Components: Backend
>Affects Versions: Impala 4.0
> Environment: BUILD_TAG
> jenkins-impala-cdpd-master-core-ubsan-111
>Reporter: Wenzhe Zhou
>Assignee: Csaba Ringhofer
>Priority: Major
>  Labels: broken-build
> Attachments: 092420_backtraces.txt, backtraces.txt, backtraces.txt
>
>
> Core file generated in impala-cdpd-master-core-ubsan build
> Back traces:
> CORE: ./tests/core.1594000709.13971.impalad
> BINARY: ./be/build/latest/service/impalad
> Core was generated by 
> `/data/jenkins/workspace/impala-cdpd-master-core-ubsan/repos/Impala/be/build/lat'.
> Program terminated with signal SIGABRT, Aborted.
> #0 0x7f7a481851f7 in raise () from /lib64/libc.so.6
> To enable execution of this file add
>  add-auto-load-safe-path 
> /data0/jenkins/workspace/impala-cdpd-master-core-ubsan/Impala-Toolchain/gcc-4.9.2/lib64/libstdc++.so.6.0.20-gdb.py
> line to your configuration file "/var/lib/jenkins/.gdbinit".
> To completely disable this security protection add
>  set auto-load safe-path /
> line to your configuration file "/var/lib/jenkins/.gdbinit".
> For more information about this security protection see the
> "Auto-loading safe path" section in the GDB manual. E.g., run from the shell:
>  info "(gdb)Auto-loading safe path"
> #0 0x7f7a481851f7 in raise () from /lib64/libc.so.6
> #1 0x7f7a481868e8 in abort () from /lib64/libc.so.6
> #2 0x083401c4 in google::DumpStackTraceAndExit() ()
> #3 0x08336b5d in google::LogMessage::Fail() ()
> #4 0x08338402 in google::LogMessage::SendToLog() ()
> #5 0x08336537 in google::LogMessage::Flush() ()
> #6 0x08339afe in google::LogMessageFatal::~LogMessageFatal() ()
> #7 0x03215662 in impala::PrintPath (tbl_desc=..., path=...) at 
> /data/jenkins/workspace/impala-cdpd-master-core-ubsan/repos/Impala/be/src/util/debug-util.cc:259
> #8 0x0370dfe9 in impala::HdfsOrcScanner::ResolveColumns 
> (this=0x14555c00, tuple_desc=..., selected_nodes=0x7f79722730a8, 
> pos_slots=0x7f7972273058) at 
> /data/jenkins/workspace/impala-cdpd-master-core-ubsan/repos/Impala/be/src/exec/hdfs-orc-scanner.cc:436
> #9 0x037099dd in impala::HdfsOrcScanner::SelectColumns 
> (this=0x14555c00, tuple_desc=...) at 
> /data/jenkins/workspace/impala-cdpd-master-core-ubsan/repos/Impala/be/src/exec/hdfs-orc-scanner.cc:456
> #10 0x03707688 in impala::HdfsOrcScanner::Open (this=0x14555c00, 
> context=0x7f7972274700) at 
> /data/jenkins/workspace/impala-cdpd-master-core-ubsan/repos/Impala/be/src/exec/hdfs-orc-scanner.cc:221
> #11 0x035e0a48 in 
> impala::HdfsScanNodeBase::CreateAndOpenScannerHelper (this=0x1b1c7100, 
> partition=0x142f9d00, context=0x7f7972274700, scanner=0x7f79722746f8) at 
> /data/jenkins/workspace/impala-cdpd-master-core-ubsan/repos/Impala/be/src/exec/hdfs-scan-node-base.cc:882
> #12 0x039df2e8 in impala::HdfsScanNode::ProcessSplit 
> (this=0x1b1c7100, filter_ctxs=..., expr_results_pool=0x7f7972274bd8, 
> scan_range=0x12a16c40, scanner_thread_reservation=0x7f7972274e18) at 
> /data/jenkins/workspace/impala-cdpd-master-core-ubsan/repos/Impala/be/src/exec/hdfs-scan-node.cc:480
> #13 0x039ddd85 in impala::HdfsScanNode::ScannerThread 
> (this=0x1b1c7100, first_thread=true, scanner_thread_reservation=8192) at 
> /data/jenkins/workspace/impala-cdpd-master-core-ubsan/repos/Impala/be/src/exec/hdfs-scan-node.cc:418
> #14 0x039e1980 in 
> impala::HdfsScanNode::ThreadTokenAvailableCb(impala::ThreadResourcePool*)::$_0::operator()()
>  const (this=0x7f7972275450) at 
> /data/jenkins/workspace/impala-cdpd-master-core-ubsan/repos/Impala/be/src/exec/hdfs-scan-node.cc:339
> #15 0x039e13b2 in 
> boost::detail::function::void_function_obj_invoker0  void>::invoke(boost::detail::function::function_buffer&) 
> (function_obj_ptr=...) at 
> /data/jenkins/workspace/impala-cdpd-master-core-ubsan/Impala-Toolchain/boost-1.61.0-p2/include/boost/function/function_template.hpp:159
> #16 0x024d46f0 in boost::function0::operator() 
> (this=0x7f7972275448) at 
> /data/jenkins/workspace/impala-cdpd-master-core-ubsan/Impala-Toolchain/boost-1.61.0-p2/include/boost/function/function_template.hpp:770
> #17 0x03425ba7 in impala::Thread::SuperviseThread(std::string const&, 
> std::string const&, boost::function, impala::ThreadDebugInfo const*, 
> impala::Promise*) (name=..., category=..., 
> functor=..., parent_thread_info=0x7f797006d068,