[jira] [Created] (IMPALA-10383) Data race in AdmissionController::WaitOnQueued

2020-12-08 Thread Attila Jeges (Jira)
Attila Jeges created IMPALA-10383:
-

 Summary: Data race in AdmissionController::WaitOnQueued
 Key: IMPALA-10383
 URL: https://issues.apache.org/jira/browse/IMPALA-10383
 Project: IMPALA
  Issue Type: Bug
Affects Versions: Impala 4.0
Reporter: Attila Jeges
Assignee: Thomas Tauber-Marshall


TSAN is reporting a data race in {{AdmissionController::WaitOnQueued:}}
{code:java}
WARNING: ThreadSanitizer: data race (pid=4257)
  Write of size 8 at 0x7b58000901b0 by thread T416:
#0 std::_Hashtable, 
std::allocator >, std::__detail::_Select1st, 
std::equal_to, std::hash, 
std::__detail::_Mod_range_hashing, std::__detail::_Default_ranged_hash, 
std::__detail::_Prime_rehash_policy, std::__detail::_Hashtable_traits >::_M_erase(unsigned long, std::__detail::_Hash_node_base*, 
std::__detail::_Hash_node, true>*) 
/data/jenkins/workspace/impala-asf-master-core-tsan/Impala-Toolchain/toolchain-packages-gcc7.5.0/gcc-7.5.0/lib/gcc/x86_64-pc-linux-gnu/7.5.0/../../../../include/c++/7.5.0/bits/hashtable.h:1891:7
 (impalad+0x22bbe78)
#1 std::_Hashtable, 
std::allocator >, std::__detail::_Select1st, 
std::equal_to, std::hash, 
std::__detail::_Mod_range_hashing, std::__detail::_Default_ranged_hash, 
std::__detail::_Prime_rehash_policy, std::__detail::_Hashtable_traits >::_M_erase(std::integral_constant, impala::UniqueIdPB 
const&) 
/data/jenkins/workspace/impala-asf-master-core-tsan/Impala-Toolchain/toolchain-packages-gcc7.5.0/gcc-7.5.0/lib/gcc/x86_64-pc-linux-gnu/7.5.0/../../../../include/c++/7.5.0/bits/hashtable.h:1916:7
 (impalad+0x22bbd3a)
#2 std::_Hashtable, 
std::allocator >, std::__detail::_Select1st, 
std::equal_to, std::hash, 
std::__detail::_Mod_range_hashing, std::__detail::_Default_ranged_hash, 
std::__detail::_Prime_rehash_policy, std::__detail::_Hashtable_traits >::erase(impala::UniqueIdPB const&) 
/data/jenkins/workspace/impala-asf-master-core-tsan/Impala-Toolchain/toolchain-packages-gcc7.5.0/gcc-7.5.0/lib/gcc/x86_64-pc-linux-gnu/7.5.0/../../../../include/c++/7.5.0/bits/hashtable.h:759:16
 (impalad+0x22bbca0)
#3 std::unordered_map, 
std::equal_to, std::allocator > >::erase(impala::UniqueIdPB 
const&) 
/data/jenkins/workspace/impala-asf-master-core-tsan/Impala-Toolchain/toolchain-packages-gcc7.5.0/gcc-7.5.0/lib/gcc/x86_64-pc-linux-gnu/7.5.0/../../../../include/c++/7.5.0/bits/unordered_map.h:814:21
 (impalad+0x22bbc50)
#4 impala::AdmissionController::WaitOnQueued(impala::UniqueIdPB const&, 
std::unique_ptr >*, long, 
bool*)::$_6::operator()() const 
/data/jenkins/workspace/impala-asf-master-core-tsan/repos/Impala/be/src/scheduling/admission-controller.cc:1278:49
 (impalad+0x229d199)
#5 
impala::ScopeExitTrigger >*, long, 
bool*)::$_6>::~ScopeExitTrigger() 
/data/jenkins/workspace/impala-asf-master-core-tsan/repos/Impala/be/src/util/scope-exit-trigger.h:40:25
 (impalad+0x2294ca9)
#6 impala::AdmissionController::WaitOnQueued(impala::UniqueIdPB const&, 
std::unique_ptr >*, long, bool*) 
/data/jenkins/workspace/impala-asf-master-core-tsan/repos/Impala/be/src/scheduling/admission-controller.cc:1333:1
 (impalad+0x2294912)
#7 
impala::LocalAdmissionControlClient::SubmitForAdmission(impala::AdmissionController::AdmissionRequest
 const&, impala::RuntimeProfile::EventSequence*, 
std::unique_ptr >*) 
/data/jenkins/workspace/impala-asf-master-core-tsan/repos/Impala/be/src/scheduling/local-admission-control-client.cc:45:62
 (impalad+0x2c1e40e)
#8 impala::ClientRequestState::FinishExecQueryOrDmlRequest() 
/data/jenkins/workspace/impala-asf-master-core-tsan/repos/Impala/be/src/service/client-request-state.cc:578:52
 (impalad+0x245c651)
#9 boost::_mfi::mf0::operator()(impala::ClientRequestState*) const 
/data/jenkins/workspace/impala-asf-master-core-tsan/Impala-Toolchain/toolchain-packages-gcc7.5.0/boost-1.61.0-p2/include/boost/bind/mem_fn_template.hpp:49:29
 (impalad+0x2468ef6)
#10 void boost::_bi::list1 
>::operator(), 
boost::_bi::list0>(boost::_bi::type, boost::_mfi::mf0&, boost::_bi::list0&, int) 
/data/jenkins/workspace/impala-asf-master-core-tsan/Impala-Toolchain/toolchain-packages-gcc7.5.0/boost-1.61.0-p2/include/boost/bind/bind.hpp:259:9
 (impalad+0x2468e4a)
#11 boost::_bi::bind_t, 
boost::_bi::list1 > 
>::operator()() 
/data/jenkins/workspace/impala-asf-master-core-tsan/Impala-Toolchain/toolchain-packages-gcc7.5.0/boost-1.61.0-p2/include/boost/bind/bind.hpp:1222:16
 (impalad+0x2468dd3)
#12 
boost::detail::function::void_function_obj_invoker0, 
boost::_bi::list1 > >, 
void>::invoke(boost::detail::function::function_buffer&) 
/data/jenkins/workspace/impala-asf-master-core-tsan/Impala-Toolchain/toolchain-packages-gcc7.5.0/boost-1.61.0-p2/include/boost/function/function_template.hpp:159:11
 (impalad+0x2468bc9)
#13 boost::function0::operator()() const 
/data/jenkins/workspace/impala-asf-master-core-tsan/Impala-Toolchain/toolchain-packages-gcc7.5.0/b

[jira] [Updated] (IMPALA-10383) Data race in AdmissionController::WaitOnQueued

2020-12-08 Thread Attila Jeges (Jira)


 [ 
https://issues.apache.org/jira/browse/IMPALA-10383?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Attila Jeges updated IMPALA-10383:
--
Labels: broken-build flaky  (was: broken-build)

> Data race in AdmissionController::WaitOnQueued
> --
>
> Key: IMPALA-10383
> URL: https://issues.apache.org/jira/browse/IMPALA-10383
> Project: IMPALA
>  Issue Type: Bug
>Affects Versions: Impala 4.0
>Reporter: Attila Jeges
>Assignee: Thomas Tauber-Marshall
>Priority: Major
>  Labels: broken-build, flaky
>
> TSAN is reporting a data race in {{AdmissionController::WaitOnQueued:}}
> {code:java}
> WARNING: ThreadSanitizer: data race (pid=4257)
>   Write of size 8 at 0x7b58000901b0 by thread T416:
> #0 std::_Hashtable const, impala::AdmissionController::QueueNode>, 
> std::allocator impala::AdmissionController::QueueNode> >, std::__detail::_Select1st, 
> std::equal_to, std::hash, 
> std::__detail::_Mod_range_hashing, std::__detail::_Default_ranged_hash, 
> std::__detail::_Prime_rehash_policy, std::__detail::_Hashtable_traits false, true> >::_M_erase(unsigned long, std::__detail::_Hash_node_base*, 
> std::__detail::_Hash_node impala::AdmissionController::QueueNode>, true>*) 
> /data/jenkins/workspace/impala-asf-master-core-tsan/Impala-Toolchain/toolchain-packages-gcc7.5.0/gcc-7.5.0/lib/gcc/x86_64-pc-linux-gnu/7.5.0/../../../../include/c++/7.5.0/bits/hashtable.h:1891:7
>  (impalad+0x22bbe78)
> #1 std::_Hashtable const, impala::AdmissionController::QueueNode>, 
> std::allocator impala::AdmissionController::QueueNode> >, std::__detail::_Select1st, 
> std::equal_to, std::hash, 
> std::__detail::_Mod_range_hashing, std::__detail::_Default_ranged_hash, 
> std::__detail::_Prime_rehash_policy, std::__detail::_Hashtable_traits false, true> >::_M_erase(std::integral_constant, 
> impala::UniqueIdPB const&) 
> /data/jenkins/workspace/impala-asf-master-core-tsan/Impala-Toolchain/toolchain-packages-gcc7.5.0/gcc-7.5.0/lib/gcc/x86_64-pc-linux-gnu/7.5.0/../../../../include/c++/7.5.0/bits/hashtable.h:1916:7
>  (impalad+0x22bbd3a)
> #2 std::_Hashtable const, impala::AdmissionController::QueueNode>, 
> std::allocator impala::AdmissionController::QueueNode> >, std::__detail::_Select1st, 
> std::equal_to, std::hash, 
> std::__detail::_Mod_range_hashing, std::__detail::_Default_ranged_hash, 
> std::__detail::_Prime_rehash_policy, std::__detail::_Hashtable_traits false, true> >::erase(impala::UniqueIdPB const&) 
> /data/jenkins/workspace/impala-asf-master-core-tsan/Impala-Toolchain/toolchain-packages-gcc7.5.0/gcc-7.5.0/lib/gcc/x86_64-pc-linux-gnu/7.5.0/../../../../include/c++/7.5.0/bits/hashtable.h:759:16
>  (impalad+0x22bbca0)
> #3 std::unordered_map impala::AdmissionController::QueueNode, std::hash, 
> std::equal_to, 
> std::allocator impala::AdmissionController::QueueNode> > >::erase(impala::UniqueIdPB const&) 
> /data/jenkins/workspace/impala-asf-master-core-tsan/Impala-Toolchain/toolchain-packages-gcc7.5.0/gcc-7.5.0/lib/gcc/x86_64-pc-linux-gnu/7.5.0/../../../../include/c++/7.5.0/bits/unordered_map.h:814:21
>  (impalad+0x22bbc50)
> #4 impala::AdmissionController::WaitOnQueued(impala::UniqueIdPB const&, 
> std::unique_ptr std::default_delete >*, long, 
> bool*)::$_6::operator()() const 
> /data/jenkins/workspace/impala-asf-master-core-tsan/repos/Impala/be/src/scheduling/admission-controller.cc:1278:49
>  (impalad+0x229d199)
> #5 
> impala::ScopeExitTrigger  const&, std::unique_ptr std::default_delete >*, long, 
> bool*)::$_6>::~ScopeExitTrigger() 
> /data/jenkins/workspace/impala-asf-master-core-tsan/repos/Impala/be/src/util/scope-exit-trigger.h:40:25
>  (impalad+0x2294ca9)
> #6 impala::AdmissionController::WaitOnQueued(impala::UniqueIdPB const&, 
> std::unique_ptr std::default_delete >*, long, bool*) 
> /data/jenkins/workspace/impala-asf-master-core-tsan/repos/Impala/be/src/scheduling/admission-controller.cc:1333:1
>  (impalad+0x2294912)
> #7 
> impala::LocalAdmissionControlClient::SubmitForAdmission(impala::AdmissionController::AdmissionRequest
>  const&, impala::RuntimeProfile::EventSequence*, 
> std::unique_ptr std::default_delete >*) 
> /data/jenkins/workspace/impala-asf-master-core-tsan/repos/Impala/be/src/scheduling/local-admission-control-client.cc:45:62
>  (impalad+0x2c1e40e)
> #8 impala::ClientRequestState::FinishExecQueryOrDmlRequest() 
> /data/jenkins/workspace/impala-asf-master-core-tsan/repos/Impala/be/src/service/client-request-state.cc:578:52
>  (impalad+0x245c651)
> #9 boost::_mfi::mf0 impala::ClientRequestState>::operator()(impala::ClientRequestState*) const 
> /data/jenkins/workspace/impala-asf-master-core-tsan/Impala-Toolchain/toolchain-packages-gcc7.5.0/boost-1.61.0-p2/include/boost/bind/mem_fn_template.hpp:49:29
>  (impalad+0x2468ef6)
> #10 void boost::_bi::list1 
> >::op

[jira] [Updated] (IMPALA-10383) Data race in AdmissionController::WaitOnQueued

2020-12-08 Thread Attila Jeges (Jira)


 [ 
https://issues.apache.org/jira/browse/IMPALA-10383?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Attila Jeges updated IMPALA-10383:
--
Labels: broken-build  (was: )

> Data race in AdmissionController::WaitOnQueued
> --
>
> Key: IMPALA-10383
> URL: https://issues.apache.org/jira/browse/IMPALA-10383
> Project: IMPALA
>  Issue Type: Bug
>Affects Versions: Impala 4.0
>Reporter: Attila Jeges
>Assignee: Thomas Tauber-Marshall
>Priority: Major
>  Labels: broken-build
>
> TSAN is reporting a data race in {{AdmissionController::WaitOnQueued:}}
> {code:java}
> WARNING: ThreadSanitizer: data race (pid=4257)
>   Write of size 8 at 0x7b58000901b0 by thread T416:
> #0 std::_Hashtable const, impala::AdmissionController::QueueNode>, 
> std::allocator impala::AdmissionController::QueueNode> >, std::__detail::_Select1st, 
> std::equal_to, std::hash, 
> std::__detail::_Mod_range_hashing, std::__detail::_Default_ranged_hash, 
> std::__detail::_Prime_rehash_policy, std::__detail::_Hashtable_traits false, true> >::_M_erase(unsigned long, std::__detail::_Hash_node_base*, 
> std::__detail::_Hash_node impala::AdmissionController::QueueNode>, true>*) 
> /data/jenkins/workspace/impala-asf-master-core-tsan/Impala-Toolchain/toolchain-packages-gcc7.5.0/gcc-7.5.0/lib/gcc/x86_64-pc-linux-gnu/7.5.0/../../../../include/c++/7.5.0/bits/hashtable.h:1891:7
>  (impalad+0x22bbe78)
> #1 std::_Hashtable const, impala::AdmissionController::QueueNode>, 
> std::allocator impala::AdmissionController::QueueNode> >, std::__detail::_Select1st, 
> std::equal_to, std::hash, 
> std::__detail::_Mod_range_hashing, std::__detail::_Default_ranged_hash, 
> std::__detail::_Prime_rehash_policy, std::__detail::_Hashtable_traits false, true> >::_M_erase(std::integral_constant, 
> impala::UniqueIdPB const&) 
> /data/jenkins/workspace/impala-asf-master-core-tsan/Impala-Toolchain/toolchain-packages-gcc7.5.0/gcc-7.5.0/lib/gcc/x86_64-pc-linux-gnu/7.5.0/../../../../include/c++/7.5.0/bits/hashtable.h:1916:7
>  (impalad+0x22bbd3a)
> #2 std::_Hashtable const, impala::AdmissionController::QueueNode>, 
> std::allocator impala::AdmissionController::QueueNode> >, std::__detail::_Select1st, 
> std::equal_to, std::hash, 
> std::__detail::_Mod_range_hashing, std::__detail::_Default_ranged_hash, 
> std::__detail::_Prime_rehash_policy, std::__detail::_Hashtable_traits false, true> >::erase(impala::UniqueIdPB const&) 
> /data/jenkins/workspace/impala-asf-master-core-tsan/Impala-Toolchain/toolchain-packages-gcc7.5.0/gcc-7.5.0/lib/gcc/x86_64-pc-linux-gnu/7.5.0/../../../../include/c++/7.5.0/bits/hashtable.h:759:16
>  (impalad+0x22bbca0)
> #3 std::unordered_map impala::AdmissionController::QueueNode, std::hash, 
> std::equal_to, 
> std::allocator impala::AdmissionController::QueueNode> > >::erase(impala::UniqueIdPB const&) 
> /data/jenkins/workspace/impala-asf-master-core-tsan/Impala-Toolchain/toolchain-packages-gcc7.5.0/gcc-7.5.0/lib/gcc/x86_64-pc-linux-gnu/7.5.0/../../../../include/c++/7.5.0/bits/unordered_map.h:814:21
>  (impalad+0x22bbc50)
> #4 impala::AdmissionController::WaitOnQueued(impala::UniqueIdPB const&, 
> std::unique_ptr std::default_delete >*, long, 
> bool*)::$_6::operator()() const 
> /data/jenkins/workspace/impala-asf-master-core-tsan/repos/Impala/be/src/scheduling/admission-controller.cc:1278:49
>  (impalad+0x229d199)
> #5 
> impala::ScopeExitTrigger  const&, std::unique_ptr std::default_delete >*, long, 
> bool*)::$_6>::~ScopeExitTrigger() 
> /data/jenkins/workspace/impala-asf-master-core-tsan/repos/Impala/be/src/util/scope-exit-trigger.h:40:25
>  (impalad+0x2294ca9)
> #6 impala::AdmissionController::WaitOnQueued(impala::UniqueIdPB const&, 
> std::unique_ptr std::default_delete >*, long, bool*) 
> /data/jenkins/workspace/impala-asf-master-core-tsan/repos/Impala/be/src/scheduling/admission-controller.cc:1333:1
>  (impalad+0x2294912)
> #7 
> impala::LocalAdmissionControlClient::SubmitForAdmission(impala::AdmissionController::AdmissionRequest
>  const&, impala::RuntimeProfile::EventSequence*, 
> std::unique_ptr std::default_delete >*) 
> /data/jenkins/workspace/impala-asf-master-core-tsan/repos/Impala/be/src/scheduling/local-admission-control-client.cc:45:62
>  (impalad+0x2c1e40e)
> #8 impala::ClientRequestState::FinishExecQueryOrDmlRequest() 
> /data/jenkins/workspace/impala-asf-master-core-tsan/repos/Impala/be/src/service/client-request-state.cc:578:52
>  (impalad+0x245c651)
> #9 boost::_mfi::mf0 impala::ClientRequestState>::operator()(impala::ClientRequestState*) const 
> /data/jenkins/workspace/impala-asf-master-core-tsan/Impala-Toolchain/toolchain-packages-gcc7.5.0/boost-1.61.0-p2/include/boost/bind/mem_fn_template.hpp:49:29
>  (impalad+0x2468ef6)
> #10 void boost::_bi::list1 
> >::operator(), 
> boost::_bi::

[jira] [Updated] (IMPALA-10383) Data race in AdmissionController::WaitOnQueued

2020-12-08 Thread Attila Jeges (Jira)


 [ 
https://issues.apache.org/jira/browse/IMPALA-10383?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Attila Jeges updated IMPALA-10383:
--
Labels: broken-build data flaky  (was: broken-build flaky)

> Data race in AdmissionController::WaitOnQueued
> --
>
> Key: IMPALA-10383
> URL: https://issues.apache.org/jira/browse/IMPALA-10383
> Project: IMPALA
>  Issue Type: Bug
>Affects Versions: Impala 4.0
>Reporter: Attila Jeges
>Assignee: Thomas Tauber-Marshall
>Priority: Major
>  Labels: broken-build, data, flaky
>
> TSAN is reporting a data race in {{AdmissionController::WaitOnQueued:}}
> {code:java}
> WARNING: ThreadSanitizer: data race (pid=4257)
>   Write of size 8 at 0x7b58000901b0 by thread T416:
> #0 std::_Hashtable const, impala::AdmissionController::QueueNode>, 
> std::allocator impala::AdmissionController::QueueNode> >, std::__detail::_Select1st, 
> std::equal_to, std::hash, 
> std::__detail::_Mod_range_hashing, std::__detail::_Default_ranged_hash, 
> std::__detail::_Prime_rehash_policy, std::__detail::_Hashtable_traits false, true> >::_M_erase(unsigned long, std::__detail::_Hash_node_base*, 
> std::__detail::_Hash_node impala::AdmissionController::QueueNode>, true>*) 
> /data/jenkins/workspace/impala-asf-master-core-tsan/Impala-Toolchain/toolchain-packages-gcc7.5.0/gcc-7.5.0/lib/gcc/x86_64-pc-linux-gnu/7.5.0/../../../../include/c++/7.5.0/bits/hashtable.h:1891:7
>  (impalad+0x22bbe78)
> #1 std::_Hashtable const, impala::AdmissionController::QueueNode>, 
> std::allocator impala::AdmissionController::QueueNode> >, std::__detail::_Select1st, 
> std::equal_to, std::hash, 
> std::__detail::_Mod_range_hashing, std::__detail::_Default_ranged_hash, 
> std::__detail::_Prime_rehash_policy, std::__detail::_Hashtable_traits false, true> >::_M_erase(std::integral_constant, 
> impala::UniqueIdPB const&) 
> /data/jenkins/workspace/impala-asf-master-core-tsan/Impala-Toolchain/toolchain-packages-gcc7.5.0/gcc-7.5.0/lib/gcc/x86_64-pc-linux-gnu/7.5.0/../../../../include/c++/7.5.0/bits/hashtable.h:1916:7
>  (impalad+0x22bbd3a)
> #2 std::_Hashtable const, impala::AdmissionController::QueueNode>, 
> std::allocator impala::AdmissionController::QueueNode> >, std::__detail::_Select1st, 
> std::equal_to, std::hash, 
> std::__detail::_Mod_range_hashing, std::__detail::_Default_ranged_hash, 
> std::__detail::_Prime_rehash_policy, std::__detail::_Hashtable_traits false, true> >::erase(impala::UniqueIdPB const&) 
> /data/jenkins/workspace/impala-asf-master-core-tsan/Impala-Toolchain/toolchain-packages-gcc7.5.0/gcc-7.5.0/lib/gcc/x86_64-pc-linux-gnu/7.5.0/../../../../include/c++/7.5.0/bits/hashtable.h:759:16
>  (impalad+0x22bbca0)
> #3 std::unordered_map impala::AdmissionController::QueueNode, std::hash, 
> std::equal_to, 
> std::allocator impala::AdmissionController::QueueNode> > >::erase(impala::UniqueIdPB const&) 
> /data/jenkins/workspace/impala-asf-master-core-tsan/Impala-Toolchain/toolchain-packages-gcc7.5.0/gcc-7.5.0/lib/gcc/x86_64-pc-linux-gnu/7.5.0/../../../../include/c++/7.5.0/bits/unordered_map.h:814:21
>  (impalad+0x22bbc50)
> #4 impala::AdmissionController::WaitOnQueued(impala::UniqueIdPB const&, 
> std::unique_ptr std::default_delete >*, long, 
> bool*)::$_6::operator()() const 
> /data/jenkins/workspace/impala-asf-master-core-tsan/repos/Impala/be/src/scheduling/admission-controller.cc:1278:49
>  (impalad+0x229d199)
> #5 
> impala::ScopeExitTrigger  const&, std::unique_ptr std::default_delete >*, long, 
> bool*)::$_6>::~ScopeExitTrigger() 
> /data/jenkins/workspace/impala-asf-master-core-tsan/repos/Impala/be/src/util/scope-exit-trigger.h:40:25
>  (impalad+0x2294ca9)
> #6 impala::AdmissionController::WaitOnQueued(impala::UniqueIdPB const&, 
> std::unique_ptr std::default_delete >*, long, bool*) 
> /data/jenkins/workspace/impala-asf-master-core-tsan/repos/Impala/be/src/scheduling/admission-controller.cc:1333:1
>  (impalad+0x2294912)
> #7 
> impala::LocalAdmissionControlClient::SubmitForAdmission(impala::AdmissionController::AdmissionRequest
>  const&, impala::RuntimeProfile::EventSequence*, 
> std::unique_ptr std::default_delete >*) 
> /data/jenkins/workspace/impala-asf-master-core-tsan/repos/Impala/be/src/scheduling/local-admission-control-client.cc:45:62
>  (impalad+0x2c1e40e)
> #8 impala::ClientRequestState::FinishExecQueryOrDmlRequest() 
> /data/jenkins/workspace/impala-asf-master-core-tsan/repos/Impala/be/src/service/client-request-state.cc:578:52
>  (impalad+0x245c651)
> #9 boost::_mfi::mf0 impala::ClientRequestState>::operator()(impala::ClientRequestState*) const 
> /data/jenkins/workspace/impala-asf-master-core-tsan/Impala-Toolchain/toolchain-packages-gcc7.5.0/boost-1.61.0-p2/include/boost/bind/mem_fn_template.hpp:49:29
>  (impalad+0x2468ef6)
> #10 void boost::_b

[jira] [Updated] (IMPALA-10383) Data race in AdmissionController::WaitOnQueued

2020-12-08 Thread Attila Jeges (Jira)


 [ 
https://issues.apache.org/jira/browse/IMPALA-10383?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Attila Jeges updated IMPALA-10383:
--
Labels: broken-build flaky  (was: broken-build data flaky)

> Data race in AdmissionController::WaitOnQueued
> --
>
> Key: IMPALA-10383
> URL: https://issues.apache.org/jira/browse/IMPALA-10383
> Project: IMPALA
>  Issue Type: Bug
>Affects Versions: Impala 4.0
>Reporter: Attila Jeges
>Assignee: Thomas Tauber-Marshall
>Priority: Major
>  Labels: broken-build, flaky
>
> TSAN is reporting a data race in {{AdmissionController::WaitOnQueued:}}
> {code:java}
> WARNING: ThreadSanitizer: data race (pid=4257)
>   Write of size 8 at 0x7b58000901b0 by thread T416:
> #0 std::_Hashtable const, impala::AdmissionController::QueueNode>, 
> std::allocator impala::AdmissionController::QueueNode> >, std::__detail::_Select1st, 
> std::equal_to, std::hash, 
> std::__detail::_Mod_range_hashing, std::__detail::_Default_ranged_hash, 
> std::__detail::_Prime_rehash_policy, std::__detail::_Hashtable_traits false, true> >::_M_erase(unsigned long, std::__detail::_Hash_node_base*, 
> std::__detail::_Hash_node impala::AdmissionController::QueueNode>, true>*) 
> /data/jenkins/workspace/impala-asf-master-core-tsan/Impala-Toolchain/toolchain-packages-gcc7.5.0/gcc-7.5.0/lib/gcc/x86_64-pc-linux-gnu/7.5.0/../../../../include/c++/7.5.0/bits/hashtable.h:1891:7
>  (impalad+0x22bbe78)
> #1 std::_Hashtable const, impala::AdmissionController::QueueNode>, 
> std::allocator impala::AdmissionController::QueueNode> >, std::__detail::_Select1st, 
> std::equal_to, std::hash, 
> std::__detail::_Mod_range_hashing, std::__detail::_Default_ranged_hash, 
> std::__detail::_Prime_rehash_policy, std::__detail::_Hashtable_traits false, true> >::_M_erase(std::integral_constant, 
> impala::UniqueIdPB const&) 
> /data/jenkins/workspace/impala-asf-master-core-tsan/Impala-Toolchain/toolchain-packages-gcc7.5.0/gcc-7.5.0/lib/gcc/x86_64-pc-linux-gnu/7.5.0/../../../../include/c++/7.5.0/bits/hashtable.h:1916:7
>  (impalad+0x22bbd3a)
> #2 std::_Hashtable const, impala::AdmissionController::QueueNode>, 
> std::allocator impala::AdmissionController::QueueNode> >, std::__detail::_Select1st, 
> std::equal_to, std::hash, 
> std::__detail::_Mod_range_hashing, std::__detail::_Default_ranged_hash, 
> std::__detail::_Prime_rehash_policy, std::__detail::_Hashtable_traits false, true> >::erase(impala::UniqueIdPB const&) 
> /data/jenkins/workspace/impala-asf-master-core-tsan/Impala-Toolchain/toolchain-packages-gcc7.5.0/gcc-7.5.0/lib/gcc/x86_64-pc-linux-gnu/7.5.0/../../../../include/c++/7.5.0/bits/hashtable.h:759:16
>  (impalad+0x22bbca0)
> #3 std::unordered_map impala::AdmissionController::QueueNode, std::hash, 
> std::equal_to, 
> std::allocator impala::AdmissionController::QueueNode> > >::erase(impala::UniqueIdPB const&) 
> /data/jenkins/workspace/impala-asf-master-core-tsan/Impala-Toolchain/toolchain-packages-gcc7.5.0/gcc-7.5.0/lib/gcc/x86_64-pc-linux-gnu/7.5.0/../../../../include/c++/7.5.0/bits/unordered_map.h:814:21
>  (impalad+0x22bbc50)
> #4 impala::AdmissionController::WaitOnQueued(impala::UniqueIdPB const&, 
> std::unique_ptr std::default_delete >*, long, 
> bool*)::$_6::operator()() const 
> /data/jenkins/workspace/impala-asf-master-core-tsan/repos/Impala/be/src/scheduling/admission-controller.cc:1278:49
>  (impalad+0x229d199)
> #5 
> impala::ScopeExitTrigger  const&, std::unique_ptr std::default_delete >*, long, 
> bool*)::$_6>::~ScopeExitTrigger() 
> /data/jenkins/workspace/impala-asf-master-core-tsan/repos/Impala/be/src/util/scope-exit-trigger.h:40:25
>  (impalad+0x2294ca9)
> #6 impala::AdmissionController::WaitOnQueued(impala::UniqueIdPB const&, 
> std::unique_ptr std::default_delete >*, long, bool*) 
> /data/jenkins/workspace/impala-asf-master-core-tsan/repos/Impala/be/src/scheduling/admission-controller.cc:1333:1
>  (impalad+0x2294912)
> #7 
> impala::LocalAdmissionControlClient::SubmitForAdmission(impala::AdmissionController::AdmissionRequest
>  const&, impala::RuntimeProfile::EventSequence*, 
> std::unique_ptr std::default_delete >*) 
> /data/jenkins/workspace/impala-asf-master-core-tsan/repos/Impala/be/src/scheduling/local-admission-control-client.cc:45:62
>  (impalad+0x2c1e40e)
> #8 impala::ClientRequestState::FinishExecQueryOrDmlRequest() 
> /data/jenkins/workspace/impala-asf-master-core-tsan/repos/Impala/be/src/service/client-request-state.cc:578:52
>  (impalad+0x245c651)
> #9 boost::_mfi::mf0 impala::ClientRequestState>::operator()(impala::ClientRequestState*) const 
> /data/jenkins/workspace/impala-asf-master-core-tsan/Impala-Toolchain/toolchain-packages-gcc7.5.0/boost-1.61.0-p2/include/boost/bind/mem_fn_template.hpp:49:29
>  (impalad+0x2468ef6)
> #10 void boost::_bi::lis

[jira] [Created] (IMPALA-10384) Make partition names consistent between BE and FE

2020-12-08 Thread Jira
Zoltán Borók-Nagy created IMPALA-10384:
--

 Summary: Make partition names consistent between BE and FE
 Key: IMPALA-10384
 URL: https://issues.apache.org/jira/browse/IMPALA-10384
 Project: IMPALA
  Issue Type: Bug
Reporter: Zoltán Borók-Nagy


In the BE we build partition names with the trailing char '/'. In the FE we 
build partition names without a trailing char. We should make this consistent.

I think the correct is the one without the trailing '/'. Hive also prints 
partition names without the trailing '/'. Iceberg also expects partition names 
without the trailing '/'.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Assigned] (IMPALA-10384) Make partition names consistent between BE and FE

2020-12-08 Thread Jira


 [ 
https://issues.apache.org/jira/browse/IMPALA-10384?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zoltán Borók-Nagy reassigned IMPALA-10384:
--

Assignee: Zoltán Borók-Nagy

> Make partition names consistent between BE and FE
> -
>
> Key: IMPALA-10384
> URL: https://issues.apache.org/jira/browse/IMPALA-10384
> Project: IMPALA
>  Issue Type: Bug
>Reporter: Zoltán Borók-Nagy
>Assignee: Zoltán Borók-Nagy
>Priority: Major
>
> In the BE we build partition names with the trailing char '/'. In the FE we 
> build partition names without a trailing char. We should make this consistent.
> I think the correct is the one without the trailing '/'. Hive also prints 
> partition names without the trailing '/'. Iceberg also expects partition 
> names without the trailing '/'.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Commented] (IMPALA-10252) Query returns less number of rows with run-time filtering on integer column in a subquery against functional_parquet schema

2020-12-08 Thread ASF subversion and git services (Jira)


[ 
https://issues.apache.org/jira/browse/IMPALA-10252?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17245987#comment-17245987
 ] 

ASF subversion and git services commented on IMPALA-10252:
--

Commit f684ed72c541fa04dc1841a1aab83a7c9847f1a2 in impala's branch 
refs/heads/master from Tim Armstrong
[ https://gitbox.apache.org/repos/asf?p=impala.git;h=f684ed7 ]

IMPALA-10252: fix invalid runtime filters for outer joins

The planner generates runtime filters for non-join conjuncts
assigned to LEFT OUTER and FULL OUTER JOIN nodes. This is
correct in many cases where NULLs stemming from unmatched rows
would result in the predicate evaluating to false. E.g.
x = y is always false if y is NULL.

However, it is incorrect if the NULL returned from the unmatched
row can result in the predicate evaluating to true. E.g.
x = isnull(y, 1) can return true even if y is NULL.

The fix is to detect cases when the source expression from the
left input of the join returns non-NULL for null inputs and then
skip generating the filter.

Examples of expressions that may be affected by this change are
COALESCE and ISNULL.

Testing:
Added regression tests:
* Planner tests for LEFT OUTER and FULL OUTER where the runtime
  filter was incorrectly generated before this patch.
* Enabled end-to-end test that was previously failing.
* Added a new runtime filter test that will execute on both
  Parquet and Kudu (which are subtly different because of nullability of
  slots).

Ran exhaustive tests.

Change-Id: I507af1cc8df15bca21e0d8555019997812087261
Reviewed-on: http://gerrit.cloudera.org:8080/16622
Reviewed-by: Impala Public Jenkins 
Tested-by: Impala Public Jenkins 


> Query returns less number of rows with run-time filtering on integer column 
> in a subquery against functional_parquet schema
> ---
>
> Key: IMPALA-10252
> URL: https://issues.apache.org/jira/browse/IMPALA-10252
> Project: IMPALA
>  Issue Type: Bug
>  Components: Frontend
>Affects Versions: Impala 2.5.0, Impala 2.6.0, Impala 2.7.0, Impala 2.8.0, 
> Impala 2.9.0, Impala 2.10.0, Impala 2.11.0, Impala 3.0, Impala 2.12.0, Impala 
> 3.1.0, Impala 3.2.0, Impala 3.3.0, Impala 3.4.0
>Reporter: Qifan Chen
>Assignee: Tim Armstrong
>Priority: Blocker
>  Labels: correctness
> Fix For: Impala 4.0
>
>
> During the work to address IMPALA-6628 (Use unqualified table references in 
> .test files run from test_queries.py), it is found that a query against the 
> functional_parquet database returns 1 row while the same query returns 12 
> rows when run-time filtering is turned off, or against the functional 
> database. 
>  
>  
> {code:java}
> Query: --SET RUNTIME_FILTER_MODE=OFF;
> select id, int_col, year, month
>  from functional_parquet.alltypessmall s
>  where s.int_col = (select count(*) from functional_parquet.alltypestiny t 
> where s.id = t.id)
>  order by id
> Query submitted at: 2020-10-18 12:41:15 (Coordinator: 
> http://qifan-10229:25000)
> Query progress can be monitored at: 
> http://qifan-10229:25000/query_plan?query_id=394a61d8f0002336:fd45e073
> ++-+--+---+
> | id | int_col | year | month |
> ++-+--+---+
> | 1 | 1 | 2009 | 1 |
> ++-+--+---+
> {code}
>  
>  
> {code:java}
> RUNTIME_FILTER_MODE set to OFF
> Query: select id, int_col, year, month 
>  from functional_parquet.alltypessmall s 
>  where s.int_col = (select count(*) from functional_parquet.alltypestiny t 
> where s.id = t.id) 
>  order by id
> Query submitted at: 2020-10-18 12:40:58 (Coordinator: 
> http://qifan-10229:25000)
> Query progress can be monitored at: 
> http://qifan-10229:25000/query_plan?query_id=304c095f478607fc:7d2d03ff
> ++-+--+---+
> | id | int_col | year | month |
> ++-+--+---+
> | 1 | 1 | 2009 | 1 |
> | 10 | 0 | 2009 | 1 |
> | 20 | 0 | 2009 | 1 |
> | 25 | 0 | 2009 | 2 |
> | 35 | 0 | 2009 | 2 |
> | 45 | 0 | 2009 | 2 |
> | 50 | 0 | 2009 | 3 |
> | 60 | 0 | 2009 | 3 |
> | 70 | 0 | 2009 | 3 |
> | 75 | 0 | 2009 | 4 |
> | 85 | 0 | 2009 | 4 |
> | 95 | 0 | 2009 | 4 |
> ++-+--+---+{code}
>  
> Query against functional database.
> {code:java}
> Query: select id, int_col, year, month 
>  from functional.alltypessmall s 
>  where s.int_col = (select count(*) from functional.alltypestiny t where s.id 
> = t.id) 
>  order by id
> Query submitted at: 2020-10-18 12:35:24 (Coordinator: 
> http://qifan-10229:25000)
> Query progress can be monitored at: 
> http://qifan-10229:25000/query_plan?query_id=104bd5d7a6d5fe74:09a6c090
> ++-+--+---+
> | id | int_col | year | month |
> ++-+--+---+
> | 1 

[jira] [Resolved] (IMPALA-10252) Query returns less number of rows with run-time filtering on integer column in a subquery against functional_parquet schema

2020-12-08 Thread Tim Armstrong (Jira)


 [ 
https://issues.apache.org/jira/browse/IMPALA-10252?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tim Armstrong resolved IMPALA-10252.

Fix Version/s: Impala 4.0
   Resolution: Fixed

> Query returns less number of rows with run-time filtering on integer column 
> in a subquery against functional_parquet schema
> ---
>
> Key: IMPALA-10252
> URL: https://issues.apache.org/jira/browse/IMPALA-10252
> Project: IMPALA
>  Issue Type: Bug
>  Components: Frontend
>Affects Versions: Impala 2.5.0, Impala 2.6.0, Impala 2.7.0, Impala 2.8.0, 
> Impala 2.9.0, Impala 2.10.0, Impala 2.11.0, Impala 3.0, Impala 2.12.0, Impala 
> 3.1.0, Impala 3.2.0, Impala 3.3.0, Impala 3.4.0
>Reporter: Qifan Chen
>Assignee: Tim Armstrong
>Priority: Blocker
>  Labels: correctness
> Fix For: Impala 4.0
>
>
> During the work to address IMPALA-6628 (Use unqualified table references in 
> .test files run from test_queries.py), it is found that a query against the 
> functional_parquet database returns 1 row while the same query returns 12 
> rows when run-time filtering is turned off, or against the functional 
> database. 
>  
>  
> {code:java}
> Query: --SET RUNTIME_FILTER_MODE=OFF;
> select id, int_col, year, month
>  from functional_parquet.alltypessmall s
>  where s.int_col = (select count(*) from functional_parquet.alltypestiny t 
> where s.id = t.id)
>  order by id
> Query submitted at: 2020-10-18 12:41:15 (Coordinator: 
> http://qifan-10229:25000)
> Query progress can be monitored at: 
> http://qifan-10229:25000/query_plan?query_id=394a61d8f0002336:fd45e073
> ++-+--+---+
> | id | int_col | year | month |
> ++-+--+---+
> | 1 | 1 | 2009 | 1 |
> ++-+--+---+
> {code}
>  
>  
> {code:java}
> RUNTIME_FILTER_MODE set to OFF
> Query: select id, int_col, year, month 
>  from functional_parquet.alltypessmall s 
>  where s.int_col = (select count(*) from functional_parquet.alltypestiny t 
> where s.id = t.id) 
>  order by id
> Query submitted at: 2020-10-18 12:40:58 (Coordinator: 
> http://qifan-10229:25000)
> Query progress can be monitored at: 
> http://qifan-10229:25000/query_plan?query_id=304c095f478607fc:7d2d03ff
> ++-+--+---+
> | id | int_col | year | month |
> ++-+--+---+
> | 1 | 1 | 2009 | 1 |
> | 10 | 0 | 2009 | 1 |
> | 20 | 0 | 2009 | 1 |
> | 25 | 0 | 2009 | 2 |
> | 35 | 0 | 2009 | 2 |
> | 45 | 0 | 2009 | 2 |
> | 50 | 0 | 2009 | 3 |
> | 60 | 0 | 2009 | 3 |
> | 70 | 0 | 2009 | 3 |
> | 75 | 0 | 2009 | 4 |
> | 85 | 0 | 2009 | 4 |
> | 95 | 0 | 2009 | 4 |
> ++-+--+---+{code}
>  
> Query against functional database.
> {code:java}
> Query: select id, int_col, year, month 
>  from functional.alltypessmall s 
>  where s.int_col = (select count(*) from functional.alltypestiny t where s.id 
> = t.id) 
>  order by id
> Query submitted at: 2020-10-18 12:35:24 (Coordinator: 
> http://qifan-10229:25000)
> Query progress can be monitored at: 
> http://qifan-10229:25000/query_plan?query_id=104bd5d7a6d5fe74:09a6c090
> ++-+--+---+
> | id | int_col | year | month |
> ++-+--+---+
> | 1 | 1 | 2009 | 1 |
> | 10 | 0 | 2009 | 1 |
> | 20 | 0 | 2009 | 1 |
> | 25 | 0 | 2009 | 2 |
> | 35 | 0 | 2009 | 2 |
> | 45 | 0 | 2009 | 2 |
> | 50 | 0 | 2009 | 3 |
> | 60 | 0 | 2009 | 3 |
> | 70 | 0 | 2009 | 3 |
> | 75 | 0 | 2009 | 4 |
> | 85 | 0 | 2009 | 4 |
> | 95 | 0 | 2009 | 4 |
> ++-+--+---+{code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Commented] (IMPALA-10382) Predicate with coalesce on both sides of LOJ isn't NULL filtering

2020-12-08 Thread Tim Armstrong (Jira)


[ 
https://issues.apache.org/jira/browse/IMPALA-10382?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17245988#comment-17245988
 ] 

Tim Armstrong commented on IMPALA-10382:


IMPALA-10252 is a related bug.

> Predicate with coalesce on both sides of LOJ isn't NULL filtering
> -
>
> Key: IMPALA-10382
> URL: https://issues.apache.org/jira/browse/IMPALA-10382
> Project: IMPALA
>  Issue Type: Bug
>  Components: Frontend
>Reporter: Shant Hovsepian
>Assignee: Xianqing He
>Priority: Critical
>  Labels: correctness
>
> A query like the below will have the outer join simplified to an inner join 
> when the predicate with coalesce isn't always NULL filtering.
> {code:sql}
> select t1.int_col from alltypestiny t1 left outer join alltypesagg t2 on 
> t1.tinyint_col = t2.tinyint_col left outer join alltypes t3 on t1.int_col = 
> t3.int_col where t2.tinyint_col >= coalesce(t1.int_col, t2.int_col);
> {code}
> {noformat}
> functional> set ENABLE_OUTER_JOIN_TO_INNER_TRANSFORMATION=true;
> functional> explain select t1.int_col from alltypestiny t1 left outer join 
> alltypesagg t2 on t1.tinyint_col = t2.tinyint_col left outer join alltypes t3 
> on t1.int_col = t3.int_col where t2.tinyint_col >= coalesce(t1.int_col, 
> t2.int_col);
> Query: explain select t1.int_col from alltypestiny t1 left outer join 
> alltypesagg t2 on t1.tinyint_col = t2.tinyint_col left outer join alltypes t3 
> on t1.int_col = t3.int_col where t2.tinyint_col >= coalesce(t1.int_col, 
> t2.int_col)
> ++
> | Explain String |
> ++
> | Max Per-Host Resource Reservation: Memory=5.04MB Threads=8 |
> | Per-Host Resource Estimates: Memory=287MB  |
> ||
> | PLAN-ROOT SINK |
> | 08:EXCHANGE [UNPARTITIONED]|
> | 04:HASH JOIN [LEFT OUTER JOIN, PARTITIONED]|
> | |--07:EXCHANGE [HASH(t3.int_col)]  |
> | |  02:SCAN HDFS [functional.alltypes t3]   |
> | 06:EXCHANGE [HASH(t1.int_col)] |
> | 03:HASH JOIN [INNER JOIN, BROADCAST]   |
> | |--05:EXCHANGE [BROADCAST] |
> | |  00:SCAN HDFS [functional.alltypestiny t1]   |
> | 01:SCAN HDFS [functional.alltypesagg t2]   |
> ++
> Fetched 13 row(s) in 0.02s
> {noformat}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Commented] (IMPALA-9985) CentOS 8 builds break with __glibc_has_include ("__linux__/stat.h")

2020-12-08 Thread Laszlo Gaal (Jira)


[ 
https://issues.apache.org/jira/browse/IMPALA-9985?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17246013#comment-17246013
 ] 

Laszlo Gaal commented on IMPALA-9985:
-

The change to the native-toolchian project is now committed, see
https://github.com/cloudera/native-toolchain/commit/7644f7fe9cb87a88821f14072181319c9d1c7018

> CentOS 8 builds break with __glibc_has_include ("__linux__/stat.h")
> ---
>
> Key: IMPALA-9985
> URL: https://issues.apache.org/jira/browse/IMPALA-9985
> Project: IMPALA
>  Issue Type: Bug
>  Components: Infrastructure
>Affects Versions: Impala 4.0
>Reporter: Laszlo Gaal
>Assignee: Laszlo Gaal
>Priority: Blocker
>
> Currently Docker-based builds are running; they are breaking early in the 
> build, during virtualenv construction, when the Python bitarray module is 
> compiled:
> {code}
> 2020-07-21 07:44:32.913375 Complete output from command 
> /home/impdev/Impala/bin/../infra/python/env-gcc7.5.0/bin/python -c "import 
> setuptools, 
> tokenize;__file__='/tmp/pip-build-NK6_23/bitarray/setup.py';exec(compile(getattr(tokenize,
>  'open', open)(__file__).read().replace('\r\n', '\n'), __file__, 'exec'))" 
> install --record /tmp/pip-L3NjnK-record/install-record.txt 
> --single-version-externally-managed --compile --install-headers 
> /home/impdev/Impala/bin/../infra/python/env-gcc7.5.0/include/site/python2.7/bitarray:
> 2020-07-21 07:44:32.913393 running install
> 2020-07-21 07:44:32.913411 running build
> 2020-07-21 07:44:32.913430 running build_py
> 2020-07-21 07:44:32.913447 creating build
> 2020-07-21 07:44:32.913476 creating build/lib.linux-x86_64-2.7
> 2020-07-21 07:44:32.913510 creating build/lib.linux-x86_64-2.7/bitarray
> 2020-07-21 07:44:32.913553 copying bitarray/util.py -> 
> build/lib.linux-x86_64-2.7/bitarray
> 2020-07-21 07:44:32.913599 copying bitarray/test_util.py -> 
> build/lib.linux-x86_64-2.7/bitarray
> 2020-07-21 07:44:32.913645 copying bitarray/__init__.py -> 
> build/lib.linux-x86_64-2.7/bitarray
> 2020-07-21 07:44:32.913695 copying bitarray/test_bitarray.py -> 
> build/lib.linux-x86_64-2.7/bitarray
> 2020-07-21 07:44:32.913715 running build_ext
> 2020-07-21 07:44:32.913746 building 'bitarray._bitarray' extension
> 2020-07-21 07:44:32.913775 creating build/temp.linux-x86_64-2.7
> 2020-07-21 07:44:32.913809 creating build/temp.linux-x86_64-2.7/bitarray
> 2020-07-21 07:44:32.914022 ccache 
> /home/impdev/Impala/toolchain/toolchain-packages-gcc7.5.0/gcc-7.5.0/bin/gcc 
> -fno-strict-aliasing -I/usr/include/ncurses 
> -I/mnt/build/bzip2-1.0.6-p2/include -DNDEBUG -g -fwrapv -O3 -Wall 
> -Wstrict-prototypes -fPIC 
> -I/home/impdev/Impala/toolchain/toolchain-packages-gcc7.5.0/python-2.7.16/include/python2.7
>  -c bitarray/_bitarray.c -o build/temp.linux-x86_64-2.7/bitarray/_bitarray.o
> 2020-07-21 07:44:32.914059 In file included from 
> /usr/include/sys/stat.h:446:0,
> 2020-07-21 07:44:32.914135  from 
> /home/impdev/Impala/toolchain/toolchain-packages-gcc7.5.0/python-2.7.16/include/python2.7/pyport.h:390,
> 2020-07-21 07:44:32.914210  from 
> /home/impdev/Impala/toolchain/toolchain-packages-gcc7.5.0/python-2.7.16/include/python2.7/Python.h:61,
> 2020-07-21 07:44:32.914244  from bitarray/_bitarray.c:12:
> 2020-07-21 07:44:32.914350 
> /home/impdev/Impala/toolchain/toolchain-packages-gcc7.5.0/gcc-7.5.0/lib/gcc/x86_64-pc-linux-gnu/7.5.0/include-fixed/bits/statx.h:38:25:
>  error: missing binary operator before token "("
> 2020-07-21 07:44:32.914384  #if __glibc_has_include ("__linux__/stat.h")
> 2020-07-21 07:44:32.914408  ^
> 2020-07-21 07:44:32.91 error: command 'ccache' failed with exit 
> status 1
> 20{code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Comment Edited] (IMPALA-9985) CentOS 8 builds break with __glibc_has_include ("__linux__/stat.h")

2020-12-08 Thread Laszlo Gaal (Jira)


[ 
https://issues.apache.org/jira/browse/IMPALA-9985?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17246013#comment-17246013
 ] 

Laszlo Gaal edited comment on IMPALA-9985 at 12/8/20, 5:21 PM:
---

The change to the native-toolchain project is now committed, see
https://github.com/cloudera/native-toolchain/commit/7644f7fe9cb87a88821f14072181319c9d1c7018


was (Author: laszlog):
The change to the native-toolchian project is now committed, see
https://github.com/cloudera/native-toolchain/commit/7644f7fe9cb87a88821f14072181319c9d1c7018

> CentOS 8 builds break with __glibc_has_include ("__linux__/stat.h")
> ---
>
> Key: IMPALA-9985
> URL: https://issues.apache.org/jira/browse/IMPALA-9985
> Project: IMPALA
>  Issue Type: Bug
>  Components: Infrastructure
>Affects Versions: Impala 4.0
>Reporter: Laszlo Gaal
>Assignee: Laszlo Gaal
>Priority: Blocker
>
> Currently Docker-based builds are running; they are breaking early in the 
> build, during virtualenv construction, when the Python bitarray module is 
> compiled:
> {code}
> 2020-07-21 07:44:32.913375 Complete output from command 
> /home/impdev/Impala/bin/../infra/python/env-gcc7.5.0/bin/python -c "import 
> setuptools, 
> tokenize;__file__='/tmp/pip-build-NK6_23/bitarray/setup.py';exec(compile(getattr(tokenize,
>  'open', open)(__file__).read().replace('\r\n', '\n'), __file__, 'exec'))" 
> install --record /tmp/pip-L3NjnK-record/install-record.txt 
> --single-version-externally-managed --compile --install-headers 
> /home/impdev/Impala/bin/../infra/python/env-gcc7.5.0/include/site/python2.7/bitarray:
> 2020-07-21 07:44:32.913393 running install
> 2020-07-21 07:44:32.913411 running build
> 2020-07-21 07:44:32.913430 running build_py
> 2020-07-21 07:44:32.913447 creating build
> 2020-07-21 07:44:32.913476 creating build/lib.linux-x86_64-2.7
> 2020-07-21 07:44:32.913510 creating build/lib.linux-x86_64-2.7/bitarray
> 2020-07-21 07:44:32.913553 copying bitarray/util.py -> 
> build/lib.linux-x86_64-2.7/bitarray
> 2020-07-21 07:44:32.913599 copying bitarray/test_util.py -> 
> build/lib.linux-x86_64-2.7/bitarray
> 2020-07-21 07:44:32.913645 copying bitarray/__init__.py -> 
> build/lib.linux-x86_64-2.7/bitarray
> 2020-07-21 07:44:32.913695 copying bitarray/test_bitarray.py -> 
> build/lib.linux-x86_64-2.7/bitarray
> 2020-07-21 07:44:32.913715 running build_ext
> 2020-07-21 07:44:32.913746 building 'bitarray._bitarray' extension
> 2020-07-21 07:44:32.913775 creating build/temp.linux-x86_64-2.7
> 2020-07-21 07:44:32.913809 creating build/temp.linux-x86_64-2.7/bitarray
> 2020-07-21 07:44:32.914022 ccache 
> /home/impdev/Impala/toolchain/toolchain-packages-gcc7.5.0/gcc-7.5.0/bin/gcc 
> -fno-strict-aliasing -I/usr/include/ncurses 
> -I/mnt/build/bzip2-1.0.6-p2/include -DNDEBUG -g -fwrapv -O3 -Wall 
> -Wstrict-prototypes -fPIC 
> -I/home/impdev/Impala/toolchain/toolchain-packages-gcc7.5.0/python-2.7.16/include/python2.7
>  -c bitarray/_bitarray.c -o build/temp.linux-x86_64-2.7/bitarray/_bitarray.o
> 2020-07-21 07:44:32.914059 In file included from 
> /usr/include/sys/stat.h:446:0,
> 2020-07-21 07:44:32.914135  from 
> /home/impdev/Impala/toolchain/toolchain-packages-gcc7.5.0/python-2.7.16/include/python2.7/pyport.h:390,
> 2020-07-21 07:44:32.914210  from 
> /home/impdev/Impala/toolchain/toolchain-packages-gcc7.5.0/python-2.7.16/include/python2.7/Python.h:61,
> 2020-07-21 07:44:32.914244  from bitarray/_bitarray.c:12:
> 2020-07-21 07:44:32.914350 
> /home/impdev/Impala/toolchain/toolchain-packages-gcc7.5.0/gcc-7.5.0/lib/gcc/x86_64-pc-linux-gnu/7.5.0/include-fixed/bits/statx.h:38:25:
>  error: missing binary operator before token "("
> 2020-07-21 07:44:32.914384  #if __glibc_has_include ("__linux__/stat.h")
> 2020-07-21 07:44:32.914408  ^
> 2020-07-21 07:44:32.91 error: command 'ccache' failed with exit 
> status 1
> 20{code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Assigned] (IMPALA-10371) test_java_udfs crash impalad if result spooling is enabled

2020-12-08 Thread Riza Suminto (Jira)


 [ 
https://issues.apache.org/jira/browse/IMPALA-10371?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Riza Suminto reassigned IMPALA-10371:
-

Assignee: Daniel Becker

> test_java_udfs crash impalad if result spooling is enabled
> --
>
> Key: IMPALA-10371
> URL: https://issues.apache.org/jira/browse/IMPALA-10371
> Project: IMPALA
>  Issue Type: Bug
>  Components: Backend
>Affects Versions: Impala 3.4.0
>Reporter: Riza Suminto
>Assignee: Daniel Becker
>Priority: Major
> Attachments: 46a19881-resolved.txt, hs_err_pid12878.log
>
>
> The following test query from TestUdfExecution::test_java_udfs crash impalad 
> when result spooling is enabled.
> {code:java}
> select throws_exception() from functional.alltypestiny{code}
> The following is a truncated JVM crash log related to the crash
> {code:java}
> ---  T H R E A D  ---Current thread 
> (0x0fb4c000):  JavaThread "Thread-700" [_thread_in_native, id=30853, 
> stack(0x7f79715ff000,0x7f7971dff000)]Stack: 
> [0x7f79715ff000,0x7f7971dff000],  sp=0x7f7971dfa280,  free 
> space=8172k
> Native frames: (J=compiled Java code, j=interpreted, Vv=VM code, C=native 
> code)
> V  [libjvm.so+0xb6b032]
> V  [libjvm.so+0x4f14bd]
> V  [libjvm.so+0x80fa8f]
> V  [libjvm.so+0x7e0991]
> V  [libjvm.so+0x69fa10]
> j  
> org.apache.impala.TestUdfException.evaluate()Lorg/apache/hadoop/io/BooleanWritable;+9
> v  ~StubRoutines::call_stub
> V  [libjvm.so+0x6af9ba]
> V  [libjvm.so+0xa1def8]
> V  [libjvm.so+0xa1f8d5]
> V  [libjvm.so+0x7610f8]  JVM_InvokeMethod+0x128
> J 2286  
> sun.reflect.NativeMethodAccessorImpl.invoke0(Ljava/lang/reflect/Method;Ljava/lang/Object;[Ljava/lang/Object;)Ljava/lang/Object;
>  (0 bytes) @ 0x7f7acb553ced [0x7f7acb553c00+0xed]
> J 6921 C2 
> sun.reflect.NativeMethodAccessorImpl.invoke(Ljava/lang/Object;[Ljava/lang/Object;)Ljava/lang/Object;
>  (104 bytes) @ 0x7f7acbd1de38 [0x7f7acbd1ddc0+0x78]
> J 3645 C2 org.apache.impala.hive.executor.UdfExecutor.evaluate()V (396 bytes) 
> @ 0x7f7acaf6e894 [0x7f7acaf6e640+0x254]
> v  ~StubRoutines::call_stub
> V  [libjvm.so+0x6af9ba]
> V  [libjvm.so+0x72c046]
> V  [libjvm.so+0x730523]
> C  0x7f7ab4c5d0d0
> C  [impalad+0x26a2648]  
> impala::ScalarExprEvaluator::GetValue(impala::ScalarExpr const&, 
> impala::TupleRow const*)+0x7a
> C  [impalad+0x26a25cb]  
> impala::ScalarExprEvaluator::GetValue(impala::TupleRow const*)+0x2b
> C  [impalad+0x21f4f78]  
> impala::AsciiQueryResultSet::AddRows(std::vector  std::allocator > const&, impala::RowBatch*, 
> int, int)+0x4c2
> C  [impalad+0x25c5862]  
> impala::BufferedPlanRootSink::GetNext(impala::RuntimeState*, 
> impala::QueryResultSet*, int, bool*, long)+0x70c
> C  [impalad+0x296cf17]  impala::Coordinator::GetNext(impala::QueryResultSet*, 
> int, bool*, long)+0x557
> C  [impalad+0x219f5fe]  impala::ClientRequestState::FetchRowsInternal(int, 
> impala::QueryResultSet*, long)+0x6b2
> C  [impalad+0x219d98e]  impala::ClientRequestState::FetchRows(int, 
> impala::QueryResultSet*, long)+0x46
> C  [impalad+0x21c1d29]  
> impala::ImpalaServer::FetchInternal(impala::TUniqueId, bool, int, 
> beeswax::Results*)+0x717
> C  [impalad+0x21bbde9]  impala::ImpalaServer::fetch(beeswax::Results&, 
> beeswax::QueryHandle const&, bool, int)+0x577
> {code}
> If result spooling is enabled, BufferedPlanRootSink will be used and 
> ScalarExprEvaluation will be called in BufferedPlanRootSink::GetNext, leading 
> to this crash.
> Without result spooling, BlockingPlanRootSink will be used and 
> ScalarExprEvaluation is called in BlockingPlanRootSink::Send. No crash happen 
> when result spooling is disabled.
> Attached is the full JVM crash log and resolved minidump.
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Commented] (IMPALA-10371) test_java_udfs crash impalad if result spooling is enabled

2020-12-08 Thread Riza Suminto (Jira)


[ 
https://issues.apache.org/jira/browse/IMPALA-10371?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17246060#comment-17246060
 ] 

Riza Suminto commented on IMPALA-10371:
---

Hi [~daniel.becker], I'm assigning this to you since you recently worked on 
IMPALA-7658 and have better knowledge about codegen in HiveUdfCall.

> test_java_udfs crash impalad if result spooling is enabled
> --
>
> Key: IMPALA-10371
> URL: https://issues.apache.org/jira/browse/IMPALA-10371
> Project: IMPALA
>  Issue Type: Bug
>  Components: Backend
>Affects Versions: Impala 3.4.0
>Reporter: Riza Suminto
>Assignee: Daniel Becker
>Priority: Major
> Attachments: 46a19881-resolved.txt, hs_err_pid12878.log
>
>
> The following test query from TestUdfExecution::test_java_udfs crash impalad 
> when result spooling is enabled.
> {code:java}
> select throws_exception() from functional.alltypestiny{code}
> The following is a truncated JVM crash log related to the crash
> {code:java}
> ---  T H R E A D  ---Current thread 
> (0x0fb4c000):  JavaThread "Thread-700" [_thread_in_native, id=30853, 
> stack(0x7f79715ff000,0x7f7971dff000)]Stack: 
> [0x7f79715ff000,0x7f7971dff000],  sp=0x7f7971dfa280,  free 
> space=8172k
> Native frames: (J=compiled Java code, j=interpreted, Vv=VM code, C=native 
> code)
> V  [libjvm.so+0xb6b032]
> V  [libjvm.so+0x4f14bd]
> V  [libjvm.so+0x80fa8f]
> V  [libjvm.so+0x7e0991]
> V  [libjvm.so+0x69fa10]
> j  
> org.apache.impala.TestUdfException.evaluate()Lorg/apache/hadoop/io/BooleanWritable;+9
> v  ~StubRoutines::call_stub
> V  [libjvm.so+0x6af9ba]
> V  [libjvm.so+0xa1def8]
> V  [libjvm.so+0xa1f8d5]
> V  [libjvm.so+0x7610f8]  JVM_InvokeMethod+0x128
> J 2286  
> sun.reflect.NativeMethodAccessorImpl.invoke0(Ljava/lang/reflect/Method;Ljava/lang/Object;[Ljava/lang/Object;)Ljava/lang/Object;
>  (0 bytes) @ 0x7f7acb553ced [0x7f7acb553c00+0xed]
> J 6921 C2 
> sun.reflect.NativeMethodAccessorImpl.invoke(Ljava/lang/Object;[Ljava/lang/Object;)Ljava/lang/Object;
>  (104 bytes) @ 0x7f7acbd1de38 [0x7f7acbd1ddc0+0x78]
> J 3645 C2 org.apache.impala.hive.executor.UdfExecutor.evaluate()V (396 bytes) 
> @ 0x7f7acaf6e894 [0x7f7acaf6e640+0x254]
> v  ~StubRoutines::call_stub
> V  [libjvm.so+0x6af9ba]
> V  [libjvm.so+0x72c046]
> V  [libjvm.so+0x730523]
> C  0x7f7ab4c5d0d0
> C  [impalad+0x26a2648]  
> impala::ScalarExprEvaluator::GetValue(impala::ScalarExpr const&, 
> impala::TupleRow const*)+0x7a
> C  [impalad+0x26a25cb]  
> impala::ScalarExprEvaluator::GetValue(impala::TupleRow const*)+0x2b
> C  [impalad+0x21f4f78]  
> impala::AsciiQueryResultSet::AddRows(std::vector  std::allocator > const&, impala::RowBatch*, 
> int, int)+0x4c2
> C  [impalad+0x25c5862]  
> impala::BufferedPlanRootSink::GetNext(impala::RuntimeState*, 
> impala::QueryResultSet*, int, bool*, long)+0x70c
> C  [impalad+0x296cf17]  impala::Coordinator::GetNext(impala::QueryResultSet*, 
> int, bool*, long)+0x557
> C  [impalad+0x219f5fe]  impala::ClientRequestState::FetchRowsInternal(int, 
> impala::QueryResultSet*, long)+0x6b2
> C  [impalad+0x219d98e]  impala::ClientRequestState::FetchRows(int, 
> impala::QueryResultSet*, long)+0x46
> C  [impalad+0x21c1d29]  
> impala::ImpalaServer::FetchInternal(impala::TUniqueId, bool, int, 
> beeswax::Results*)+0x717
> C  [impalad+0x21bbde9]  impala::ImpalaServer::fetch(beeswax::Results&, 
> beeswax::QueryHandle const&, bool, int)+0x577
> {code}
> If result spooling is enabled, BufferedPlanRootSink will be used and 
> ScalarExprEvaluation will be called in BufferedPlanRootSink::GetNext, leading 
> to this crash.
> Without result spooling, BlockingPlanRootSink will be used and 
> ScalarExprEvaluation is called in BlockingPlanRootSink::Send. No crash happen 
> when result spooling is disabled.
> Attached is the full JVM crash log and resolved minidump.
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Created] (IMPALA-10385) bootstrap_system.sh fails to install snappy-devel Centos 8.3

2020-12-08 Thread Laszlo Gaal (Jira)
Laszlo Gaal created IMPALA-10385:


 Summary: bootstrap_system.sh fails to install snappy-devel Centos 
8.3
 Key: IMPALA-10385
 URL: https://issues.apache.org/jira/browse/IMPALA-10385
 Project: IMPALA
  Issue Type: Bug
  Components: Infrastructure
Affects Versions: Impala 4.0
Reporter: Laszlo Gaal
Assignee: Laszlo Gaal


On Centos 8 the package {{snappy-devel}} lives in the PowerTools repo, which is 
not enabled by default; so {{bootstrap_system.sh}} installs it by enabling the 
repo using a command-line argument passed to {{dnf}} (the Centos 8 package 
manager).
Centos 8.3 changed the capitalization of repo names: earlier names used 
MixedCase, while 8.3 changed to repo IDs in all lowercase. Unfortunately this 
causes the {{dnf install}} call to fail with an "unknown repo" error.
This breaks Docker-based tests for Centos 8, which is currently the only way to 
run Impala tests on that platform.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Updated] (IMPALA-10385) bootstrap_system.sh fails when installing snappy-devel on Centos 8.3

2020-12-08 Thread Laszlo Gaal (Jira)


 [ 
https://issues.apache.org/jira/browse/IMPALA-10385?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Laszlo Gaal updated IMPALA-10385:
-
Summary: bootstrap_system.sh fails when installing snappy-devel on Centos 
8.3  (was: bootstrap_system.sh fails to install snappy-devel Centos 8.3)

> bootstrap_system.sh fails when installing snappy-devel on Centos 8.3
> 
>
> Key: IMPALA-10385
> URL: https://issues.apache.org/jira/browse/IMPALA-10385
> Project: IMPALA
>  Issue Type: Bug
>  Components: Infrastructure
>Affects Versions: Impala 4.0
>Reporter: Laszlo Gaal
>Assignee: Laszlo Gaal
>Priority: Critical
>  Labels: broken-build
>
> On Centos 8 the package {{snappy-devel}} lives in the PowerTools repo, which 
> is not enabled by default; so {{bootstrap_system.sh}} installs it by enabling 
> the repo using a command-line argument passed to {{dnf}} (the Centos 8 
> package manager).
> Centos 8.3 changed the capitalization of repo names: earlier names used 
> MixedCase, while 8.3 changed to repo IDs in all lowercase. Unfortunately this 
> causes the {{dnf install}} call to fail with an "unknown repo" error.
> This breaks Docker-based tests for Centos 8, which is currently the only way 
> to run Impala tests on that platform.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Resolved] (IMPALA-9991) TestShellClient.test_fetch_size_result_spooling is flaky

2020-12-08 Thread Riza Suminto (Jira)


 [ 
https://issues.apache.org/jira/browse/IMPALA-9991?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Riza Suminto resolved IMPALA-9991.
--
Fix Version/s: Not Applicable
   Resolution: Cannot Reproduce

This issue has not appear anymore since last time reported.
I also loop-run this test for a thousand times over recent asf-master branch 
and can not reproduce the issue.

Resolving this Jira as "Cannot Reproduce".

> TestShellClient.test_fetch_size_result_spooling is flaky
> 
>
> Key: IMPALA-9991
> URL: https://issues.apache.org/jira/browse/IMPALA-9991
> Project: IMPALA
>  Issue Type: Bug
>Reporter: Sahil Takiar
>Assignee: Riza Suminto
>Priority: Critical
>  Labels: broken-build, flaky
> Fix For: Not Applicable
>
>
> shell.test_shell_client.TestShellClient.test_fetch_size_result_spooling[table_format_and_file_extension:
>  ('parquet', '.parq') | protocol: hs2] (from pytest)
> h3. Error Message
> shell/test_shell_client.py:70: in test_fetch_size_result_spooling 
> self.__fetch_rows(client.fetch(handle), num_rows / fetch_size, num_rows) 
> shell/test_shell_client.py:80: in __fetch_rows for fetch_batch in 
> fetch_batches: ../shell/impala_client.py:787: in fetch yield 
> self._transpose(col_value_converters, resp.results.columns) E AttributeError: 
> 'NoneType' object has no attribute 'columns'
> h3. Stacktrace
> shell/test_shell_client.py:70: in test_fetch_size_result_spooling 
> self.__fetch_rows(client.fetch(handle), num_rows / fetch_size, num_rows) 
> shell/test_shell_client.py:80: in __fetch_rows for fetch_batch in 
> fetch_batches: ../shell/impala_client.py:787: in fetch yield 
> self._transpose(col_value_converters, resp.results.columns) E AttributeError: 
> 'NoneType' object has no attribute 'columns'
> h3. Standard Error
> Opened TCP connection to localhost:21050



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Work started] (IMPALA-10360) Allow a simple limit to be treated as a sampling hint where applicable

2020-12-08 Thread Aman Sinha (Jira)


 [ 
https://issues.apache.org/jira/browse/IMPALA-10360?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Work on IMPALA-10360 started by Aman Sinha.
---
> Allow a simple limit to be treated as a sampling hint where applicable
> --
>
> Key: IMPALA-10360
> URL: https://issues.apache.org/jira/browse/IMPALA-10360
> Project: IMPALA
>  Issue Type: Improvement
>  Components: Frontend
>Affects Versions: Impala 3.4.0
>Reporter: Aman Sinha
>Assignee: Aman Sinha
>Priority: Major
>
> As a follow-up to IMPALA-10314, it is sometimes useful to consider a simple 
> limit as a way to sample from a table if a relevant hint has been provided.  
> This is especially useful if the query is against a view because a 
> TABLESAMPLE clause is only supported for base tables, not views.  Here's an 
> example that illustrates the motivation:
> {noformat}
> set optimize_simple_limit = true;
> with v1 as 
> (select * from fact_table /* +some_hint_for_table */
>   where col in (select col from dim_table where ...)) 
> select * from v1 limit 10;
> {noformat}
> In this case, the outer query just wants any 10 rows that satisfy the WHERE 
> predicate in v1 and if we can specify the hint for the large fact_table to 
> treat the simple limit as a hint for sampling, it would substantially reduce 
> the query planning time without significantly compromising on the 
> correctness.  Without such optimization, during planning the scan ranges will 
> be computed for the entire fact_table which is expensive.
> Also, note that doing the naive push down of limit to the fact table is not 
> advisable because then the planner may decide (under the 
> optimize_simple_limit=true setting) to only look at first few partitions or 
> files within a partition and those rows may not satisfy the join condition. 
> The sampling will be spread out more uniformly across the partitions, so the 
> chances of producing sufficient qualifying rows is much higher.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org