[jira] [Commented] (DRILL-5541) C++ Client Crashes During Simple "Man in the Middle" Attack Test with Exploitable Write AV
[ https://issues.apache.org/jira/browse/DRILL-5541?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16037936#comment-16037936 ] Rob Wu commented on DRILL-5541: --- I set up a proxy server that mess with the incoming data randomly before returning it to see if the C++ client handles invalid data gracefully. DrillClient <--> Proxy <---> Server connect() O---> select * from Tab O---> <--- Flip random bits (do work on the data) <- Data Process X CrashX > C++ Client Crashes During Simple "Man in the Middle" Attack Test with > Exploitable Write AV > -- > > Key: DRILL-5541 > URL: https://issues.apache.org/jira/browse/DRILL-5541 > Project: Apache Drill > Issue Type: Bug > Components: Client - C++ >Affects Versions: 1.10.0 >Reporter: Rob Wu >Priority: Minor > > drillClient!boost_sb::shared_ptr::reset+0xa7: > 07fe`c292f827 f0ff4b08lock dec dword ptr [rbx+8] > ds:07fe`c2b3de78=c29e6060 > Exploitability Classification: EXPLOITABLE > Recommended Bug Title: Exploitable - User Mode Write AV starting at > drillClient!boost_sb::shared_ptr::reset+0x00a7 > (Hash=0x4ae7fdff.0xb15af658) > User mode write access violations that are not near NULL are exploitable. > == > Stack Trace: > Child-SP RetAddr Call Site > `030df630 07fe`c295bca1 > drillClient!boost_sb::shared_ptr::reset+0xa7 > > [c:\users\bamboo\desktop\make_win_drill\sb_boost\include\boost-1_57\boost\smart_ptr\shared_ptr.hpp > @ 620] > `030df680 07fe`c295433c > drillClient!Drill::DrillClientImpl::processSchemasResult+0x281 > [c:\users\bamboo\desktop\make_win_drill\drill-1.10.0.1\drill-1.10.0.1\contrib\native\client\src\clientlib\drillclientimpl.cpp > @ 1227] > `030df7a0 07fe`c294cbf6 > drillClient!Drill::DrillClientImpl::handleRead+0x75c > [c:\users\bamboo\desktop\make_win_drill\drill-1.10.0.1\drill-1.10.0.1\contrib\native\client\src\clientlib\drillclientimpl.cpp > @ 1555] > `030df9c0 07fe`c294ce9f > drillClient!boost_sb::asio::detail::win_iocp_socket_recv_op > > >,boost_sb::asio::mutable_buffers_1,boost_sb::asio::detail::transfer_all_t,boost_sb::_bi::bind_t char * __ptr64,boost_sb::system::error_code const & __ptr64,unsigned > __int64>,boost_sb::_bi::list4 __ptr64>,boost_sb::_bi::value __ptr64>,boost_sb::arg<1>,boost_sb::arg<2> > > > >::do_complete+0x166 > [c:\users\bamboo\desktop\make_win_drill\sb_boost\include\boost-1_57\boost\asio\detail\win_iocp_socket_recv_op.hpp > @ 97] > `030dfa90 07fe`c296009d > drillClient!boost_sb::asio::detail::win_iocp_io_service::do_one+0x27f > [c:\users\bamboo\desktop\make_win_drill\sb_boost\include\boost-1_57\boost\asio\detail\impl\win_iocp_io_service.ipp > @ 406] > `030dfb70 07fe`c295ffc9 > drillClient!boost_sb::asio::detail::win_iocp_io_service::run+0xad > [c:\users\bamboo\desktop\make_win_drill\sb_boost\include\boost-1_57\boost\asio\detail\impl\win_iocp_io_service.ipp > @ 164] > `030dfbd0 07fe`c2aa5b53 > drillClient!boost_sb::asio::io_service::run+0x29 > [c:\users\bamboo\desktop\make_win_drill\sb_boost\include\boost-1_57\boost\asio\impl\io_service.ipp > @ 60] > `030dfc10 07fe`c2ad3e03 drillClient!boost_sb::`anonymous > namespace'::thread_start_function+0x43 > `030dfc50 07fe`c2ad404e drillClient!_callthreadstartex+0x17 > [f:\dd\vctools\crt\crtw32\startup\threadex.c @ 376] > `030dfc80 `779e59cd drillClient!_threadstartex+0x102 > [f:\dd\vctools\crt\crtw32\startup\threadex.c @ 354] > `030dfcb0 `77c1a561 kernel32!BaseThreadInitThunk+0xd > `030dfce0 ` ntdll!RtlUserThreadStart+0x1d > == > Register: > rax=0284bae0 rbx=07fec2b3de70 rcx=027ec210 > rdx=027ec210 rsi=027f2638 rdi=027f25d0 > rip=07fec292f827 rsp=030df630 rbp=027ec210 > r8=027ec210 r9= r10=027d32fc > r11=27eb001b0003 r12= r13=028035a0 > r14=027ec210 r15= > iopl=0 nv up ei pl nz na pe nc > cs=0033 ss=002b ds=002b es=002b fs=0053 gs=002b efl=00010200 > drillClient!boost_sb::shared_ptr::reset+0xa7: > 07fe`c292f827 f0ff4b08lock dec dword ptr [rbx+8] > ds:07fe`c2b3de78=c29e6060 -- This message was sent by Atlassian JIRA
[jira] [Issue Comment Deleted] (DRILL-5541) C++ Client Crashes During Simple "Man in the Middle" Attack Test with Exploitable Write AV
[ https://issues.apache.org/jira/browse/DRILL-5541?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rob Wu updated DRILL-5541: -- Comment: was deleted (was: I set up a server) > C++ Client Crashes During Simple "Man in the Middle" Attack Test with > Exploitable Write AV > -- > > Key: DRILL-5541 > URL: https://issues.apache.org/jira/browse/DRILL-5541 > Project: Apache Drill > Issue Type: Bug > Components: Client - C++ >Affects Versions: 1.10.0 >Reporter: Rob Wu >Priority: Minor > > drillClient!boost_sb::shared_ptr::reset+0xa7: > 07fe`c292f827 f0ff4b08lock dec dword ptr [rbx+8] > ds:07fe`c2b3de78=c29e6060 > Exploitability Classification: EXPLOITABLE > Recommended Bug Title: Exploitable - User Mode Write AV starting at > drillClient!boost_sb::shared_ptr::reset+0x00a7 > (Hash=0x4ae7fdff.0xb15af658) > User mode write access violations that are not near NULL are exploitable. > == > Stack Trace: > Child-SP RetAddr Call Site > `030df630 07fe`c295bca1 > drillClient!boost_sb::shared_ptr::reset+0xa7 > > [c:\users\bamboo\desktop\make_win_drill\sb_boost\include\boost-1_57\boost\smart_ptr\shared_ptr.hpp > @ 620] > `030df680 07fe`c295433c > drillClient!Drill::DrillClientImpl::processSchemasResult+0x281 > [c:\users\bamboo\desktop\make_win_drill\drill-1.10.0.1\drill-1.10.0.1\contrib\native\client\src\clientlib\drillclientimpl.cpp > @ 1227] > `030df7a0 07fe`c294cbf6 > drillClient!Drill::DrillClientImpl::handleRead+0x75c > [c:\users\bamboo\desktop\make_win_drill\drill-1.10.0.1\drill-1.10.0.1\contrib\native\client\src\clientlib\drillclientimpl.cpp > @ 1555] > `030df9c0 07fe`c294ce9f > drillClient!boost_sb::asio::detail::win_iocp_socket_recv_op > > >,boost_sb::asio::mutable_buffers_1,boost_sb::asio::detail::transfer_all_t,boost_sb::_bi::bind_t char * __ptr64,boost_sb::system::error_code const & __ptr64,unsigned > __int64>,boost_sb::_bi::list4 __ptr64>,boost_sb::_bi::value __ptr64>,boost_sb::arg<1>,boost_sb::arg<2> > > > >::do_complete+0x166 > [c:\users\bamboo\desktop\make_win_drill\sb_boost\include\boost-1_57\boost\asio\detail\win_iocp_socket_recv_op.hpp > @ 97] > `030dfa90 07fe`c296009d > drillClient!boost_sb::asio::detail::win_iocp_io_service::do_one+0x27f > [c:\users\bamboo\desktop\make_win_drill\sb_boost\include\boost-1_57\boost\asio\detail\impl\win_iocp_io_service.ipp > @ 406] > `030dfb70 07fe`c295ffc9 > drillClient!boost_sb::asio::detail::win_iocp_io_service::run+0xad > [c:\users\bamboo\desktop\make_win_drill\sb_boost\include\boost-1_57\boost\asio\detail\impl\win_iocp_io_service.ipp > @ 164] > `030dfbd0 07fe`c2aa5b53 > drillClient!boost_sb::asio::io_service::run+0x29 > [c:\users\bamboo\desktop\make_win_drill\sb_boost\include\boost-1_57\boost\asio\impl\io_service.ipp > @ 60] > `030dfc10 07fe`c2ad3e03 drillClient!boost_sb::`anonymous > namespace'::thread_start_function+0x43 > `030dfc50 07fe`c2ad404e drillClient!_callthreadstartex+0x17 > [f:\dd\vctools\crt\crtw32\startup\threadex.c @ 376] > `030dfc80 `779e59cd drillClient!_threadstartex+0x102 > [f:\dd\vctools\crt\crtw32\startup\threadex.c @ 354] > `030dfcb0 `77c1a561 kernel32!BaseThreadInitThunk+0xd > `030dfce0 ` ntdll!RtlUserThreadStart+0x1d > == > Register: > rax=0284bae0 rbx=07fec2b3de70 rcx=027ec210 > rdx=027ec210 rsi=027f2638 rdi=027f25d0 > rip=07fec292f827 rsp=030df630 rbp=027ec210 > r8=027ec210 r9= r10=027d32fc > r11=27eb001b0003 r12= r13=028035a0 > r14=027ec210 r15= > iopl=0 nv up ei pl nz na pe nc > cs=0033 ss=002b ds=002b es=002b fs=0053 gs=002b efl=00010200 > drillClient!boost_sb::shared_ptr::reset+0xa7: > 07fe`c292f827 f0ff4b08lock dec dword ptr [rbx+8] > ds:07fe`c2b3de78=c29e6060 -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (DRILL-5541) C++ Client Crashes During Simple "Man in the Middle" Attack Test with Exploitable Write AV
[ https://issues.apache.org/jira/browse/DRILL-5541?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16037935#comment-16037935 ] Rob Wu commented on DRILL-5541: --- I set up a server > C++ Client Crashes During Simple "Man in the Middle" Attack Test with > Exploitable Write AV > -- > > Key: DRILL-5541 > URL: https://issues.apache.org/jira/browse/DRILL-5541 > Project: Apache Drill > Issue Type: Bug > Components: Client - C++ >Affects Versions: 1.10.0 >Reporter: Rob Wu >Priority: Minor > > drillClient!boost_sb::shared_ptr::reset+0xa7: > 07fe`c292f827 f0ff4b08lock dec dword ptr [rbx+8] > ds:07fe`c2b3de78=c29e6060 > Exploitability Classification: EXPLOITABLE > Recommended Bug Title: Exploitable - User Mode Write AV starting at > drillClient!boost_sb::shared_ptr::reset+0x00a7 > (Hash=0x4ae7fdff.0xb15af658) > User mode write access violations that are not near NULL are exploitable. > == > Stack Trace: > Child-SP RetAddr Call Site > `030df630 07fe`c295bca1 > drillClient!boost_sb::shared_ptr::reset+0xa7 > > [c:\users\bamboo\desktop\make_win_drill\sb_boost\include\boost-1_57\boost\smart_ptr\shared_ptr.hpp > @ 620] > `030df680 07fe`c295433c > drillClient!Drill::DrillClientImpl::processSchemasResult+0x281 > [c:\users\bamboo\desktop\make_win_drill\drill-1.10.0.1\drill-1.10.0.1\contrib\native\client\src\clientlib\drillclientimpl.cpp > @ 1227] > `030df7a0 07fe`c294cbf6 > drillClient!Drill::DrillClientImpl::handleRead+0x75c > [c:\users\bamboo\desktop\make_win_drill\drill-1.10.0.1\drill-1.10.0.1\contrib\native\client\src\clientlib\drillclientimpl.cpp > @ 1555] > `030df9c0 07fe`c294ce9f > drillClient!boost_sb::asio::detail::win_iocp_socket_recv_op > > >,boost_sb::asio::mutable_buffers_1,boost_sb::asio::detail::transfer_all_t,boost_sb::_bi::bind_t char * __ptr64,boost_sb::system::error_code const & __ptr64,unsigned > __int64>,boost_sb::_bi::list4 __ptr64>,boost_sb::_bi::value __ptr64>,boost_sb::arg<1>,boost_sb::arg<2> > > > >::do_complete+0x166 > [c:\users\bamboo\desktop\make_win_drill\sb_boost\include\boost-1_57\boost\asio\detail\win_iocp_socket_recv_op.hpp > @ 97] > `030dfa90 07fe`c296009d > drillClient!boost_sb::asio::detail::win_iocp_io_service::do_one+0x27f > [c:\users\bamboo\desktop\make_win_drill\sb_boost\include\boost-1_57\boost\asio\detail\impl\win_iocp_io_service.ipp > @ 406] > `030dfb70 07fe`c295ffc9 > drillClient!boost_sb::asio::detail::win_iocp_io_service::run+0xad > [c:\users\bamboo\desktop\make_win_drill\sb_boost\include\boost-1_57\boost\asio\detail\impl\win_iocp_io_service.ipp > @ 164] > `030dfbd0 07fe`c2aa5b53 > drillClient!boost_sb::asio::io_service::run+0x29 > [c:\users\bamboo\desktop\make_win_drill\sb_boost\include\boost-1_57\boost\asio\impl\io_service.ipp > @ 60] > `030dfc10 07fe`c2ad3e03 drillClient!boost_sb::`anonymous > namespace'::thread_start_function+0x43 > `030dfc50 07fe`c2ad404e drillClient!_callthreadstartex+0x17 > [f:\dd\vctools\crt\crtw32\startup\threadex.c @ 376] > `030dfc80 `779e59cd drillClient!_threadstartex+0x102 > [f:\dd\vctools\crt\crtw32\startup\threadex.c @ 354] > `030dfcb0 `77c1a561 kernel32!BaseThreadInitThunk+0xd > `030dfce0 ` ntdll!RtlUserThreadStart+0x1d > == > Register: > rax=0284bae0 rbx=07fec2b3de70 rcx=027ec210 > rdx=027ec210 rsi=027f2638 rdi=027f25d0 > rip=07fec292f827 rsp=030df630 rbp=027ec210 > r8=027ec210 r9= r10=027d32fc > r11=27eb001b0003 r12= r13=028035a0 > r14=027ec210 r15= > iopl=0 nv up ei pl nz na pe nc > cs=0033 ss=002b ds=002b es=002b fs=0053 gs=002b efl=00010200 > drillClient!boost_sb::shared_ptr::reset+0xa7: > 07fe`c292f827 f0ff4b08lock dec dword ptr [rbx+8] > ds:07fe`c2b3de78=c29e6060 -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (DRILL-5541) C++ Client Crashes During Simple "Man in the Middle" Attack Test with Exploitable Write AV
[ https://issues.apache.org/jira/browse/DRILL-5541?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16037912#comment-16037912 ] Parth Chandra commented on DRILL-5541: -- Curious to know how you created this issue. > C++ Client Crashes During Simple "Man in the Middle" Attack Test with > Exploitable Write AV > -- > > Key: DRILL-5541 > URL: https://issues.apache.org/jira/browse/DRILL-5541 > Project: Apache Drill > Issue Type: Bug > Components: Client - C++ >Affects Versions: 1.10.0 >Reporter: Rob Wu >Priority: Minor > > drillClient!boost_sb::shared_ptr::reset+0xa7: > 07fe`c292f827 f0ff4b08lock dec dword ptr [rbx+8] > ds:07fe`c2b3de78=c29e6060 > Exploitability Classification: EXPLOITABLE > Recommended Bug Title: Exploitable - User Mode Write AV starting at > drillClient!boost_sb::shared_ptr::reset+0x00a7 > (Hash=0x4ae7fdff.0xb15af658) > User mode write access violations that are not near NULL are exploitable. > == > Stack Trace: > Child-SP RetAddr Call Site > `030df630 07fe`c295bca1 > drillClient!boost_sb::shared_ptr::reset+0xa7 > > [c:\users\bamboo\desktop\make_win_drill\sb_boost\include\boost-1_57\boost\smart_ptr\shared_ptr.hpp > @ 620] > `030df680 07fe`c295433c > drillClient!Drill::DrillClientImpl::processSchemasResult+0x281 > [c:\users\bamboo\desktop\make_win_drill\drill-1.10.0.1\drill-1.10.0.1\contrib\native\client\src\clientlib\drillclientimpl.cpp > @ 1227] > `030df7a0 07fe`c294cbf6 > drillClient!Drill::DrillClientImpl::handleRead+0x75c > [c:\users\bamboo\desktop\make_win_drill\drill-1.10.0.1\drill-1.10.0.1\contrib\native\client\src\clientlib\drillclientimpl.cpp > @ 1555] > `030df9c0 07fe`c294ce9f > drillClient!boost_sb::asio::detail::win_iocp_socket_recv_op > > >,boost_sb::asio::mutable_buffers_1,boost_sb::asio::detail::transfer_all_t,boost_sb::_bi::bind_t char * __ptr64,boost_sb::system::error_code const & __ptr64,unsigned > __int64>,boost_sb::_bi::list4 __ptr64>,boost_sb::_bi::value __ptr64>,boost_sb::arg<1>,boost_sb::arg<2> > > > >::do_complete+0x166 > [c:\users\bamboo\desktop\make_win_drill\sb_boost\include\boost-1_57\boost\asio\detail\win_iocp_socket_recv_op.hpp > @ 97] > `030dfa90 07fe`c296009d > drillClient!boost_sb::asio::detail::win_iocp_io_service::do_one+0x27f > [c:\users\bamboo\desktop\make_win_drill\sb_boost\include\boost-1_57\boost\asio\detail\impl\win_iocp_io_service.ipp > @ 406] > `030dfb70 07fe`c295ffc9 > drillClient!boost_sb::asio::detail::win_iocp_io_service::run+0xad > [c:\users\bamboo\desktop\make_win_drill\sb_boost\include\boost-1_57\boost\asio\detail\impl\win_iocp_io_service.ipp > @ 164] > `030dfbd0 07fe`c2aa5b53 > drillClient!boost_sb::asio::io_service::run+0x29 > [c:\users\bamboo\desktop\make_win_drill\sb_boost\include\boost-1_57\boost\asio\impl\io_service.ipp > @ 60] > `030dfc10 07fe`c2ad3e03 drillClient!boost_sb::`anonymous > namespace'::thread_start_function+0x43 > `030dfc50 07fe`c2ad404e drillClient!_callthreadstartex+0x17 > [f:\dd\vctools\crt\crtw32\startup\threadex.c @ 376] > `030dfc80 `779e59cd drillClient!_threadstartex+0x102 > [f:\dd\vctools\crt\crtw32\startup\threadex.c @ 354] > `030dfcb0 `77c1a561 kernel32!BaseThreadInitThunk+0xd > `030dfce0 ` ntdll!RtlUserThreadStart+0x1d > == > Register: > rax=0284bae0 rbx=07fec2b3de70 rcx=027ec210 > rdx=027ec210 rsi=027f2638 rdi=027f25d0 > rip=07fec292f827 rsp=030df630 rbp=027ec210 > r8=027ec210 r9= r10=027d32fc > r11=27eb001b0003 r12= r13=028035a0 > r14=027ec210 r15= > iopl=0 nv up ei pl nz na pe nc > cs=0033 ss=002b ds=002b es=002b fs=0053 gs=002b efl=00010200 > drillClient!boost_sb::shared_ptr::reset+0xa7: > 07fe`c292f827 f0ff4b08lock dec dword ptr [rbx+8] > ds:07fe`c2b3de78=c29e6060 -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (DRILL-5541) C++ Client Crashes During Simple "Man in the Middle" Attack Test with Exploitable Write AV
[ https://issues.apache.org/jira/browse/DRILL-5541?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16037601#comment-16037601 ] ASF GitHub Bot commented on DRILL-5541: --- GitHub user superbstreak opened a pull request: https://github.com/apache/drill/pull/850 DRILL-5541: C++ Client Crashes During Simple "Man in the Middle" Atta… …ck Test with Exploitable Write AV You can merge this pull request into a Git repository by running: $ git pull https://github.com/superbstreak/drill DRILL-5541 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/drill/pull/850.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #850 commit 716db51df61d0ee47804217a6a133d1d1152b64a Author: Rob Wu Date: 2017-06-05T21:06:33Z DRILL-5541: C++ Client Crashes During Simple "Man in the Middle" Attack Test with Exploitable Write AV > C++ Client Crashes During Simple "Man in the Middle" Attack Test with > Exploitable Write AV > -- > > Key: DRILL-5541 > URL: https://issues.apache.org/jira/browse/DRILL-5541 > Project: Apache Drill > Issue Type: Bug > Components: Client - C++ >Affects Versions: 1.10.0 >Reporter: Rob Wu >Priority: Minor > > drillClient!boost_sb::shared_ptr::reset+0xa7: > 07fe`c292f827 f0ff4b08lock dec dword ptr [rbx+8] > ds:07fe`c2b3de78=c29e6060 > Exploitability Classification: EXPLOITABLE > Recommended Bug Title: Exploitable - User Mode Write AV starting at > drillClient!boost_sb::shared_ptr::reset+0x00a7 > (Hash=0x4ae7fdff.0xb15af658) > User mode write access violations that are not near NULL are exploitable. > == > Stack Trace: > Child-SP RetAddr Call Site > `030df630 07fe`c295bca1 > drillClient!boost_sb::shared_ptr::reset+0xa7 > > [c:\users\bamboo\desktop\make_win_drill\sb_boost\include\boost-1_57\boost\smart_ptr\shared_ptr.hpp > @ 620] > `030df680 07fe`c295433c > drillClient!Drill::DrillClientImpl::processSchemasResult+0x281 > [c:\users\bamboo\desktop\make_win_drill\drill-1.10.0.1\drill-1.10.0.1\contrib\native\client\src\clientlib\drillclientimpl.cpp > @ 1227] > `030df7a0 07fe`c294cbf6 > drillClient!Drill::DrillClientImpl::handleRead+0x75c > [c:\users\bamboo\desktop\make_win_drill\drill-1.10.0.1\drill-1.10.0.1\contrib\native\client\src\clientlib\drillclientimpl.cpp > @ 1555] > `030df9c0 07fe`c294ce9f > drillClient!boost_sb::asio::detail::win_iocp_socket_recv_op > > >,boost_sb::asio::mutable_buffers_1,boost_sb::asio::detail::transfer_all_t,boost_sb::_bi::bind_t char * __ptr64,boost_sb::system::error_code const & __ptr64,unsigned > __int64>,boost_sb::_bi::list4 __ptr64>,boost_sb::_bi::value __ptr64>,boost_sb::arg<1>,boost_sb::arg<2> > > > >::do_complete+0x166 > [c:\users\bamboo\desktop\make_win_drill\sb_boost\include\boost-1_57\boost\asio\detail\win_iocp_socket_recv_op.hpp > @ 97] > `030dfa90 07fe`c296009d > drillClient!boost_sb::asio::detail::win_iocp_io_service::do_one+0x27f > [c:\users\bamboo\desktop\make_win_drill\sb_boost\include\boost-1_57\boost\asio\detail\impl\win_iocp_io_service.ipp > @ 406] > `030dfb70 07fe`c295ffc9 > drillClient!boost_sb::asio::detail::win_iocp_io_service::run+0xad > [c:\users\bamboo\desktop\make_win_drill\sb_boost\include\boost-1_57\boost\asio\detail\impl\win_iocp_io_service.ipp > @ 164] > `030dfbd0 07fe`c2aa5b53 > drillClient!boost_sb::asio::io_service::run+0x29 > [c:\users\bamboo\desktop\make_win_drill\sb_boost\include\boost-1_57\boost\asio\impl\io_service.ipp > @ 60] > `030dfc10 07fe`c2ad3e03 drillClient!boost_sb::`anonymous > namespace'::thread_start_function+0x43 > `030dfc50 07fe`c2ad404e drillClient!_callthreadstartex+0x17 > [f:\dd\vctools\crt\crtw32\startup\threadex.c @ 376] > `030dfc80 `779e59cd drillClient!_threadstartex+0x102 > [f:\dd\vctools\crt\crtw32\startup\threadex.c @ 354] > `030dfcb0 `77c1a561 kernel32!BaseThreadInitThunk+0xd > `030dfce0 ` ntdll!RtlUserThreadStart+0x1d > == > Register: > rax=0284bae0 rbx=07fec2b3de70 rcx=027ec210 > rdx=027ec210 rsi=027f2638 rdi=027f25d0 > rip=07fec292f827 rsp=030df630 rbp=027ec210 > r8=027ec210 r9= r10=027d32fc > r11=27eb001b0003 r12= r13=028035a0 > r14=027ec210 r15= > iopl=0 nv up ei pl nz na pe nc > cs=0033 ss=002b ds=002b es=002
[jira] [Updated] (DRILL-5568) Include hadoop-common jars inside drill-jdbc-all.jar
[ https://issues.apache.org/jira/browse/DRILL-5568?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sorabh Hamirwasia updated DRILL-5568: - Reviewer: Paul Rogers > Include hadoop-common jars inside drill-jdbc-all.jar > > > Key: DRILL-5568 > URL: https://issues.apache.org/jira/browse/DRILL-5568 > Project: Apache Drill > Issue Type: Bug > Components: Client - JDBC >Reporter: Sorabh Hamirwasia >Assignee: Sorabh Hamirwasia > > With Sasl support in 1.10 the authentication using username/password was > moved to Plain Mechanism of Sasl Framework. There are couple of Hadoop > classes like Configuration.java and UserGroupInformation.java defined in > hadoop-common package which were used in DrillClient for security mechanisms > like Plain/Kerberos mechanisms. Due to this we need to add hadoop dependency > inside _drill-jdbc-all.jar_ Without it the application using this driver > will fail to connect to Drill with authentication enabled. > Today this jar (which is JDBC driver for Drill) already has lots of other > dependencies which DrillClient relies on like Netty, etc. But the way we add > these dependencies are under *oadd* namespace so that the application using > this driver won't end up in conflict with it's own version of same > dependencies. As part of this JIRA it will include hadoop-common dependencies > under same namespace. This will allow an application to connect to Drill > using this driver with security enabled. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (DRILL-5568) Include hadoop-common jars inside drill-jdbc-all.jar
[ https://issues.apache.org/jira/browse/DRILL-5568?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16037583#comment-16037583 ] ASF GitHub Bot commented on DRILL-5568: --- GitHub user sohami opened a pull request: https://github.com/apache/drill/pull/849 DRILL-5568: Include hadoop-common jars inside drill-jdbc-all.jar More details on this PR is in [JIRA](https://issues.apache.org/jira/browse/DRILL-5568) You can merge this pull request into a Git repository by running: $ git pull https://github.com/sohami/drill DRILL-5568 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/drill/pull/849.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #849 commit e84ce5bb6317e7a8caa50c7ffc85dfc416616596 Author: Sorabh Hamirwasia Date: 2017-06-05T20:45:27Z DRILL-5568: Include hadoop-common jars inside drill-jdbc-all.jar > Include hadoop-common jars inside drill-jdbc-all.jar > > > Key: DRILL-5568 > URL: https://issues.apache.org/jira/browse/DRILL-5568 > Project: Apache Drill > Issue Type: Bug > Components: Client - JDBC >Reporter: Sorabh Hamirwasia >Assignee: Sorabh Hamirwasia > > With Sasl support in 1.10 the authentication using username/password was > moved to Plain Mechanism of Sasl Framework. There are couple of Hadoop > classes like Configuration.java and UserGroupInformation.java defined in > hadoop-common package which were used in DrillClient for security mechanisms > like Plain/Kerberos mechanisms. Due to this we need to add hadoop dependency > inside _drill-jdbc-all.jar_ Without it the application using this driver > will fail to connect to Drill with authentication enabled. > Today this jar (which is JDBC driver for Drill) already has lots of other > dependencies which DrillClient relies on like Netty, etc. But the way we add > these dependencies are under *oadd* namespace so that the application using > this driver won't end up in conflict with it's own version of same > dependencies. As part of this JIRA it will include hadoop-common dependencies > under same namespace. This will allow an application to connect to Drill > using this driver with security enabled. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Resolved] (DRILL-5567) Review changes for DRILL 5514
[ https://issues.apache.org/jira/browse/DRILL-5567?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Karthikeyan Manivannan resolved DRILL-5567. --- Resolution: Done > Review changes for DRILL 5514 > - > > Key: DRILL-5567 > URL: https://issues.apache.org/jira/browse/DRILL-5567 > Project: Apache Drill > Issue Type: Sub-task >Reporter: Karthikeyan Manivannan >Assignee: Karthikeyan Manivannan > Fix For: 1.11.0 > > Original Estimate: 2h > Remaining Estimate: 2h > -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (DRILL-5514) Enhance VectorContainer to merge two row sets
[ https://issues.apache.org/jira/browse/DRILL-5514?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16037580#comment-16037580 ] ASF GitHub Bot commented on DRILL-5514: --- Github user bitblender commented on a diff in the pull request: https://github.com/apache/drill/pull/837#discussion_r118797793 --- Diff: exec/java-exec/src/main/java/org/apache/drill/exec/record/BatchSchema.java --- @@ -157,4 +158,26 @@ private boolean majorTypeEqual(MajorType t1, MajorType t2) { return true; } + /** + * Merge two schema to produce a new, merged schema. The caller is responsible + * for ensuring that column names are unique. The order of the fields in the + * new schema is the same as that of this schema, with the other schema's fields + * appended in the order defined in the other schema. The resulting selection + * vector mode is the same as this schema. (That is, this schema is assumed to + * be the main part of the batch, possibly with a selection vector, with the + * other schema representing additional, new columns.) + * @param otherSchema the schema to merge with this one + * @return the new, merged, schema + */ + + public BatchSchema merge(BatchSchema otherSchema) { +if (otherSchema.selectionVectorMode != SelectionVectorMode.NONE && +selectionVectorMode != otherSchema.selectionVectorMode) { + throw new IllegalArgumentException("Left schema must carry the selection vector mode"); --- End diff -- "Left schema must carry the same selection vector mode" + "as the right schema"? > Enhance VectorContainer to merge two row sets > - > > Key: DRILL-5514 > URL: https://issues.apache.org/jira/browse/DRILL-5514 > Project: Apache Drill > Issue Type: Improvement >Affects Versions: 1.10.0 >Reporter: Paul Rogers >Assignee: Paul Rogers >Priority: Minor > Fix For: 1.11.0 > > > Consider the concept of a "record batch" in Drill. On the one hand, one can > envision a record batch as a stack of records: > {code} > | a1 | b1 | c1 | > > | a2 | b2 | c2 | > {code} > But, Drill is columnar. So a record batch is really a "bundle" of vectors: > {code} > | a1 || b1 || c1 | > | a2 || b2 || c2 | > {code} > There are times when it is handy to build up a record batch as a merge of two > different vector bundles: > {code} > -- bundle 1 ---- bundle 2 -- > | a1 || b1 || c1 | > | a2 || b2 || c2 | > {code} > For example, consider a reader. The reader implementation might read columns > (a, b) from a file, say. Then, the "{{ScanBatch}}" might add (c) as an > implicit vector (the file name, say.) The merged set of vectors comprises the > final schema: (a, b, c). > This ticket asks for the code to do the merge: > * Merge two schemas A = (a, b), B = (c) to create schema C = (a, b, c). > * Merge two vector containers C1 and C2 to create a new container, C3, that > holds the merger of the vectors from the first two. > Clearly, the merge only makes sense if: > * The two input containers have the same row count, and > * The columns in each input container are distinct. > Because this feature is also useful for tests, add the merge to the "row set" > tools also. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (DRILL-5514) Enhance VectorContainer to merge two row sets
[ https://issues.apache.org/jira/browse/DRILL-5514?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16037581#comment-16037581 ] ASF GitHub Bot commented on DRILL-5514: --- Github user bitblender commented on a diff in the pull request: https://github.com/apache/drill/pull/837#discussion_r120198724 --- Diff: exec/java-exec/src/main/java/org/apache/drill/exec/record/BatchSchema.java --- @@ -157,4 +158,26 @@ private boolean majorTypeEqual(MajorType t1, MajorType t2) { return true; } + /** + * Merge two schema to produce a new, merged schema. The caller is responsible + * for ensuring that column names are unique. The order of the fields in the + * new schema is the same as that of this schema, with the other schema's fields + * appended in the order defined in the other schema. The resulting selection + * vector mode is the same as this schema. (That is, this schema is assumed to + * be the main part of the batch, possibly with a selection vector, with the + * other schema representing additional, new columns.) + * @param otherSchema the schema to merge with this one + * @return the new, merged, schema + */ + + public BatchSchema merge(BatchSchema otherSchema) { +if (otherSchema.selectionVectorMode != SelectionVectorMode.NONE && +selectionVectorMode != otherSchema.selectionVectorMode) { + throw new IllegalArgumentException("Left schema must carry the selection vector mode"); +} +List mergedFields = new ArrayList<>(); --- End diff -- List mergedFields = new ArrayList(this.fields.size() + otherSchema.fields.size()) would avoid having to potentially grow the ArrayList twice. > Enhance VectorContainer to merge two row sets > - > > Key: DRILL-5514 > URL: https://issues.apache.org/jira/browse/DRILL-5514 > Project: Apache Drill > Issue Type: Improvement >Affects Versions: 1.10.0 >Reporter: Paul Rogers >Assignee: Paul Rogers >Priority: Minor > Fix For: 1.11.0 > > > Consider the concept of a "record batch" in Drill. On the one hand, one can > envision a record batch as a stack of records: > {code} > | a1 | b1 | c1 | > > | a2 | b2 | c2 | > {code} > But, Drill is columnar. So a record batch is really a "bundle" of vectors: > {code} > | a1 || b1 || c1 | > | a2 || b2 || c2 | > {code} > There are times when it is handy to build up a record batch as a merge of two > different vector bundles: > {code} > -- bundle 1 ---- bundle 2 -- > | a1 || b1 || c1 | > | a2 || b2 || c2 | > {code} > For example, consider a reader. The reader implementation might read columns > (a, b) from a file, say. Then, the "{{ScanBatch}}" might add (c) as an > implicit vector (the file name, say.) The merged set of vectors comprises the > final schema: (a, b, c). > This ticket asks for the code to do the merge: > * Merge two schemas A = (a, b), B = (c) to create schema C = (a, b, c). > * Merge two vector containers C1 and C2 to create a new container, C3, that > holds the merger of the vectors from the first two. > Clearly, the merge only makes sense if: > * The two input containers have the same row count, and > * The columns in each input container are distinct. > Because this feature is also useful for tests, add the merge to the "row set" > tools also. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Updated] (DRILL-5568) Include hadoop-common jars inside drill-jdbc-all.jar
[ https://issues.apache.org/jira/browse/DRILL-5568?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sorabh Hamirwasia updated DRILL-5568: - Summary: Include hadoop-common jars inside drill-jdbc-all.jar (was: Include Hadoop dependency jars inside drill-jdbc-all.jar) > Include hadoop-common jars inside drill-jdbc-all.jar > > > Key: DRILL-5568 > URL: https://issues.apache.org/jira/browse/DRILL-5568 > Project: Apache Drill > Issue Type: Bug > Components: Client - JDBC >Reporter: Sorabh Hamirwasia >Assignee: Sorabh Hamirwasia > > With Sasl support in 1.10 the authentication using username/password was > moved to Plain Mechanism of Sasl Framework. There are couple of Hadoop > classes like Configuration.java and UserGroupInformation.java defined in > hadoop-common package which were used in DrillClient for security mechanisms > like Plain/Kerberos mechanisms. Due to this we need to add hadoop dependency > inside _drill-jdbc-all.jar_ Without it the application using this driver > will fail to connect to Drill with authentication enabled. > Today this jar (which is JDBC driver for Drill) already has lots of other > dependencies which DrillClient relies on like Netty, etc. But the way we add > these dependencies are under *oadd* namespace so that the application using > this driver won't end up in conflict with it's own version of same > dependencies. As part of this JIRA it will include hadoop-common dependencies > under same namespace. This will allow an application to connect to Drill > using this driver with security enabled. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Created] (DRILL-5568) Include Hadoop dependency jars inside drill-jdbc-all.jar
Sorabh Hamirwasia created DRILL-5568: Summary: Include Hadoop dependency jars inside drill-jdbc-all.jar Key: DRILL-5568 URL: https://issues.apache.org/jira/browse/DRILL-5568 Project: Apache Drill Issue Type: Bug Components: Client - JDBC Reporter: Sorabh Hamirwasia Assignee: Sorabh Hamirwasia With Sasl support in 1.10 the authentication using username/password was moved to Plain Mechanism of Sasl Framework. There are couple of Hadoop classes like Configuration.java and UserGroupInformation.java defined in hadoop-common package which were used in DrillClient for security mechanisms like Plain/Kerberos mechanisms. Due to this we need to add hadoop dependency inside _drill-jdbc-all.jar_ Without it the application using this driver will fail to connect to Drill with authentication enabled. Today this jar (which is JDBC driver for Drill) already has lots of other dependencies which DrillClient relies on like Netty, etc. But the way we add these dependencies are under *oadd* namespace so that the application using this driver won't end up in conflict with it's own version of same dependencies. As part of this JIRA it will include hadoop-common dependencies under same namespace. This will allow an application to connect to Drill using this driver with security enabled. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Updated] (DRILL-5567) Review changes for DRILL 5514
[ https://issues.apache.org/jira/browse/DRILL-5567?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Karthikeyan Manivannan updated DRILL-5567: -- Remaining Estimate: 2h Original Estimate: 2h > Review changes for DRILL 5514 > - > > Key: DRILL-5567 > URL: https://issues.apache.org/jira/browse/DRILL-5567 > Project: Apache Drill > Issue Type: Sub-task >Reporter: Karthikeyan Manivannan >Assignee: Karthikeyan Manivannan > Fix For: 1.11.0 > > Original Estimate: 2h > Remaining Estimate: 2h > -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Updated] (DRILL-5514) Enhance VectorContainer to merge two row sets
[ https://issues.apache.org/jira/browse/DRILL-5514?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Karthikeyan Manivannan updated DRILL-5514: -- Reviewer: Karthikeyan Manivannan (was: Sorabh Hamirwasia) > Enhance VectorContainer to merge two row sets > - > > Key: DRILL-5514 > URL: https://issues.apache.org/jira/browse/DRILL-5514 > Project: Apache Drill > Issue Type: Improvement >Affects Versions: 1.10.0 >Reporter: Paul Rogers >Assignee: Paul Rogers >Priority: Minor > Fix For: 1.11.0 > > > Consider the concept of a "record batch" in Drill. On the one hand, one can > envision a record batch as a stack of records: > {code} > | a1 | b1 | c1 | > > | a2 | b2 | c2 | > {code} > But, Drill is columnar. So a record batch is really a "bundle" of vectors: > {code} > | a1 || b1 || c1 | > | a2 || b2 || c2 | > {code} > There are times when it is handy to build up a record batch as a merge of two > different vector bundles: > {code} > -- bundle 1 ---- bundle 2 -- > | a1 || b1 || c1 | > | a2 || b2 || c2 | > {code} > For example, consider a reader. The reader implementation might read columns > (a, b) from a file, say. Then, the "{{ScanBatch}}" might add (c) as an > implicit vector (the file name, say.) The merged set of vectors comprises the > final schema: (a, b, c). > This ticket asks for the code to do the merge: > * Merge two schemas A = (a, b), B = (c) to create schema C = (a, b, c). > * Merge two vector containers C1 and C2 to create a new container, C3, that > holds the merger of the vectors from the first two. > Clearly, the merge only makes sense if: > * The two input containers have the same row count, and > * The columns in each input container are distinct. > Because this feature is also useful for tests, add the merge to the "row set" > tools also. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Created] (DRILL-5567) Review changes for DRILL 5514
Karthikeyan Manivannan created DRILL-5567: - Summary: Review changes for DRILL 5514 Key: DRILL-5567 URL: https://issues.apache.org/jira/browse/DRILL-5567 Project: Apache Drill Issue Type: Sub-task Reporter: Karthikeyan Manivannan Assignee: Karthikeyan Manivannan -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Updated] (DRILL-5565) Directory Query fails with Permission denied: access=EXECUTE if dirN name is 'year=2017' or 'month=201704'
[ https://issues.apache.org/jira/browse/DRILL-5565?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ehur updated DRILL-5565: Environment: CentOS release 6.8 > Directory Query fails with Permission denied: access=EXECUTE if dirN name is > 'year=2017' or 'month=201704' > -- > > Key: DRILL-5565 > URL: https://issues.apache.org/jira/browse/DRILL-5565 > Project: Apache Drill > Issue Type: Bug > Components: Functions - Drill, SQL Parser >Affects Versions: 1.6.0 > Environment: CentOS release 6.8 >Reporter: ehur > > running a query like this works fine, when the name dir0 contains numerics > only: > select * from all.my.records > where dir0 >= '20170322' > limit 10; > if the dirN is named according to this convention: year=2017 we get one of > the following problems: > 1. Either "system error permission denied" in: > select * from all.my.records > where dir0 >= 'year=2017' > limit 10; > SYSTEM ERROR: RemoteException: Permission denied: user=myuser, > access=EXECUTE, > inode: > /user/myuser/all/my/records/year=2017/month=201701/day=20170101/application_1485464650247_1917/part-r-0.gz.parquet":myuser:supergroup:-rw-r--r-- > at > org.apache.hadoop.hdfs.server.namenode.DefaultAuthorizationProvider.checkFsPermission(DefaultAuthorizationProvider.java:257) > at > org.apache.hadoop.hdfs.server.namenode.DefaultAuthorizationProvider.check(DefaultAuthorizationProvider.java:238) > at > org.apache.hadoop.hdfs.server.namenode.DefaultAuthorizationProvider.checkTraverse(DefaultAuthorizationProvider.java:180) > at > org.apache.hadoop.hdfs.server.namenode.DefaultAuthorizationProvider.checkPermission(DefaultAuthorizationProvider.java:137) > at > org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker.checkPermission(FSPermissionChecker.java:138) > at > org.apache.hadoop.hdfs.server.namenode.FSNamesystem.checkPermission(FSNamesystem.java:6609) > at > org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getFileInfo(FSNamesystem.java:4223) > at > org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.getFileInfo(NameNodeRpcServer.java:894) > at > org.apache.hadoop.hdfs.server.namenode.AuthorizationProviderProxyClientProtocol.getFileInfo(AuthorizationProviderProxyClientProtocol.java:526) > at > org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.getFileInfo(ClientNamenodeProtocolServerSideTranslatorPB.java:822) > at > org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java) > at > org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:617) > at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:1060) > at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2086) > at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2082) > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Subject.java:422) > at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1671) > at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2080) > 2. OR, if the where clause only specifies numerics in the dirname, it does > not blow up, but neither does it return the relevant data, since that where > clause is not the correct path to our data: > select * from all.my.records > where dir0 >= '2017' > limit 10; -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (DRILL-5546) Schema change problems caused by empty batch
[ https://issues.apache.org/jira/browse/DRILL-5546?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16037481#comment-16037481 ] Paul Rogers commented on DRILL-5546: Agreed. Just to clarify, the idea of a "non-existent batch" is confusing. Yes, NONE means that there are no more batches. If the only outcome is NONE ("fast NONE"), then that means that there is no output: not a batch with empty schema, rather that there is no batch at all (a null batch.) That's what the ∧ is supposed to mean... > Schema change problems caused by empty batch > > > Key: DRILL-5546 > URL: https://issues.apache.org/jira/browse/DRILL-5546 > Project: Apache Drill > Issue Type: Bug >Reporter: Jinfeng Ni >Assignee: Jinfeng Ni > > There have been a few JIRAs opened related to schema change failure caused by > empty batch. This JIRA is opened as an umbrella for all those related JIRAS ( > such as DRILL-4686, DRILL-4734, DRILL4476, DRILL-4255, etc). > -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Created] (DRILL-5566) AssertionError: Internal error: invariant violated: call to wrong operator
Khurram Faraaz created DRILL-5566: - Summary: AssertionError: Internal error: invariant violated: call to wrong operator Key: DRILL-5566 URL: https://issues.apache.org/jira/browse/DRILL-5566 Project: Apache Drill Issue Type: Bug Components: Execution - Flow Affects Versions: 1.11.0 Reporter: Khurram Faraaz CHARACTER_LENGTH is a non-reserved keyword as per the SQL specification. It is a monadic function that accepts exactly one operand or parameter. {noformat} ::= | | | | ... ... ::= | ::= { CHAR_LENGTH | CHARACTER_LENGTH } [ USING ] ... ... ::= CHARACTERS | OCTETS {noformat} Drill reports an assertion error in drillbit.log when character_length function is used in a SQL query. {noformat} 0: jdbc:drill:schema=dfs.tmp> select character_length(cast('hello' as varchar(10))) col1 from (values(1)); Error: SYSTEM ERROR: AssertionError: Internal error: invariant violated: call to wrong operator [Error Id: 49198839-5a1b-4786-9257-59739b27d2a8 on centos-01.qa.lab:31010] (org.apache.drill.exec.work.foreman.ForemanException) Unexpected exception during fragment initialization: Internal error: invariant violated: call to wrong operator org.apache.drill.exec.work.foreman.Foreman.run():297 java.util.concurrent.ThreadPoolExecutor.runWorker():1145 java.util.concurrent.ThreadPoolExecutor$Worker.run():615 java.lang.Thread.run():745 Caused By (java.lang.AssertionError) Internal error: invariant violated: call to wrong operator org.apache.calcite.util.Util.newInternal():777 org.apache.calcite.util.Util.permAssert():885 org.apache.calcite.sql2rel.ReflectiveConvertletTable$3.convertCall():219 org.apache.calcite.sql2rel.SqlNodeToRexConverterImpl.convertCall():59 org.apache.calcite.sql2rel.SqlToRelConverter$Blackboard.visit():4148 org.apache.calcite.sql2rel.SqlToRelConverter$Blackboard.visit():3581 org.apache.calcite.sql.SqlCall.accept():130 org.apache.calcite.sql2rel.SqlToRelConverter$Blackboard.convertExpression():4040 org.apache.calcite.sql2rel.StandardConvertletTable$8.convertCall():185 org.apache.calcite.sql2rel.SqlNodeToRexConverterImpl.convertCall():59 org.apache.calcite.sql2rel.SqlToRelConverter$Blackboard.visit():4148 org.apache.calcite.sql2rel.SqlToRelConverter$Blackboard.visit():3581 org.apache.calcite.sql.SqlCall.accept():130 org.apache.calcite.sql2rel.SqlToRelConverter$Blackboard.convertExpression():4040 org.apache.calcite.sql2rel.SqlToRelConverter.convertSelectList():3411 org.apache.calcite.sql2rel.SqlToRelConverter.convertSelectImpl():612 org.apache.calcite.sql2rel.SqlToRelConverter.convertSelect():568 org.apache.calcite.sql2rel.SqlToRelConverter.convertQueryRecursive():2773 org.apache.calcite.sql2rel.SqlToRelConverter.convertQuery():522 org.apache.drill.exec.planner.sql.SqlConverter.toRel():269 org.apache.drill.exec.planner.sql.handlers.DefaultSqlHandler.convertToRel():623 org.apache.drill.exec.planner.sql.handlers.DefaultSqlHandler.validateAndConvert():195 org.apache.drill.exec.planner.sql.handlers.DefaultSqlHandler.getPlan():164 org.apache.drill.exec.planner.sql.DrillSqlWorker.getQueryPlan():131 org.apache.drill.exec.planner.sql.DrillSqlWorker.getPlan():79 org.apache.drill.exec.work.foreman.Foreman.runSQL():1050 org.apache.drill.exec.work.foreman.Foreman.run():280 java.util.concurrent.ThreadPoolExecutor.runWorker():1145 java.util.concurrent.ThreadPoolExecutor$Worker.run():615 java.lang.Thread.run():745 (state=,code=0) {noformat} Calcite supports character_length function {noformat} [root@centos-0170 csv]# ./sqlline sqlline version 1.1.9 sqlline> !connect jdbc:calcite:model=target/test-classes/model.json admin admin SLF4J: Failed to load class "org.slf4j.impl.StaticLoggerBinder". SLF4J: Defaulting to no-operation (NOP) logger implementation SLF4J: See http://www.slf4j.org/codes.html#StaticLoggerBinder for further details. 0: jdbc:calcite:model=target/test-classes/mod> select character_length(cast('hello' as varchar(10))) col1 from (values(1)); ++ |COL1| ++ | 5 | ++ 1 row selected (1.379 seconds) {noformat} Postgres 9.3 also supports character_length function {noformat} postgres=# select character_length(cast('hello' as varchar(10))) col1 from (values(1)) foo; col1 -- 5 (1 row) {noformat} -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Created] (DRILL-5565) Directory Query fails with Permission denied: access=EXECUTE if dirN name is 'year=2017' or 'month=201704'
ehur created DRILL-5565: --- Summary: Directory Query fails with Permission denied: access=EXECUTE if dirN name is 'year=2017' or 'month=201704' Key: DRILL-5565 URL: https://issues.apache.org/jira/browse/DRILL-5565 Project: Apache Drill Issue Type: Bug Components: Functions - Drill, SQL Parser Affects Versions: 1.6.0 Reporter: ehur running a query like this works fine, when the name dir0 contains numerics only: select * from all.my.records where dir0 >= '20170322' limit 10; if the dirN is named according to this convention: year=2017 we get one of the following problems: 1. Either "system error permission denied" in: select * from all.my.records where dir0 >= 'year=2017' limit 10; SYSTEM ERROR: RemoteException: Permission denied: user=myuser, access=EXECUTE, inode: /user/myuser/all/my/records/year=2017/month=201701/day=20170101/application_1485464650247_1917/part-r-0.gz.parquet":myuser:supergroup:-rw-r--r-- at org.apache.hadoop.hdfs.server.namenode.DefaultAuthorizationProvider.checkFsPermission(DefaultAuthorizationProvider.java:257) at org.apache.hadoop.hdfs.server.namenode.DefaultAuthorizationProvider.check(DefaultAuthorizationProvider.java:238) at org.apache.hadoop.hdfs.server.namenode.DefaultAuthorizationProvider.checkTraverse(DefaultAuthorizationProvider.java:180) at org.apache.hadoop.hdfs.server.namenode.DefaultAuthorizationProvider.checkPermission(DefaultAuthorizationProvider.java:137) at org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker.checkPermission(FSPermissionChecker.java:138) at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.checkPermission(FSNamesystem.java:6609) at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getFileInfo(FSNamesystem.java:4223) at org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.getFileInfo(NameNodeRpcServer.java:894) at org.apache.hadoop.hdfs.server.namenode.AuthorizationProviderProxyClientProtocol.getFileInfo(AuthorizationProviderProxyClientProtocol.java:526) at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.getFileInfo(ClientNamenodeProtocolServerSideTranslatorPB.java:822) at org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java) at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:617) at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:1060) at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2086) at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2082) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:422) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1671) at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2080) 2. OR, if the where clause only specifies numerics in the dirname, it does not blow up, but neither does it return the relevant data, since that where clause is not the correct path to our data: select * from all.my.records where dir0 >= '2017' limit 10; -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Created] (DRILL-5564) IllegalStateException: allocator[op:21:1:5:HashJoinPOP]: buffer space (16674816) + prealloc space (0) + child space (0) != allocated (16740352)
Khurram Faraaz created DRILL-5564: - Summary: IllegalStateException: allocator[op:21:1:5:HashJoinPOP]: buffer space (16674816) + prealloc space (0) + child space (0) != allocated (16740352) Key: DRILL-5564 URL: https://issues.apache.org/jira/browse/DRILL-5564 Project: Apache Drill Issue Type: Bug Components: Execution - Flow Affects Versions: 1.11.0 Environment: 3 node CentOS cluster Reporter: Khurram Faraaz Run a concurrent Java program that executes TPCDS query11 while the above concurrent java program is under execution stop foreman Drillbit (from another shell, using below command) ./bin/drillbit.sh stop and you will see the IllegalStateException: allocator[op:21:1:5:HashJoinPOP]: and another assertion error, in the drillbit.log AssertionError: Failure while stopping processing for operator id 10. Currently have states of processing:false, setup:false, waiting:true. Drill 1.11.0 git commit ID: d11aba2 (with assertions enabled) details from drillbit.log from the foreman Drillbit node. {noformat} 2017-06-05 18:38:33,838 [26ca5afa-7f6d-991b-1fdf-6196faddc229:frag:23:1] INFO o.a.d.e.w.fragment.FragmentExecutor - 26ca5afa-7f6d-991b-1fdf-6196faddc229:23:1: State change requested RUNNING --> FAILED 2017-06-05 18:38:33,849 [26ca5afa-7f6d-991b-1fdf-6196faddc229:frag:23:1] INFO o.a.d.e.w.fragment.FragmentExecutor - 26ca5afa-7f6d-991b-1fdf-6196faddc229:23:1: State change requested FAILED --> FINISHED 2017-06-05 18:38:33,852 [26ca5afa-7f6d-991b-1fdf-6196faddc229:frag:23:1] ERROR o.a.d.e.w.fragment.FragmentExecutor - SYSTEM ERROR: AssertionError: Failure while stopping processing for operator id 10. Currently have states of processing:false, setup:false, waiting:true. Fragment 23:1 [Error Id: a116b326-43ed-4569-a20e-a10ba03d215e on centos-01.qa.lab:31010] org.apache.drill.common.exceptions.UserException: SYSTEM ERROR: AssertionError: Failure while stopping processing for operator id 10. Currently have states of processing:false, setup:false, waiting:true. Fragment 23:1 [Error Id: a116b326-43ed-4569-a20e-a10ba03d215e on centos-01.qa.lab:31010] at org.apache.drill.common.exceptions.UserException$Builder.build(UserException.java:544) ~[drill-common-1.11.0-SNAPSHOT.jar:1.11.0-SNAPSHOT] at org.apache.drill.exec.work.fragment.FragmentExecutor.sendFinalState(FragmentExecutor.java:295) [drill-java-exec-1.11.0-SNAPSHOT.jar:1.11.0-SNAPSHOT] at org.apache.drill.exec.work.fragment.FragmentExecutor.cleanup(FragmentExecutor.java:160) [drill-java-exec-1.11.0-SNAPSHOT.jar:1.11.0-SNAPSHOT] at org.apache.drill.exec.work.fragment.FragmentExecutor.run(FragmentExecutor.java:264) [drill-java-exec-1.11.0-SNAPSHOT.jar:1.11.0-SNAPSHOT] at org.apache.drill.common.SelfCleaningRunnable.run(SelfCleaningRunnable.java:38) [drill-common-1.11.0-SNAPSHOT.jar:1.11.0-SNAPSHOT] at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) [na:1.8.0_91] at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) [na:1.8.0_91] at java.lang.Thread.run(Thread.java:745) [na:1.8.0_91] Caused by: java.lang.RuntimeException: java.lang.AssertionError: Failure while stopping processing for operator id 10. Currently have states of processing:false, setup:false, waiting:true. at org.apache.drill.common.DeferredException.addThrowable(DeferredException.java:101) ~[drill-common-1.11.0-SNAPSHOT.jar:1.11.0-SNAPSHOT] at org.apache.drill.exec.work.fragment.FragmentExecutor.fail(FragmentExecutor.java:409) [drill-java-exec-1.11.0-SNAPSHOT.jar:1.11.0-SNAPSHOT] at org.apache.drill.exec.work.fragment.FragmentExecutor.run(FragmentExecutor.java:250) [drill-java-exec-1.11.0-SNAPSHOT.jar:1.11.0-SNAPSHOT] ... 4 common frames omitted Caused by: java.lang.AssertionError: Failure while stopping processing for operator id 10. Currently have states of processing:false, setup:false, waiting:true. at org.apache.drill.exec.ops.OperatorStats.stopProcessing(OperatorStats.java:167) ~[drill-java-exec-1.11.0-SNAPSHOT.jar:1.11.0-SNAPSHOT] at org.apache.drill.exec.physical.impl.ScanBatch.next(ScanBatch.java:255) ~[drill-java-exec-1.11.0-SNAPSHOT.jar:1.11.0-SNAPSHOT] at org.apache.drill.exec.physical.impl.validate.IteratorValidatorBatchIterator.next(IteratorValidatorBatchIterator.java:215) ~[drill-java-exec-1.11.0-SNAPSHOT.jar:1.11.0-SNAPSHOT] at org.apache.drill.exec.record.AbstractRecordBatch.next(AbstractRecordBatch.java:119) ~[drill-java-exec-1.11.0-SNAPSHOT.jar:1.11.0-SNAPSHOT] at org.apache.drill.exec.record.AbstractRecordBatch.next(AbstractRecordBatch.java:109) ~[drill-java-exec-1.11.0-SNAPSHOT.jar:1.11.0-SNAPSHOT] at org.apache.drill.exec.record.AbstractSingleRecordBatch.innerNext(AbstractSingleR
[jira] [Comment Edited] (DRILL-5546) Schema change problems caused by empty batch
[ https://issues.apache.org/jira/browse/DRILL-5546?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16037290#comment-16037290 ] Jinfeng Ni edited comment on DRILL-5546 at 6/5/17 6:40 PM: --- Thanks for putting a set of formal definitions of terms, which would clear confusion in further discussion. In current Drill execution and as well as in this proposal, *NONE* simply means the end of input; there is no more batch coming. It should not be used to represent any batch. The Drill's iterator framework has the code to handle 'NONE'. This proposal is just to suggest we return *NONE* directly, if the data source does not have any schema/data. This is different from what currently Drill is doing: return a *OK_NEW_SCHEMA* with a trivial result set (injected with nullable-int column), followed by a 'NONE'. Using the notation you defined, previously Drill has {{protocol : (OK_NEW_SCHEMA OK\*)\+ NONE}} Now we have: {{protocol : (OK_NEW_SCHEMA OK\*)\* NONE}} Some operators seems to work fine under the protocol change, some operators such as Join, UnionAll may not, due to the above protocol changes. was (Author: jni): Thanks for putting a set of form definitions of terms, which would clear confusion. In current Drill execution and as well as in this proposal, *NONE* simply means the end of input; there is no more batch coming. It should not be used to represent any batch. The Drill's iterator framework has the code to handle 'NONE'. This proposal is just to suggest we return *NONE* directly, if the data source does not have any schema/data. This is different from what currently Drill is doing: return a *OK_NEW_SCHEMA* with a trivial result set (injected with nullable-int column), followed by a 'NONE'. Using the notation you defined, previously Drill has {{protocol : (OK_NEW_SCHEMA OK\*)\+ NONE}} Now we have: {{protocol : (OK_NEW_SCHEMA OK\*)\* NONE}} Some operators seems to work fine under the protocol change, some operators such as Join, UnionAll may not, due to the above protocol changes. > Schema change problems caused by empty batch > > > Key: DRILL-5546 > URL: https://issues.apache.org/jira/browse/DRILL-5546 > Project: Apache Drill > Issue Type: Bug >Reporter: Jinfeng Ni >Assignee: Jinfeng Ni > > There have been a few JIRAs opened related to schema change failure caused by > empty batch. This JIRA is opened as an umbrella for all those related JIRAS ( > such as DRILL-4686, DRILL-4734, DRILL4476, DRILL-4255, etc). > -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (DRILL-5546) Schema change problems caused by empty batch
[ https://issues.apache.org/jira/browse/DRILL-5546?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16037290#comment-16037290 ] Jinfeng Ni commented on DRILL-5546: --- Thanks for putting a set of form definitions of terms, which would clear confusion. In current Drill execution and as well as in this proposal, *NONE* simply means the end of input; there is no more batch coming. It should not be used to represent any batch. The Drill's iterator framework has the code to handle 'NONE'. This proposal is just to suggest we return *NONE* directly, if the data source does not have any schema/data. This is different from what currently Drill is doing: return a *OK_NEW_SCHEMA* with a trivial result set (injected with nullable-int column), followed by a 'NONE'. Using the notation you defined, previously Drill has {{protocol : (OK_NEW_SCHEMA OK\*)\+ NONE}} Now we have: {{protocol : (OK_NEW_SCHEMA OK\*)\* NONE}} Some operators seems to work fine under the protocol change, some operators such as Join, UnionAll may not, due to the above protocol changes. > Schema change problems caused by empty batch > > > Key: DRILL-5546 > URL: https://issues.apache.org/jira/browse/DRILL-5546 > Project: Apache Drill > Issue Type: Bug >Reporter: Jinfeng Ni >Assignee: Jinfeng Ni > > There have been a few JIRAs opened related to schema change failure caused by > empty batch. This JIRA is opened as an umbrella for all those related JIRAS ( > such as DRILL-4686, DRILL-4734, DRILL4476, DRILL-4255, etc). > -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Comment Edited] (DRILL-5546) Schema change problems caused by empty batch
[ https://issues.apache.org/jira/browse/DRILL-5546?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16036141#comment-16036141 ] Paul Rogers edited comment on DRILL-5546 at 6/5/17 3:57 PM: I believe we are saying basically the same thing. To be certain, [watch out, we're gonna try some theory|https://store.xkcd.com/products/try-science]. h3. Basics We'll need some terminology, defined in the usual way: * (a, b, c) is a set that contains a, b, c * \{a:b} is a map from a to b * \[a, b, c] is an ordered list of a, b, c, where each element has an index i = 0, 1, 2... * empty = () or {} or \[] is the empty collection (of the proper type) * null = SQL null: we don't know what the value is * ∧ = Java, C, etc. null: we know the value, and it is nothing Drill is, at its core, relational. A relation R can be defined as: {{R = (S, T)}} Where: * S is the schema * T is a set of tuples (t ~1~, t ~2~, t ~3~) (AKA a table) {{S = (C, N)}} Where: * C is the list of column schemas: {{C = \[ c ~0~, c ~1~, c ~2~, ...]}} {{c = (name, type)}} And: * N is a map from name to column index: {{N = \{name : i\} }} where i the index of column c ∈ C Drill defines the idea of _compatible_ schemas. Two schemas are compatible if we redefine the schema as a set of columns: {{S’ = (C ~0~, C ~1~, …)}} Two schemas are compatible iff the column sets are identical (same name and type for each column). This is a bit more forgiving than the traditional relational model which requires that the ordered list of column schemas be identical. Let's assume this rule as we discuss schema below. We'll also need the idea of _cardinality_: {{\|S| = n}} Says that the cardinality (number of items) in S is _n_. Later, well just use _n_ to mean a schema (or relation or whatever) that has n items. h3. Relations and Multi-Relations A relation can be: {{R : ∧}} (programming null) -- the relation simply does not exist. {{ | (0,0)}} -- the trivial relation of no columns and (by definition), no rows. {{ | (s,0)}} -- a relation with some schema s, |s| ≠ 0, and no rows {{ | (s,n)}} -- a "normal" relation with schema and n tuples (rows) of data, \|s| ≠ 0, n ≠ 0 It is helpful to remember some basic relational algebra: {{R(s,0) ⋃ R(s,n) = R(s,n)}} {{R(s,n) ⋃ R(s,m) = R(s,n+m)}} Drill works not just with relations R, but also "multi-relations" (to make up a term). That is, each schema change in Drill introduces a new relation, so that the whole result set from a query is a series of relations. Let's define a "multi-reation" M using semi-BNF as: {{M : ∧}} -- undefined result set {{ | R(0,0)}} -- trivial result set {{ | R(s,0)}} - empty result set, \|s| ≠ 0 {{ | R(s,n)}} -- normal, single result set, n ≠ 0 {{ | R ~1~(s ~1~,n), R ~1~(s ~2~,m), ...}} -- multi-relation if s ~i~ ≠ s ~j~ Normally when we say multi-relation, we mean the last case: two or more relations with distinct schemas. The condition above says that to adjacent relations R ~i~ and R ~j~ must have distinct schema (or by the rules above, two relations with the same schema just collapse into a single relation with that schema. A schema can repeat, but a different schema must occur between repetitions.) h3. Multi-Relations in Drill Now let's get to Drill. Drill uses the term "schema change" to mean the transition from s ~i~ to s ~j~, s ~i~ ≠ s ~j~. The essence of the proposal here, as I understand it, is to update the implementation to fully support the first three definitions of a multi-relation (undefined, trivial and empty), assuming that Drill already supports the other two definitions (single relation and multi-relation.) In Drill, relations are broken into batches, which are, themselves, just relations. Thus a batch, B, can be: {{B : ∧}} -- undefined result set) {{ | R(0,0)}} -- trivial batch {{ | R(s,0)}} -- empty batch set, |s| ≠ 0 {{ | R(s,n)}} -- normal batch, |s| ≠ 0, n ≠ 0 And the whole result set D (for Drill) is an ordered list of batches: {{D = \[B ~1~, B ~2~, ..., B ~n~]}} Where {{B ~i~ = (s ~i~, t ~i~)}} As noted above, if adjacent batches have the same schema, then they are just sub-relations within a single larger (logical) relation. But, if the schemas differ, then the adjacent batches are the last and first of two distinct relations within a larger (logical) multi-relation. Said another way: {{D = R ~1~, R ~2~}} {{R ~i~ = B ~i,1~, B ~i,2~, ...}} To clarify, let’s visualize the schema changes as ∆~i~: {{D = \[B ~0~(s ~0~,t), B ~1~(s ~0~,t), … ∆~1~, B ~i~(s ~1~,t), B ~i+1~(s ~1~,t), …]}} The above sequence can describe any series of batches. As I understand the proposal, we want to put some constraints on the sequence. h3. Results of a Drill Query Let’s start by defining how should present the multi-relation to the client. The client also receives batches, but, following the rules in the proposal, we wan
[jira] [Created] (DRILL-5563) Stop non foreman Drillbit results in IllegalStateException: Allocator[ROOT] closed with outstanding child allocators.
Khurram Faraaz created DRILL-5563: - Summary: Stop non foreman Drillbit results in IllegalStateException: Allocator[ROOT] closed with outstanding child allocators. Key: DRILL-5563 URL: https://issues.apache.org/jira/browse/DRILL-5563 Project: Apache Drill Issue Type: Bug Components: Execution - Flow Affects Versions: 1.11.0 Environment: 3 node CentOS cluster Reporter: Khurram Faraaz Stopping the non-foreman Drillbit normally (as shown below) results in IllegalStateException: Allocator[ROOT] closed with outstanding child allocators. /opt/mapr/drill/drill-1.11.0/bin/drillbit.sh stop Drill 1.11.0 commit ID: d11aba2 Details from drillbit.log {noformat} Mon Jun 5 09:29:09 UTC 2017 Terminating drillbit pid 28182 2017-06-05 09:29:09,651 [Drillbit-ShutdownHook#0] INFO o.apache.drill.exec.server.Drillbit - Received shutdown request. 2017-06-05 09:29:11,691 [pool-6-thread-1] INFO o.a.drill.exec.rpc.user.UserServer - closed eventLoopGroup io.netty.channel.nio.NioEventLoopGroup@55511dc2 in 1004 ms 2017-06-05 09:29:11,691 [pool-6-thread-2] INFO o.a.drill.exec.rpc.data.DataServer - closed eventLoopGroup io.netty.channel.nio.NioEventLoopGroup@4078d750 in 1004 ms 2017-06-05 09:29:11,692 [pool-6-thread-1] INFO o.a.drill.exec.service.ServiceEngine - closed userServer in 1005 ms 2017-06-05 09:29:11,692 [pool-6-thread-2] INFO o.a.drill.exec.service.ServiceEngine - closed dataPool in 1005 ms 2017-06-05 09:29:11,701 [Drillbit-ShutdownHook#0] INFO o.a.drill.exec.compile.CodeCompiler - Stats: code gen count: 21, cache miss count: 7, hit rate: 67% 2017-06-05 09:29:11,709 [Drillbit-ShutdownHook#0] ERROR o.a.d.exec.server.BootStrapContext - Error while closing java.lang.IllegalStateException: Allocator[ROOT] closed with outstanding child allocators. Allocator(ROOT) 0/800/201359872/17179869184 (res/actual/peak/limit) child allocators: 4 Allocator(frag:3:2) 200/0/0/200 (res/actual/peak/limit) child allocators: 0 ledgers: 0 reservations: 0 Allocator(frag:4:2) 200/0/0/200 (res/actual/peak/limit) child allocators: 0 ledgers: 0 reservations: 0 Allocator(frag:1:2) 200/0/0/200 (res/actual/peak/limit) child allocators: 0 ledgers: 0 reservations: 0 Allocator(frag:2:2) 200/0/0/200 (res/actual/peak/limit) child allocators: 0 ledgers: 0 reservations: 0 ledgers: 0 reservations: 0 at org.apache.drill.exec.memory.BaseAllocator.close(BaseAllocator.java:492) ~[drill-memory-base-1.11.0-SNAPSHOT.jar:1.11.0-SNAPSHOT] at org.apache.drill.common.AutoCloseables.close(AutoCloseables.java:76) [drill-common-1.11.0-SNAPSHOT.jar:1.11.0-SNAPSHOT] at org.apache.drill.common.AutoCloseables.close(AutoCloseables.java:64) [drill-common-1.11.0-SNAPSHOT.jar:1.11.0-SNAPSHOT] at org.apache.drill.exec.server.BootStrapContext.close(BootStrapContext.java:247) ~[drill-java-exec-1.11.0-SNAPSHOT.jar:1.11.0-SNAPSHOT] at org.apache.drill.common.AutoCloseables.close(AutoCloseables.java:76) [drill-common-1.11.0-SNAPSHOT.jar:1.11.0-SNAPSHOT] at org.apache.drill.common.AutoCloseables.close(AutoCloseables.java:64) [drill-common-1.11.0-SNAPSHOT.jar:1.11.0-SNAPSHOT] at org.apache.drill.exec.server.Drillbit.close(Drillbit.java:159) [drill-java-exec-1.11.0-SNAPSHOT.jar:1.11.0-SNAPSHOT] at org.apache.drill.exec.server.Drillbit$ShutdownThread.run(Drillbit.java:253) [drill-java-exec-1.11.0-SNAPSHOT.jar:1.11.0-SNAPSHOT] 2017-06-05 09:29:11,709 [Drillbit-ShutdownHook#0] INFO o.apache.drill.exec.server.Drillbit - Shutdown completed (2057 ms). {noformat} -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Comment Edited] (DRILL-5546) Schema change problems caused by empty batch
[ https://issues.apache.org/jira/browse/DRILL-5546?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16036141#comment-16036141 ] Paul Rogers edited comment on DRILL-5546 at 6/5/17 7:31 AM: I believe we are saying basically the same thing. To be certain, [watch out, we're gonna try some theory|https://store.xkcd.com/products/try-science]. h3. Basics We'll need some terminology, defined in the usual way: * (a, b, c) is a set that contains a, b, c * \{a:b} is a map from a to b * \[a, b, c] is an ordered list of a, b, c, where each element has an index i = 0, 1, 2... * empty = () or {} or \[] is the empty collection (of the proper type) * null = SQL null: we don't know what the value is * ∧ = Java, C, etc. null: we know the value, and it is nothing Drill is, at its core, relational. A relation R can be defined as: {{R = (S, T)}} Where: * S is the schema * T is a set of tuples (t ~1~, t ~2~, t ~3~) (AKA a table) {{S = (C, N)}} Where: * C is the list of column schemas: {{C = \[ c ~0~, c ~1~, c ~2~, ...]}} {{c = (name, type)}} And: * N is a map from name to column index: {{N = \{name : i\} }} where i the index of column c ∈ C Drill defines the idea of _compatible_ schemas. Two schemas are compatible if we redefine the schema as a set of columns: {{S’ = (C ~0~, C ~1~, …)}} Two schemas are compatible iff the column sets are identical (same name and type for each column). This is a bit more forgiving than the traditional relational model which requires that the ordered list of column schemas be identical. Let's assume this rule as we discuss schema below. We'll also need the idea of _cardinality_: {{\|S| = n}} Says that the cardinality (number of items) in S is _n_. Later, well just use _n_ to mean a schema (or relation or whatever) that has n items. h3. Relations and Multi-Relations A relation can be: {{R : ∧}} (programming null) -- the relation simply does not exist. {{ | (0,0)}} -- the trivial relation of no columns and (by definition), no rows. {{ | (s,0)}} -- a relation with some schema s, |s| ≠ 0, and no rows {{ | (s,n)}} -- a "normal" relation with schema and n tuples (rows) of data, \|s| ≠ 0, n ≠ 0 It is helpful to remember some basic relational algebra: {{R(s,0) ⋃ R(s,n) = R(s,n)}} {{R(s,n) ⋃ R(s,m) = R(s,n+m)}} Drill works not just with relations R, but also "multi-relations" M. That is, a multi-relation (the entire result set from a single query) can defined (using semi-BNF) as: {{M : ∧}} -- undefined result set {{ | R(0,0)}} -- trivial result set {{ | R(s,0)}} - empty result set, \|s| ≠ 0 {{ | R(s,n)}} -- normal, single result set, n ≠ 0 {{ | R ~1~(s ~1~,n), R ~1~(s ~2~,m), ...}} -- multi-relation if s ~i~ ≠ s ~j~ Normally when we say multi-relation, we mean the last case: two or more relations with distinct schemas. The condition above says that to adjacent relations R ~i~ and R ~j~ must have distinct schema (or by the rules above, two relations with the same schema just collapse into a single relation with that schema. A schema can repeat, but a different schema must occur between repetitions.) h3. Multi-Relations in Drill Now let's get to Drill. Drill uses the term "schema change" to mean the transition from s ~i~ to s ~j~, s ~i~ ≠ s ~j~. The essence of the proposal here, as I understand it, is to update the implementation to fully support the first three definitions of a multi-relation (undefined, trivial and empty), assuming that Drill already supports the other two definitions (single relation and multi-relation.) In Drill, relations are broken into batches, which are, themselves, just relations. Thus a batch, B, can be: {{B : ∧}} -- undefined result set) {{ | R(0,0)}} -- trivial batch {{ | R(s,0)}} -- empty batch set, |s| ≠ 0 {{ | R(s,n)}} -- normal batch, |s| ≠ 0, n ≠ 0 And the whole result set D (for Drill) is an ordered list of batches: {{D = \[B ~1~, B ~2~, ..., B ~n~]}} Where {{B ~i~ = (s ~i~, t ~i~)}} As noted above, if adjacent batches have the same schema, then they are just sub-relations within a single larger (logical) relation. But, if the schemas differ, then the adjacent batches are the last and first of two distinct relations within a larger (logical) multi-relation. Said another way: {{D = R ~1~, R ~2~}} {{R ~i~ = B ~i,1~, B ~i,2~, ...}} To clarify, let’s visualize the schema changes as ∆~i~: {{D = \[B ~0~(s ~0~,t), B ~1~(s ~0~,t), … ∆~1~, B ~i~(s ~1~,t), B ~i+1~(s ~1~,t), …]}} The above sequence can describe any series of batches. As I understand the proposal, we want to put some constraints on the sequence. h3. Results of a Drill Query Let’s start by defining how should present the multi-relation to the client. The client also receives batches, but, following the rules in the proposal, we want to constrain the output: {{D ~e~ : B(0,0)}} -- Null result {{ | B(s,0)}} -- Empty result {{ | B
[jira] [Comment Edited] (DRILL-5546) Schema change problems caused by empty batch
[ https://issues.apache.org/jira/browse/DRILL-5546?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16036141#comment-16036141 ] Paul Rogers edited comment on DRILL-5546 at 6/5/17 7:30 AM: I believe we are saying basically the same thing. To be certain, [watch out, we're gonna try some theory|https://store.xkcd.com/products/try-science]. h3. Basics We'll need some terminology, defined in the usual way: * (a, b, c) is a set that contains a, b, c * \{a:b} is a map from a to b * \[a, b, c] is an ordered list of a, b, c, where each element has an index i = 0, 1, 2... * empty = () or {} or \[] is the empty collection (of the proper type) * null = SQL null: we don't know what the value is * ∧ = Java, C, etc. null: we know the value, and it is nothing Drill is, at its core, relational. A relation R can be defined as: {{R = (S, T)}} Where: * S is the schema * T is a set of tuples (t ~1~, t ~2~, t ~3~) (AKA a table) {{S = (C, N)}} Where: * C is the list of column schemas: {{C = \[ c ~0~, c ~1~, c ~2~, ...]}} {{c = (name, type)}} And: * N is a map from name to column index: {{N = \{name : i\} }} where i the index of column c ∈ C Drill defines the idea of _compatible_ schemas. Two schemas are compatible if we redefine the schema as a set of columns: {{S’ = (C ~0~, C ~1~, …)}} Two schemas are compatible iff the column sets are identical (same name and type for each column). This is a bit more forgiving than the traditional relational model which requires that the ordered list of column schemas be identical. Let's assume this rule as we discuss schema below. We'll also need the idea of _cardinality_: {{\|S| = n}} Says that the cardinality (number of items) in S is _n_. Later, well just use _n_ to mean a schema (or relation or whatever) that has n items. h3. Relations and Multi-Relations A relation can be: {{R : ∧}} (programming null) -- the relation simply does not exist. {{ | (0,0)}} -- the trivial relation of no columns and (by definition), no rows. {{ | (s,0)}} -- a relation with some schema s, |s| ≠ 0, and no rows {{ | (s,n)}} -- a "normal" relation with schema and n tuples (rows) of data, \|s| ≠ 0, n ≠ 0 It is helpful to remember some basic relational algebra: {{R(s,0) ⋃ R(s,n) = R(s,n)}} {{R(s,n) ⋃ R(s,m) = R(s,n+m)}} Drill works not just with relations R, but also "multi-relations" M. That is, a multi-relation (the entire result set from a single query) can defined (using semi-BNF) as: {{M : ∧}} -- undefined result set {{ | R(0,0)}} -- trivial result set {{ | R(s,0)}} - empty result set, \|s| ≠ 0 {{ | R(s,n)}} -- n ≠ 0, normal, single result set {{ | R ~1~(s ~1~,n), R ~1~(s ~2~,m), ...}} multi-relation if s ~i~ ≠ s ~j~ Normally when we say multi-relation, we mean the last case: two or more relations with distinct schemas. The condition above says that to adjacent relations R ~i~ and R ~j~ must have distinct schema (or by the rules above, two relations with the same schema just collapse into a single relation with that schema. A schema can repeat, but a different schema must occur between repetitions.) h3. Multi-Relations in Drill Now let's get to Drill. Drill uses the term "schema change" to mean the transition from s ~i~ to s ~j~, s ~i~ ≠ s ~j~. The essence of the proposal here, as I understand it, is to update the implementation to fully support the first three definitions of a multi-relation (undefined, trivial and empty), assuming that Drill already supports the other two definitions (single relation and multi-relation.) In Drill, relations are broken into batches, which are, themselves, just relations. Thus a batch, B, can be: {{B : ∧}} -- undefined result set) {{ | R(0,0)}} -- trivial batch {{ | R(s,0)}} -- empty batch set, |s| ≠ 0 {{ | R(s,n)}} -- normal batch, |s| ≠ 0, n ≠ 0 And the whole result set D (for Drill) is an ordered list of batches: {{D = \[B ~1~, B ~2~, ..., B ~n~]}} Where {{B ~i~ = (s ~i~, t ~i~)}} As noted above, if adjacent batches have the same schema, then they are just sub-relations within a single larger (logical) relation. But, if the schemas differ, then the adjacent batches are the last and first of two distinct relations within a larger (logical) multi-relation. Said another way: {{D = R ~1~, R ~2~}} {{R ~i~ = B ~i,1~, B ~i,2~, ...}} To clarify, let’s visualize the schema changes as ∆~i~: {{D = \[B ~0~(s ~0~,t), B ~1~(s ~0~,t), … ∆~1~, B ~i~(s ~1~,t), B ~i+1~(s ~1~,t), …]}} The above sequence can describe any series of batches. As I understand the proposal, we want to put some constraints on the sequence. h3. Results of a Drill Query Let’s start by defining how should present the multi-relation to the client. The client also receives batches, but, following the rules in the proposal, we want to constrain the output: {{D ~e~ : B(0,0)}} -- Null result {{ | B(s,0)}} -- Empty result {{ | B(s,
[jira] [Comment Edited] (DRILL-5546) Schema change problems caused by empty batch
[ https://issues.apache.org/jira/browse/DRILL-5546?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16036141#comment-16036141 ] Paul Rogers edited comment on DRILL-5546 at 6/5/17 7:28 AM: I believe we are saying basically the same thing. To be certain, [watch out, we're gonna try some theory|https://store.xkcd.com/products/try-science]. h3. Basics We'll need some terminology, defined in the usual way: * (a, b, c) is a set that contains a, b, c * \{a:b} is a map from a to b * \[a, b, c] is an ordered list of a, b, c, where each element has an index i = 0, 1, 2... * empty = () or {} or \[] is the empty collection (of the proper type) * null = SQL null: we don't know what the value is * ∧ = Java, C, etc. null: we know the value, and it is nothing Drill is, at its core, relational. A relation R can be defined as: {{R = (S, T)}} Where: * S is the schema * T is a set of tuples (t ~1~, t ~2~, t ~3~) (AKA a table) {{S = (C, N)}} Where: * C is the list of column schemas: {{C = \[ c ~0~, c ~1~, c ~2~, ...]}} {{c = (name, type)}} And: * N is a map from name to column index: {{N = \{name : i\} }} where i the index of column c ∈ C Drill defines the idea of _compatible_ schemas. Two schemas are compatible if we redefine the schema as a set of columns: {{S’ = (C ~0~, C ~1~, …)}} Two schemas are compatible iff the column sets are identical (same name and type for each column). This is a bit more forgiving than the traditional relational model which requires that the ordered list of column schemas be identical. Let's assume this rule as we discuss schema below. We'll also need the idea of _cardinality_: {{\|S| = n}} Says that the cardinality (number of items) in S is _n_. Later, well just use _n_ to mean a schema (or relation or whatever) that has n items. h3. Relations and Multi-Relations A relation can be: {{R : ∧}} (programming null) -- the relation simply does not exist. {{ | (0,0)}} -- the trivial relation of no columns and (by definition), no rows. {{ | (s,0)}} -- a relation with some schema s, |s| ≠ 0, and no rows {{ | (s,n)}} -- a "normal" relation with schema and n tuples (rows) of data, \|s| ≠ 0, n ≠ 0 It is helpful to remember some basic relational algebra: {{R(s,0) ⋃ R(s,n) = R(s,n)}} {{R(s,n) ⋃ R(s,m) = R(s,n+m)}} Drill works not just with relations R, but also "multi-relations" M. That is, a multi-relation (the entire result set from a single query) can defined (using semi-BNF) as: {{M : ∧}} -- undefined result set {{ | R(0,0)}} -- trivial result set {{ | R(c,0)}} - empty result set, \|c| ≠ 0 {{ | R(c,n)}} -- n ≠ 0, normal, single result set {{ | R ~1~(s ~1~,n), R ~1~(s ~2~,m), ...}} multi-relation if s ~i~ ≠ s ~j~ Normally when we say multi-relation, we mean the last case: two or more relations with distinct schemas. The condition above says that to adjacent relations R ~i~ and R ~j~ must have distinct schema (or by the rules above, two relations with the same schema just collapse into a single relation with that schema. A schema can repeat, but a different schema must occur between repetitions.) h3. Multi-Relations in Drill Now let's get to Drill. Drill uses the term "schema change" to mean the transition from s ~i~ to s ~j~, s ~i~ ≠ s ~j~. The essence of the proposal here, as I understand it, is to update the implementation to fully support the first three definitions of a multi-relation (undefined, trivial and empty), assuming that Drill already supports the other two definitions (single relation and multi-relation.) In Drill, relations are broken into batches, which are, themselves, just relations. Thus a batch, B, can be: {{B : ∧}} -- undefined result set) {{ | R(0,0)}} -- trivial batch {{ | R(c,0)}} -- empty batch set, |c| ≠ 0 {{ | R(c,n)}} -- normal batch, |c| ≠ 0, n ≠ 0 And the whole result set D (for Drill) is an ordered list of batches: {{D = \[B ~1~, B ~2~, ..., B ~n~]}} Where {{B ~i~ = (s ~i~, t ~i~)}} As noted above, if adjacent batches have the same schema, then they are just sub-relations within a single larger (logical) relation. But, if the schemas differ, then the adjacent batches are the last and first of two distinct relations within a larger (logical) multi-relation. Said another way: {{D = R ~1~, R ~2~}} {{R ~i~ = B ~i,1~, B ~i,2~, ...}} To clarify, let’s visualize the schema changes as ∆~i~: {{D = \[B ~0~(s ~0~,t), B ~1~(s ~0~,t), … ∆~1~, B ~i~(s ~1~,t), B ~i+1~(s ~1~,t), …]}} The above sequence can describe any series of batches. As I understand the proposal, we want to put some constraints on the sequence. h3. Results of a Drill Query Let’s start by defining how should present the multi-relation to the client. The client also receives batches, but, following the rules in the proposal, we want to constrain the output: {{D ~e~ : B(0,0)}} -- Null result {{ | B(s,0)}} -- Empty result {{ | B(s,
[jira] [Comment Edited] (DRILL-5546) Schema change problems caused by empty batch
[ https://issues.apache.org/jira/browse/DRILL-5546?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16036141#comment-16036141 ] Paul Rogers edited comment on DRILL-5546 at 6/5/17 7:25 AM: I believe we are saying basically the same thing. To be certain, [watch out, we're gonna try some theory|https://store.xkcd.com/products/try-science]. h3. Basics We'll need some terminology, defined in the usual way: * (a, b, c) is a set that contains a, b, c * \{a:b} is a map from a to b * \[a, b, c] is an ordered list of a, b, c, where each element has an index i = 0, 1, 2... * empty = () or {} or \[] is the empty collection (of the proper type) * null = SQL null: we don't know what the value is * ∧ = Java, C, etc. null: we know the value, and it is nothing Drill is, at its core, relational. A relation R can be defined as: {{R = (S, T)}} Where: * S is the schema * T is a set of tuples (t ~1~, t ~2~, t ~3~) (AKA a table) {{S = (C, N)}} Where: * C is the list of column schemas: {{C = \[ c ~0~, c ~1~, c ~2~, ...]}} {{c = (name, type)}} And: * N is a map from name to column index: {{N = \{name : i\} }} where i the index of column c ∈ C Drill defines the idea of _compatible_ schemas. Two schemas are compatible if we redefine the schema as a set of columns: {{S’ = (C ~0~, C ~1~, …)}} Two schemas are compatible iff the column sets are identical (same name and type for each column). This is a bit more forgiving than the traditional relational model which requires that the ordered list of column schemas be identical. Let's assume this rule as we discuss schema below. We'll also need the idea of _cardinality_: {{\|S| = n}} Says that the cardinality (number of items) in S is _n_. Later, well just use _n_ to mean a schema (or relation or whatever) that has n items. h3. Relations and Multi-Relations A relation can be: {{R : ∧}} (programming null) -- the relation simply does not exist. {{ | (0,0)}} -- the trivial relation of no columns and (by definition), no rows. {{ | (s,0)}} -- a relation with some schema s, |s| ≠ 0, and no rows {{ | (s,n)}} -- a "normal" relation with schema and n tuples (rows) of data, \|s| ≠ 0, n ≠ 0 It is helpful to remember some basic relational algebra: {{R(s,0) ⋃ R(s,n) = R(s,n)}} {{R(s,n) ⋃ R(s,m) = R(s,n+m)}} Drill works not just with relations R, but also "multi-relations" M. That is, a multi-relation (the entire result set from a single query) can defined (using semi-BNF) as: {{M : ∧}} -- undefined result set {{ | R(0,0)}} -- trivial result set {{ | R(c,0)}} - empty result set, \|c| ≠ 0 {{ | R(c,n)}} -- n ≠ 0, normal, single result set {{ | R ~1~(s ~1~,n), R ~1~(s ~2~,m), ...}} multi-relation if s ~i~ ≠ s ~j~ Normally when we say multi-relation, we mean the last case: two or more relations with distinct schemas. The condition above says that to adjacent relations R ~i~ and R ~j~ must have distinct schema (or by the rules above, two relations with the same schema just collapse into a single relation with that schema. A schema can repeat, but a different schema must occur between repetitions.) h3. Multi-Relations in Drill Now let's get to Drill. Drill uses the term "schema change" to mean the transition from s ~i~ to s ~j~, s ~i~ ≠ s ~j~. The essence of the proposal here, as I understand it, is to update the implementation to fully support the first three definitions of a multi-relation (undefined, trivial and empty), assuming that Drill already supports the other two definitions (single relation and multi-relation.) In Drill, relations are broken into batches, which are, themselves, just relations. Thus a batch, B, can be: {{B : ∧}} -- undefined result set) {{ | R(0,0)}} -- trivial batch {{ | R(c,0)}} -- empty batch set, |c| ≠ 0 {{ | R}} -- normal batch And the whole result set D (for Drill) is an ordered list of batches: {{D = \[B ~1~, B ~2~, ..., B ~n~]}} Where {{B ~i~ = (s ~i~, t ~i~)}} As noted above, if adjacent batches have the same schema, then they are just sub-relations within a single larger (logical) relation. But, if the schemas differ, then the adjacent batches are the last and first of two distinct relations within a larger (logical) multi-relation. Said another way: {{D = R ~1~, R ~2~}} {{R ~i~ = B ~i,1~, B ~i,2~, ...}} To clarify, let’s visualize the schema changes as ∆~i~: {{D = \[B ~0~(s ~0~,t), B ~1~(s ~0~,t), … ∆~1~, B ~i~(s ~1~,t), B ~i+1~(s ~1~,t), …]}} The above sequence can describe any series of batches. As I understand the proposal, we want to put some constraints on the sequence. h3. Results of a Drill Query Let’s start by defining how should present the multi-relation to the client. The client also receives batches, but, following the rules in the proposal, we want to constrain the output: {{D ~e~ : B(0,0)}} -- Null result {{ | B(s,0)}} -- Empty result {{ | B(s,n)\+}} -- “Classic” O/
[jira] [Comment Edited] (DRILL-5546) Schema change problems caused by empty batch
[ https://issues.apache.org/jira/browse/DRILL-5546?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16036141#comment-16036141 ] Paul Rogers edited comment on DRILL-5546 at 6/5/17 7:10 AM: I believe we are saying basically the same thing. To be certain, [watch out, we're gonna try some theory|https://store.xkcd.com/products/try-science]. h3. Basics We'll need some terminology, defined in the usual way: * (a, b, c) is a set that contains a, b, c * \{a:b} is a map from a to b * \[a, b, c] is an ordered list of a, b, c, where each element has an index i = 0, 1, 2... * empty = () or {} or \[] is the empty collection (of the proper type) * null = SQL null: we don't know what the value is * ∧ = Java, C, etc. null: we know the value, and it is nothing Drill is, at its core, relational. A relation R can be defined as: {{R = (S, T)}} Where: * S is the schema * T is a set of tuples (t ~1~, t ~2~, t ~3~) (AKA a table) {{S = (C, N)}} Where: * C is the list of column schemas: {{C = \[ c ~0~, c ~1~, c ~2~, ...]}} {{c = (name, type)}} And: * N is a map from name to column index: {{N = \{name : i\} }} where i the index of column c ∈ C Drill defines the idea of _compatible_ schemas. Two schemas are compatible if we redefine the schema as a set of columns: {{S’ = (C ~0~, C ~1~, …)}} Two schemas are compatible iff the column sets are identical (same name and type for each column). This is a bit more forgiving than the traditional relational model which requires that the ordered list of column schemas be identical. Let's assume this rule as we discuss schema below. We'll also need the idea of _cardinality_: {{\|S| = n}} Says that the cardinality (number of items) in S is _n_. Later, well just use _n_ to mean a schema (or relation or whatever) that has n items. h3. Relations and Multi-Relations A relation can be: {{R : ∧}} (programming null) -- the relation simply does not exist. {{ | (0,0)}} -- the trivial relation of no columns and (by definition), no rows. {{ | (s,0)}} -- a relation with some schema s, |s| ≠ 0, and no rows {{ | (s,n)}} -- a "normal" relation with schema and n tuples (rows) of data, \|s| ≠ 0, n ≠ 0 It is helpful to remember some basic relational algebra: {{R(s,0) ⋃ R(s,n) = R(s,n)}} {{R(s,n) ⋃ R(s,m) = R(s,n+m)}} Drill works not just with relations R, but also "multi-relations" M. That is, a multi-relation (the entire result set from a single query) can defined (using semi-BNF) as: {{M : ∧}} -- undefined result set) {{ | R(0,0)}} -- trivial result set {{ | R(c,0)}} - empty result set, \|c| ≠ 0 {{ | R(c,n)}} -- n ≠ 0, normal, single result set {{ | R ~1~(s ~1~,n), R ~1~(s ~2~,m), ...}} multi-relation if s ~i~ ≠ s ~j~ Normally when we say multi-relation, we mean the last case: two or more relations with distinct schemas. The condition above says that to adjacent relations R ~i~ and R ~j~ must have distinct schema (or by the rules above, two relations with the same schema just collapse into a single relation with that schema. A schema can repeat, but a different schema must occur between repetitions.) h3. Multi-Relations in Drill Now let's get to Drill. Drill uses the term "schema change" to mean the transition from s ~i~ to s ~j~, s ~i~ ≠ s ~j~. The essence of the proposal here, as I understand it, is to update the implementation to fully support the first three definitions of a multi-relation (undefined, trivial and empty), assuming that Drill already supports the other two definitions (single relation and multi-relation.) In Drill, relations are broken into batches, which are, themselves, just relations. Thus a batch, B, can be: {{B : ∧}} -- undefined result set) {{ | R(0,0)}} -- trivial batch {{ | R(c,0)}} -- empty batch set, |c| ≠ 0 {{ | R}} -- normal batch And the whole result set D (for Drill) is an ordered list of batches: {{D = \[B ~1~, B ~2~, ..., B ~n~]}} Where {{B ~i~ = (s ~i~, t ~i~)}} As noted above, if adjacent batches have the same schema, then they are just sub-relations within a single larger (logical) relation. But, if the schemas differ, then the adjacent batches are the last and first of two distinct relations within a larger (logical) multi-relation. Said another way: {{D = R ~1~, R ~2~}} {{R ~i~ = B ~i,1~, B ~i,2~, ...}} To clarify, let’s visualize the schema changes as ∆~i~: {{D = \[B ~0~(s ~0~,t), B ~1~(s ~0~,t), … ∆~1~, B ~i~(s ~1~,t), B ~i+1~(s ~1~,t), …]}} The above sequence can describe any series of batches. As I understand the proposal, we want to put some constraints on the sequence. h3. Results of a Drill Query Let’s start by defining how should present the multi-relation to the client. The client also receives batches, but, following the rules in the proposal, we want to constrain the output: {{D ~e~ : B(0,0)}} -- Null result {{ | B(s,0)}} -- Empty result {{ | B(s,n)\+}} -- “Classic” O