Sahil Takiar created IMPALA-9737:
------------------------------------

             Summary: DCHECK in buffer-pool.cc - min_bytes_to_write <= 
dirty_unpinned_pages_.bytes() 
                 Key: IMPALA-9737
                 URL: https://issues.apache.org/jira/browse/IMPALA-9737
             Project: IMPALA
          Issue Type: Bug
          Components: Backend
            Reporter: Sahil Takiar
            Assignee: Tim Armstrong


Saw this recently in a dockerised pre-commit tests against a seemingly 
unrelated change: 
https://jenkins.impala.io/job/ubuntu-16.04-from-scratch/10499/#showFailuresLink 
(triggered by https://gerrit.cloudera.org/#/c/14666/)

The error message from the logs is:

{code}
Error Message
DCHECK found in log file: /home/ubuntu/Impala/logs/ee_tests/impalad_node1.FATAL
Standard Error
Log file created at: 2020/05/07 18:07:14
Running on machine: ip-172-31-3-33
Log line format: [IWEF]mmdd hh:mm:ss.uuuuuu threadid file:line] msg
F0507 18:07:14.606797 88747 buffer-pool.cc:711] 
3f4ad52d42fef180:55b3458000000004] Check failed: min_bytes_to_write <= 
dirty_unpinned_pages_.bytes() (262144 vs. 0) <BufferPool::Client> 0xedf5d680 
name: HASH_JOIN_NODE id=2 ptr=0x25b0e400 write_status:  buffers allocated 
262144 num_pages: 0 pinned_bytes: 0 dirty_unpinned_bytes: 0 
in_flight_write_bytes: 0 reservation: {<ReservationTracker>: reservation_limit 
9223372036854775807 reservation 524288 used_reservation 262144 
child_reservations 0 parent:
<ReservationTracker>: reservation_limit 9223372036854775807 reservation 524288 
used_reservation 0 child_reservations 524288 parent:
<ReservationTracker>: reservation_limit 175112192 reservation 132120576 
used_reservation 0 child_reservations 132120576 parent:
<ReservationTracker>: reservation_limit 10952163328 reservation 326664192 
used_reservation 0 child_reservations 326664192 parent:
NULL}
  0 pinned pages: 
  0 dirty unpinned pages: 
  0 in flight write pages: 
{code}

The minidump stack is:

{code}
Operating system: Linux
                  0.0.0 Linux 4.4.0-1081-aws #91-Ubuntu SMP Tue Apr 16 08:21:03 
UTC 2019 x86_64
CPU: amd64
     family 6 model 79 stepping 1
     16 CPUs

GPU: UNKNOWN

Crash reason:  SIGABRT
Crash address: 0x3e8000010e6
Process uptime: not available

Thread 418 (crashed)
 0  libc-2.23.so + 0x35428
    rax = 0x0000000000000000   rdx = 0x0000000000000006
    rcx = 0x00007f948e3aa428   rbx = 0x00000000073e2300
    rsi = 0x0000000000015aab   rdi = 0x00000000000010e6
    rbp = 0x00007f9394558c60   rsp = 0x00007f93945588f8
     r8 = 0x0000000000000000    r9 = 0x0000000000000020
    r10 = 0x0000000000000008   r11 = 0x0000000000000202
    r12 = 0x00000000073e2380   r13 = 0x000000000000039a
    r14 = 0x00000000073e9cc4   r15 = 0x00000000073e2300
    rip = 0x00007f948e3aa428
    Found by: given as instruction pointer in context
 1  libc-2.23.so + 0x3702a
    rbp = 0x00007f9394558c60   rsp = 0x00007f9394558900
    rip = 0x00007f948e3ac02a
    Found by: stack scanning
 2  impalad!google::DumpStackTraceAndExit() + 0x24
    rbp = 0x00007f9394558c60   rsp = 0x00007f9394558a30
    rip = 0x0000000005010014
    Found by: stack scanning
 3  impalad!google::LogMessage::Fail() + 0xd
    rbx = 0x00000000073e2300   rbp = 0x00007f9394558c60
    rsp = 0x00007f9394558ae0   rip = 0x0000000005006a6d
    Found by: call frame info
 4  impalad!google::LogMessage::SendToLog() + 0x2b2
    rbx = 0x00000000073e2300   rbp = 0x00007f9394558c60
    rsp = 0x00007f9394558af0   rip = 0x0000000005008312
    Found by: call frame info
 5  impalad!google::LogMessage::Flush() + 0x157
    rbx = 0x00007f9394558ca0   rbp = 0x00007f948ef675a0
    rsp = 0x00007f9394558c70   r12 = 0x00007f9394558c8f
    r13 = 0x0000000000000001   r14 = 0x00007f9394558db0
    r15 = 0x0000000000000001   rip = 0x0000000005006447
    Found by: call frame info
 6  impalad!google::LogMessageFatal::~LogMessageFatal() + 0xe
    rbx = 0x00007f9394558db0   rbp = 0x00007f9394558f40
    rsp = 0x00007f9394558cf0   r12 = 0x0000000000000000
    r13 = 0x0000000000000001   r14 = 0x0000000000000001
    r15 = 0x0000000000000001   rip = 0x0000000005009a0e
    Found by: call frame info
 7  impalad!impala::BufferPool::Client::WriteDirtyPagesAsync(long) 
[buffer-pool.cc : 711 + 0xf]
    rbx = 0x0000000000000000   rbp = 0x00007f9394558f40
    rsp = 0x00007f9394558d10   r12 = 0x0000000000000000
    r13 = 0x0000000000000001   r14 = 0x0000000000000001
    r15 = 0x0000000000000001   rip = 0x0000000002701732
    Found by: call frame info
 8  
impalad!impala::BufferPool::Client::CleanPages(std::unique_lock<std::mutex>*, 
long, bool) [buffer-pool.cc : 691 + 0x16]
    rbx = 0x0000000000000000   rbp = 0x00007f9394559170
    rsp = 0x00007f9394558f50   r12 = 0x0000000000000000
    r13 = 0x0000000000000001   r14 = 0x0000000000000001
    r15 = 0x0000000000000001   rip = 0x0000000002701147
    Found by: call frame info
 9  
impalad!impala::BufferPool::Client::TransferReservationTo(impala::ReservationTracker*,
 long, bool*) [buffer-pool.cc : 648 + 0x1e]
    rbx = 0x0000000000000000   rbp = 0x00007f9394559200
    rsp = 0x00007f9394559180   r12 = 0x0000000000000000
    r13 = 0x0000000000000001   r14 = 0x0000000000000001
    r15 = 0x0000000000000001   rip = 0x0000000002700a21
    Found by: call frame info
10  
impalad!impala::BufferPool::ClientHandle::TransferReservationTo(impala::ReservationTracker*,
 long, bool*) [buffer-pool.cc : 347 + 0x22]
    rbx = 0x0000000000000000   rbp = 0x00007f9394559240
    rsp = 0x00007f9394559210   r12 = 0x0000000000000000
    r13 = 0x0000000000000001   r14 = 0x0000000000000001
    r15 = 0x0000000000000001   rip = 0x00000000026fce68
    Found by: call frame info
11  
impalad!impala::BufferPool::ClientHandle::TransferReservationTo(impala::BufferPool::ClientHandle*,
 long, bool*) [buffer-pool.cc : 353 + 0x33]
    rbx = 0x0000000000000000   rbp = 0x00007f93945592c0
    rsp = 0x00007f9394559250   r12 = 0x0000000000000000
    r13 = 0x0000000000000001   r14 = 0x0000000000000001
    r15 = 0x0000000000000001   rip = 0x00000000026fcf55
    Found by: call frame info
12  
impalad!impala::PhjBuilder::ReturnReservation(impala::BufferPool::ClientHandle*,
 long) [partitioned-hash-join-builder.cc : 1155 + 0x35]
    rbx = 0x0000000000000000   rbp = 0x00007f93945593b0
    rsp = 0x00007f93945592d0   r12 = 0x0000000000000000
    r13 = 0x0000000000000001   r14 = 0x0000000000000001
    r15 = 0x0000000000000001   rip = 0x0000000002860a8f
    Found by: call frame info
13  impalad!impala::PartitionedHashJoinNode::Close(impala::RuntimeState*) 
[partitioned-hash-join-node.cc : 305 + 0x54]
    rbx = 0x0000000000080000   rbp = 0x00007f93945593f0
    rsp = 0x00007f93945593c0   r12 = 0x0000000000000000
    r13 = 0x0000000000000001   r14 = 0x0000000000000001
    r15 = 0x0000000000000001   rip = 0x00000000028726fc
    Found by: call frame info
14  impalad!impala::ExecNode::Close(impala::RuntimeState*) [exec-node.cc : 314 
+ 0x37]
    rbx = 0x0000000000000000   rbp = 0x00007f93945594e0
    rsp = 0x00007f9394559400   r12 = 0x0000000000000000
    r13 = 0x0000000000000001   r14 = 0x0000000000000001
    r15 = 0x0000000000000001   rip = 0x000000000277437c
    Found by: call frame info
{code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to