yknoya commented on PR #12395:
URL: https://github.com/apache/trafficserver/pull/12395#issuecomment-3142000024

   # Problem
   A bug was discovered where the cache loaded from host.db is not being used.
   Upon debugging, it was found that the following logic in the `probe` 
function is executed, resulting in the cache being marked as stale:
   
https://github.com/apache/trafficserver/blob/9f6ababfed6c9d0266ee1d7e1703a09413ca48cb/iocore/hostdb/HostDB.cc#L567-L570
   
   # Cause
   The cause of the stale determination was identified in `HostDB::unmarshall`, 
specifically in the following section:
   
https://github.com/apache/trafficserver/blob/9f6ababfed6c9d0266ee1d7e1703a09413ca48cb/iocore/hostdb/I_HostDBProcessor.h#L180-L183
   
   This code first populates `HostDBInfo` using memcpy, then calls the 
constructor via placement new.
   As described in the comment, the constructor is invoked to reset the 
reference count of the base class `RefCountObj` to zero.
   However, calling the constructor also reinitializes members that use default 
member initializers:
   
https://github.com/apache/trafficserver/blob/9f6ababfed6c9d0266ee1d7e1703a09413ca48cb/iocore/hostdb/I_HostDBProcessor.h#L322-L339
   
   Debug logs added before and after the constructor call showed that fields 
such as `ip_timestamp` were unintentionally reset, causing the cache to be 
incorrectly judged as stale:
   ```
   [Aug  1 10:29:01.022] traffic_server DEBUG: <I_HostDBProcessor.h:182 
(unmarshall)> (hostdb) HostDBInfo before new: key=17900179173150759707 
hostname_offset=80, ip_timestamp=1754011708, ip_timeout_interval=900, 
_iobuffer_index=0, m_refcount=2
   [Aug  1 10:29:01.022] traffic_server DEBUG: <I_HostDBProcessor.h:192 
(unmarshall)> (hostdb) HostDBInfo after new: key=0 hostname_offset=0, 
ip_timestamp=0, ip_timeout_interval=0, _iobuffer_index=0, m_refcount=0
   ```
   
   The patch used to add the debug logs is shown below:
   <details>
   <summary>patch</summary>
   
   ```diff
   diff --git a/iocore/hostdb/I_HostDBProcessor.h 
b/iocore/hostdb/I_HostDBProcessor.h
   index 0be0a81b4..2356337d7 100644
   --- a/iocore/hostdb/I_HostDBProcessor.h
   +++ b/iocore/hostdb/I_HostDBProcessor.h
   @@ -178,10 +178,22 @@ struct HostDBInfo : public RefCountObj {
        HostDBInfo *ret = HostDBInfo::alloc(size - sizeof(HostDBInfo));
        int buf_index   = ret->_iobuffer_index;
        memcpy((void *)ret, buf, size);
   +
   +    Debug("hostdb", "HostDBInfo before new: key=%lu hostname_offset=%u, "
   +      "ip_timestamp=%u, ip_timeout_interval=%u, _iobuffer_index=%d, 
m_refcount=%d",
   +      ret->key, ret->hostname_offset, ret->ip_timestamp, 
ret->ip_timeout_interval,
   +      ret->_iobuffer_index, ret->refcount());
   +
        // Reset the refcount back to 0, this is a bit ugly-- but I'm not sure 
we want to expose a method
        // to mess with the refcount, since this is a fairly unique use case
        ret                  = new (ret) HostDBInfo();
        ret->_iobuffer_index = buf_index;
   +
   +    Debug("hostdb", "HostDBInfo after new: key=%lu hostname_offset=%u, "
   +      "ip_timestamp=%u, ip_timeout_interval=%u, _iobuffer_index=%d, 
m_refcount=%d",
   +      ret->key, ret->hostname_offset, ret->ip_timestamp, 
ret->ip_timeout_interval,
   +      ret->_iobuffer_index, ret->refcount());
   +
        return ret;
      }
    
   diff --git a/iocore/hostdb/RefCountCache.cc b/iocore/hostdb/RefCountCache.cc
   index af9cf0fd2..8f58c3767 100644
   --- a/iocore/hostdb/RefCountCache.cc
   +++ b/iocore/hostdb/RefCountCache.cc
   @@ -49,5 +49,5 @@ RefCountCacheHeader::operator==(RefCountCacheHeader const 
&that) const
    bool
    RefCountCacheHeader::compatible(RefCountCacheHeader *that) const
    {
   -  return this->magic == that->magic && this->version == that->version && 
this->object_version == that->version;
   +  return this->magic == that->magic && this->version == that->version && 
this->object_version == that->object_version;
    };
   ```
   </details>
   
   
   # Fix
   To fix this issue, the construction of `HostDBInfo` in `HostDB::unmarshall` 
must satisfy the following conditions:
   - The reference counter (`m_refcount`) should be initialized.
   - All other member variables should not be reinitialized.
   
   I considered ways to initialize the reference counter without calling the 
constructor.
   However, I concluded that invoking the constructor is also necessary to 
reassign the virtual function table pointer (vptr):
   
https://github.com/apache/trafficserver/blob/9f6ababfed6c9d0266ee1d7e1703a09413ca48cb/iocore/hostdb/I_HostDBProcessor.h#L180-L183
   
   As a result, I chose the following workaround:
   Back up the member variable values before calling the constructor, then 
restore them after the constructor has been executed.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to