PSUdaemon opened a new pull request, #12567:
URL: https://github.com/apache/trafficserver/pull/12567

   # Configurable Hash Algorithm for Consistent Hash Parent Selection
   
     ## Overview
   
     Makes the hash algorithm used in consistent hash parent selection 
configurable at startup, adding two faster alternatives to the existing 
SipHash-2-4 implementation.
   
     ## Motivation
   
     The current implementation hard-codes SipHash-2-4 for consistent hash 
parent selection. While secure and DoS-resistant, it may not be optimal for all 
deployments. This change
     allows operators to choose faster algorithms based on their specific 
performance requirements and threat models.
   
     ## Changes
   
     ### New Hash Implementations (Zero Dependencies)
   
     - **SipHash Template** (`include/tscore/HashSip.h`)
       - Template-based implementation: `ATSHashSip<c_rounds, d_rounds>`
       - **SipHash-1-3** (type alias `ATSHash64Sip13`): ~50% faster than 
SipHash-2-4
       - **SipHash-2-4** (type alias `ATSHash64Sip24`): Existing algorithm, now 
template-based
       - License: CC0 (public domain, ASF Category A)
       - Zero code duplication between variants
       - Header-only template with compile-time optimization
   
     - **HashWyhash** (`include/tscore/HashWyhash.h`, 
`src/tscore/HashWyhash.cc`)
       - Wyhash v4.1: ~3-5x faster than SipHash-2-4
       - License: Unlicense (public domain, ASF Category A)
       - DoS-resistant, processes 32-byte blocks
       - Uses `seed=0` for deterministic behavior
   
     ### Configuration Infrastructure
   
     - **New Configuration Variable:**
       ```yaml
       proxy.config.http.parent_proxy.consistent_hash_algorithm: siphash24
     - Values: siphash24 (default), siphash13, wyhash
     - Only affects round_robin=consistent_hash in parent.config
     - Requires restart to take effect
     - Implementation:
       - Added ParentHashAlgorithm enum to ParentSelection.h
       - Factory pattern in ParentConsistentHash::createHashInstance()
       - Config reading in ParentRecord::Init()
       - Registered in RecordsConfig.cc
   
     Testing
   
     745 assertions, all passing:
   
     - Unit Tests (44 assertions)
       - test_HashAlgorithms.cc: Comprehensive tests for HashSip13 and Wyhash
       - Tests: determinism, empty input, single byte, block boundaries, 
incremental updates, URL patterns, clear/reuse
     - Integration Tests (14 assertions)
       - test_ParentHashConfig.cc: Config parsing and validation
       - Tests: valid inputs, invalid input fallback, case sensitivity, 
backward compatibility
     - Related Tests (687 assertions)
       - test_NextHopConsistentHash: 111 assertions
       - test_NextHopRoundRobin: 55 assertions
       - test_NextHopStrategyFactory: 521 assertions
   
     Documentation
   
     - records.yaml.en.rst
       - Full configuration documentation
       - Performance characteristics for each algorithm
       - Migration warning about request redistribution
     - parent.config.en.rst
       - Hash algorithm reference in consistent_hash section
       - Cross-reference to records.yaml for details
   
     Backward Compatibility
   
     Fully backward compatible:
   
     - Default remains siphash24 (existing behavior unchanged)
     - All hash implementations use seed=0 for deterministic behavior across 
restarts
     - Existing tests pass with no regressions
     - No changes to parent selection logic, only hash implementations
   
     Migration Consideration:
   
     Changing the hash algorithm will cause requests to be redistributed 
differently across parent proxies. This can lead to cache churn and increased 
origin load during the
     transition. Plan migrations carefully and consider doing them during 
low-traffic periods.
   
     Performance Characteristics
   
     | Algorithm | Speed vs SipHash-2-4 | Compression Rounds     | Finalization 
Rounds | DoS Resistant |
     
|-----------|----------------------|------------------------|---------------------|---------------|
     | siphash24 | Baseline (1.0x)      | 2                      | 4            
       | ✅ Yes         |
     | siphash13 | ~1.5x faster         | 1                      | 3            
       | ✅ Yes         |
     | wyhash    | ~3-5x faster         | N/A (different design) | N/A          
       | ✅ Yes         |
   
     Implementation Highlights
   
     - Template-Based SipHash: Uses ATSHashSip<c_rounds, d_rounds> template to 
eliminate code duplication between SipHash variants. Type aliases 
ATSHash64Sip24 and ATSHash64Sip13
     provide convenient access. Compiler optimizes loops at compile time for 
zero runtime overhead.
     - Deterministic Seeding: All hash implementations use seed=0 to ensure 
consistent parent selection across server restarts, preventing cache churn.
     - Factory Pattern: ParentConsistentHash::createHashInstance() selects hash 
algorithm based on config, with fallback to SipHash-2-4 for unknown values.
   
     Future Work
   
     - Phase 2: Add XXH3 if an external dependency is acceptable
     - Phase 3: Implement per-parent-set hash configuration with configurable 
seed values
   
     Testing Instructions
   
     1. Build with changes: cmake --build build
     2. Run hash tests: ./build/src/tscore/test_tscore "[HashSip13]" 
"[HashWyhash]"
     3. Run config tests: ./build/src/proxy/unit_tests/test_proxy
     4. Verify default: Check that 
proxy.config.http.parent_proxy.consistent_hash_algorithm defaults to siphash24 
in configs/records.yaml.default.in
   
     ## Configuration Example
   
     **Global hash algorithm setting (in records.yaml):**
     ```yaml
     http:
       parent_proxy:
         consistent_hash_algorithm: wyhash  # or siphash24, siphash13
   
     Parent selection rule (in parent.config):
     # The hash algorithm configured in records.yaml will be used
     dest_domain=example.com parent=p1:80,p2:80 round_robin=consistent_hash
   
     Note: The hash algorithm is a global setting that affects all parent 
selections using round_robin=consistent_hash. It cannot be configured 
per-parent-set in this
     implementation. Future work (Phase 3) may add per-parent-set hash 
configuration.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to