Re: gr-sleipnir Digital Voice Protocol - Test Results and Feedback Request

Håken Hveem Fri, 12 Dec 2025 08:11:57 -0800

Den 12.12.2025 16:59, skrev Håken Hveem:

Great questions! Let me address each:


    Synchronization
You're right to be curious - the ±500 Hz tolerance at 4 kSym/s (~12.5%of symbol rate) is better than expected. Here's what's happening:
*Current implementation:*

  * GNU Radio's |digital.clock_recovery_mm_ff| for symbol timing
  * |digital.fll_band_edge_cc| for coarse frequency correction
  * Frame sync via correlation with known preamble (64-symbol
    Barker-like sequence)
The FLL band edge tracker is doing most of the heavy lifting forfrequency offset. It's designed for continuous-phase modulations andworks reasonably well with GFSK (which is what I'm actually using, notraw FSK - should have been clearer about that).
*But you've identified a weak point:* I haven't implemented finecarrier tracking after frame sync. The coarse FLL + preamblecorrelation is surprisingly robust in simulation, but I'm skepticalit'll hold up in real hardware with drift during the frame. This is onthe Q2 2025 roadmap - probably need a pilot tone or decision-directedtracking.
The ±1 kHz failure is likely the FLL's tracking range limit. Beyondthat, it loses lock entirely.
    LDPC Code Choice
You're exactly right about the trade-off. I'm using custom codes fromthe DVB-S2/T2 family:
  * 4FSK: Rate 3/4, n=1000, k=750 (4FSK needs higher rate for 6 kbps Opus)
  * 8FSK: Rate 2/3, n=1125, k=750 (8FSK can afford lower rate for
    better protection)
These are *short* by LDPC standards (DVB-S2 uses 64800 bits!). I chosethem because:
  * Target latency: <80ms total (40ms Opus frame is non-negotiable)
  * Belief propagation still converges in ~20-30 iterations with these
    sizes
  * ~2 dB from Shannon limit (not optimal, but acceptable for voice)
*Could I do better?* Probably with optimized irregular LDPC codesdesigned specifically for 750-1000 bit blocks, but that's researchterritory. I borrowed proven codes from DVB to avoid reinventing thewheel.
    ChaCha20 and Frame Loss
*You've identified a real issue!* ChaCha20 is indeed a stream cipher,and losing synchronization is catastrophic.
*Current approach:*

  * Each 40ms frame is encrypted *independently*
  * Frame number used as nonce (increments each frame)
  * Key stays constant for the transmission
  * Format: |ChaCha20(frame_data, key, nonce=frame_number)|

*How frame loss is handled:*

 1. Receiver knows expected frame sequence (from superframe counter)
 2. If frame N is lost (FER), receiver knows to skip that nonce
 3. Frame N+1 arrives → use nonce=N+1 → decrypts correctly
 4. No keystream reinitialization needed
*The trick:* Frame numbers are transmitted in a *separateauthenticated header* (not encrypted), protected by its own LDPC code.This header has stronger protection (rate 1/3) than the voice payload(rate 2/3 or 3/4).
*Failure mode:* If the header is corrupted, the entire frame isdiscarded. This is why crypto overhead appears as increased FER - it'snot the encryption itself, but the additional header that can fail.
*Alternative considered:* Your multi-instance round-robin idea isclever! I didn't implement it because:
  * Added complexity (state management)
  * Frame numbers solve it more simply
  * Voice can tolerate 5% loss (Opus error concealment)
For data applications (where 5% loss is unacceptable), your approachmight be necessary.
    Soft-Decision LDPC

*You're correct - I'm NOT using soft-decision decoding yet!*
Current implementation uses |gr-fec|'s |ldpc_decoder| in*hard-decision mode*:
  * FSK demod → hard bits (0/1) → LDPC decoder
  * This creates the 4-5% FER floor

*Why not soft-decision?*

  * Honestly: Implementation complexity
  * |gr-fec|'s sum-product decoder exists, but I couldn't get it
    working reliably in time for Phase 3 testing
  * Hard-decision was "good enough" to validate the protocol design

*Roadmap ( 2025 ?):* Implement soft-decision properly:

  * FSK demod → log-likelihood ratios (LLRs) → sum-product algorithm
  * Expected improvement: 1-2 dB waterfall, FER floor <0.1%
  * This should bring me closer to theoretical LDPC performance
You're right that I'm leaving performance on the table. Hard-decisionwas a pragmatic choice for "get it working first."
    FSK vs Other Modulations

*You're absolutely right - FSK is suboptimal for power efficiency!*

*Why I chose GFSK:*

 1. *Simplicity:* Easy to implement, debug, and test
 2. *Constant envelope:* Good for non-linear amplifiers (typical ham
    radios)
 3. *Narrow bandwidth:* 9-12 kHz fits in 12.5 kHz channels
 4. *Phase continuity:* GFSK (not raw FSK) maintains phase across symbols

*Better alternatives:*

  * *QPSK/OQPSK:* 2 bits/symbol, better power efficiency, needs linear amp
  * *APSK:* Even better, but more complex equalization
  * *GMSK:* Similar to GFSK, proven (used in GSM)

*Why not QPSK?*

  * Requires linear amplifier (not always available in ham radios)
  * More sensitive to phase noise
  * More complex synchronization
*Future consideration:* A "high-performance mode" with QPSK for fixedstations with linear amps could be interesting. Would gain 2-3 dB overGFSK.
But for initial deployment targeting typical ham gear (Class Cfinals), constant-envelope modulation seemed safer.
    Target Frequencies

*Primary target:* VHF/UHF (144-148 MHz, 430-440 MHz)

  * Where digital voice is most active
  * Hardware available (MCM-iMX93 with SX1255 covers 400-930 MHz)

*Possible extension:* 6m, 2m, 70cm, 23cm

  * Protocol is frequency-agnostic
  * RF hardware is the limiting factor


    Back-of-Envelope Check

Your math is correct:

  * 8FSK: 8 kbps audio → 12 kbps coded → 4 kSym/s
  * Symbol rate: 4000 symbols/second
  * ±500 Hz tolerance: 12.5% of symbol rate

*This IS suspiciously good!* You're right to question it.

*Possible explanations:*

 1. *GFSK bandwidth:* I'm using BT=0.5, which spreads the spectrum
    more than minimum-shift FSK. This might make the band-edge FLL
    more robust.
 2. *Preamble length:* 64 symbols gives the FLL time to converge
    before frame sync.
 3. *Simulation optimism:* GNU Radio's FLL might be more ideal than
    real hardware. Hardware testing will reveal the truth.
*Action item:* I should characterize the FLL's actual tracking rangeexperimentally, not just trust simulation. This is hardware validationwork.
    Summary

You've identified several areas for improvement:

 1. *Synchronization:* Need fine tracking after frame sync
 2. *LDPC codes:* Could be optimized for block size, but DVB codes work
 3. *ChaCha20:* Solved with frame numbers, but header overhead creates FER
 4. *Soft-decision:* Its a tricky thing to implement, so it depends on
    the cost and advantage of it.
 5. *Modulation:* GFSK is suboptimal but practical for ham gear
 6. *Frequency sync:* Need to validate ±500 Hz claim with real hardware
Really appreciate the technical depth here - these are exactly thequestions that improve the design. The honest answer is: simulationshows promise, but hardware will reveal where the weaknesses are.That's why on-air testing is critical.
I am waiting for the LinHT to release some time next year..

73

Re: gr-sleipnir Digital Voice Protocol - Test Results and Feedback Request

Reply via email to