aglinxinyuan opened a new issue, #5550:
URL: https://github.com/apache/texera/issues/5550

   ## Background
   
   Two modules in `engine/architecture/logreplay` currently lack a dedicated 
unit-spec:
   
   | Source class | Purpose |
   | --- | --- |
   | `EmptyReplayLogger` | Null-object `ReplayLogger` whose 
`logCurrentStepWithMessage` / `markAsReplayDestination` are no-ops and 
`drainCurrentLogRecords` returns an empty array |
   | `ReplayLogGenerator` | Pure decoder: walks a 
`SequentialRecordStorage[ReplayLogRecord]` and partitions the records into a 
`(steps, messages)` queue pair, stopping at the requested `ReplayDestination` |
   
   Both sit on the fault-tolerance / replay hot path. The `ReplayLogger` / 
`OrderEnforcer` traits and `ReplayOrderEnforcer` are out of scope here (heavier 
dependencies — covered separately).
   
   ## Behavior to pin
   
   ### `EmptyReplayLogger`
   
   | Surface | Contract |
   | --- | --- |
   | `logCurrentStepWithMessage` | no-op (returns Unit, no exception) |
   | `markAsReplayDestination` | no-op |
   | `drainCurrentLogRecords` | returns an empty `Array[ReplayLogRecord]` 
regardless of step |
   | typing | is a `ReplayLogger` (compile-time enforced) |
   
   ### `ReplayLogGenerator.generate`
   
   | Surface | Contract |
   | --- | --- |
   | empty storage (`getStorage(None)`) | returns `(empty queue, empty queue)` |
   | storage containing only `ProcessingStep` records | enqueues all into 
`steps`; `messages` queue is empty |
   | storage containing only `MessageContent` records | enqueues all into 
`messages`; `steps` queue is empty |
   | storage containing both kinds, interleaved | partitions correctly by type, 
preserving per-type insertion order |
   | `ReplayDestination(id)` matching `replayTo` | short-circuits early — 
records after that point are NOT enqueued |
   | `ReplayDestination(id)` NOT matching `replayTo` | is silently skipped 
(does NOT end iteration) |
   | unknown record type | throws `RuntimeException` |
   
   ## Scope
   
   - New spec files (one per source class per the spec-filename convention):
     - `EmptyReplayLoggerSpec.scala`
     - `ReplayLogGeneratorSpec.scala`
   - No production-code changes.
   - The `ReplayLogGenerator` spec uses an in-memory `SequentialRecordStorage` 
(the `VFSRecordStorage` backed by a temp dir; same pattern as the existing 
checkpoint specs) and the production `AmberRuntime.serde` for round-trip.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to