aglinxinyuan opened a new pull request, #5554:
URL: https://github.com/apache/texera/pull/5554

   ### What changes were proposed in this PR?
   
   Pin behavior of two previously-uncovered modules in 
`engine/architecture/logreplay`. No production-code changes.
   
   | Spec | Source class | Tests |
   | --- | --- | --- |
   | `EmptyReplayLoggerSpec` | `EmptyReplayLogger` | 9 |
   | `ReplayLogGeneratorSpec` | `ReplayLogGenerator` | 10 |
   
   Both spec files follow the `<srcClassName>Spec.scala` one-to-one convention.
   
   **Behavior pinned — `EmptyReplayLogger`**
   
   | Surface | Contract |
   | --- | --- |
   | `drainCurrentLogRecords(step)` | returns an empty `Array[ReplayLogRecord]` 
regardless of `step` (positive, zero, `Long.MaxValue`, negative); returns a 
non-null array; element type is `ReplayLogRecord` (compile-time enforced) |
   | `markAsReplayDestination` | no-op for any 
`EmbeddedControlMessageIdentity`; does not accumulate state |
   | `logCurrentStepWithMessage` | no-op for any `(step, channelId, msg)` 
triple, including `msg = None` and 100 successive calls |
   | Trait conformance | usable through the `ReplayLogger` interface |
   
   **Behavior pinned — `ReplayLogGenerator.generate`**
   
   | Surface | Contract |
   | --- | --- |
   | `getStorage(None)` (EmptyRecordStorage) | returns `(empty queue, empty 
queue)` |
   | empty `VFSRecordStorage` file | returns `(empty queue, empty queue)` |
   | only `ProcessingStep` records | enqueues all into the steps queue, 
preserving insertion order |
   | only `MessageContent` records | enqueues all into the messages queue, 
preserving insertion order |
   | interleaved steps + messages | partitions correctly by type, preserving 
per-type order |
   | `ReplayDestination(id)` matching `replayTo` | short-circuits — records 
after the cut are NOT enqueued |
   | `ReplayDestination(id)` NOT matching `replayTo` | is silently skipped, 
iteration continues |
   | multiple matching `ReplayDestination` | stops at the FIRST one |
   | unknown record type (e.g. `TerminateSignal`) | throws `RuntimeException` 
with a diagnostic message |
   
   **Notes**
   
   While writing `ReplayLogGeneratorSpec` I discovered a **production 
stream-leak** in `ReplayLogGenerator.generate`: when it hits the matching 
`ReplayDestination` it short-circuits via a non-local `return`, leaving the 
`SequentialRecordReader`'s underlying `Input` stream open. On Windows the 
leaked file handle blocks subsequent temp-dir deletion. The spec's `cleanup` 
helper tolerates the resulting `FileSystemException` so the case still passes; 
the real fix is at the source and is out of scope for a test-coverage PR (will 
file a follow-up issue).
   
   `ReplayLogGeneratorSpec` uses a temp-dir-backed 
`VFSRecordStorage[ReplayLogRecord]` and the production `AmberRuntime.serde` 
(suite-local `ActorSystem` injected via reflection, torn down in `afterAll`) — 
same harness pattern as `CheckpointSubsystemSpec` / `ClientEventSpec` / 
`VFSRecordStorageSpec`.
   
   ### Any related issues, documentation, discussions?
   
   Closes #5550.
   
   ### How was this PR tested?
   
   Pure unit-test additions; verified locally with:
   
   - `sbt "WorkflowExecutionService/testOnly 
org.apache.texera.amber.engine.architecture.logreplay.EmptyReplayLoggerSpec 
org.apache.texera.amber.engine.architecture.logreplay.ReplayLogGeneratorSpec"` 
— 19 tests, all green
   - `sbt scalafmtCheckAll` — clean
   - CI to confirm
   
   ### Was this PR authored or co-authored using generative AI tooling?
   
   Generated-by: Claude Code (Sonnet 4.5)


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to