rrwright commented on code in PR #744:
URL: https://github.com/apache/pekko-http/pull/744#discussion_r2361409926


##########
http/src/main/scala/org/apache/pekko/http/scaladsl/unmarshalling/sse/ServerSentEventParser.scala:
##########
@@ -110,18 +120,57 @@ private final class ServerSentEventParser(
       import shape._
 
       private val builder = new Builder()
+      private lazy val log = Logging(materializer.system, 
classOf[ServerSentEventParser])
+      @volatile private var shouldSkipUntilEventEnd = false
 
       setHandlers(in, out, this)
 
-      override def onPush() = {
+      override def onPush(): Unit = {
         val line = grab(in)
-        if (line == "") { // An event is terminated with a new line
-          if (builder.hasData) // Events without data are ignored according to 
the spec
-            push(out, builder.build())
-          else
-            pull(in)
+        if (shouldSkipUntilEventEnd) { // Max event size was previously 
reached. Skip successive lines until event ends
+          if (line.isEmpty) shouldSkipUntilEventEnd = false // Stop skipping 
when end of event (empty line) is reached
+          pull(in) // Already reported oversized event (below). Drop and 
continue to next line.
+        } else if (maxEventSize > 0 && builder.size + line.length > 
maxEventSize) { // Next line exceeds the size limit
+          shouldSkipUntilEventEnd = true
+          oversizedStrategy match {
+            case OversizedSseStrategy.FailStream =>
+              builder.appendData(line)
+              val event = builder.build()
+              failStage(new IllegalStateException(
+                s"Oversized SSE Event ${event.id.fold("") { id => s"at ID: $id 
" }}" +

Review Comment:
   Hmm, I'm not sure what you mean by "user-controlled". The exception would 
occur in the SSE client. The `id` value comes from the SSE `id:` field which is 
data sent from the server. The meaning of the ID field from the [SSE 
spec](https://html.spec.whatwg.org/multipage/server-sent-events.html) is that 
it "sets the last event ID string," which means it identifies and distinguishes 
each semantically-whole SSE event (which will be spread across multiple lines), 
and is used for tracking events and resuming connections (e.g. browsers 
automatically send `Last-Event-ID` header on reconnect). The spec only limits 
it to a UTF-8 string of non-null characters. No length constraint (hence the 
need for a line limit which is the topic of this PR). In general, it's usually 
a counter that behaves like the numeric offset in Kafka.
   
   I'm not really familiar with the intended use of `ExceptionWithErrorInfo` 
and its docstring isn't helpful. Looking through the code, I see 13 total uses. 
Those uses look like they're all for processing HTTP entities and contexts. I 
don't really have an opinion on whether that's the right exception choice to 
use here because I don't understand the intention of that exception class.
   
   The choice of failing the stage with an `IllegalStateException` is actually 
just preserving what the code used to do previously and keeps the default 
behavior (and exception type) unless the user opts into one of the new choices. 
I'm happy to change it though if there's clear guidance.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to