mcvsubbu commented on code in PR #9163:
URL: https://github.com/apache/pinot/pull/9163#discussion_r938998080
##########
pinot-core/src/main/java/org/apache/pinot/core/data/manager/realtime/LLRealtimeSegmentDataManager.java:
##########
@@ -611,6 +616,19 @@ private boolean processStreamEvents(MessageBatch
messagesAndOffsets, long idlePi
return prematureExit;
}
+ /**
+ * This method decides whether decoder exceptions are to be ignored. If the
decoder exception cannot be ignored then
+ * the exception is rethrown.
+ * @param e Exception thrown whilst decoding a message from the batch
+ */
+ private void handleDecoderException(RuntimeException e) {
+ if (_partitionLevelStreamConfig.getDecoderErrorsIgnore()) {
+ _segmentLogger.warn("Ignoring exception while decoding message.", e);
Review Comment:
please dont log a warning here. Incrementing a stat is good enough. We seem
to have multiple choices to do that, though. I think the error handling here
need some cleanup.
The choices are:
- Bump numRowsErrored (currently done if transform throws exceptions OR we
are not able to index the row for some reason. We also add an error stack and
bump two metrics later ONCE when we complete completion)
- Increment the metric `INVALID_REALTIME_ROWS_DROPPED` (we do that in the
case when decoder returns null but we don't add it to error stack)
Clearly, some cleanup needs to happen, and we need to think through how we
want errors/exceptions handled.
One requirement is that we need an option to retry on certain cases as
decided by the decoder.
Here is one proposal.
- Move the call to `processStream Events()` up under the try/catch block.
- Change `processStreamEvents() to throw whatever exception we want --
permanent or temporary depending on what behavior we desire. The contract is
that you can do ONE of the two. On temporary, a retry will happen.
Also, cleanup the stats bumping. Don't any of the other errors (transform
errors) warrant the bumping of `INVALID_REALTIME_ROWS_DROPPED` ?
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]