liran-funaro commented on a change in pull request #10336:
URL: https://github.com/apache/druid/pull/10336#discussion_r484402253



##########
File path: 
processing/src/test/java/org/apache/druid/segment/incremental/OnheapIncrementalIndexBenchmark.java
##########
@@ -227,9 +224,7 @@ protected AddToFactsResult addToFacts(
           }
           catch (ParseException e) {
             // "aggregate" can throw ParseExceptions if a selector expects 
something but gets something else.
-            if (getReportParseExceptions()) {
-              throw e;
-            }
+            throw e;

Review comment:
       If we never intend to log/report this exception, we can remove the 
try/catch clause entirely.

##########
File path: 
processing/src/main/java/org/apache/druid/segment/incremental/OnheapIncrementalIndex.java
##########
@@ -186,7 +185,7 @@ protected AddToFactsResult addToFacts(
       } else {
         // We lost a race
         aggs = concurrentGet(prev);
-        parseExceptionMessages = doAggregate(metrics, aggs, rowContainer, row);
+        doAggregate(metrics, aggs, rowContainer, row, parseExceptionMessages);

Review comment:
       Currently (before and after this PR), a row that contains partially 
corrupted/incorrect data is aggregated anyway (partially).
   This may result in wrong statistics inference by the user due to 
inconsistencies between the number of rows each aggregator accumulated.
   IMO, the decision of allowing this scenario should be of the user. 
   But since currently, there is no way to roll-back aggregation, I don't see 
how this can be easily implemented.
   Maybe adding parse as a preliminary step to aggregation, but this is an 
entirely different subject.
   "return early without the second aggregation" would only aggravate this 
scenario, so I don't think it is the way to go either. 




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org
For additional commands, e-mail: commits-h...@druid.apache.org

Reply via email to