gerlowskija commented on code in PR #1040:
URL: https://github.com/apache/solr/pull/1040#discussion_r980343005


##########
solr/core/src/java/org/apache/solr/update/processor/IgnoreLargeDocumentProcessorFactory.java:
##########
@@ -59,15 +71,25 @@ public void init(NamedList<?> args) {
   public UpdateRequestProcessor getInstance(
       SolrQueryRequest req, SolrQueryResponse rsp, UpdateRequestProcessor 
next) {
     return new UpdateRequestProcessor(next) {
+
       @Override
       public void processAdd(AddUpdateCommand cmd) throws IOException {
         long docSize = 
ObjectSizeEstimator.estimate(cmd.getSolrInputDocument());
         if (docSize / 1024 > maxDocumentSize) {
+          handleViolatingDoc(cmd, docSize);
+        } else {
+          super.processAdd(cmd);
+        }
+      }
+
+      private void handleViolatingDoc(AddUpdateCommand cmd, long 
estimatedSizeBytes) {
+        if (onlyLogErrors) {
+          log.warn("Skipping doc {} bc size {} exceeds limit {}", 
cmd.getPrintableId(), estimatedSizeBytes / 1024, maxDocumentSize);

Review Comment:
   I actually ended up keeping the estimated size in bytes here, with the 
"limit" value still in KB.
   
   "Limit" makes sense to keep in kb, because that's the unit that users 
specify when configuring this URP.  As for the estimated size, I thought about 
using "kb" for that as well, but I was worried that users might be confused in 
cases where the integer-division/rounding needed to convert bytes to kb might 
result in displaying a message where the estimated size and limit are equal.
   
   In short I didn't want to have us printing log messages that looked like:
   
   > Skipping doc asdf bc size 2kb exceeds limit 2kb
   
   But I've added explicit units to the variables and log-messages involved 
here, so hopefully that's good enough to address the core of your issue.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org
For additional commands, e-mail: issues-h...@solr.apache.org

Reply via email to