[jira] [Commented] (SOLR-13699) maxChars no longer working as designed on CopyField

Chris Troullis (JIRA) Thu, 15 Aug 2019 12:21:58 -0700


    [ 
https://issues.apache.org/jira/browse/SOLR-13699?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16908422#comment-16908422
 ]


Chris Troullis commented on SOLR-13699:
---------------------------------------

[~erickerickson] So after looking through the CopyFieldTest unit tests, I found 
that we are already testing the maxChars functionality via the 
testCopyFieldFunctionality() test, and the maxChars functionality is working 
properly when the test runs! 

After further digging it seems that the issue only occurs when docs are indexed 
in Binary format, using the JavaBinCodec, as this is where there change was 
made to read strings as a ByteArrayUtf8CharSequence instead of a string. It 
appears that the test framework indexes docs in XML format, which does not use 
the JavaBinCodec, so the fields are read as strings, and the maxChars works as 
designed. 

So, in other words, it's still an issue, but looks like it only effects docs 
indexed in Binary format. Since it looks like the test framework only supports 
indexing in XML format (although I didn't look that hard), do you have any 
suggestions on how to properly unit test this?

> maxChars no longer working as designed on CopyField
> ---------------------------------------------------
>
>                 Key: SOLR-13699
>                 URL: https://issues.apache.org/jira/browse/SOLR-13699
>             Project: Solr
>          Issue Type: Bug
>      Security Level: Public(Default Security Level. Issues are Public) 
>    Affects Versions: 7.7, 7.7.1, 7.7.2, 8.0, 8.0.1, 8.1, 8.2, 7.7.3, 8.1.1, 
> 8.1.2
>            Reporter: Chris Troullis
>            Priority: Major
>
> We recently upgraded from Solr 7.3 to 8.1, and noticed that the maxChars 
> property on a copy field is no longer functioning as designed, while indexing 
> via SolrJ. Per the most recent documentation it looks like there have been no 
> intentional changes as to the functionality of this property, so I assume 
> this is a bug.
>   
>  In debugging the issue, it looks like the bug was caused by SOLR-12992. In 
> DocumentBuilder where the maxChar limit is applied, it first checks if the 
> value is instanceof String. As of SOLR-12992, string values are now coming in 
> as ByteArrayUtf8CharSequence (unless they are above a certain size as defined 
> by JavaBinCodec.MAX_UTF8_SZ), so they are failing the instanceof String 
> check, and the maxChar truncation is not being applied. I am currently not 
> sure if this issue is limited to indexing via SolrJ or if it applies to 
> documents indexed via any means



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[jira] [Commented] (SOLR-13699) maxChars no longer working as designed on CopyField

Reply via email to