[ https://issues.apache.org/jira/browse/SOLR-13850?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16954664#comment-16954664 ]
David Smiley commented on SOLR-13850: ------------------------------------- I'm not even sure it's meaningful to have a pre-analyzed field be "stored". > Atomic Updates with PreAnalyzedField > ------------------------------------ > > Key: SOLR-13850 > URL: https://issues.apache.org/jira/browse/SOLR-13850 > Project: Solr > Issue Type: Bug > Security Level: Public(Default Security Level. Issues are Public) > Affects Versions: 7.7.2, 8.2 > Environment: Ubuntu 16.04 LTS / Java 8 (Zulu), Windows 10 / Java 11 > (Oracle) > Reporter: Oleksandr Drapushko > Priority: Critical > Labels: AtomicUpdate > > If you try to update non pre-analyzed fields in a document using atomic > updates, data in pre-analyzed fields (if there is any) will be lost. > *Steps to reproduce* > 1. Index this document into techproducts > {code:json} > { > "id": "a", > "n_s": "s1", > "pre": > "{\"v\":\"1\",\"str\":\"Alaska\",\"tokens\":[{\"t\":\"alaska\",\"s\":0,\"e\":6,\"i\":1}]}" > } > {code} > 2. Query the document > {code:json} > { > "response":{"numFound":1,"start":0,"maxScore":1.0,"docs":[ > { > "id":"a", > "n_s":"s1", > "pre":"Alaska", > "_version_":1647475215142223872}] > }} > {code} > 3. Update using atomic syntax > {code:json} > { > "add": { > "doc": { > "id": "a", > "n_s": {"set": "s2"} > }}} > {code} > 4. Observe the warning in solr log > UI: > {noformat} > WARN x:techproducts_shard2_replica_n6 PreAnalyzedField Error parsing > pre-analyzed field 'pre' > {noformat} > solr.log: > {noformat} > WARN (qtp1384454980-23) [c:techproducts s:shard2 r:core_node8 > x:techproducts_shard2_replica_n6] o.a.s.s.PreAnalyzedField Error parsing > pre-analyzed field 'pre' => java.io.IOException: Invalid JSON type > java.lang.String, expected Map > at > org.apache.solr.schema.JsonPreAnalyzedParser.parse(JsonPreAnalyzedParser.java:86) > {noformat} > 5. Query the document again > {code:json} > { > "response":{"numFound":1,"start":0,"maxScore":1.0,"docs":[ > { > "id":"a", > "n_s":"s2", > "_version_":1647475461695995904}] > }} > {code} > *Result*: There is no 'pre' field in the document anymore. > _My thoughts on it_ > 1. Data loss can be prevented if the warning will be replaced with error > (re-throwing exception). Atomic updates for such documents still won't work, > but updates will be explicitly rejected. > 2. Solr tries to read the document from index, merge it with input document > and re-index the document, but when it reads indexed pre-analyzed fields the > format is different, so Solr cannot parse and re-index those fields properly. -- This message was sent by Atlassian Jira (v8.3.4#803005) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org