[ https://issues.apache.org/jira/browse/LUCENE-8830?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16855964#comment-16855964 ]
Adrien Grand commented on LUCENE-8830: -------------------------------------- The logic looks fine to me, the omitNorms flag is just set later in the process. That said that logic is a bit complicated so I could have missed something. I tried to reproduce the bug you are describing without success. Can you provide us with a test case? > DefaultIndexingChain.getOrAddField method ignores omitNorms from FieldType > -------------------------------------------------------------------------- > > Key: LUCENE-8830 > URL: https://issues.apache.org/jira/browse/LUCENE-8830 > Project: Lucene - Core > Issue Type: Bug > Components: core/index > Affects Versions: 6.6.1 > Reporter: Ishan Sri > Priority: Major > > Norms are being computed and written even when *omitNorms is set to true* in > the fieldTypes. I chased the issue and found that the method *getOrAddField* > tries to create a *FieldInfo* object in the 1st pass. By default this object > has omitNorms to false. The method sets the *indexOptions* as specified in > the fieldType on this newly created object but doesn't do the same for > *omitNorms.* This effectively overrides this flag which creates issues down > the line. > > Here's the code snippet for the method with the *fieldInfos.getOrAdd* call > > > {code:java} > private PerField getOrAddField(String name, IndexableFieldType fieldType, > boolean invert) { > // Make sure we have a PerField allocated > final int hashPos = name.hashCode() & hashMask; > PerField fp = fieldHash[hashPos]; > while (fp != null && !fp.fieldInfo.name.equals(name)) { > fp = fp.next; > } > if (fp == null) { > // First time we are seeing this field in this segment > FieldInfo fi = fieldInfos.getOrAdd(name); > // Messy: must set this here because e.g. FreqProxTermsWriterPerField looks > at the // initial IndexOptions to decide what arrays it must create). Then, > we also must // set it in PerField.invert to allow for later downgrading of > the index options: > fi.setIndexOptions(fieldType.indexOptions()); > fp = new PerField(fi, invert); > ... {code} > > > > The *getOrAdd* method below instantiates a new object with omitNorms set to > false as the 4th parameter. > > {code:java} > /** Create a new field, or return existing one. */ > public FieldInfo getOrAdd(String name) { > FieldInfo fi = fieldInfo(name); > > if (fi == null) { > // This field wasn't yet added to this in-RAM > // segment's FieldInfo, so now we get a global > // number for this field. If the field was seen > // before then we'll get the same name and number, > // else we'll allocate a new one: > final int fieldNumber = globalFieldNumbers.addOrGet(name, -1, > DocValuesType.NONE, 0, 0); > > fi = new FieldInfo(name, fieldNumber, false, false, false, IndexOptions.NONE, > DocValuesType.NONE, -1, new HashMap<>(), 0, 0); > assert !byName.containsKey(fi.name); > globalFieldNumbers.verifyConsistent(Integer.valueOf(fi.number), fi.name, > DocValuesType.NONE); > byName.put(fi.name, fi); > } > return fi; > }{code} > > This will cause norms to always be computed which not only produces incorrect > scores but also impacts the disk usage if there are many documents with > multiple fields which have this flag set to true but ignored -- This message was sent by Atlassian JIRA (v7.6.3#76005) --------------------------------------------------------------------- To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org