During indexing, an inverted index is made with the term of the documents
and the term frequency, document frequency etc. are stored. If I know
correctly, the exact document length is not stored in the index to reduce
the size. Instead, a normalized length is stored for each document.
However, for most retrieval functions, document length is a necessary
component and the normalized doc-length is used in those functions.

I want to ask how exactly the normalization process is performed. The
question might have been answered already, but I was unable to find the
proper response. Your help is much appreciated.

Thanks.

Reply via email to