Number of terms in a SOLR field
Hi all, I am attempting to test some changes I made to my DIH based indexing process. The changes only affect the way I describe my fields in data-config.xml, there should be no changes to the way the data is indexed or stored. As a QA check I was wanting to compare the results from indexing the same data before/after the change. I was looking for a way of getting counts of terms in each field. I guess Luke etc most allow this but how? Regards Fergus.
Re: Number of terms in a SOLR field
Fergus McMenemie wrote: Hi all, I am attempting to test some changes I made to my DIH based indexing process. The changes only affect the way I describe my fields in data-config.xml, there should be no changes to the way the data is indexed or stored. As a QA check I was wanting to compare the results from indexing the same data before/after the change. I was looking for a way of getting counts of terms in each field. I guess Luke etc most allow this but how? Luke uses brute force approach - it traverses all terms, and counts terms per field. This is easy to implement yourself - just get IndexReader.terms() enumeration and traverse it. -- Best regards, Andrzej Bialecki ___. ___ ___ ___ _ _ __ [__ || __|__/|__||\/| Information Retrieval, Semantic Web ___|||__|| \| || | Embedded Unix, System Integration http://www.sigram.com Contact: info at sigram dot com
Re: Number of terms in a SOLR field
Fergus McMenemie wrote: Hi all, I am attempting to test some changes I made to my DIH based indexing process. The changes only affect the way I describe my fields in data-config.xml, there should be no changes to the way the data is indexed or stored. As a QA check I was wanting to compare the results from indexing the same data before/after the change. I was looking for a way of getting counts of terms in each field. I guess Luke etc most allow this but how? Luke uses brute force approach - it traverses all terms, and counts terms per field. This is easy to implement yourself - just get IndexReader.terms() enumeration and traverse it. Thanks Andrzej This is just a one off QA check. How do I get Luke to display terms and counts? -- Best regards, Andrzej Bialecki Fergus. --
Re: Number of terms in a SOLR field
Fergus McMenemie wrote: Fergus McMenemie wrote: Hi all, I am attempting to test some changes I made to my DIH based indexing process. The changes only affect the way I describe my fields in data-config.xml, there should be no changes to the way the data is indexed or stored. As a QA check I was wanting to compare the results from indexing the same data before/after the change. I was looking for a way of getting counts of terms in each field. I guess Luke etc most allow this but how? Luke uses brute force approach - it traverses all terms, and counts terms per field. This is easy to implement yourself - just get IndexReader.terms() enumeration and traverse it. Thanks Andrzej This is just a one off QA check. How do I get Luke to display terms and counts? 1. get Luke 0.9.9 2. open index with Luke 3. Look at the Overview panel, you will see the list titled Available fields and term counts per field. -- Best regards, Andrzej Bialecki ___. ___ ___ ___ _ _ __ [__ || __|__/|__||\/| Information Retrieval, Semantic Web ___|||__|| \| || | Embedded Unix, System Integration http://www.sigram.com Contact: info at sigram dot com
Re: Number of terms in a SOLR field
Fergus McMenemie wrote: Fergus McMenemie wrote: Hi all, I am attempting to test some changes I made to my DIH based indexing process. The changes only affect the way I describe my fields in data-config.xml, there should be no changes to the way the data is indexed or stored. As a QA check I was wanting to compare the results from indexing the same data before/after the change. I was looking for a way of getting counts of terms in each field. I guess Luke etc most allow this but how? Luke uses brute force approach - it traverses all terms, and counts terms per field. This is easy to implement yourself - just get IndexReader.terms() enumeration and traverse it. Thanks Andrzej This is just a one off QA check. How do I get Luke to display terms and counts? 1. get Luke 0.9.9 2. open index with Luke 3. Look at the Overview panel, you will see the list titled Available fields and term counts per field. Thanks, That got me going, and I felt a little stupid after stumbling across http://wiki.apache.org/solr/LukeRequestHandler Regards Fergus