Thanks for your help! Here is an example, I have 100 items, each with a set of potentially unique attributes. Attributes could be color, size, length, density, etc. So an example document could be:
Id: 1 ItemType: foo Blob-field: all sorts of text handled normally Outer-Color: Red Size: Large Temperature: hot Etc: Etc: I would like to get the sum of frequency counts for each term in the fields I specify across the search results. I can just iterate through the documents and use getTermFreqVector() for each desired field on each document, then sum that; but this seems slow to me. -----Original Message----- From: Michael D. Curtin [mailto:[EMAIL PROTECTED] Sent: Wednesday, November 15, 2006 11:35 AM To: java-user@lucene.apache.org Subject: Re: term vectors Phil Rosen wrote: > I am building an application that requires I index a set of documents on > the scale of hundreds of thousands. > > A document can have a varying number of attribute fields with an unknown > set of potential values. I realize that just indexing a blob of fields > would be much faster, however I need to bin the search results based on > common attributes; as different types of attributes could potentially have > overlapping values a single blob for all attributes wont work. > > My question is this, is there a way to get term frequencies for a set of > documents or hits, without using getTermFreqVector() on each document and > each attribute field? As I could have hundreds of results, each with > dozens of attribute fields, looping getTermFreqVector() would be very > slow. If there isn't something inherent to lucene, has anyone seen an > extension that could accomplish this? Could you give an example of what you're starting with, what a search looks like, and what you want out? It sounds almost like you're looking for a custom statistical analysis of hits, which I doubt Lucene is going to have for you, out of the box ... --MDC --------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED] --------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]