Re: Zip Files
Luke, Look at the javadocs for java.io.ByteArrayInputStream - it wraps a byte array and makes it accessible as an InputStream. Also see java.util.zip.ZipFile. You should be able to read and parse all contents of the zip file in memory. http://java.sun.com/j2se/1.4.2/docs/api/java/io/ByteArrayInputStream.html On Tue, 1 Mar 2005 12:39:17 -0500, Luke Shannon [EMAIL PROTECTED] wrote: Thanks Ernesto. I'm struggling with how I can work with an array of bytes instead of a Java File. It would be easier to unzip the zip to a temp directory, parse the files and than delete the directory. But this would greatly slow indexing and use up disk space. Luke - Original Message - From: Ernesto De Santis [EMAIL PROTECTED] To: Lucene Users List lucene-user@jakarta.apache.org Sent: Tuesday, March 01, 2005 10:48 AM Subject: Re: Zip Files Hello first, you need a parser for each file type: pdf, txt, word, etc. and use a java api to iterate zip content, see: http://java.sun.com/j2se/1.4.2/docs/api/java/util/zip/ZipInputStream.html use getNextEntry() method little example: ZipInputStream zis = new ZipInputStream(fileInputStream); ZipEntry zipEntry; while(zipEntry = zis.getNextEntry() != null){ //use zipEntry to get name, etc. //get properly parser for current entry //use parser with zis (ZipInputStream) } good luck Ernesto Luke Shannon escribió: Hello; Anyone have an ideas on how to index the contents within zip files? Thanks, Luke - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED] -- Ernesto De Santis - Colaborativa.net Córdoba 1147 Piso 6 Oficinas 3 y 4 (S2000AWO) Rosario, SF, Argentina. - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED] - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED] - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
Re: Search Performance
I should have mentioned, the reason for not doing this the obvious, simple way (just close the Searcher and reopen it if a new version is available) is because some threads could be in the middle of iterating through the search Hits. If you close the Searcher they get a Bad file descriptor IOException. As I found out the hard way :) On Fri, 18 Feb 2005 15:03:29 -0600, Chris Lamprecht [EMAIL PROTECTED] wrote: I recently dealt with the issue of re-using a Searcher with an index that changes often. I wrote a class that allows my searching classes to check out a lucene Searcher, perform a search, and then return the Searcher. It's similar to a database connection pool, except that - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
Re: Search Performance
Wouldn't this leave open file handles? I had a problem where there were lots of open file handles for deleted index files, because the old searchers were not being closed. On Fri, 18 Feb 2005 13:41:37 -0800 (PST), Otis Gospodnetic [EMAIL PROTECTED] wrote: Or you could just open a new IndexSearcher, forget the old one, and have GC collect it when everyone is done with it. Otis - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
Re: Subversion conversion
One thing about subversion branches (from Key Concepts Behind Branches in chapter 4 of the subversion book): 2. Subversion has no internal concept of a branchonly copies. When you copy a directory, the resulting directory is only a branch because you attach that meaning to it. You may think of the directory differently, or treat it differently, but to Subversion it's just an ordinary directory that happens to have been created by copying. On Wed, 2 Feb 2005 19:49:53 -0500, Chakra Yadavalli [EMAIL PROTECTED] wrote: Hello ALL, It might not be the right place for it but as we are talking about SCM, I have a quick question. First, I haven't used CVS/SVN on any project. I am a ClearCase/PVCS guy. I just would like to know WHICH CONFIGURATION MANAGEMENT PLAN DO YOU FOLLOW IN LUCENE DEVELOPMENT. PLAN A: DEVELOP IN TRUNK AND BRANCH OFF ON RELEASE Recently I had a discussion with a friend about developing in the TRUNK (which in the /main in ClearCase speak), which my friend claims that is done in the APACHE/Open Source projects. The main advantage he pointed was that Merging could be avoided if you are developing in the TRUNK. And when there is a release, they create a new Branch (say LUCENE_1.5 branch) and label them. That branch will be used for maintenance and any code deltas will be merged back to TRUNK as needed. PLAN B: BRANCH OF BEFORE PLANNED RELEASE AND MERGE BACK TO MAIN/TRUNK As I am from a private workspace/isolated development school of thought promoted by ClearCase, I am used to create a branch at the project/release initiation and develop in that branch (say /main/dev). Similarly, we have /main/int for making changes when the project goes to integration phase, and a /main/acp branch for acceptance. In this school, the /main will always have fewer versions of files and the difference between any two consecutive versions is the NET CHANGE of that SCM element (either file or dir) between two releases (say LUCENE 1.4 and 1.5). Thanks in advance for your time. Chakra Yadavalli http://jroller.com/page/cyblogue -Original Message- From: aurora [mailto:[EMAIL PROTECTED] Sent: Wednesday, February 02, 2005 4:25 PM To: lucene-user@jakarta.apache.org Subject: Re: Subversion conversion Subversion rocks! I have just setup the Windows svn client TortoiseSVN with my favourite file manager Total Commander 6.5. The svn status and commands are readily integrated with the file manager. Offline diff and revert are two things I really like from svn. The conversion to Subversion is complete. The new repository is available to users read-only at: http://svn.apache.org/repos/asf/lucene/java/trunk Besides /trunk, there is also /branches and /tags. /tags contains all the CVS tags made so that you could grab a snapshot of a previous version. /trunk is analogous to CVS HEAD. You can learn more about the Apache repository configuration here and how to use the command-line client to check out the repository: http://www.apache.org/dev/version-control.html Learn about Subversion, including the complete O'Reilly Subversion book in electronic form for free here: http://subversion.tigris.org For committers, check out the repository using https and your Apache username/password. The Lucene sandbox has been integrated into our single Subversion repository, under /java/trunk/sandbox: http://svn.apache.org/repos/asf/lucene/java/trunk/sandbox/ The Lucene CVS repositories have been locked for read-only. If there are any issues with this conversion, let me know and I'll bring them to the Apache infrastructure group. Erik - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED] -- Visit my weblog: http://www.jroller.com/page/cyblogue - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED] - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
Re: Adding Fields to Document (with same name)
Hi Karl, From _Lucene in Action_, section 2.2, when you add the same field with different values, Internally, Lucene appends all the words together and index them in a single Field ..., allowing you to use any of the given words when searching. See also http://www.lucenebook.com/search?query=appendable+fields -chris On Tue, 1 Feb 2005 11:42:23 +0100 (MET), [EMAIL PROTECTED] [EMAIL PROTECTED] wrote: Hi, what happens when I add two fields with the same name to one Document? Document doc = new Document(); doc.add(Field.Text(bla, this is my first text)); doc.add(Field.Text(bla, this is my second text)); Will the second text overwrite the first, because only one field can be held with the same name in one document? Will the first and the second text be merged, when I search in the field bla (e.g. with query bla:text) ? I am working on XML indexing and did not get an error when having repeated XML fields. Now I am wondering... Karl -- Sparen beginnt mit GMX DSL: http://www.gmx.net/de/go/dsl - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED] - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
Re: Searching with words that contain % , / and the like
Without looking at the source, my guess is that StandardAnalyzer (and StandardTokenizer) is the culprit. The StandardAnalyzer grammar (in StandardTokenizer.jj) is probably defined so x/y parses into two tokens, x and y. s is a default stopword (see StopAnalyzer.ENGLISH_STOP_WORDS), so it gets filtered out, while p does not. To get what you want, you can use a WhitespaceAnalyzer, write your own custom Analyzer or Tokenizer, or modify the StandardTokenizer.jj grammar to suit your needs. WhitespaceAnalyzer is much simpler than StandardAnalyzer, so you may see some other things being tokenized differently. -Chris On Thu, 27 Jan 2005 12:12:16 +0530, Robinson Raju [EMAIL PROTECTED] wrote: Hi , Is there a way to search for words that contain / or % . if my query is test/s , it is just taken as test if my query is test/p , it is just taken as test p has anyone done this / faced such an issue ? Regards Robin - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED] - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
Re: Reloading an index
I just ran into a similar issue. When you close an IndexSearcher, it doesn't necessarily close the underlying IndexReader. It depends which constructor you used to create the IndexSearcher. See the constructors javadocs or source for the details. In my case, we were updating and optimizing the index from another process, and reopening IndexSearchers. We would eventually run out of disk space because it was leaving open file handles to deleted files, so the disk space was never being made available, until the JVM processes ended. If you're under linux, try running the 'lsof' command to see if there are any handles to files marked (deleted). -Chris On Thu, 27 Jan 2005 08:28:30 -0800 (PST), Greg Gershman [EMAIL PROTECTED] wrote: I have an index that is frequently updated. When indexing is completed, an event triggers a new Searcher to be opened. When the new Searcher is opened, incoming searches are redirected to the new Searcher, the old Searcher is closed and nulled, but I still see about twice the amount of memory in use well after the original searcher has been closed. Is there something else I can do to get this memory reclaimed? Should I explicitly call garbarge collection? Any ideas? Thanks. Greg Gershman __ Do you Yahoo!? Meet the all-new My Yahoo! - Try it today! http://my.yahoo.com - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED] - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
Re: rackmount lucene/nutch - Re: google mini? who needs it when Lucene is there
As they say, nothing lasts forever ;) I like the idea. If a project like this gets going, I think I'd be interested in helping. The Google mini looks very well done (they have two demos on the web page). For $5000, it's probably a very good solution for many businesses. If the demos are accurate, it seems like you almost literally plug it in, configure a few things using the web interface, and you're in business. Demos are at http://www.google.com/enterprise/mini/product_tours_demos.html -chris On Thu, 27 Jan 2005 17:40:53 -0800 (PST), Otis Gospodnetic [EMAIL PROTECTED] wrote: I discuss this with myself a lot inside my head... :) Seriously, I agree with Erik. I think this is a business opportunity. How many people are hating me now and going shh? Raise your hands! Otis --- David Spencer [EMAIL PROTECTED] wrote: This reminds me, has anyone every discussed something similar: - rackmount server ( or for coolness factor, that mini mac) - web i/f for config/control - of course the server would have the following s/w: -- web server -- lucene / nutch Part of the work here I think is having a decent web i/f to configure the thing and to customize the LF of the search results. jian chen wrote: Hi, I was searching using google and just found that there was a new feature called google mini. Initially I thought it was another free service for small companies. Then I realized that it costs quite some money ($4,995) for the hardware and software. (I guess the proprietary software costs a whole lot more than actual hardware.) The nice feature is that, you can only index up to 50,000 documents with this price. If you need to index more, sorry, send in the check... It seems to me that any small biz will be ripped off if they install this google mini thing, compared to using Lucene to implement a easy to use search software, which could search up to whatever number of documents you could image. I hope the lucene project could get exposed more to the enterprise so that people know that they have not only cheaper but more importantly, BETTER alternatives. Jian - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED] - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED] - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED] - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
Re: LUCENE + EXCEPTION
Hi Karthik, If you are talking about SingleThreadModel (i.e. your servlet implements javax.servlet.SingleThreadModel), this does not guarantee that two different instances of your servlet won't be run at the same time. It only guarantees that each instance of your servlet will only be run by one thread at a time. See: http://java.sun.com/j2ee/sdk_1.3/techdocs/api/javax/servlet/SingleThreadModel.html If you are accessing a shared resource (a lucene index), you'll have to prevent concurrent modifications somehow other than SingleThreadModel. I think they've finally deprecated SingleThreadModel in the latest (may be not even out yet) servlet spec. -chris On STANDALONE Usge of UPDATION/DELETION/ADDITION of Documents into MergerIndex, the Code of mine runs PERFECTLY with out any Problems. But When the same Code is plugged into a WEBAPP on TOMCAT with a servlet Running in SINGLE THREAD MODE,Some times Frequently I get the Error as below - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
Re: Stemming
Also if you can't wait, see page 2 of http://www.onjava.com/pub/a/onjava/2003/01/15/lucene.html or the LIA e-book ;) On Fri, 21 Jan 2005 09:27:42 -0500, Kevin L. Cobb [EMAIL PROTECTED] wrote: OK, OK ... I'll buy the book. I guess its about time since I am deeply and forever in love with Lucene. Might as well take the final plunge. -Original Message- From: Otis Gospodnetic [mailto:[EMAIL PROTECTED] Sent: Friday, January 21, 2005 9:12 AM To: Lucene Users List Subject: Re: Stemming Hi Kevin, Stemming is an optional operation and is done in the analysis step. Lucene comes with a Porter stemmer and a Filter that you can use in an Analyzer: ./src/java/org/apache/lucene/analysis/PorterStemFilter.java ./src/java/org/apache/lucene/analysis/PorterStemmer.java You can find more about it here: http://www.lucenebook.com/search?query=stemming You can also see mentions of SnowballAnalyzer in those search results, and you can find an adapter for SnowballAnalyzers in Lucene Sandbox. Otis --- Kevin L. Cobb [EMAIL PROTECTED] wrote: I want to understand how Lucene uses stemming but can't find any documentation on the Lucene site. I'll continue to google but hope that this list can help narrow my search. I have several questions on the subject currently but hesitate to list them here since finding a good document on the subject may answer most of them. Thanks in advance for any pointers, Kevin - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED] - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED] - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
Re: How do I unlock?
What about a shutdown hook? Runtime.getRuntime().addShutdownHook(new Thread() { public void run() { /* whatever */ } }); see also http://www.onjava.com/pub/a/onjava/2003/03/26/shutdownhook.html On Tue, 11 Jan 2005 13:21:42 -0800, Doug Cutting [EMAIL PROTECTED] wrote: Joseph Ottinger wrote: As one for whom the question's come up recently, I'd say that locks need to be terminated gracefully, instead. I've noticed a number of cases where the locks get abandoned in exceptional conditions, which is almost exactly what you don't want. The problem is that this is hard to do from Java. A typical approach is to put the process id in the lock file, then, if that process is dead, ignore the lock file. But Java does not let one know process ids. Java 1.4 provides a LockFile mechanism which should mostly solve this, but Lucene 1.4.3 does not yet require Java 1.4 and hence cannot use that feature. Lucene 2.0 is likely to require Java 1.4 and should be able to do a better job of automatically unlocking indexes when processes die. Doug - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED] - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
Re: Incremental Search experiment with Lucene, sort of like the new Google Suggestion page
Very cool, thanks for posting this! Google's feature doesn't seem to do a search on every keystroke necessarily. Instead, it waits until you haven't typed a character for a short period (I'm guessing about 100 or 150 milliseconds). So if you type fast, it doesn't hit the server until you pause. There are some more detailed postings on slashdot about how it works. On Fri, 10 Dec 2004 16:36:27 -0800, David Spencer [EMAIL PROTECTED] wrote: Google just came out with a page that gives you feedback as to how many pages will match your query and variations on it: http://www.google.com/webhp?complete=1hl=en I had an unexposed experiment I had done with Lucene a few months ago that this has inspired me to expose - it's not the same, but it's similar in that as you type in a query you're given *immediate* feedback as to how many pages match. Try it here: http://www.searchmorph.com/kat/isearch.html This is my SearchMorph site which has an index of ~90k pages of open source javadoc packages. As you type in a query, on every keystroke it does at least one Lucene search to show results in the bottom part of the page. It also gives spelling corrections (using my NGramSpeller contribution) and also suggests popular tokens that start the same way as your search query. For one way to see corrections in action, type in rollback character by character (don't do a cut and paste). Note that: -- this is not how the Google page works - just similar to it -- I do single word suggestions while google does the more useful whole phrase suggestions (TBD I'll try to copy them) -- They do lots of javascript magic, whereas I use old school frames mostly -- this is relatively expensive, as it does 1 query per character, and when it's doing spelling correction there is even more work going on -- this is just an experiment and the page may be unstable as I fool w/ it What's nice is when you get used to immediate results, going back to the batch way of searching seems backward, slow, and old fashioned. There are too many idle CPUs in the world - this is one way to keep them busier :) -- Dave PS Weblog entry updated too: http://www.searchmorph.com/weblog/index.php?id=26 - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED] - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
Re: Too many open files issue
A useful resource for increasing the number of file handles on various operating systems is the Volano Report: http://www.volano.com/report/ I had requested help on an issue we have been facing with the Too many open files Exception garbling the search indexes and crashing the search on the web site. - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
Re: Considering intermediary solution before Lucene question
John, It actually should be pretty easy to use just the parts of Lucene you want (the analyzers, etc) without using the rest. See the example of the PorterStemmer from this article: http://www.onjava.com/pub/a/onjava/2003/01/15/lucene.html?page=2 You could feed a Reader to the tokenStream() method of PorterStemAnalyzer, and get back a TokenStream, from which you pull the tokens using the next() method. On Wed, 17 Nov 2004 18:54:07 -0500, [EMAIL PROTECTED] [EMAIL PROTECTED] wrote: Is there a way to use Lucene stemming and stop word removal without using the rest of the tool? I am downloading the code now, but I imagine the answer might be deeply burried. I would like to be able to send in a phrase and get back a collection of keywords if possible. I am thinking of using an intermediary solution before moving fully to Lucene. I don't have time to spend a month making a carefully tested, administratable Lucene solution for my site yet, but I intend to do so over time. Funny thing is the Lucene code likely would only take up a couple hundred of lines, but integration and administration would take me much more time. In the meantime, I am thinking I could use perhaps Lucene steming and parsing of words, then stick each search word along with the associated primary key in an indexed MySql table. Each record I would need to do this to is small with maybe only average 15 userful words. I would be able to have an in-database solution though ranking, etc would not exist. This is better then the exact word searching i have currently which is really bad. By the way, MySql 4.1.1 has some Lucene type handling, but it too does not have stemming and I am sure it is very slow compaired to Lucene. Cpanel is still stuck on MySql 4.0.* so many people would not have access to even this basic ability in production systems for some time yet. JohnE - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED] - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
Re: Index Locking Issues Resolved...I hope
MySQL does offer a basic fulltext search (with MyISAM tables), but it doesn't really approach the functionality of Lucene, such as pluggable tokenizers, stemming, etc. I think MS SQL server has fulltext search as well, but I have no idea if it's any good. See http://www.google.com/search?hl=enlr=safe=offc2coff=1q=mysql+fulltext I have not seen clear yet because it is all new. I wish a database Text field could have this sort of mechanism built into it. MySql does not do this (what I am using), but I am going to check into other databases now. OJB will work with most all of them so that would help if there is a database type of solution that will allow that sleep at night thing to happen!!! - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
How to efficiently get # of search results, per attribute
I'd like to implement a search across several types of entities, let's say, classes, professors, and departments. I want the user to be able to enter a simple, single query and not have to specify what they're looking for. Then I want the search results to be something like this: Search results for: philosophy boyer Found: 121 classes - 5 professors - 2 departments search results here... I know I could iterate through every hit returned and count them up myself, but that seems inefficient if there are lots of results. Is there some other way to get this kind of information from the search result set? My other ideas are: doing a separate search each result type, or storing different types in different indexes. Any suggestions? Thanks for your help! -Chris - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
Re: How to efficiently get # of search results, per attribute
Nader and Chuck, Thanks for the responses, they're both helpful. My index sizes will begin on the order of 200,000 classes, and 20,000 instructors (and much fewer departments), and grow over time to maybe a few million classes. Compared to some of the numbers I've seen on this mailing list, my dataset is fairly small. I think I'll not worry about performance for now, until unless it becomes an issue. -Chris On Sat, 13 Nov 2004 15:36:11 -0800, Chuck Williams [EMAIL PROTECTED] wrote: My Lucene application includes multi-faceted navigation that does a more complex version of the below. I've got 5 different taxonomies into which every indexed item is classified. The largest of the taxonomies has over 15,000 entries while the other 4 are much smaller. For every search query, I determine the best small set of nodes from each taxonomy to present to the user as drill down options, and provide the counts regarding how many results fall under each of these nodes. At present I only have about 25,000 indexed objects and usually no more than 1,000 results from the initial query. To determine the drill-down options and counts, I scan up to 1,000 results computing the counts for all nodes into which these results classify. Then for each taxonomy I pick the best drill-down options available (orthogonal set with reasonable branching factor that covers all results) and present them with their counts. If there are more than 1,000 results, I extrapolate the computed counts to estimate the actual counts on the entire set of results. This is all done with a single index and a single search. The total time required for performing this computation for the one large taxonomy is under 10ms, running in full debug mode in my ide. The query response time overall is subjectively instantaneous at the UI (Google-speed or better). So, unless some dimension of the problem is much bigger than mine, I doubt performance will be an issue. Chuck -Original Message- From: Nader Henein [mailto:[EMAIL PROTECTED] Sent: Saturday, November 13, 2004 2:29 AM To: Lucene Users List Subject: Re: How to efficiently get # of search results, per attribute It depends on how many results they're looking through, here are two scenarios I see: 1] If you don't have that many records you can fetch all the results and then do a post parsing step the determine totals 2] If you have a lot of entries in each category and you're worried about fetching thousands of records every time, you can just have seperate indecies per category and search them in in parallel (not Lucene Parallel Search) and you can get up to 100 hits for each one (efficiency) but you'll also have the total from the search to display. Either way you can boost up speed using RamDirectory if you need more speed from the search, but whichever approach you choose I would recommend that you sit down and do some number crunching to figure out which way to go. Hope this helps Nader Henein Chris Lamprecht wrote: I'd like to implement a search across several types of entities, let's say, classes, professors, and departments. I want the user to be able to enter a simple, single query and not have to specify what they're looking for. Then I want the search results to be something like this: Search results for: philosophy boyer Found: 121 classes - 5 professors - 2 departments search results here... I know I could iterate through every hit returned and count them up myself, but that seems inefficient if there are lots of results. Is there some other way to get this kind of information from the search result set? My other ideas are: doing a separate search each result type, or storing different types in different indexes. Any suggestions? Thanks for your help! -Chris - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED] - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED] - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED] - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]