Re: Range Query Sombody HELP please
On Thursday 27 May 2004 07:00, Karthik N S wrote: Hi Lucene developers Is it possible to do Search and retrieve relevant information on the Indexed Document within in specific range settings which may be similar to an Query in SQL = select * from BOOKSHELF where book1 between 100 and 200 ex:- search_word , Book between 100 AND 200 [ Note:- where Book uniquefield hit info which is already Indexed ] The query parser can construct this query for you (assuming search_word is in the query default field): +search_word +(book:[100 TO 200]) See also: http://jakarta.apache.org/lucene/docs/queryparsersyntax.html One problem you might run into is that Lucene does not support numbers directly, only strings are indexed. You can index these numbers with sufficient zero's prefixed and add these prefix zero's in the query. Erik Hatcher wrote an article on how to do make the query: http://today.java.net/pub/a/today/2003/11/07/QueryParserRules.html You'll need to override the getRangeQuery() method. Have fun, Ype - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
RE: Range Query Sombody HELP please
Hi Lucene -Developer My main intention was Search for an word hit in a Unique Field between ranges say book100 - book 200 indexed numbers It's something like creating a SUBSEARCH with in the SEARCHINDEX. This is similar to a SQL = select * from BOOKSHELF. or select * from BOOKSHELF where book1 between 100 and 200. with regards Karthik -Original Message- From: Ype Kingma [mailto:[EMAIL PROTECTED] Sent: Thursday, May 27, 2004 12:46 PM To: [EMAIL PROTECTED] Subject: Re: Range Query Sombody HELP please On Thursday 27 May 2004 07:00, Karthik N S wrote: Hi Lucene developers Is it possible to do Search and retrieve relevant information on the Indexed Document within in specific range settings which may be similar to an Query in SQL = select * from BOOKSHELF where book1 between 100 and 200 ex:- search_word , Book between 100 AND 200 [ Note:- where Book uniquefield hit info which is already Indexed ] The query parser can construct this query for you (assuming search_word is in the query default field): +search_word +(book:[100 TO 200]) See also: http://jakarta.apache.org/lucene/docs/queryparsersyntax.html One problem you might run into is that Lucene does not support numbers directly, only strings are indexed. You can index these numbers with sufficient zero's prefixed and add these prefix zero's in the query. Erik Hatcher wrote an article on how to do make the query: http://today.java.net/pub/a/today/2003/11/07/QueryParserRules.html You'll need to override the getRangeQuery() method. Have fun, Ype - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED] - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
Memo: Re: RE: RE: Query parser and minus signs
Thanks Erik :) We are using 1.3 so it looks like an upgrade should be made asap. Whilst hacking around I found an alternative solution. I went back to using a Keyword field, but instead of using the minus symbol in the query I just used -language:en* which has the desired effect. Now I know about the upgrade to 1.4 I'll have a look at some alternative solutions. Thanks for everyone's suggestions on this problem. Alex B. Erik Hatcher [EMAIL PROTECTED] on 26 May 2004 17:24 Please respond to Lucene Users List [EMAIL PROTECTED] To:Lucene Users List [EMAIL PROTECTED] cc: bcc: Subject:Re: RE: RE: Query parser and minus signs On May 26, 2004, at 10:48 AM, [EMAIL PROTECTED] wrote: Query: hsbc -language:zh-HK Parsed query: (contents:hsbc -language:zh -contents:hk) (keywords:hsbc -language:zh -keywords:hk) (title:hsbc -language:zh -title:hk) (language:hsbc -language:zh -language:HK) Hits: 169 Not quite what I was expecting from the parsed query - the zh and HK are now separated. I think I can safely say that you are not running the latest version of Lucene. This has been corrected in the 1.4 versions. I've tested this with Wal-Mart (without the quote) and QueryParser, and it works as expected. Query: hsbc -language:zh\-HK Parsed query: (contents:hsbc -language:zh\-HK) (keywords:hsbc -language:zh\-HK) (title:hsbc -language:zh\-HK) (language:hsbc -language:zh\-HK) Hits: 206 And I'm guessing here, but I don't think the slash is escaping, does it just become part of the query?? Now that is odd. QueryParser is an awkward beast at times, and combining it with MultiFieldQueryParser (which I'd recommend against, as you can see with the odd queries it built for you) gets even more confusing. Hopefully the latest Lucene 1.4 RC release will fix up your situation. Erik - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED] ** This message originated from the Internet. Its originator may or may not be who they claim to be and the information contained in the message and any attachments may or may not be accurate. ** _ This transmission has been issued by a member of the HSBC Group (HSBC) for the information of the addressee only and should not be reproduced and / or distributed to any other person. Each page attached hereto must be read in conjunction with any disclaimer which forms part of it. This transmission is neither an offer nor the solicitation of an offer to sell or purchase any investment. Its contents are based on information obtained from sources believed to be reliable but HSBC makes no representation and accepts no responsibility or liability as to its completeness or accuracy. - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
Memo: Re: Asian languages
Hi Christophe, we're currently indexing Chinese pages with little difficulty. You can use the standard analyzer to index the documents and it will tokenize the content into individual characters. If you want to create a list of 'stop' words you will need to create your own analyzer and supply it with a list of unicode characters to stop. We are indexing HTML pages using a spider to traverse the site and have subclassed Document into HTML_Document. This allows us to set the content encoding for the input stream reader - as our system default is iso_8859-1 in common with most western machines - which enables it to correctly process the unicode characters. You may need to do this too. Hope this helps Alex. Christophe Lombart [EMAIL PROTECTED] on 26 May 2004 19:16 Please respond to Lucene Users List [EMAIL PROTECTED] To:Lucene Users List [EMAIL PROTECTED] cc: bcc: Subject:Asian languages Which asian languages are supported by Lucene ? What about corean, japanese, thaï, ... ? If they are not yet supported, what I need to do ? Thanks, Christophe - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED] ** This message originated from the Internet. Its originator may or may not be who they claim to be and the information contained in the message and any attachments may or may not be accurate. ** _ This transmission has been issued by a member of the HSBC Group (HSBC) for the information of the addressee only and should not be reproduced and / or distributed to any other person. Each page attached hereto must be read in conjunction with any disclaimer which forms part of it. This transmission is neither an offer nor the solicitation of an offer to sell or purchase any investment. Its contents are based on information obtained from sources believed to be reliable but HSBC makes no representation and accepts no responsibility or liability as to its completeness or accuracy. - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
Memo: Re: Asian languages
Sorry Christophe, I mis-informed you. We did NOT subclass Document, we simply created an HTMLDocument class with methods that return Lucene Documents with the required fields added and that is where the content-encoding was set. Alex. Alex BOURNE/IBEU/[EMAIL PROTECTED] on 27 May 2004 09:05 Please respond to Lucene Users List [EMAIL PROTECTED] To:Lucene Users List [EMAIL PROTECTED] cc: bcc: Subject:Re: Asian languages Hi Christophe, we're currently indexing Chinese pages with little difficulty. You can use the standard analyzer to index the documents and it will tokenize the content into individual characters. If you want to create a list of 'stop' words you will need to create your own analyzer and supply it with a list of unicode characters to stop. We are indexing HTML pages using a spider to traverse the site and have subclassed Document into HTML_Document. This allows us to set the content encoding for the input stream reader - as our system default is iso_8859-1 in common with most western machines - which enables it to correctly process the unicode characters. You may need to do this too. Hope this helps Alex. Christophe Lombart [EMAIL PROTECTED] on 26 May 2004 19:16 Please respond to Lucene Users List [EMAIL PROTECTED] To:Lucene Users List [EMAIL PROTECTED] cc: bcc: Subject:Asian languages Which asian languages are supported by Lucene ? What about corean, japanese, thaï, ... ? If they are not yet supported, what I need to do ? Thanks, Christophe - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED] ** This message originated from the Internet. Its originator may or may not be who they claim to be and the information contained in the message and any attachments may or may not be accurate. ** _ This transmission has been issued by a member of the HSBC Group (HSBC) for the information of the addressee only and should not be reproduced and / or distributed to any other person. Each page attached hereto must be read in conjunction with any disclaimer which forms part of it. This transmission is neither an offer nor the solicitation of an offer to sell or purchase any investment. Its contents are based on information obtained from sources believed to be reliable but HSBC makes no representation and accepts no responsibility or liability as to its completeness or accuracy. - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED] ** This message originated from the Internet. Its originator may or may not be who they claim to be and the information contained in the message and any attachments may or may not be accurate. ** _ This transmission has been issued by a member of the HSBC Group (HSBC) for the information of the addressee only and should not be reproduced and / or distributed to any other person. Each page attached hereto must be read in conjunction with any disclaimer which forms part of it. This transmission is neither an offer nor the solicitation of an offer to sell or purchase any investment. Its contents are based on information obtained from sources believed to be reliable but HSBC makes no representation and accepts no responsibility or liability as to its completeness or accuracy. - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
Re: Range Query Sombody HELP please
On May 27, 2004, at 3:37 AM, Karthik N S wrote: Hi Lucene -Developer My main intention was Search for an word hit in a Unique Field between ranges say book100 - book 200 indexed numbers It's something like creating a SUBSEARCH with in the SEARCHINDEX. This is similar to a SQL = select * from BOOKSHELF. or select * from BOOKSHELF where book1 between 100 and 200. Karthik - I'm having a hard time understanding your questions unfortunately. Ype replied with solution suggestion by overriding getRangeQuery on a custom QueryParser subclass. You need to ensure you are indexing numbers in a padded fashion: http://wiki.apache.org/jakarta-lucene/SearchNumericalFields Erik - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
Re: classic scenario
Hello, Answers inlined. --- Adrian Dumitru [EMAIL PROTECTED] wrote: I am (also) building a web crawler, a topic specific one to be more precise, for a vortal. I recently learned about Lucene and I'd very much like to use it in order to handle keyword specific searched on the info that I collect. I suspect this is a classic project, at least for Lucene, probably something like this has been addressed already on this disussion list, I'm interested to hear any experience anyone might have with this subject. See http://www.nutch.org/ It may make sense to join Nutch, contribute patches that help you, etc. instead of building your own crawler from scratch. My crawler goes on the internet, extracts/parse/ranks and saves websites, most of the information is also categoriezed and stored in the database but I also save about 10 top pages from each site in the filesystem. The first question is: should I care about indexing these files at the time I extract them from internet? Or should I index them later, when I make them available for search? Lucene does not care about files and is not limited to indexing files. It sounds like you tried the Lucene demo that indexes files in the file system. However, indexing in batch instead of as you crawl may be a more scalable and cleaner, more manageable approach. Nutch uses that approach for a reason. :) If yes, then can I still name my files the way I want?(i.e. are there any constraints in the filenames from Lucene perspective?) No constraints. Is it an OK idea to have the same files repository (or index) where the crawler writes (indexes files) and the search function searches? Not a good idea. Keep your Lucene index directory clean, and use it only as an index directory. Write your files elsewhere, I would suggest. I guess performance issues are important here. Can I still organize the files that I save the way I want? (I planned to write all the files from a given website on different folders...and the folders will have as name the id from my database) That is up to you and your application. I just suggest you keep that outside the index directory, in order to keep things clean, well organized, and such. I maintain a taxonomy (list of categories)...each website will fall into one or more of these categories, also each website will have a rank. Does Lucene have something that I should be aware of related to what I said? Lucene ranks search result items. Look at Similarity and DefaultSimilarity classes. It sounds like you may benefit from having a custom Similarity that is aware of your categories. I guess that's it for now...this is more like a pet project for me, a pet which keeps growing :) I wouldn't mind any help and opinions you can provide, source code samples, etc. It this is really a pet project, perhaps joining Nutch will also be fun for you. Some recent Nutch contributors are also Lucene users. Otis - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
Re: Memory usage
Sorry if I'm stating the obvious. Is this happening in some stand-alone unit tests, or are you running things from some application and in some environment, like Tomcat, Jetty or in some non-web app? Your queries are pretty big (although I recall some people using even bigger ones... but it all depends on the hardware they had), but are you sure running out of memory is due to Lucene, or could it be a leak in the app from which you are running queries? Otis --- James Dunn [EMAIL PROTECTED] wrote: Doug, We only search on analyzed text fields. There are a couple of additional fields in the index like OBJECT_ID that are keywords but we don't search against those, we only use them once we get a result back to find the thing that document represents. Thanks, Jim --- Doug Cutting [EMAIL PROTECTED] wrote: It is cached by the IndexReader and lives until the index reader is garbage collected. 50-70 searchable fields is a *lot*. How many are analyzed text, and how many are simply keywords? Doug James Dunn wrote: Doug, Thanks! I just asked a question regarding how to calculate the memory requirements for a search. Does this memory only get used only during the search operation itself, or is it referenced by the Hits object or anything else after the actual search completes? Thanks again, Jim --- Doug Cutting [EMAIL PROTECTED] wrote: James Dunn wrote: Also I search across about 50 fields but I don't use wildcard or range queries. Lucene uses one byte of RAM per document per searched field, to hold the normalization values. So if you search a 10M document collection with 50 fields, then you'll end up using 500MB of RAM. If you're using unanalyzed fields, then an easy workaround to reduce the number of fields is to combine many in a single field. So, instead of, e.g., using an f1 field with value abc, and an f2 field with value efg, use a single field named f with values 1_abc and 2_efg. We could optimize this in Lucene. If no values of an indexed field are analyzed, then we could store no norms for the field and hence read none into memory. This wouldn't be too hard to implement... Doug - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED] __ Do you Yahoo!? Friends. Fun. Try the all-new Yahoo! Messenger. http://messenger.yahoo.com/ - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED] - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED] __ Do you Yahoo!? Friends. Fun. Try the all-new Yahoo! Messenger. http://messenger.yahoo.com/ - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED] - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
Re: Range Query Sombody HELP please
Karthik, namaste! I seem to be getting multiple copies of your email. I received 4 copies of this email. Could you please limit things to 1 message per subject? I get hundreds of messages every day as is. :( Thank you, Otis --- Karthik N S [EMAIL PROTECTED] wrote: Hi Lucene developers Is it possible to do Search and retrieve relevant information on the Indexed Document within in specific range settings which may be similar to an Query in SQL = select * from BOOKSHELF where book1 between 100 and 200 ex:- search_word , Book between 100 AND 200 [ Note:- where Book uniquefield hit info which is already Indexed ] Sombody Please Help me :( with regards Karthik - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED] - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
Re: Memory usage
Otis, My app does run within Tomcat. But when I started getting these OutOfMemoryErrors I wrote a little unit test to watch the memory usage without Tomcat in the middle and I still see the memory usage. Thanks, Jim --- Otis Gospodnetic [EMAIL PROTECTED] wrote: Sorry if I'm stating the obvious. Is this happening in some stand-alone unit tests, or are you running things from some application and in some environment, like Tomcat, Jetty or in some non-web app? Your queries are pretty big (although I recall some people using even bigger ones... but it all depends on the hardware they had), but are you sure running out of memory is due to Lucene, or could it be a leak in the app from which you are running queries? Otis --- James Dunn [EMAIL PROTECTED] wrote: Doug, We only search on analyzed text fields. There are a couple of additional fields in the index like OBJECT_ID that are keywords but we don't search against those, we only use them once we get a result back to find the thing that document represents. Thanks, Jim --- Doug Cutting [EMAIL PROTECTED] wrote: It is cached by the IndexReader and lives until the index reader is garbage collected. 50-70 searchable fields is a *lot*. How many are analyzed text, and how many are simply keywords? Doug James Dunn wrote: Doug, Thanks! I just asked a question regarding how to calculate the memory requirements for a search. Does this memory only get used only during the search operation itself, or is it referenced by the Hits object or anything else after the actual search completes? Thanks again, Jim --- Doug Cutting [EMAIL PROTECTED] wrote: James Dunn wrote: Also I search across about 50 fields but I don't use wildcard or range queries. Lucene uses one byte of RAM per document per searched field, to hold the normalization values. So if you search a 10M document collection with 50 fields, then you'll end up using 500MB of RAM. If you're using unanalyzed fields, then an easy workaround to reduce the number of fields is to combine many in a single field. So, instead of, e.g., using an f1 field with value abc, and an f2 field with value efg, use a single field named f with values 1_abc and 2_efg. We could optimize this in Lucene. If no values of an indexed field are analyzed, then we could store no norms for the field and hence read none into memory. This wouldn't be too hard to implement... Doug - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED] __ Do you Yahoo!? Friends. Fun. Try the all-new Yahoo! Messenger. http://messenger.yahoo.com/ - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED] - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED] __ Do you Yahoo!? Friends. Fun. Try the all-new Yahoo! Messenger. http://messenger.yahoo.com/ - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED] - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED] __ Do you Yahoo!? Friends. Fun. Try the all-new Yahoo! Messenger. http://messenger.yahoo.com/ - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
Re: Range Query Sombody HELP please
On Thursday 27 May 2004 09:37, Karthik N S wrote: Hi Lucene -Developer My main intention was Search for an word hit in a Unique Field between ranges say book100 - book 200 indexed numbers It's something like creating a SUBSEARCH with in the SEARCHINDEX. You don't need to shout (uppercase), I've been teaching SQL. Could you explain what you mean by subsearch? I suppose you might want to have a look at the various filter classes in the org.apache.lucene.search package. Regards, Ype - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
Re: Number query not working
Thanks Erik! That showed me the problem right away. -Reece --- Lucene Users List [EMAIL PROTECTED] wrote: On May 26, 2004, at 6:38 PM, [EMAIL PROTECTED] wrote: It looks like its because I'm using the SimpleAnalyzer instead of the StandardAnalyzer. What is the SimpleAnalyzer to this query to make it not work? http://wiki.apache.org/jakarta-lucene/AnalysisParalysis It is a good idea to analyze the analyzer. Do a .toString output of the Query and you'll see clearly what happened. Erik Thanks, Reece --- Lucene Users List [EMAIL PROTECTED] wrote: Hi, I have a bunch of digits in a field. When I do this search it returns nothing: myField:001085609805100 It returns the correct document when I add a * to the end like this: myField:001085609805100* -- added the * I'm not sure what is happening here. I'm thinking that Lucene is doing some number conversion internally when it sees only digits. When I add the * maybe it presumes it is still a string. How do I get a string of digits to work without adding a *? Thanks, Reece - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED] - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED] - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED] - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
Hits object
At one point I thought I'd read that a Hits object doesn't actually contain Documents, but rather references to them. However, in that case I wouldn't expect I could save a Hits object past the closing of it's orginiating Searcher (in this case a MultiSearcher: Hits hits = myMultiSearcher.search()). yet later when I access the same Hits object (having reinstantiated a new MultiSearcher, myMultiSearcher2, but *not* performing a new search) I can retrieve documents from the Hits object without complaint. Is this just my good fortune that things haven't been garbage-collected yet? Or does the Hits object contain the full document set? --David - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
Tool for analyzing analyzers
I've knocked together this tool which automatically discovers Analyzers on the classpath and provides a GUI to allow you to try out different Analyzers and see their effects: http://www.inperspective.com/lucene/Viewer.zip This needs JDK1.4 and you'll need to define the classpath to include Lucene and any of your custom analyzers. Paste in some example text, take your pick of analyzer and hit the Analyze button to see the results. Cheers Mark - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
Re: Hits object
Hits caches up to 200 HitDocs, which may contain the underlying Document. I suspect you accessed a Document that had already been accessed and thus found something in the cache, and it did not have to get back to the underlying searcher. Erik On May 27, 2004, at 4:51 PM, [EMAIL PROTECTED] wrote: At one point I thought I'd read that a Hits object doesn't actually contain Documents, but rather references to them. However, in that case I wouldn't expect I could save a Hits object past the closing of it's orginiating Searcher (in this case a MultiSearcher: Hits hits = myMultiSearcher.search()). yet later when I access the same Hits object (having reinstantiated a new MultiSearcher, myMultiSearcher2, but *not* performing a new search) I can retrieve documents from the Hits object without complaint. Is this just my good fortune that things haven't been garbage-collected yet? Or does the Hits object contain the full document set? --David - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED] - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
Re: Tool for analyzing analyzers
Mark, Nice idea! (I've had this type of thing on my to-do list for the Lucene demo refactoring that I *promise* I'll eventually get around to). I tried to get it to work, though, and was unsuccessful. It did not show me any Analyzers in the drop down (I have the latest CVS version of Lucene in my classpath). Maybe this could be added into Luke as a new tab? You can sort of fake this with Luke now, by entering your text as a query and seeing what it parses to, and select an Analyzer. Erik On May 27, 2004, at 6:45 PM, [EMAIL PROTECTED] wrote: I've knocked together this tool which automatically discovers Analyzers on the classpath and provides a GUI to allow you to try out different Analyzers and see their effects: http://www.inperspective.com/lucene/Viewer.zip This needs JDK1.4 and you'll need to define the classpath to include Lucene and any of your custom analyzers. Paste in some example text, take your pick of analyzer and hit the Analyze button to see the results. Cheers Mark - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED] - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
Re: Hits object
so it sounds like I shouldn't rely on documents still being there in general. --D - Original Message - From: Erik Hatcher [EMAIL PROTECTED] Date: Thursday, May 27, 2004 5:04 pm Subject: Re: Hits object Hits caches up to 200 HitDocs, which may contain the underlying Document. I suspect you accessed a Document that had already been accessed and thus found something in the cache, and it did not have to get back to the underlying searcher. Erik On May 27, 2004, at 4:51 PM, [EMAIL PROTECTED] wrote: At one point I thought I'd read that a Hits object doesn't actually contain Documents, but rather references to them. However, in that case I wouldn't expect I could save a Hits object past the closing of it's orginiating Searcher (in this case a MultiSearcher: Hits hits = myMultiSearcher.search()). yet later when I access the same Hits object (having reinstantiated a new MultiSearcher, myMultiSearcher2, but *not* performing a new search) I can retrieve documents from the Hits object without complaint. Is this just my good fortune that things haven't been garbage-collected yet? Or does the Hits object contain the full document set? --David - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED] --- -- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED] - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
RE: Range Query Sombody HELP please
Hey Ype Apologies for the misconduct. Weh we do a search in SQL using '*' we all know that the result would be total no of records in the table,but when we want to get limit our record we apply range between 2 specific row records [Which we call it as subsearch] Similarly on a indexed record I would like perform the same tecnique as above. In fact I was looking at the url u sent me in the last mail on using getRange Queries and was working on the same http://jakarta.apache.org/lucene/docs/queryparsersyntax.html and http://today.java.net/pub/a/today/2003/11/07/QueryParserRules.html but witou results for the last 12 hrs. If u could spare a few minuts and please expalin or provide a simple [ full ] example using and over riding the getRange() method . with regards Karthik -Original Message- From: Ype Kingma [mailto:[EMAIL PROTECTED] Sent: Thursday, May 27, 2004 11:03 PM To: [EMAIL PROTECTED] Subject: Re: Range Query Sombody HELP please On Thursday 27 May 2004 09:37, Karthik N S wrote: Hi Lucene -Developer My main intention was Search for an word hit in a Unique Field between ranges say book100 - book 200 indexed numbers It's something like creating a SUBSEARCH with in the SEARCHINDEX. You don't need to shout (uppercase), I've been teaching SQL. Could you explain what you mean by subsearch? I suppose you might want to have a look at the various filter classes in the org.apache.lucene.search package. Regards, Ype - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED] - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]