Caching and paging search results
Hi all, could someone describe his expirience in implementation of caching, sorting and paging search results. Is Stateful Session bean appropriate for this? My wish is to obtain all search hits only in first call, and after that, to iterate through Hit Collection and display cached results. I have checked SearchBean in contribution section, but it does not provide real caching and paging. Regards and thanx in advance! Milan ___ Yahoo! Messenger - Communicate instantly..."Ping" your friends today! Download Messenger Now http://uk.messenger.yahoo.com/download/index.html - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
Highlighting problem
Hi all, I have incorporated highlighting package (http://home.clara.net/markharwood/lucene/highlight.htm) but I am worried about the following issue. If I want to display "body" field contents best segments, containing term from query highlighted, I have to define Field "body" as Stored. So, complete process would be like this: Index related work: 1. parse uploaded document into temp ASCII file 2. read ASCII file and append its content to String 3. make Field as Text(String name, String value) Search related work: 1. Retrieve field body String value from the hit (again - only way to do this - as I have understood is to declare Field body as Stored) 2. pass the String value to Highlighter methods. Besides that in Lucene FAQ I have read that body fields are not good candidates to be declared as Stored. Index size is one obvious reason, but I am wondering, how it implies Lucene search performance in general? Has somebody an idea how to include highlight functionality in Unstored Field? Regards and thanx in advance Milan ___ Yahoo! Messenger - Communicate instantly..."Ping" your friends today! Download Messenger Now http://uk.messenger.yahoo.com/download/index.html - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
Multilanguage and wildcard support
Hi, all. I would like to describe my dilemma about analyzing stuff. 2. Multilanguage and wildcard support In Lucene 1.3 Final I have found very useful class PerFieldAnalyzerWrapper, which helped me to specify separate analyzer for each field. But, full text content - obtained after parsing word, excel, xml or other kind of document) should be searchable using stemming capabilities and also should support wildcard queries. I implemented this solution: - indexing of full content is done in two separate fields, because wildcard queries do not pass through analyzer, as I have read in this mailing archive. Field1 (stemmingbody) - matching snowball analyzer is used. Field2 (plainbody) - Whitespace analyzer is used. So, when user searches for some term in items content, I parse the query and if it contains wild character, search in "plainbody" is performed; otherwise I search in "stemmingbody", expecting better search results, that way. Is there a better way to do this, e.g. not to index full content in two separate fields, but only one (I tokenize it, index it, but not store it)? Thanks for any opinion or suggestion in advance! Best regards Milan Agatonovic ___ Yahoo! Messenger - Communicate instantly..."Ping" your friends today! Download Messenger Now http://uk.messenger.yahoo.com/download/index.html - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
Lucene and Message Driven Bean
Hi all, I am new at this mailing list, although I have been using Lucene for a quite long time. I have implemented Lucene API for a pretty big multi-language groupware application, but I still have some problems and dilemmas. I should not use Lucene indexing in schedule procedure (as I found like common way to use Lucene), because I am supposed to provide searchable item, as soon as it is uploaded (document, meeting, forum article etc) So, I made a solution (described under) and would like to hear from experts in this field if it is a good or bad one in general, suggestions and opinions. 1. Indexing process: After upload (parallel storage in DB and File System) I call my Stateless Session Bean which puts uploaded item (wrapped in JMS Message) in Queue. Message Driven Bean (configured as One Instance in Pool under JBoss) receives message and calls Lucene methods which then perform indexing stuff. Dilemma: Is there better way to do this, providing the same functionality? Problem: I face the situation that IOException is raised after call IndexWriter constructor IndexWriter(Directory d, Analyzer a, final boolean create) with different messages. - Index locked for write - Lock obtain timed out - Other messages if index is corrupted (no segments file e.g - I deleted it on purpose) The thing I would like to do is: - If Index is locked due to any reason, rollback the transaction bring the message back into queue. - If Index is corrupted, discard the messages in queue and send mail to administrator. Do you find an idea to subclass IOException and somehow treat differently situation when index is locked from when it is corrupted, appropriate? Thanks a lot in advance. Next problem dilemma is regarding analyzing content and is to be followed. Best regards Milan Agatonovic ___ Yahoo! Messenger - Communicate instantly..."Ping" your friends today! Download Messenger Now http://uk.messenger.yahoo.com/download/index.html - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]