Simon, I look a quick look at the UML PDF. It seems to me that various *Services are overly complicated. Since you can have only 1 thread modifying the Lucene index, perhaps you should go the same route as IndexModifier (I never used it, but it looks like people are using it to manage write/delete/search concurrency). So perhaps all you need are IndexStorageService and SearchService for the searchable Lucene index(es), and a DataStorageService for storing and reading data from the BDB store or whatever you end up using.
Regarding the naming of StorageCache - this confused me at first. Seeing "cache" makes me think "previously retrieved/found data stored in a cache for faster subsequent requests/searches". But from what I can tell, that is not what StorageCache is about. It looks like StorageCache is really a buffer of entries that are scheduled to be written to or deleted from the index+storage. If that's so, I would consder renaming this "StorageBuffer" or some such. Otis ----- Original Message ---- From: Simon Willnauer <[EMAIL PROTECTED]> To: java-dev@lucene.apache.org Sent: Thursday, June 1, 2006 7:37:44 PM Subject: GData Server - Lucene storage Hello folks, as I'm the only developer on the project due to the SummerOfCode program it is quiet a tough task to discuss all the architecture with you on the mailing list. For this reason I decided to create UML diagrams to discuss the main components. I will not attach the uml to the mails rather upload it to a server so you can download an study it. Well, the next thing I have to implement is a storage to store the entries in. I will provide 2 kinds of storage's (lucene and BerkleyDB based). The first will be a lucene index to store the entries identified by the entry ID and feed ID stored in the index as a Keyword (used to be Field.Keyword). The underlaying lucene storage will only be used to store the entries compressed. Which feed entries to retrieve from the lucene storage will be based on results of the indexing/search component as every client request to a gdata server is a query to the index. So the results of the search are entry ids and a corresponding feed. These entries will be retrieved from the storage and send back to the client. The storage component does also provide delete / update and insert functionality (wouldn't be a storage without these). The biggest problem with the lucene storage is to achieve a transactional state. Imagine the following scenario: An update request comes in. -> the entry to update will be added to the lucene writer who writes the update. But another delete request has locked the index and an IOException will be thrown. So the update request will queue the entry and retries to obtain the lock. No problem so far. But if the index writer can not open the index due to some other error (the index could not be found) the exception will also be an IOExc. Is there any way to figure out whether the IOException is caused due to a lock which would be alright or due to some other serious reasons? I added some comments on the UML to describe the arch. to you in more detail. So please download the file and have a look at it. http://www.javawithchopsticks.de/webaccess/lucenestorage.pdf I will appreciate all your comments!! regards Simon --------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED] --------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]