Hello, I am using Lucene to build an index from roughly 10 million documents in number. The documents are about 4 TB in total.
After some trial runs, indexing a subset of the documents I am trying to figure out a hosting service configuration to create a full index from the entire 10 TB of data. As I am still unsure how this project will turn out I am not purchasing hardware/ram but considering a web host. for the purpose of : 1) download the data and to start indexing it. 2) The web front end to access this index will be a python framework ( eg. Django etc) I am seriously contemplating signing up with Joyent for this plan: AMD Opteron x64 multi-core servers with 4GiB RAM per core 1/16 (Burstable up to 95%) 1 TB - Bandwidth/month, 1 GB RAM, + as such as NAS storage as I can afford to pay for. My QUESTION is - Will this RAM and CPU be sufficient during development of the search application and building the index, etc. or is it so abysmal and under-equipped in terms of hardware that the development version of my application will not work. I understand that having more RAM is always good, but is 1GB as good as nothing? This setup is NOT for production but for for development so I can get my hands dirty with lucene which will require plenty of tweaks as the project moves along. What initial configuration would you recommend for a development version given the corpus size. I am not even sure how large my index will look like at this point. I hope to build an my indexes this way and once the search infrastructure is working and the web-front end complete, I plan to worry about Redundancy, availability and scalability for the many users I hope to provide this free service for :-) Many of you in this forum have built successful products with Lucene. To name a few I am aware of - Ken Krugle, James Ryley, Dennis Kubes Some of you must have started with small machines,test set-ups etc where you built your initial search apps. I hope to receive some advise about my plan and approach to start building an infrastructure to support my Lucene app. Thank you. Venkat --------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]