Hi Bruce, welcome to Lucene. First off: your questions would better suited for one of the [EMAIL PROTECTED] lists ... not java-dev ( http://people.apache.org/~hossman/#java-dev )
The question becomes, which list would be a good place for you to start? ... : -i'm trying to gather data/words/terms from a number of : different test web sites to build a database of terms : for a test app : -i'd like to put the resulting data into some sort of : database. is lucene sufficient for doing this, or will : i need some sort of additional toolset? One thing I'm not clear on is whether you are looking for something to hadle the crawling of these websites and extracting the terms from various file types (in which case you should start with [EMAIL PROTECTED]) or if you want to do that yourself and have very specific control in your own code over where the data comes from before it gets indexed (in which case you should email [EMAIL PROTECTED] Once you have a lucene index built (either by nutch or by yourself using the Java APIs) you can writecode to use that index in a variety of ways - including extracting data about the set of known terms and frequencies. If you really want this data put into a relational database, I suspect you'd need to do that converstion process yourself -- i don't know of any general purpose tools to do that. -Hoss --------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
