Thank you Mark Bennett and Ted Dunning: (a) for the advice to use Solr rather than Lucene Core and (b) for the advice to use JSON or maybe XML or CSV. I can transform my data files to JSON format quite easily. With respect to Solr, indeed I was confused by all the references to its role as a cloud platform; I had not recognized it as a tool to work with a simple database that is stored on one's own server. -Bas Braams
On Mon, Aug 12, 2013 at 5:11 PM, Mark Bennett <[email protected]>wrote: > Hello Bastiaan, > > On Aug 12, 2013, at 4:24 AM, Bastiaan Braams <[email protected]> wrote: > > > Greetings. I am a newcomer looking for advice about getting started with > > Lucene Core and/or Solr in order to present to the world a searchable > > bibliographical database. > > Excellent. > > > I have the database in my filespace in a plain text format; let us say > as a > > BibTeX file. So the data is quite well structured, with fields such as > > Author, Title, Journal and Year, but also some less structured fields: > > Abstract, Notes, Keywords. I don't have the article full texts. > > The trick will be to get this data into one of the formats that Solr can > digest (XML, JSON or CSV), or write a Java client that uses SolrJ that > reads the file and submits it. > > > There are about 100 000 entries in the database; the total size is less > > than 1 GB. > > That's fine, that's a reasonable amount of data. > > > I have access to a server that already provides web pages to the world. > Now > > I want to provide these bibliographical data to the world, with some > search > > functionality for the visitors. > > Good. > > > Would Lucene Core be a good building block for this? Would I have any use > > for Lucene Solr? > > I would strongly suggest Solr over Lucene. > > > I have the impression that I should consider Solr only if > > the data were distributed over the web, ... > > This is not correct, although I'm curious how you got that impression? > The "cloud" in SolrCloud refers to Solr itself being able to run on > multiple machines for larger datasets, although I think other people are > sometimes confused about what the "cloud" really means. > > > but in my case the data are all in > > one place that is under my control. > > Solr can run on one machine, that's fine. > > > > > The quick tutorial for Lucene Core shows how I may create a Lucene > database > > and query it on my system through the command line. Could someone please > > recommend a tutorial about creating a web interface for the prospective > > world-wide users of this database? > > You really want Solr for this. > > You can customize the Solr interface with the Velocity templates. Here's > an article that discusses several options: > > http://searchhub.org/2010/01/14/solr-search-user-interface-examples/ > > Welcome on board! > > >
