I'll add that I _guarantee_ you'll want to re-index the data as you change your schema and the like. You'll be able to do that much more quickly if the data is stored locally somehow.
A RDBMS is not necessary however. You could simply store the data on disk in some format you could re-read and send to Solr. Best, Erick On Tue, Feb 21, 2017 at 5:17 PM, Dave <hastings.recurs...@gmail.com> wrote: > B is a better option long term. Solr is meant for retrieving flat data, fast, > not hierarchical. That's what a database is for and trust me you would rather > have a real database on the end point. Each tool has a purpose, solr can > never replace a relational database, and a relational database could not > replace solr. Start with the slow model (database) for control/display and > enhance with the fast model (solr) for retrieval/search > > > >> On Feb 21, 2017, at 7:57 PM, Robert Hume <rhum...@gmail.com> wrote: >> >> To learn how to properly use Solr, I'm building a little experimental >> project with it to search for used car listings. >> >> Car listings appear on a variety of different places ... central places >> Craigslist and also many many individual Used Car dealership websites. >> >> I am wondering, should I: >> >> (a) deploy a Solr search engine and build individual indexers for every >> type of web site I want to find listings on? >> >> or >> >> (b) build my own database to store car listings, and then build services >> that scrape data from different sites and feed entries into the database; >> then point my Solr search to my database, one simple source of listings? >> >> My concerns are: >> >> With (a) ... I have to be smart enough to understand all those different >> data sources and remove/update listings when they change; while this be >> harder to do with custom Solr indexers than writing something from scratch? >> >> With (b) ... I'm maintaining a huge database of all my listings which seems >> redundant; google doesn't make a *copy* of everything on the internet, it >> just knows it's there. Is maintaining my own database a bad design? >> >> Thanks for reading!