Some questions. 1. I have about 3-5 tables. Now designing schema.xml for a single table looks ok, but whats the direction for handling multiple table structures is something I am not sure about. Would it be like a big huge xml, wherein those three tables (assuming its three) would show up as three different tag-trees, nullable.
My source provides me a single flat file per table (tab delimited). Do you think having multiple indexes could be a solution for this case ?? or do I really need to spend effort in denormalizing the data ? 2. Further, loading into solr can use some perf tuning.. any tips ? best practices ? 3. Also, is there a way to specify a xslt at the server side, and make it default, i.e. whenever a response is returned, that xslt is applied to the response automatically... 4. And last question for the day - :) there was one post saying that the spatial support is really basic in solr and is going to be improved in next versions... Can you ppl help me get a definitive yes or no on spatial support... in the current form, does it work on not ? I would store lat and long, and would need to make them searchable... --raghav.. -----Original Message----- From: Sharma, Raghvendra [mailto:[email protected]] Sent: Tuesday, September 28, 2010 11:45 AM To: [email protected] Subject: RE: Is Solr right for my business situation ? Thanks for the responses people. @Grant 1. can you show me some direction on that.. loading data from an incoming stream.. do I need some third party tools, or need to build something myself... 4. I am basically attempting to build a very fast search interface for the existing data. The volume I mentioned is more like static one (data is already there). The sql statements I mentioned are daily updates coming. The good thing is that the history is not there, so the overall volume is not growing, but I need to apply the update statements. One workaround I had in mind is, (though not so great performance) is to apply the updates to a copy of rdbms, and then feed the rdbms extract to solr. Sounds like overkill, but I don't have another idea right now. Perhaps business discussions would yield something. @All - Some more questions guys. 1. I have about 3-5 tables. Now designing schema.xml for a single table looks ok, but whats the direction for handling multiple table structures is something I am not sure about. Would it be like a big huge xml, wherein those three tables (assuming its three) would show up as three different tag-trees, nullable. My source provides me a single flat file per table (tab delimited). 2. Further, loading into solr can use some perf tuning.. any tips ? best practices ? 3. Also, is there a way to specify a xslt at the server side, and make it default, i.e. whenever a response is returned, that xslt is applied to the response automatically... 4. And last question for the day - :) there was one post saying that the spatial support is really basic in solr and is going to be improved in next versions... Can you ppl help me get a definitive yes or no on spatial support... in the current form, does it work on not ? I would store lat and long, and would need to make them searchable... Looks like I m close to my solution.. :) --raghav -----Original Message----- From: Grant Ingersoll [mailto:[email protected]] Sent: Tuesday, September 28, 2010 1:05 AM To: [email protected] Subject: Re: Is Solr right for my business situation ? Inline. On Sep 27, 2010, at 1:26 PM, Walter Underwood wrote: > When do you need to deploy? > > As I understand it, the spatial search in Solr is being rewritten and is > slated for Solr 4.0, the release after next. It will be in 3.x, the next release > > The existing spatial search has some serious problems and is deprecated. > > Right now, I think the only way to get spatial search in Solr is to deploy a > nightly snapshot from the active development on trunk. If you are deploying a > year from now, that might change. > > There is not any support for SQL-like statements or for joins. The best > practice for Solr is to think of your data as a single table, essentially > creating a view from your database. The rows become Solr documents, the > columns become Solr fields. There is now group-by capabilities in trunk as well, which may or may not help. > > wunder > > On Sep 27, 2010, at 9:34 AM, Sharma, Raghvendra wrote: > >> I am sure these kind of questions keep coming to you guys, but I want to >> raise the same question in a different context...my own business situation. >> I am very very new to solr and though I have tried to read through the >> documentation, I have nowhere near completing the whole read. >> >> The need is like this - >> >> We have a huge rdbms database/table. A single table perhaps houses 100+ >> million rows. Though oracle is doing a fine job of handling the insertion >> and updation of data, the querying is where our main concerns lie. Since we >> have spatial data, the index building takes hours and hours for such tables. >> >> That's when we thought of moving away from standard rdbms and thought of >> trying something different and fast. >> My last week has been spent in a journey reading through bigtable to hadoop >> to hbase, to hive and then finally landed on solr. As far as I am in my >> tests, it looks pretty good, but I have a few unanswered questions still. >> Trying this group for them :) (I am sure I can find some answers if I >> read/google more on the topic, but now I m being lazy and feel asking the >> people who are already using it/or perhaps developing it is a better bet). >> >> 1. Can I get my solr instance to load data (fresh data for indexing) from a >> stream (imagine a mq kind of queue, or similar) ? Yes, with a little bit of work. >> 2. Can I host my solr instance to use hbase as the database/file system >> (read HDFS) ? Probably, but I doubt it will be fast. Local disk is usually the best. 100+ M rows is large but not unreasonable. >> 3. are there somewhere any reports available (as in benchmarks ) for a solr >> instance's performance ? You can probably search the web for these. I've personally seen several installs w/ 1B+ docs and subsecond search and faceting and heard of others. You might look at the stuff the Hathi trust has put up. >> 4. are there any APIs available which might help me apply ANSI sql kind of >> statements to my solr data ? No. Question back? What kinds of things are you trying to do? >> >> It would be great if people could help share their experience in the area... >> if it's too much trouble writing all of it, perhaps url would be easier... I >> welcome all kinds of help here... any advice/suggestions are good ... >> >> Looking forward to your viewpoints.. >> >> --raghav.. >> ****************************************************************************************** >> >> This message may contain confidential or proprietary information intended >> only for the use of the >> addressee(s) named above or may contain information that is legally >> privileged. If you are >> not the intended addressee, or the person responsible for delivering it to >> the intended addressee, >> you are hereby notified that reading, disseminating, distributing or copying >> this message is strictly >> prohibited. If you have received this message by mistake, please immediately >> notify us by >> replying to the message and delete the original message and any copies >> immediately thereafter. >> >> Thank you. >> ****************************************************************************************** >> >> CLLD >> > > > > -------------------------- Grant Ingersoll http://lucenerevolution.org Apache Lucene/Solr Conference, Boston Oct 7-8
