Thanks for the responses people.


1. can you show me some direction on that.. loading data from an incoming 
stream.. do I need some third party tools, or need to build something myself...

4. I am basically attempting to build a very fast search interface for the 
existing data. The volume I mentioned is more like static one (data is already 
there). The sql statements I mentioned are daily updates coming. The good thing 
is that the history is not there, so the overall volume is not growing, but I 
need to apply the update statements. 

One workaround I had in mind is, (though not so great performance) is to apply 
the updates to a copy of rdbms, and then feed the rdbms extract to solr.  
Sounds like overkill, but I don't have another idea right now. Perhaps business 
discussions would yield something.

@All -

Some more questions guys.  

1. I have about 3-5 tables. Now designing schema.xml for a single table looks 
ok, but whats the direction for handling multiple table structures is something 
I am not sure about. Would it be like a big huge xml, wherein those three 
tables (assuming its three) would show up as three different tag-trees, 

My source provides me a single flat file per table (tab delimited).

2. Further, loading into solr can use some perf tuning.. any tips ? best 
practices ?

3. Also, is there a way to specify a xslt at the server side, and make it 
default, i.e. whenever a response is returned, that xslt is applied to the 
response automatically...

4. And last question for the day - :) there was one post saying that the 
spatial support is really basic in solr and is going to be improved in next 
versions... Can you ppl help me get a definitive yes or no on spatial 
support... in the current form, does it work on not ? I would store lat and 
long, and would need to make them searchable...

Looks like I m close to my solution.. :)


-----Original Message-----
From: Grant Ingersoll [] 
Sent: Tuesday, September 28, 2010 1:05 AM
Subject: Re: Is Solr right for my business situation ?


On Sep 27, 2010, at 1:26 PM, Walter Underwood wrote:

> When do you need to deploy?
> As I understand it, the spatial search in Solr is being rewritten and is 
> slated for Solr 4.0, the release after next.

It will be in 3.x, the next release

> The existing spatial search has some serious problems and is deprecated.
> Right now, I think the only way to get spatial search in Solr is to deploy a 
> nightly snapshot from the active development on trunk. If you are deploying a 
> year from now, that might change.
> There is not any support for SQL-like statements or for joins. The best 
> practice for Solr is to think of your data as a single table, essentially 
> creating a view from your database. The rows become Solr documents, the 
> columns become Solr fields.

There is now group-by capabilities in trunk as well, which may or may not help.

> wunder
> On Sep 27, 2010, at 9:34 AM, Sharma, Raghvendra wrote:
>> I am sure these kind of questions keep coming to you guys, but I want to 
>> raise the same question in a different own business situation.
>> I am very very new to solr and though I have tried to read through the 
>> documentation, I have nowhere near completing the whole read.
>> The need is like this - 
>> We have a huge rdbms database/table. A single table perhaps houses 100+ 
>> million rows. Though oracle is doing a fine job of handling the insertion 
>> and updation of data, the querying is where our main concerns lie.  Since we 
>> have spatial data, the index building takes hours and hours for such tables.
>> That's when we thought of moving away from standard rdbms and thought of 
>> trying something different and fast. 
>> My last week has been spent in a journey reading through bigtable to hadoop 
>> to hbase, to hive and then finally landed on solr. As far as I am in my 
>> tests, it looks pretty good, but I have a few unanswered questions still. 
>> Trying this group for them  :)  (I am sure I can find some answers if I 
>> read/google more on the topic, but now I m being lazy and feel asking the 
>> people who are already using it/or perhaps developing it is a better bet).
>> 1. Can I get my solr instance to load data (fresh data for indexing) from a 
>> stream (imagine a mq kind of queue, or similar) ?

Yes, with a little bit of work.

>> 2. Can I host my solr instance to use hbase as the database/file system 
>> (read HDFS) ?

Probably, but I doubt it will be fast.  Local disk is usually the best.  100+ M 
rows is large but not unreasonable.

>> 3. are there somewhere any reports available (as in benchmarks ) for a solr 
>> instance's performance ? 

You can probably search the web for these.  I've personally seen several 
installs w/ 1B+ docs and subsecond search and faceting and heard of others.  
You might look at the stuff the Hathi trust has put up.  

>> 4. are there any APIs available which might help me apply ANSI sql kind of 
>> statements to my solr data ? 

No.  Question back?  What kinds of things are you trying to do?

>> It would be great if people could help share their experience in the area... 
>> if it's too much trouble writing all of it, perhaps url would be easier... I 
>> welcome all kinds of help here... any advice/suggestions are good ...
>> Looking forward to your viewpoints..
>> --raghav..
>> ******************************************************************************************
>> This message may contain confidential or proprietary information intended 
>> only for the use of the 
>> addressee(s) named above or may contain information that is legally 
>> privileged. If you are 
>> not the intended addressee, or the person responsible for delivering it to 
>> the intended addressee, 
>> you are hereby notified that reading, disseminating, distributing or copying 
>> this message is strictly 
>> prohibited. If you have received this message by mistake, please immediately 
>> notify us by  
>> replying to the message and delete the original message and any copies 
>> immediately thereafter. 
>> Thank you. 
>> ******************************************************************************************

Grant Ingersoll Apache Lucene/Solr Conference, Boston Oct 7-8

Reply via email to