Wow, that is a relief! I was going to have to look at ElasticSearch instead.
Dennis Gearon Signature Warning ---------------- EARTH has a Right To Life, otherwise we all die. Read 'Hot, Flat, and Crowded' Laugh at http://www.yert.com/film.php --- On Mon, 9/27/10, Grant Ingersoll <[email protected]> wrote: > From: Grant Ingersoll <[email protected]> > Subject: Re: Is Solr right for my business situation ? > To: [email protected] > Date: Monday, September 27, 2010, 12:35 PM > Inline. > > On Sep 27, 2010, at 1:26 PM, Walter Underwood wrote: > > > When do you need to deploy? > > > > As I understand it, the spatial search in Solr is > being rewritten and is slated for Solr 4.0, the release > after next. > > It will be in 3.x, the next release > > > > > The existing spatial search has some serious problems > and is deprecated. > > > > Right now, I think the only way to get spatial search > in Solr is to deploy a nightly snapshot from the active > development on trunk. If you are deploying a year from now, > that might change. > > > > There is not any support for SQL-like statements or > for joins. The best practice for Solr is to think of your > data as a single table, essentially creating a view from > your database. The rows become Solr documents, the columns > become Solr fields. > > There is now group-by capabilities in trunk as well, which > may or may not help. > > > > > wunder > > > > On Sep 27, 2010, at 9:34 AM, Sharma, Raghvendra > wrote: > > > >> I am sure these kind of questions keep coming to > you guys, but I want to raise the same question in a > different context...my own business situation. > >> I am very very new to solr and though I have tried > to read through the documentation, I have nowhere near > completing the whole read. > >> > >> The need is like this - > >> > >> We have a huge rdbms database/table. A single > table perhaps houses 100+ million rows. Though oracle is > doing a fine job of handling the insertion and updation of > data, the querying is where our main concerns lie. > Since we have spatial data, the index building takes hours > and hours for such tables. > >> > >> That's when we thought of moving away from > standard rdbms and thought of trying something different and > fast. > >> My last week has been spent in a journey reading > through bigtable to hadoop to hbase, to hive and then > finally landed on solr. As far as I am in my tests, it looks > pretty good, but I have a few unanswered questions still. > Trying this group for them :) (I am sure I can > find some answers if I read/google more on the topic, but > now I m being lazy and feel asking the people who are > already using it/or perhaps developing it is a better bet). > >> > >> 1. Can I get my solr instance to load data (fresh > data for indexing) from a stream (imagine a mq kind of > queue, or similar) ? > > Yes, with a little bit of work. > > >> 2. Can I host my solr instance to use hbase as the > database/file system (read HDFS) ? > > Probably, but I doubt it will be fast. Local disk is > usually the best. 100+ M rows is large but not > unreasonable. > > >> 3. are there somewhere any reports available (as > in benchmarks ) for a solr instance's performance ? > > You can probably search the web for these. I've > personally seen several installs w/ 1B+ docs and subsecond > search and faceting and heard of others. You might > look at the stuff the Hathi trust has put up. > > >> 4. are there any APIs available which might help > me apply ANSI sql kind of statements to my solr data ? > > No. Question back? What kinds of things are you > trying to do? > > >> > >> It would be great if people could help share their > experience in the area... if it's too much trouble writing > all of it, perhaps url would be easier... I welcome all > kinds of help here... any advice/suggestions are good ... > >> > >> Looking forward to your viewpoints.. > >> > >> --raghav.. > >> > ****************************************************************************************** > > >> This message may contain confidential or > proprietary information intended only for the use of the > >> addressee(s) named above or may contain > information that is legally privileged. If you are > >> not the intended addressee, or the person > responsible for delivering it to the intended addressee, > >> you are hereby notified that reading, > disseminating, distributing or copying this message is > strictly > >> prohibited. If you have received this message by > mistake, please immediately notify us by > >> replying to the message and delete the original > message and any copies immediately thereafter. > >> > >> Thank you. > >> > ****************************************************************************************** > > >> CLLD > >> > > > > > > > > > > -------------------------- > Grant Ingersoll > http://lucenerevolution.org Apache Lucene/Solr > Conference, Boston Oct 7-8 > >
