Solr can handle all of your pain points. You can sort with any indexed field. 
It returns correct count. Faceting is trivial. OR conditions are totally fine, 
it can handle really complex conditional statements.

Sent from my iPad

On 26-Sep-2012, at 12:48 AM, Matthew Shapiro <m...@mshapiro.net> wrote:

> Hi all,  I don't know if this is the correct mailing list, so I apologize
> if it isn't.  I wasn't sure what other list it would go to.
> 
> Anyways, my company a while back (before I started) got Google envy and
> decided to purchase a GSA system to store our searchable data.  While the
> GSA seems ok for a web-crawler it seems woefully inadequate for quick
> searching of application/non-web data.  Unfortunately, since they already
> purchased a GSA license and support I am trying to put together a
> non-direct cost argument on why I need to switch our search infrastructure
> from GSA to Solr.  To preface this, while I have used Lucene in various
> projects in the past (though not too extensively, just for basic search
> implementations) I have never used Solr.
> 
> I was hoping someone could comment on some of the areas below where I have
> encountered friction with the GSA and let me know if / how Solr is an
> improvement.
> 
> 1) Sorting by anything other than last modified date or relevancy is
> impossible with the GSA.  I need to be able to sort results based on a
> specific piece of metadata
> 
> 2) When performing a search outside of the page bounds (e.g. there are only
> 2 pages of results but the user queries for data on page 3) the GSA returns
> a total results count of zero, making it impossible to know if you have
> paged too far or if there were actually zero results
> 
> 3) No insight into data being fed into the GSA.  When I send data to the
> GSA it lists the data feed in the "feeds" page, but it's impossible to know
> which feed contained what data, and if an error occurs (depending on the
> error) you have no idea which peice of data was rejected or caused the
> failure.  Due to this I had to cut down and only send data to the system in
> very small chunks, just so one bad entry doesn't hold back too many records
> being updated.
> 
> 4) The GSA does not allow searching for data between two dates.  The most
> it lets you do is define a numerical data field with the dates (e.g.
> 20120901) but the GSA only supports numerical searching up to 6 significant
> digits, which means it only gives month accuracy but not day.
> 
> 5) The GSA does not allow operations nested within OR statements.  For
> example, you cannot do (x and y) or (a and b).
> 
> 6) No way to selectively flush mass data.  If I need to flush all the data
> in a collection to re-index it I have to deny a whole URL so the indexer
> clears the data out, then re-enable that URL.  Sometimes I need to flush
> only data flagged as articles or data for a specific client.
> 
> 7) Setting up facet groups is a very manual process in the GSA.  Also
> there's no easy way to have date ranges as search facets (date ranges all
> have to be explicitely defined through the web interface and
> manually maintained, I'd rather be able to have it give me facets on a year
> by year basis, or month by month).
> 
> Those are the main pain points.  There are others, such as community
> support (which between the mailing list and stack overflow I'm not worried
> about) but if anyone can give me a quick rundown on if Solr addresses any
> of these issues  I would be immensely thankful.

Reply via email to