Hey,

Your system sounds similar to the work don by Stu Hood at Rackspace in their
Mailtrust unit. See
http://highscalability.com/how-rackspace-now-uses-mapreduce-and-hadoop-query-terabytes-datafor
more details and inspiration.

Regards,
Jeff

On Thu, Jun 4, 2009 at 4:58 PM, <silentsurfe...@yahoo.com> wrote:

> Hi,
> This is encouraging to know that solr/lucene solution may work.
> Can anyone using solr/lucene for such scenario can confirm that the
> solution is used and working fine? That would be really helpful, as I just
> started looking into the solr/lucene solution only couple of days back and
> might be difficult to be 100% confident before proposing the solution
> approach in next couple of days.
> Thanks,Surfer
>
> --- On Thu, 6/4/09, Otis Gospodnetic <otis_gospodne...@yahoo.com> wrote:
>
> From: Otis Gospodnetic <otis_gospodne...@yahoo.com>
> Subject: Re: Questions regarding IT search solution
> To:
>  solr-user@lucene.apache.org
> Date: Thursday, June 4, 2009, 10:26 PM
>
>
> My guess is Solr/Lucene would work.  Not sure how well/fast, but it would,
> esp. if you avoid range queries (or use tdate), and esp. if you
> shard/segment indices smartly, so that at query time you send (or distribute
> if you have to) the query to only those shards that have the data (if your
> query is for a limited time period).
>
>  Otis
> --
> Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch
>
>
>
> ----- Original Message ----
> > From: Silent Surfer <silentsurfe...@yahoo.com>
> > To: solr-user@lucene.apache.org
> > Sent: Thursday, June 4, 2009 5:52:21 PM
> > Subject: Re:
>  Questions regarding IT search solution
> >
> > Hi,
> > As Alex correctly pointed out my main intention is to figure out whether
> > Solr/lucene offer functionalities to replicate what Splunk is doing in
> terms of
> > building indexes etc for enabling search capabilities.
> > We evaluated Splunk, but it is not very cost effective solution for us as
> we may
> > have logs running into few GBs per day as there can be around 25-20
> servers
> > running, and Splunk licensing model is based of size of logs per day that
> too,
> > the license valid for only 1 year.
> > With this back ground, any further inputs on this are greatly
> appreciated.
> > Thanks,Surfer
> >
> > --- On Thu, 6/4/09, Alexandre Rafalovitch wrote:
> >
> > From: Alexandre Rafalovitch
> > Subject: Re: Questions regarding IT search solution
> > To: solr-user@lucene.apache.org
> > Date: Thursday, June 4, 2009, 9:27 PM
> >
> > I would also be interested to know what other existing solutions exist.
> >
> > Splunk's advantage is that it does extraction of the fields with
> > advanced searching functionality (it has lexers/parsers for multiple
> > content types). I believe that's the Solr's function desired in
> > original posting. At the time they came out (2004), I was not aware of
> > any good open source solutions to do what they did. And I would have
> > loved one, as I was analyzing multi-gigabite logs.
> >
> > Hadoop might be a way to process the files, but what would do the
> > indexing and searching?
> >
> > Regards,
> >     Alex.
> >
> > On Thu, Jun 4, 2009 at 11:56 AM, Walter Underwoodwrote:
> > > Why build one? Don't those already exist?
> > >
> > > Personally, I'd start with Hadoop instead of Solr. Putting
>  logs in a
> > > search index is guaranteed to not scale. People were already trying
> > > different approaches ten years ago.
> > >
> > > wunder
> > >
> > > On 6/4/09 8:41 AM, "Silent Surfer" wrote:
> > >
> > >> Hi,
> > >> Any help/pointers on the following message would really help me..
> > >> Thanks,Surfer
> > >>
> > >> --- On Tue, 6/2/09, Silent Surfer wrote:
> > >>
> > >> From: Silent Surfer
> > >> Subject: Questions regarding IT search solution
> > >> To: solr-user@lucene.apache.org
> > >> Date: Tuesday, June 2, 2009, 5:45 PM
> > >>
> > >> Hi,
> > >> I am new to Lucene forum and it is my first question.I need a
> clarification
> > >> from you.
> > >> Requirement:------------------1. Build a IT search tool for logs
> similar to
> > >> that of Splunk(Only wrt searching logs but not in terms of reporting,
> graphs
> > >> etc) using
>  solr/lucene. The log files are mainly the server logs like JBoss,
> > >> Custom application server logs (May or may not be log4j logs) and the
> files
> > >> size can go potentially upto 100 MB2. The logs are spread across
> multiple
> > >> servers (25 to 30 servers)2. Capability to be do search almost
> realtime3.
> > >> Support  distributed search
> > >>
> > >> Our search criterion can be based on a keyword or timestamp or IP
> address
> > etc.
> > >> Can anyone throw some light if solr/lucene is right solution for this
> ?
> > >> Appreciate any quick help in this regard.
> > >> Thanks,Surfer
>
>
>
>
>
>

Reply via email to