Re: tomcat solr logs

2010-07-07 Thread Jeff Hammerbacher
Hey Robert,

You may want to check out Flume for log file collection:
http://github.com/cloudera/flume. We don't currently allow Flume to populate
a Solr index, but that would be quite an interesting use case!

Later,
Jeff

On Wed, Jun 30, 2010 at 3:06 PM, Robert Petersen rober...@buy.com wrote:

 Sorry if this is at all off topic.  Our solr log files need grooming and we
 would also like to analyze them, perhaps pulling various data points into a
 DB table, is there a preferred app for doing log file analysis and/or an
 easy way to delete the old log files?



Re: Questions regarding IT search solution

2009-06-04 Thread Jeff Hammerbacher
Hey,

Your system sounds similar to the work don by Stu Hood at Rackspace in their
Mailtrust unit. See
http://highscalability.com/how-rackspace-now-uses-mapreduce-and-hadoop-query-terabytes-datafor
more details and inspiration.

Regards,
Jeff

On Thu, Jun 4, 2009 at 4:58 PM, silentsurfe...@yahoo.com wrote:

 Hi,
 This is encouraging to know that solr/lucene solution may work.
 Can anyone using solr/lucene for such scenario can confirm that the
 solution is used and working fine? That would be really helpful, as I just
 started looking into the solr/lucene solution only couple of days back and
 might be difficult to be 100% confident before proposing the solution
 approach in next couple of days.
 Thanks,Surfer

 --- On Thu, 6/4/09, Otis Gospodnetic otis_gospodne...@yahoo.com wrote:

 From: Otis Gospodnetic otis_gospodne...@yahoo.com
 Subject: Re: Questions regarding IT search solution
 To:
  solr-user@lucene.apache.org
 Date: Thursday, June 4, 2009, 10:26 PM


 My guess is Solr/Lucene would work.  Not sure how well/fast, but it would,
 esp. if you avoid range queries (or use tdate), and esp. if you
 shard/segment indices smartly, so that at query time you send (or distribute
 if you have to) the query to only those shards that have the data (if your
 query is for a limited time period).

  Otis
 --
 Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch



 - Original Message 
  From: Silent Surfer silentsurfe...@yahoo.com
  To: solr-user@lucene.apache.org
  Sent: Thursday, June 4, 2009 5:52:21 PM
  Subject: Re:
  Questions regarding IT search solution
 
  Hi,
  As Alex correctly pointed out my main intention is to figure out whether
  Solr/lucene offer functionalities to replicate what Splunk is doing in
 terms of
  building indexes etc for enabling search capabilities.
  We evaluated Splunk, but it is not very cost effective solution for us as
 we may
  have logs running into few GBs per day as there can be around 25-20
 servers
  running, and Splunk licensing model is based of size of logs per day that
 too,
  the license valid for only 1 year.
  With this back ground, any further inputs on this are greatly
 appreciated.
  Thanks,Surfer
 
  --- On Thu, 6/4/09, Alexandre Rafalovitch wrote:
 
  From: Alexandre Rafalovitch
  Subject: Re: Questions regarding IT search solution
  To: solr-user@lucene.apache.org
  Date: Thursday, June 4, 2009, 9:27 PM
 
  I would also be interested to know what other existing solutions exist.
 
  Splunk's advantage is that it does extraction of the fields with
  advanced searching functionality (it has lexers/parsers for multiple
  content types). I believe that's the Solr's function desired in
  original posting. At the time they came out (2004), I was not aware of
  any good open source solutions to do what they did. And I would have
  loved one, as I was analyzing multi-gigabite logs.
 
  Hadoop might be a way to process the files, but what would do the
  indexing and searching?
 
  Regards,
  Alex.
 
  On Thu, Jun 4, 2009 at 11:56 AM, Walter Underwoodwrote:
   Why build one? Don't those already exist?
  
   Personally, I'd start with Hadoop instead of Solr. Putting
  logs in a
   search index is guaranteed to not scale. People were already trying
   different approaches ten years ago.
  
   wunder
  
   On 6/4/09 8:41 AM, Silent Surfer wrote:
  
   Hi,
   Any help/pointers on the following message would really help me..
   Thanks,Surfer
  
   --- On Tue, 6/2/09, Silent Surfer wrote:
  
   From: Silent Surfer
   Subject: Questions regarding IT search solution
   To: solr-user@lucene.apache.org
   Date: Tuesday, June 2, 2009, 5:45 PM
  
   Hi,
   I am new to Lucene forum and it is my first question.I need a
 clarification
   from you.
   Requirement:--1. Build a IT search tool for logs
 similar to
   that of Splunk(Only wrt searching logs but not in terms of reporting,
 graphs
   etc) using
  solr/lucene. The log files are mainly the server logs like JBoss,
   Custom application server logs (May or may not be log4j logs) and the
 files
   size can go potentially upto 100 MB2. The logs are spread across
 multiple
   servers (25 to 30 servers)2. Capability to be do search almost
 realtime3.
   Support  distributed search
  
   Our search criterion can be based on a keyword or timestamp or IP
 address
  etc.
   Can anyone throw some light if solr/lucene is right solution for this
 ?
   Appreciate any quick help in this regard.
   Thanks,Surfer