Well, I think you'll have the same problem. Lucene, and Solr (since
it's built on Lucene) are both going to expect a structured document
as input. Once you send in a bunch of documents, you can then query
them for whatever you want to find.
A quick search of the internets found me this Apache Labs project -
called Pinpoint. It's designed to take log data in, and build an index
out of it. I'm not sure how developed it is, but it might be a good
starting point for you. There are probably other projects out there
along the same lines.. Here's Pinpoint: http://svn.apache.org/repos/asf/labs/pinpoint/trunk/
Why do you want to use Solr / Lucene to look through your files? If
you have a huge dataset, some people are using Hadoop (a version of
Google's MapReduce) to look through very large sets of logfiles: http://www.lexemetech.com/2008/01/hadoop-and-log-file-analysis.html
Thanks for your time!
Matthew Runo
Software Engineer, Zappos.com
mr...@zappos.com - 702-943-7833
On Mar 24, 2009, at 10:28 AM, nga pham wrote:
Do you think luence is better to filter out a particular IP address
from a
txt file?
Thank you Runo,
Nga
On Tue, Mar 24, 2009 at 10:21 AM, Matthew Runo <mr...@zappos.com>
wrote:
I don't think that Solr is the best thing to use for searching a
text file.
I'd use grep myself, if you're on a unix-like system.
To use solr, you'd need to throw each network 'event' (GET, POST,
etc etc)
into an XML document, and post those into Solr so it could generate
the
index. You could then do things like
ip:10.206.158.154 to find a specific IP address, or even ip:
10.206.158* to
get a subnet.
Perhaps the thing that's building your text file could post to Solr
instead?
Thanks for your time!
Matthew Runo
Software Engineer, Zappos.com
mr...@zappos.com - 702-943-7833
On Mar 24, 2009, at 9:32 AM, nga pham wrote:
Hi All,
I have a txt file, that captured all of my network traffic. How
can I use
Solr to filter out a particular IP address?
Thank you,
Nga.