Well, I think you'll have the same problem. Lucene, and Solr (since it's built on Lucene) are both going to expect a structured document as input. Once you send in a bunch of documents, you can then query them for whatever you want to find.

A quick search of the internets found me this Apache Labs project - called Pinpoint. It's designed to take log data in, and build an index out of it. I'm not sure how developed it is, but it might be a good starting point for you. There are probably other projects out there along the same lines.. Here's Pinpoint: http://svn.apache.org/repos/asf/labs/pinpoint/trunk/

Why do you want to use Solr / Lucene to look through your files? If you have a huge dataset, some people are using Hadoop (a version of Google's MapReduce) to look through very large sets of logfiles: http://www.lexemetech.com/2008/01/hadoop-and-log-file-analysis.html

Thanks for your time!

Matthew Runo
Software Engineer, Zappos.com
mr...@zappos.com - 702-943-7833

On Mar 24, 2009, at 10:28 AM, nga pham wrote:

Do you think luence is better to filter out a particular IP address from a
txt file?

Thank you Runo,
Nga

On Tue, Mar 24, 2009 at 10:21 AM, Matthew Runo <mr...@zappos.com> wrote:

I don't think that Solr is the best thing to use for searching a text file.
I'd use grep myself, if you're on a unix-like system.

To use solr, you'd need to throw each network 'event' (GET, POST, etc etc) into an XML document, and post those into Solr so it could generate the
index. You could then do things like
ip:10.206.158.154 to find a specific IP address, or even ip: 10.206.158* to
get a subnet.

Perhaps the thing that's building your text file could post to Solr
instead?

Thanks for your time!

Matthew Runo
Software Engineer, Zappos.com
mr...@zappos.com - 702-943-7833


On Mar 24, 2009, at 9:32 AM, nga pham wrote:

Hi All,

I have a txt file, that captured all of my network traffic. How can I use
Solr to filter out a particular IP address?

Thank you,
Nga.




Reply via email to