Re: Question

Erik Hatcher Thu, 07 Jan 2010 08:32:42 -0800

Ed - that's a reasonable critique, but the API is practically the samebetween the Lucene.Net and Lucene Java. There is a sectioncontributed by George in the upcoming 2nd edition of Lucene in Action- it's short and says basically that.


But, rather than buy a commercial search engine, consider Solr!

I don't want to come here and steal any of Lucene.Net's thunder bymentioning Solr, as no doubt Lucene.Net is the right fit for manyprojects. Solr, though, is so much more than just Lucene, providingenterprisey features (replication, distributed search, facets, andmore) that just can't be trivially/naively built on top of any flavorof Lucene. And Solr is easy interfaced with .NET as a client. Ofcourse the hurdle then is "does Solr, a Java-based app, fit into theoperations of your deployment environment?". It's another technologyto add if the shop is purely .NET currently. But then again, itliterally does run everywhere quite easily.


        Erik

On Jan 7, 2010, at 4:27 AM, Ed Jones wrote:

My problem with Lucene in Action and all the examples on theinternet isthat they were all in Java and you have to understand exactly whatJavais doing to understand it all properly. It's for this very reason wehad

to shun using Lucene.net in major projects. I wanted dearly to use it
but the learning curve was far too steep and there appears to be very
very few .net examples of code or help.

Instead we have invested a significant amount of money in buying in a
much more commercial search engine.

I am keeping an eye on the Lucene.net project though in-case it can be
used in other parts of our business, but again the same will apply, we
will need more non Java examples.

Ed

-----Original Message-----
From: Roger Chapman [mailto:ro...@stormid.com]
Sent: 07 January 2010 09:21
To: lucene-net-dev@lucene.apache.org
Subject: RE: Question

From what I can remember the book Lucene in Action has a goodsection on

indexing documents and PDFs http://www.manning.com/hatcher2/



Roger.





-----Original Message-----
From: Ben Martz [mailto:benma...@gmail.com]
Sent: 06 January 2010 19:51
To: lucene-net-dev@lucene.apache.org
Cc: <lucene-net-dev@lucene.apache.org>
Subject: Re: Question



Todd,



I would definitely take Michael's advice to learn more about the

overall issue before you get too far.



A quick answer that may help is Windows does not ship with an iFilter

for PDF built-in. Installing Adobe Reader 8 or higher will install a

decent PDF iFilter.



I am a little surprised by your question though - I assume that you

have access to your own source code and could examine the result from

the iFilter that's being fed to the IndexWriter and compare the

behavior in the TXT case with the behavior in the PDF case?



Cheers,

Ben



Sent from my iPhone



On Jan 6, 2010, at 10:13, Michael Garski <mgar...@myspace-inc.com>

wrote:

Todd,

You'll need some way to extract the text from the PDF prior to

indexing.  I'm not familiar with any packages that can do that but I

have heard of them.  You may want to try searching the mailing list

to see if there has been mention of one previously.  Lucid

Imagination hosts a great mailing list search tool at

http://www.lucidimagination.com/search/

Michael

-----Original Message-----

From: Todd McIndoo [mailto:tmcin...@speedyscan.biz]

Sent: Wednesday, January 06, 2010 10:11 AM

To: lucene-net-dev@lucene.apache.org

Subject: Question

Sorry if this is duplicate

We are using Lucene.net of version 2.0.0.4. I am trying to search a

document

which contains lots of PDFs. I want to search a document, which

contains a

specific word, using Lucene.net. We are yielding results in text

documents

but not in PDF. Is there something we have to do to be able to

search in PDF

Documents. All ifilters have been installed on the computer so I do

not

think that is the issue.

Regards,

SPEEDY SOLUTIONS

Todd McIndoo

Re: Question

Reply via email to