RE: Question

Nicholas Paldino [.NET/C# MVP] Thu, 07 Jan 2010 18:18:29 -0800

Erik,

        It's the fact that the API is exactly the same (as well as the lines
of code, practically) which causes many of the issues in Lucene.NET (not
only in use but in implementation), as while Java and C# are very similar,
that doesn't guarantee the same results.


        But that's an issue for another email, one which many (including
myself) have dealt with.

        That being said, I would go with a Solr provider if such a thing
existed.  I'm debating whether or not to use the Lucene.NET library in my
application, or to try and find a preexisting Solr provider.  However, it
doesn't seem that there are many, and I really don't have the luxury of
setting up the environment myself (although I'm interested in using it,
since I can very easily talk whatever language it does over the wire with
.NET).

                - Nick

-----Original Message-----
From: Erik Hatcher [mailto:[email protected]] 
Sent: Thursday, January 07, 2010 4:49 AM
To: [email protected]
Subject: Re: Question

Ed - that's a reasonable critique, but the API is practically the same  
between the Lucene.Net and Lucene Java.   There is a section  
contributed by George in the upcoming 2nd edition of Lucene in Action  
- it's short and says basically that.

But, rather than buy a commercial search engine, consider Solr!

I don't want to come here and steal any of Lucene.Net's thunder by  
mentioning Solr, as no doubt Lucene.Net is the right fit for many  
projects.   Solr, though, is so much more than just Lucene, providing  
enterprisey features (replication, distributed search, facets, and  
more) that just can't be trivially/naively built on top of any flavor  
of Lucene.  And Solr is easy interfaced with .NET as a client.  Of  
course the hurdle then is "does Solr, a Java-based app, fit into the  
operations of your deployment environment?".  It's another technology  
to add if the shop is purely .NET currently.  But then again, it  
literally does run everywhere quite easily.

        Erik

On Jan 7, 2010, at 4:27 AM, Ed Jones wrote:

> My problem with Lucene in Action and all the examples on the  
> internet is
> that they were all in Java and you have to understand exactly what  
> Java
> is doing to understand it all properly. It's for this very reason we  
> had
> to shun using Lucene.net in major projects. I wanted dearly to use it
> but the learning curve was far too steep and there appears to be very
> very few .net examples of code or help.
>
> Instead we have invested a significant amount of money in buying in a
> much more commercial search engine.
>
> I am keeping an eye on the Lucene.net project though in-case it can be
> used in other parts of our business, but again the same will apply, we
> will need more non Java examples.
>
> Ed
>
> -----Original Message-----
> From: Roger Chapman [mailto:[email protected]]
> Sent: 07 January 2010 09:21
> To: [email protected]
> Subject: RE: Question
>
> From what I can remember the book Lucene in Action has a good  
> section on
> indexing documents and PDFs http://www.manning.com/hatcher2/
>
>
>
> Roger.
>
>
>
>
>
> -----Original Message-----
> From: Ben Martz [mailto:[email protected]]
> Sent: 06 January 2010 19:51
> To: [email protected]
> Cc: <[email protected]>
> Subject: Re: Question
>
>
>
> Todd,
>
>
>
> I would definitely take Michael's advice to learn more about the
>
> overall issue before you get too far.
>
>
>
> A quick answer that may help is Windows does not ship with an iFilter
>
> for PDF built-in. Installing Adobe Reader 8 or higher will install a
>
> decent PDF iFilter.
>
>
>
> I am a little surprised by your question though - I assume that you
>
> have access to your own source code and could examine the result from
>
> the iFilter that's being fed to the IndexWriter and compare the
>
> behavior in the TXT case with the behavior in the PDF case?
>
>
>
> Cheers,
>
> Ben
>
>
>
> Sent from my iPhone
>
>
>
> On Jan 6, 2010, at 10:13, Michael Garski <[email protected]>
>
> wrote:
>
>
>
>> Todd,
>
>>
>
>> You'll need some way to extract the text from the PDF prior to
>
>> indexing.  I'm not familiar with any packages that can do that but I
>
>> have heard of them.  You may want to try searching the mailing list
>
>> to see if there has been mention of one previously.  Lucid
>
>> Imagination hosts a great mailing list search tool at
> http://www.lucidimagination.com/search/
>
>>
>
>> Michael
>
>>
>
>> -----Original Message-----
>
>> From: Todd McIndoo [mailto:[email protected]]
>
>> Sent: Wednesday, January 06, 2010 10:11 AM
>
>> To: [email protected]
>
>> Subject: Question
>
>>
>
>> Sorry if this is duplicate
>
>>
>
>>
>
>>
>
>> We are using Lucene.net of version 2.0.0.4. I am trying to search a
>
>> document
>
>> which contains lots of PDFs. I want to search a document, which
>
>> contains a
>
>> specific word, using Lucene.net. We are yielding results in text
>
>> documents
>
>> but not in PDF. Is there something we have to do to be able to
>
>> search in PDF
>
>>
>
>> Documents. All ifilters have been installed on the computer so I do
>
>> not
>
>> think that is the issue.
>
>>
>
>>
>
>>
>
>> Regards,
>
>>
>
>> SPEEDY SOLUTIONS
>
>>
>
>>
>
>>
>
>> Todd McIndoo
>
>>
>
>

smime.p7s
Description: S/MIME cryptographic signature

RE: Question

Reply via email to