date:20020305

Re: how to parse XHTML

2002-03-05 Thread Otis Gospodnetic

Terry, These are really not Lucene questions. Lucene will let you index text, but you need to figure out how to parse your XHTML files. Take a look at Jtidy on sf.net, I think Jtidy can help you with parsing XHTML, or perhaps Xerces from xml.apache.org can. Otis --- Terry McGregor <[EMAIL PROT

Lucene Consultant needed.

2002-03-05 Thread Eric Thoman

Hi, I will be working on a project with Lucene. I was wondering if anyone knows of a consultant with Lucene experience that might be interested in some work. If so, please give me a ring. Sincerely, Eric Thoman Director The Manchester Java Users' Group http://www.manjug.org 603.627.8419 --

how to parse XHTML

2002-03-05 Thread Terry McGregor

Hi, I'm new to Lucene, and I was wondering how I should parse XHTML files. Should I name them with the .HTML file extention and use org.apache.lucene.demo.IndexHTML or name them with the .XML file extention and use an XML parser? Also, I would like to keep my XHTML files with a .XHTML file e

Re: Re: Lucene Documentation

2002-03-05 Thread acoliver

Ahh. As it stands Lucene is more of a *search api* then drop in indexing application like htDig. Myself and some other developers will be working on adding application extensions to Lucene (still in the planning stage): http://jakarta.apache.org/lucene/docs/luceneplan.html At current there i

Re: Adding multiple paths to a document

2002-03-05 Thread Ype Kingma

Grim, >I am looking at using lucene to index a large set of documents. In >order to be able to search a subset of documents, I've added a >"path"-field to each document (indexed, not stored, not tokenized). >Using a prefix-query seems to work fine. > >My problem: Our documents can have several d

Re: Lucene Documentation

2002-03-05 Thread Ryan Ogaard

Hello Andy, I have actually stepped through the getting started docs on apache (with success!). I am fairly new to Java and find it difficult configuring Lucene to securely search my company's intranet (including .htm, .jsp, .pdf, .doc, ...). I'm just taking a shot in the dark here hoping fo

Re: Lucene Documentation

2002-03-05 Thread acoliver

http://jakarta.apache.org/lucene/docs/gettingstarted.html If this isn't good enough, please let me know what I can do to make it better. Documenting Lucene is something I have an big interest in. -Andy >On Tue, 05 Mar 2002 08:43:59 -0600 "Ryan Ogaard" <[EMAIL PROTECTED]> wrote. >Hello All, >

Re: TimeOut Exception when Indexing with EJB (Please Help)

2002-03-05 Thread Otis Gospodnetic

Hello, I think you should just try your two suggestions and see. The answer depends on how exactly you do it, OS configuration, etc. Does this happen on an optimized index, too? Otis --- Tihon One <[EMAIL PROTECTED]> wrote: > Hi all; > > I've tried to index a 100K text file on a empty Index fo

Lucene Documentation

2002-03-05 Thread Ryan Ogaard

Hello All, I am in the process of testing Lucene for our intranet, and having a difficult time finding good documentation. Any recommendations on good Web sites with tips, how-tos, code examples, etc. for Lucene? Thank you for your time and consideration... Ryan -- To unsubscribe, e-mail:

Re: Indexing HTML with Lucene

2002-03-05 Thread Erik Hatcher

You have to do it yourself, at at least find code that does this. The Lucene sample code has an HTML parser, and I've posted (to lucene-dev) an alternative way of using JTidy to do this. Erik - Original Message - From: "Melissa Mifsud" <[EMAIL PROTECTED]> To: "Lucene User" <[EMAIL P

Indexing HTML with Lucene

2002-03-05 Thread Melissa Mifsud

Hi, Is it necessary to strip the HTML tags from HTML documents BEFORE telling Lucene to index them? Does Lucene do this or will it index the tags too?! Melissa

What type of indexer is Lucene?

2002-03-05 Thread Melissa Mifsud

Hi! Can anyone tell me what kind of indexer Lucene is? Statistical, Probabilistic, Boolean, Extended Boolean? I can't seem to find the answer in any documentation or article and it's really important that I know the type before I use Lucene in for application! Thanks! Melissa

Adding multiple paths to a document

2002-03-05 Thread Grim Hegland Iversen

I am looking at using lucene to index a large set of documents. In order to be able to search a subset of documents, I've added a "path"-field to each document (indexed, not stored, not tokenized). Using a prefix-query seems to work fine. My problem: Our documents can have several different pat

TimeOut Exception when Indexing with EJB (Please Help)

2002-03-05 Thread Tihon One

Hi all; I've tried to index a 100K text file on a empty Index folder (0 MB of indexed file) and it took 0.77 second. However, when my index folder get larger (~20MB of Indexed files) the same 100K text file would take up to 30 seconds. Im using EJB to do the index processing and my SessionB

Different results on same criteria

2002-03-05 Thread Parag Dharmadhikari

Hi all In my application if I serach on alphbet "e" only then I get right result but if I search on "t" then it is not showing any result. Now there will be many common words like "to" or "the" which should be shown in the searching. and more intrestingly if i search with wildcard like "t*" th

Re: how to parse XHTML

Lucene Consultant needed.

how to parse XHTML

Re: Re: Lucene Documentation

Re: Adding multiple paths to a document

Re: Lucene Documentation

Re: Lucene Documentation

Re: TimeOut Exception when Indexing with EJB (Please Help)

Lucene Documentation

Re: Indexing HTML with Lucene

Indexing HTML with Lucene

What type of indexer is Lucene?

Adding multiple paths to a document

TimeOut Exception when Indexing with EJB (Please Help)

Different results on same criteria

15 matches

Site Navigation

Mail list logo

Footer information