There's also a WAR that's already built, that's available at
http://www.brownsite.net/docsearch.htm
It works with OpenOffice documents, Word doc, Excel, PDF, XML, RTF, TXT, etc.
It can work via a servlet interface or a standalone application.
Eric Anderson
LanRx Network Solu
t redeploy the war, just restart tomcat.
Eric Anderson
LanRx Network Solutions
815-505-6132
Quoting xx28 <[EMAIL PROTECTED]>:
> Try to chang permisssion 777 for index directory.
>
> >= Original Message From Lucene Users List
> <[EMAIL PROTECTED]> =
> &g
PDFBox does the PDFs, Textmining.org is supposed to work for the doc/xls stuff,
but I don't know anything about the PPT
Eric Anderson
LanRx Network Solutions
815-505-6132
Quoting Daniel Hunziker <[EMAIL PROTECTED]>:
> Are there any parser for the following format
> - do
ebapp will
look for it.
Eric Anderson
LanRx Network Solutions
Quoting Tian LUO <[EMAIL PROTECTED]>:
>
> Dear lucene-user group:
>
> in the lucene site,there are:
>
> "
> Now you're ready to roll. In your browser set the url to
> "http://localhost:808
issue), is that by default, the
webapp looks in a location called /opt/lucene/index for the index. Because
you're on a windows platform, you're obviously not going to have the index in
that location. Look at the configuration files in lucene, and define your index
as the c:\lucene-index
luceneweb
application to look for your index?
Make sure that it's readable by tomcat.
Hopefully that helps...
Eric Anderson
LanRx Network Solutions
Quoting Elsa Hernandez <[EMAIL PROTECTED]>:
> I have not been able to install Lucene correctly (Apache Tomcat 4.1), the
> de
for me. I have a repository of PDF's, and HTML that I can index, search,
and serve.
Eric Anderson
LanRx Network Solutions
Quoting Samuel Alfonso Velázquez Díaz <[EMAIL PROTECTED]>:
>
> Hi! so far I have read all the docs of the lucene.apache.org site and some
> articl
When it throws the exception, the indexer fails, so I cannot continue the index.
It appears that it's only related to some files, as I have been able to remove
some of the files, and it will continue past that point, but if it encounters
one of these files, the index fails.
Eric Anderson
I'll go either way, but I still don't know how to implement the word parser, as
opposed to the PDF parser or HTM parser.
Eric Anderson
LanRx Network Solutions
Quoting Ryan Ackley <[EMAIL PROTECTED]>:
> Eric,
>
> The problem with antiword is that it is a native appl
Ben-
In attempting to use the PDFBox-0.6.0, I rec'd the following error when
attempting to scan a reasonably sized PDF repository.
Any thoughts?
caught a class java.io.EOFException
with message: Unexpected end of ZLIB input stream
Eric Anderson
LanRx Network Solutions
Quotin
yword index, but leaving the DOC intact, as Mr. Litchfield did with PDFBox?
Your assistance is greatly appreciated.
Eric Anderson
815-505-6132
Quoting David Spencer <[EMAIL PROTECTED]>:
> FYI I tried the textmining.org/poi combo and on a collection of 350 word
> docs people have develo
I'm interested in using the textmining/textextraction utilities using Apache
POI, that Ryan was discussing. However, I'm having some difficulty determining
what the insertion point would be to replace the default parser with the word
parser.
Any assistance would be appreciated.
LanRx Netw
Samuel-
I'm basically using the software in a similar fashion to how you are. However,
something to remember, is that the documents that you're indexing need to be in
a location that is published by your webserver. What I did, was use the tomcat
connectors, and mount my document repository insi
ething simple, but I have been unable to get this
component to work as of yet.
Any suggestions?
Thanks for your assistance!
Eric Anderson
LanRx Network Solutions, Inc.
Providing Enterprise Level Solutions...On A Small Business Budget
--
14 matches
Mail list logo