RE: lucene handling different document formats

2003-07-07 Thread Eric Anderson
There's also a WAR that's already built, that's available at http://www.brownsite.net/docsearch.htm It works with OpenOffice documents, Word doc, Excel, PDF, XML, RTF, TXT, etc. It can work via a servlet interface or a standalone application. Eric Anderson LanRx Network Solu

RE: Trouble running web demo

2003-06-07 Thread Eric Anderson
t redeploy the war, just restart tomcat. Eric Anderson LanRx Network Solutions 815-505-6132 Quoting xx28 <[EMAIL PROTECTED]>: > Try to chang permisssion 777 for index directory. > > >= Original Message From Lucene Users List > <[EMAIL PROTECTED]> = > &g

Re: parser

2003-03-20 Thread Eric Anderson
PDFBox does the PDFs, Textmining.org is supposed to work for the doc/xls stuff, but I don't know anything about the PPT Eric Anderson LanRx Network Solutions 815-505-6132 Quoting Daniel Hunziker <[EMAIL PROTECTED]>: > Are there any parser for the following format > - do

Re: I have encountered some problems in template Web Application,thanks in advance

2003-03-11 Thread Eric Anderson
ebapp will look for it. Eric Anderson LanRx Network Solutions Quoting Tian LUO <[EMAIL PROTECTED]>: > > Dear lucene-user group: > > in the lucene site,there are: > > " > Now you're ready to roll. In your browser set the url to > "http://localhost:808

Re: Help! Lucene for my site

2003-03-08 Thread Eric Anderson
issue), is that by default, the webapp looks in a location called /opt/lucene/index for the index. Because you're on a windows platform, you're obviously not going to have the index in that location. Look at the configuration files in lucene, and define your index as the c:\lucene-index

Re: Help! Lucene for my site

2003-03-07 Thread Eric Anderson
luceneweb application to look for your index? Make sure that it's readable by tomcat. Hopefully that helps... Eric Anderson LanRx Network Solutions Quoting Elsa Hernandez <[EMAIL PROTECTED]>: > I have not been able to install Lucene correctly (Apache Tomcat 4.1), the > de

Re: Lucene for my site

2003-03-07 Thread Eric Anderson
for me. I have a repository of PDF's, and HTML that I can index, search, and serve. Eric Anderson LanRx Network Solutions Quoting Samuel Alfonso Velázquez Díaz <[EMAIL PROTECTED]>: > > Hi! so far I have read all the docs of the lucene.apache.org site and some > articl

Re: [ANN] PDFBox 0.6.0

2003-03-06 Thread Eric Anderson
When it throws the exception, the indexer fails, so I cannot continue the index. It appears that it's only related to some files, as I have been able to remove some of the files, and it will continue past that point, but if it encounters one of these files, the index fails. Eric Anderson

Re: my experiences - Re: Parsing Word Docs

2003-03-06 Thread Eric Anderson
I'll go either way, but I still don't know how to implement the word parser, as opposed to the PDF parser or HTM parser. Eric Anderson LanRx Network Solutions Quoting Ryan Ackley <[EMAIL PROTECTED]>: > Eric, > > The problem with antiword is that it is a native appl

Re: [ANN] PDFBox 0.6.0

2003-03-06 Thread Eric Anderson
Ben- In attempting to use the PDFBox-0.6.0, I rec'd the following error when attempting to scan a reasonably sized PDF repository. Any thoughts? caught a class java.io.EOFException with message: Unexpected end of ZLIB input stream Eric Anderson LanRx Network Solutions Quotin

Re: my experiences - Re: Parsing Word Docs

2003-03-05 Thread Eric Anderson
yword index, but leaving the DOC intact, as Mr. Litchfield did with PDFBox? Your assistance is greatly appreciated. Eric Anderson 815-505-6132 Quoting David Spencer <[EMAIL PROTECTED]>: > FYI I tried the textmining.org/poi combo and on a collection of 350 word > docs people have develo

Parsing Word Docs

2003-03-05 Thread Eric Anderson
I'm interested in using the textmining/textextraction utilities using Apache POI, that Ryan was discussing. However, I'm having some difficulty determining what the insertion point would be to replace the default parser with the word parser. Any assistance would be appreciated. LanRx Netw

Re: Regarding Setup Lucine for my site

2003-03-05 Thread Eric Anderson
Samuel- I'm basically using the software in a similar fashion to how you are. However, something to remember, is that the documents that you're indexing need to be in a location that is published by your webserver. What I did, was use the tomcat connectors, and mount my document repository insi

results URLs

2003-02-11 Thread Eric Anderson
ething simple, but I have been unable to get this component to work as of yet. Any suggestions? Thanks for your assistance! Eric Anderson LanRx Network Solutions, Inc. Providing Enterprise Level Solutions...On A Small Business Budget --