RE: FileDocument.java -- Out of scope for Lucene Users list --

2003-11-28 Thread Philippe Laflamme
A few pointers that might help you out, but this is totally of topic for the Lucene Users list. This is a Java related problem, if you're new to Java, please look to other mailing lists for some help... ... 1 Reader reader = new BufferedReader(new InputStreamReader(is)); 2 char [] buf = new char[5

RE: FileDocument.java

2003-11-28 Thread Pleasant, Tracy
Can you give some info in the file type and how you are printing the results. Does the contents display correctly? -Original Message- From: Tun Lin [mailto:[EMAIL PROTECTED] Sent: Friday, November 28, 2003 10:59 AM To: Lucene user list Subject: FileDocument.java Hi Lucene experts, Can

FileDocument.java

2003-11-28 Thread Tun Lin
Hi Lucene experts, Can you help on this? I have included the following code in FileDocument to print out the summary but I have funny output like: The result after searching, the summary is displayed as below: ÐÏࡱá>þÿ UWþÿÿÿTÿ

RE: Lucene refresh index function (incremental indexing).

2003-11-28 Thread Tun Lin
I have deleted one of the text files I indexed and did the following command: java -Dlog4j.configuration=file:///c:/jarfiles/log4j.properties -Dlog4j.debug=true org.pdfbox.searchengine.lucene.IndexFiles -index c:\\index .. root=.. java.io.IOException: Lock obtain timed out at org.apache.lu

Re: New Lucene-powered Website

2003-11-28 Thread Ulrich Mayring
Lutz Horn wrote: Could you please tell us how large the ammount of indexed documents is? Not very large, but growing daily ;-) At this point 289 German and 272 English documents (some are not translated yet). There are many more files on our website, but only these were deemed to contain useful

Re: New Lucene-powered Website

2003-11-28 Thread Lutz Horn
Hi, Ulrich Mayring wrote: we (DENIC) are the world's second largest domain registry (.de-zone has almost 6.9 million domains) and are using Lucene to index and search our website in a high-traffic scenario. Could you please tell us how large the ammount of indexed documents is? Regards Lutz --

RE: New Lucene-powered Website

2003-11-28 Thread Dr. John Takacs
Ulrich, Well done! I too would love to know how you implemented the summarizer. If you are unable to provide the details, would you be able to steer a person in the right direction? I've experimented with a few applications that will do it, some my own, some found via searches, but none are as

Re: New Lucene-powered Website

2003-11-28 Thread Ulrich Mayring
Akmal Sarhan wrote: nice and fast ;-) would be interesting though to know how you implemented the "summarizer". Basically, the algorithm is statistics-based. It selects a configurable number of sentences from the text (in our case three) and, in case the sentences are too long, it cuts off after

Re: New Lucene-powered Website

2003-11-28 Thread Akmal Sarhan
nice and fast ;-) would be interesting though to know how you implemented the "summarizer". regards Akmal - Original Message - From: "Ulrich Mayring" <[EMAIL PROTECTED]> To: <[EMAIL PROTECTED]> Sent: Thursday, November 27, 2003 12:29 PM Subject: New Lucene-powered Website > Hello, > >

Re: indexing/searching a website

2003-11-28 Thread Michal S
Robert Taylor wrote: Check out http://www.searchblox.com/ . It's based on Lucene and extremely easy to use and set up. It basically crawls your website and creates the index. Search results are in XML and you can transform it using the XSL style sheet shipped with it or create your own. Great app,