Re: Regarding Setup Lucine for my site

2003-03-05 Thread Catalin
hi there ! we have almost the same configuration (site, index, paths, etc) like you. we used for our search on the site another approach. eg: use a small crawler to index some feeded urls, make the lucene index, make the web search app to use that index. for crawling: http://cvs.cabanova.ro/viewc

Re: Regarding Setup Lucine for my site

2003-03-05 Thread Eric Anderson
Samuel- I'm basically using the software in a similar fashion to how you are. However, something to remember, is that the documents that you're indexing need to be in a location that is published by your webserver. What I did, was use the tomcat connectors, and mount my document repository insi

Parsing Word Docs

2003-03-05 Thread Eric Anderson
I'm interested in using the textmining/textextraction utilities using Apache POI, that Ryan was discussing. However, I'm having some difficulty determining what the insertion point would be to replace the default parser with the word parser. Any assistance would be appreciated. LanRx Netw

RE: regarding Increamenta Indexing

2003-03-05 Thread Nellaiyappan Gomathinayagam
Hi Serge Knystautas, Exactly i need the same functionality. Thanks for the information. And if you don't mind, can u please send me the sample code of implemeting the stuff. Thanks a ton Nellai -Original Message- From: Serge Knystautas [mailto:[EMAIL PROTECTED]

Re: Regarding Setup Lucine for my site

2003-03-05 Thread Velázquez
Hi, I'd like to take a look at the webapp war file or zip tarball for wsearch and indexer crawling Catalin <[EMAIL PROTECTED]> wrote:.. for crawling: http://cvs.cabanova.ro/viewcvs.cgi/indexer/ for webapp: http://cvs.cabanova.ro/viewcvs.cgi/wsearch/ running online: http://www.anet.ro/searc

Re: Regarding Setup Lucine for my site

2003-03-05 Thread Pinky Iyer
Thanks, for the info, even I would be intrested to see the zip code esplly for indexer. This discussion has been a wonderful source of info esplly for we starters. Thanks to one and all. I guess once in a while such a discussion helps us too , to get to the level usually the discussion is! I w

Re: Regarding Setup Lucine for my site

2003-03-05 Thread maurits van wijland
Catalin, could you send me a zip file with your implementation? Thanks, maurits - Original Message - From: "Catalin" <[EMAIL PROTECTED]> To: "Lucene Users List" <[EMAIL PROTECTED]> Sent: Wednesday, March 05, 2003 10:26 AM Subject: Re: Regarding Setup Lucine for my site hi there ! we hav

i2a websearch application demo ???

2003-03-05 Thread Pinky Iyer
COuld anybody tell me where in the Jakarta site is this "i2a websearch application demo". Is this the demo under "getting started" under "lucene". If thats so i dont see that using any crawler. It would be nice if the jakartha site itself has a search incorporated in the site. Thanks! P Iyer ma

Re: Regarding Setup Lucine for my site

2003-03-05 Thread Catalin
hi there all ! the .zip is available (by request) at: http://dev.cabanova.ro/java/lucene/ have fun ! Catalin - Original Message - From: maurits van wijland To: Lucene Users List Sent: Wednesday, March 05, 2003 6:17 PM Subject: Re: Regarding Setup Lucine for my site Cat

Re: Regarding Setup Lucine for my site

2003-03-05 Thread Velázquez
Please point me to the web link to read more about lucene, I have read all the documentation with the distribution (which is all most the same as the lucene.apache.org site). About the problem you mentioned about URL to file mapings, what about if I issue a code line like myurl = URLEncode.e

Re: i2a websearch application demo ???

2003-03-05 Thread Catalin
http://jakarta.apache.org/lucene/docs/powered.html the 6th in the list is i2a Web Search Catalin - Original Message - From: Pinky Iyer To: Lucene Users List Sent: Wednesday, March 05, 2003 6:26 PM Subject: i2a websearch application demo ??? COuld anybody tell me where in the Jakarta si

Re: i2a websearch application demo ???

2003-03-05 Thread Pinky Iyer
Thanks! Catalin <[EMAIL PROTECTED]> wrote:http://jakarta.apache.org/lucene/docs/powered.html the 6th in the list is i2a Web Search Catalin - Original Message - From: Pinky Iyer To: Lucene Users List Sent: Wednesday, March 05, 2003 6:26 PM Subject: i2a websearch application demo ??? C

Re: i2a websearch application demo ???

2003-03-05 Thread Velázquez
Wow the features of i2a Web Search are just what I need! I have just added to my servlet engine, but so far I read the readme, but could not find if this application is GPL or LGPL, is it? Pinky Iyer <[EMAIL PROTECTED]> wrote: Thanks! Catalin wrote:http://jakarta.apache.org/lucene/docs/powered.h

Re: i2a websearch application demo ???

2003-03-05 Thread Pinky Iyer
A license for the application has not been determined yet as of now. It will most likely be BSD, ASL or GPL. Until then, there is disclaimer. Thanks! Samuel Alfonso Velázquez Díaz <[EMAIL PROTECTED]> wrote: Wow the features of i2a Web Search are just what I need! I have just added to my s

IndexReader.delete(int) not working for me

2003-03-05 Thread Joseph Ottinger
I've got a versioning content system where I want to replace documents in a lucene repository. To do so, according to the FAQ and the mailing list archives, I need to open an IndexReader, look for the document in question, delete it via the IndexReader, and then add it. This shouldn't replace the

Re: i2a websearch application demo ???

2003-03-05 Thread Pinky Iyer
I am trying to setup the i2a websearch app, when i go to admin section and choose the index with detail or any of the option i dont see any indexex being created under the main directory, am I doing anything wrong? I did change the websearch.xml to point to appropriate site (http://localhost:80

Re: IndexReader.delete(int) not working for me

2003-03-05 Thread Doug Cutting
Joseph Ottinger wrote: I've got a versioning content system where I want to replace documents in a lucene repository. To do so, according to the FAQ and the mailing list archives, I need to open an IndexReader, look for the document in question, delete it via the IndexReader, and then add it. This

Re: IndexReader.delete(int) not working for me

2003-03-05 Thread Joseph Ottinger
Then this means that my IndexReader.delete(i) isn't working properly. What would be the common causes for this? My log shows the documents being deleted, so something's going wrong at that point. On Wed, 5 Mar 2003, Doug Cutting wrote: > Joseph Ottinger wrote: > > This shouldn't replace the docum

Re: IndexReader.delete(int) not working for me

2003-03-05 Thread Doug Cutting
Joseph Ottinger wrote: Then this means that my IndexReader.delete(i) isn't working properly. What would be the common causes for this? My log shows the documents being deleted, so something's going wrong at that point. Are you closing the IndexReader after doing the deletes? This is required for

Re: IndexReader.delete(int) not working for me

2003-03-05 Thread Joseph Ottinger
Okay, I think I've done something stupid here: on closer examination, it looks like my comparison to find the specific documents to delete is failing. Let me look further at that. On Wed, 5 Mar 2003, Doug Cutting wrote: > Joseph Ottinger wrote: > > Then this means that my IndexReader.delete(i) isn

Re: IndexReader.delete(int) not working for me

2003-03-05 Thread Joseph Ottinger
Okay, I found the problem: it was a stupid coder. To wit, here's the salient code: Document d=indexReader.document(i); if(d.getField("key").equals(node.getKey()) { ... } The error, of course, is that getField.equals() is comparing FIELDS and not string values. When I changed this to pull the st

Re: Regarding Setup Lucine for my site

2003-03-05 Thread Leo Galambos
On Tue, 4 Mar 2003, Otis Gospodnetic wrote: > Even if you could replace C:\. with http:// it wouldn't be a > good solution, as directory structures and file paths do not always map > directly to URLs. Yes, but it is not the case of Samuel's configuration and 99.99% of others. The fact i

Re: Regarding Setup Lucine for my site

2003-03-05 Thread Leo Galambos
> org.apache.lucene.demo.IndexHTML wich was provided with the > documentation. Is there any problem using this demo class for a web > production site? I'm an application developer and it would be hard to > understand the hole lucene code to use it. It would be almost imposible You can use it, but:

Re: i2a websearch application demo ???

2003-03-05 Thread Velázquez
I downloaded and instaled the i2a websearch application. Looks fine, but I have a problem, my site contains a lot of Macromedia Flash Objects and there are a lot of links of my site in this flash objects. Clearly this links wouldn't be crwaled easily. Is there a way to create a index for i2a we

Re: Regarding Setup Lucine for my site

2003-03-05 Thread Otis Gospodnetic
> On the other hand, if you extend Lucene with your hacks, you will > find out > that the model of Lucene is unknown and many parts are hard-coded. It > boosts speed, but it disallows future enhancements (I could name the > parts, I hope we do not start flamewar here). I'm all eyes and I'm a serio

Re: i2a websearch application demo ???

2003-03-05 Thread Otis Gospodnetic
For all i2a questions please contact its author. i2a websearch application just _uses_ Lucene, it is not a part of Lucene. Otis --- Pinky Iyer <[EMAIL PROTECTED]> wrote: > > I am trying to setup the i2a websearch app, when i go to admin > section and choose the index with detail or any of the op

Re: i2a websearch application demo ???

2003-03-05 Thread Velázquez
Yeah sorry! Otis Gospodnetic <[EMAIL PROTECTED]> wrote:For all i2a questions please contact its author. i2a websearch application just _uses_ Lucene, it is not a part of Lucene. Otis --- Pinky Iyer wrote: > Samuel Alfonso Velázquez Díaz http://www.geocities.com/samuelvd [EMAIL PROTECTED]

Re: Regarding Setup Lucine for my site

2003-03-05 Thread Leo Galambos
> > On the other hand, if you extend Lucene with your hacks, you will > > find out > > that the model of Lucene is unknown and many parts are hard-coded. It > > boosts speed, but it disallows future enhancements (I could name the > > parts, I hope we do not start flamewar here). > > I'm all eyes a

multi query with boost AND multiple terms

2003-03-05 Thread Martin . Rademacher
Hi there, I am trying to do a search on multiple terms inclusive using boosting. I extended the MultiFieldQueryParser like this: public static org.apache.lucene.search.Query parse(String query, String[] fields, float[] boost, Analyzer analyzer) throws ParseException {

my experiences - Re: Parsing Word Docs

2003-03-05 Thread David Spencer
FYI I tried the textmining.org/poi combo and on a collection of 350 word docs people have developed here over the years, and it failed on 33% of them with exceptions being thrown about the formats being invalid. I tried "antiword" ( http://www.winfield.demon.nl/ ), a native & free *.exe, and it wo

[ANN] PDFBox 0.6.0

2003-03-05 Thread Ben Litchfield
I would like to announce the next release of PDFBox. PDFBox allows for PDF documents to be indexed using lucene through a simple interface. Please take a look at org.pdfbox.searchengine.lucene.LucenePDFDocument, which will extract all text and PDF document summary properties as lucene fields. You

Re: my experiences - Re: Parsing Word Docs

2003-03-05 Thread Eric Anderson
Ok. Thanks for the tip. I downloaded and compiled Antiword, and would like to now add it to my indexing class. However, I'm not sure how the application would be called, and from where it would be called. How will I have the class parse the document through Antiword to create the keyword index

Re: Regarding Setup Lucine for my site

2003-03-05 Thread Tatu Saloranta
On Wednesday 05 March 2003 13:35, Leo Galambos wrote: > > I'm all eyes and I'm a serious grown-up with good manners :) > > Constructive suggestions for improvement are always welcome. > First a disclaimer: I don't mean to sound too negative. I'm genuinely curious about many of the issues you ment

Multi Language support

2003-03-05 Thread Günter Kukies
Hello, that is what I know about indexing international documents: 1. I have a language ID 2. with this ID I choose an special Analzer for that language 3. I can use one index for all languages But what about searching for international documents? I don't have a language ID, because the user i