hi there !
we have almost the same configuration (site, index, paths, etc) like you.
we used for our search on the site another approach.
eg: use a small crawler to index some feeded urls,
make the lucene index, make the web search app to use that index.
for crawling:
http://cvs.cabanova.ro/viewc
Samuel-
I'm basically using the software in a similar fashion to how you are. However,
something to remember, is that the documents that you're indexing need to be in
a location that is published by your webserver. What I did, was use the tomcat
connectors, and mount my document repository insi
I'm interested in using the textmining/textextraction utilities using Apache
POI, that Ryan was discussing. However, I'm having some difficulty determining
what the insertion point would be to replace the default parser with the word
parser.
Any assistance would be appreciated.
LanRx Netw
Hi Serge Knystautas,
Exactly i need the same functionality.
Thanks for the information. And if you don't mind, can u please send me the sample
code of implemeting the stuff.
Thanks a ton
Nellai
-Original Message-
From: Serge Knystautas [mailto:[EMAIL PROTECTED]
Hi, I'd like to take a look at the webapp war file or zip tarball for wsearch and
indexer crawling
Catalin <[EMAIL PROTECTED]> wrote:..
for crawling:
http://cvs.cabanova.ro/viewcvs.cgi/indexer/
for webapp:
http://cvs.cabanova.ro/viewcvs.cgi/wsearch/
running online:
http://www.anet.ro/searc
Thanks, for the info, even I would be intrested to see the zip code esplly for
indexer. This discussion has been a wonderful source of info esplly for we starters.
Thanks to one and all. I guess once in a while such a discussion helps us too , to get
to the level usually the discussion is!
I w
Catalin,
could you send me a zip file with your implementation?
Thanks,
maurits
- Original Message -
From: "Catalin" <[EMAIL PROTECTED]>
To: "Lucene Users List" <[EMAIL PROTECTED]>
Sent: Wednesday, March 05, 2003 10:26 AM
Subject: Re: Regarding Setup Lucine for my site
hi there !
we hav
COuld anybody tell me where in the Jakarta site is this "i2a websearch application
demo". Is this the demo under "getting started" under "lucene". If thats so i dont see
that using any crawler.
It would be nice if the jakartha site itself has a search incorporated in the site.
Thanks!
P Iyer
ma
hi there all !
the .zip is available (by request)
at:
http://dev.cabanova.ro/java/lucene/
have fun !
Catalin
- Original Message -
From: maurits van wijland
To: Lucene Users List
Sent: Wednesday, March 05, 2003 6:17 PM
Subject: Re: Regarding Setup Lucine for my site
Cat
Please point me to the web link to read more about lucene, I have read all the
documentation with the distribution (which is all most the same as the
lucene.apache.org site).
About the problem you mentioned about URL to file mapings, what about if I issue a
code line like
myurl = URLEncode.e
http://jakarta.apache.org/lucene/docs/powered.html
the 6th in the list is i2a Web Search
Catalin
- Original Message -
From: Pinky Iyer
To: Lucene Users List
Sent: Wednesday, March 05, 2003 6:26 PM
Subject: i2a websearch application demo ???
COuld anybody tell me where in the Jakarta si
Thanks!
Catalin <[EMAIL PROTECTED]> wrote:http://jakarta.apache.org/lucene/docs/powered.html
the 6th in the list is i2a Web Search
Catalin
- Original Message -
From: Pinky Iyer
To: Lucene Users List
Sent: Wednesday, March 05, 2003 6:26 PM
Subject: i2a websearch application demo ???
C
Wow the features of i2a Web Search are just what I need!
I have just added to my servlet engine, but so far I read the readme, but could not
find if this application is GPL or LGPL, is it?
Pinky Iyer <[EMAIL PROTECTED]> wrote:
Thanks!
Catalin wrote:http://jakarta.apache.org/lucene/docs/powered.h
A license for the application has not been determined yet as of now. It will most
likely be BSD, ASL or GPL. Until then, there is disclaimer.
Thanks!
Samuel Alfonso Velázquez Díaz <[EMAIL PROTECTED]> wrote:
Wow the features of i2a Web Search are just what I need!
I have just added to my s
I've got a versioning content system where I want to replace documents in
a lucene repository. To do so, according to the FAQ and the mailing list
archives, I need to open an IndexReader, look for the document in
question, delete it via the IndexReader, and then add it.
This shouldn't replace the
I am trying to setup the i2a websearch app, when i go to admin section and choose the
index with detail or any of the option i dont see any indexex being created under the
main directory, am I doing anything wrong?
I did change the websearch.xml to point to appropriate site
(http://localhost:80
Joseph Ottinger wrote:
I've got a versioning content system where I want to replace documents in
a lucene repository. To do so, according to the FAQ and the mailing list
archives, I need to open an IndexReader, look for the document in
question, delete it via the IndexReader, and then add it.
This
Then this means that my IndexReader.delete(i) isn't working properly. What
would be the common causes for this? My log shows the documents being
deleted, so something's going wrong at that point.
On Wed, 5 Mar 2003, Doug Cutting wrote:
> Joseph Ottinger wrote:
> > This shouldn't replace the docum
Joseph Ottinger wrote:
Then this means that my IndexReader.delete(i) isn't working properly. What
would be the common causes for this? My log shows the documents being
deleted, so something's going wrong at that point.
Are you closing the IndexReader after doing the deletes? This is
required for
Okay, I think I've done something stupid here: on closer examination, it
looks like my comparison to find the specific documents to delete is
failing. Let me look further at that.
On Wed, 5 Mar 2003, Doug Cutting wrote:
> Joseph Ottinger wrote:
> > Then this means that my IndexReader.delete(i) isn
Okay, I found the problem: it was a stupid coder. To wit, here's the
salient code:
Document d=indexReader.document(i);
if(d.getField("key").equals(node.getKey()) {
...
}
The error, of course, is that getField.equals() is comparing FIELDS and
not string values. When I changed this to pull the st
On Tue, 4 Mar 2003, Otis Gospodnetic wrote:
> Even if you could replace C:\. with http:// it wouldn't be a
> good solution, as directory structures and file paths do not always map
> directly to URLs.
Yes, but it is not the case of Samuel's configuration and 99.99% of
others.
The fact i
> org.apache.lucene.demo.IndexHTML wich was provided with the
> documentation. Is there any problem using this demo class for a web
> production site? I'm an application developer and it would be hard to
> understand the hole lucene code to use it. It would be almost imposible
You can use it, but:
I downloaded and instaled the i2a websearch application. Looks fine, but I have a
problem, my site contains a lot of Macromedia Flash Objects and there are a lot of
links of my site in this flash objects. Clearly this links wouldn't be crwaled easily.
Is there a way to create a index for i2a we
> On the other hand, if you extend Lucene with your hacks, you will
> find out
> that the model of Lucene is unknown and many parts are hard-coded. It
> boosts speed, but it disallows future enhancements (I could name the
> parts, I hope we do not start flamewar here).
I'm all eyes and I'm a serio
For all i2a questions please contact its author.
i2a websearch application just _uses_ Lucene, it is not a part of
Lucene.
Otis
--- Pinky Iyer <[EMAIL PROTECTED]> wrote:
>
> I am trying to setup the i2a websearch app, when i go to admin
> section and choose the index with detail or any of the op
Yeah sorry!
Otis Gospodnetic <[EMAIL PROTECTED]> wrote:For all i2a questions please contact its
author.
i2a websearch application just _uses_ Lucene, it is not a part of
Lucene.
Otis
--- Pinky Iyer
wrote:
>
Samuel Alfonso Velázquez Díaz
http://www.geocities.com/samuelvd
[EMAIL PROTECTED]
> > On the other hand, if you extend Lucene with your hacks, you will
> > find out
> > that the model of Lucene is unknown and many parts are hard-coded. It
> > boosts speed, but it disallows future enhancements (I could name the
> > parts, I hope we do not start flamewar here).
>
> I'm all eyes a
Hi there,
I am trying to do a search on multiple terms inclusive using boosting. I
extended the MultiFieldQueryParser like this:
public static org.apache.lucene.search.Query parse(String query, String[]
fields, float[] boost,
Analyzer analyzer)
throws ParseException
{
FYI I tried the textmining.org/poi combo and on a collection of 350 word
docs people have developed here over the years, and it failed on 33% of them
with exceptions being thrown about the formats being invalid.
I tried "antiword" ( http://www.winfield.demon.nl/ ), a native & free
*.exe, and
it wo
I would like to announce the next release of PDFBox. PDFBox allows for
PDF documents to be indexed using lucene through a simple interface.
Please take a look at org.pdfbox.searchengine.lucene.LucenePDFDocument,
which will extract all text and PDF document summary properties as lucene
fields.
You
Ok. Thanks for the tip.
I downloaded and compiled Antiword, and would like to now add it to my indexing
class. However, I'm not sure how the application would be called, and from
where it would be called.
How will I have the class parse the document through Antiword to create the
keyword index
On Wednesday 05 March 2003 13:35, Leo Galambos wrote:
> > I'm all eyes and I'm a serious grown-up with good manners :)
> > Constructive suggestions for improvement are always welcome.
>
First a disclaimer: I don't mean to sound too negative. I'm genuinely curious
about many of the issues you ment
Hello,
that is what I know about indexing international documents:
1. I have a language ID
2. with this ID I choose an special Analzer for that language
3. I can use one index for all languages
But what about searching for international documents?
I don't have a language ID, because the user i
34 matches
Mail list logo