hi all
purely due to a policy decision, we would like to host our lucene search
application , in a j2ee container, preferable by means of a ejb.
Since access to java.io is restricted by the ejb specification, what would
be the best way to create desgin the application ?
i have taken a look at
What would be the best way? Use Lucene outside of EJB. It's quite
silly to make such a decision "purely due to a policy decision" when
the technicalities of it show that it is an unwise decision.
You're going to navigate Hits through a session bean? And as you said,
the EJB spec says not to
hi erik
thanks for the warning and the code.
Let me re-phrase the question,
i have a index generated by lucene, i need to have the search capabilty
to have a high availabilty. What solutions would be the most optimal
Currentlly i have two senarions in mind
a) setup a RMI based app. that o
Hi,
I am new bee to lucene.
I have downloaded zip file. now how can i give my own list words to lucene?
In the demo i saw that lucene is automatically creating index if we run the java
program.but I want to give my own search words, how is it possible?
regards
Santosh kumar
SoftPro Systems
Hy
How can I search through PDF?
- Original Message -
From: Santosh
To: Lucene Users List
Sent: Friday, August 20, 2004 5:59 PM
Subject: pdf search
Hi,
I am new bee to lucene.
I have downloaded zip file. now how can i give my own list words to lucene?
In the demo i saw that lucene is au
In order to search through a PDF document the text must be extracted from
the PDF document. There are several libraries to do that, including
http://www.pdfbox.org After you have the text from the PDF document you
just add it to the lucene index like any other text document. You should
go thr
Hi Santosh,
Lucene doesn't search pdfs per se. To make anything searchable you have to first
extract the content and then put it in lucene in a form it understands (i.e document
objects). So in order to search your pdfs you first need to extract the info from the
PDFs using something like PDF
hi,
I have downloaded pdfbox zip. but i am in ambigous state that where to start. how can
I check with demo, I dont see any help document with this download, please help me.
regards
Santosh kumar
SoftPro Systems
Hyderabad
"The harder you train in peace, the lesser you bleed in war"
-
What are your intensions with PDFBox?
You want to use it to index PDF files?
Santosh wrote:
hi,
I have downloaded pdfbox zip. but i am in ambigous state that where to start. how can I check with demo, I dont see any help document with this download, please help me.
regards
Santosh kumar
hi
What is that u intend to Search and What is this own 'search words'
First Explain properly u'r requirement to the form to get intented
results.
with regards
Karthik
-Original Message-
From: Santosh [mailto:[EMAIL PROTECTED]
Sent: Friday, August 20, 2004 5:59 PM
To: Lucene Users L
Option b) sounds simpler and sufficient to me. I don't see why you
would need to involve RMI for something as simple as this. I use
something similar to your b) option for some indices behind
http://www.simpy.com/ . I don't store IndexSearcher in the servlet
context, though - I just have some lo
Hello Jeff,
I don't have Debian to try this out, and this is going to be a stupid
question and suggestion, but where/how is the CLASSPATH set? Are any
of those commands actually using Lucene's build.xml?
I'm asking, because it looks like your compiler is not finding Reader
and IOException classe
hi karthik,
I have a website with some items, each contain html and pdf documents , I
have to store keywords against each item, whenever a user enters any search
word if it matches with any one of the existing keyword list then it should
show the link to particular Item.
- Original Message
exactly, the same is required to me
- Original Message -
From: Don Vaillancourt
To: Lucene Users List
Sent: Friday, August 20, 2004 6:39 PM
Subject: Re: pdfboxhelp
What are your intensions with PDFBox?
You want to use it to index PDF files?
Santosh wrote:
hi,
I have
Im a new Lucene User and I'm not too familiar with Applets either but I've
been doing a bit of testing on java applet security and if im correct in
saying that applets can read anything below there codebase then my problem
is not a security restriction one. The error is reading
java.lang.NoClassDef
Here is the super simple code required.
import org.pdfbox.searchengine.lucene.*;
File pdfFile = new File("/path/to/the/file.pdf");
// Below returns a parse PDF file in a Lucene Document object.
Document doc = LucenePDFDocument.getDocument(pdfFile);
Santosh wrot
- Original Message -
From: Don Vaillancourt
To: Lucene Users List
Sent: Friday, August 20, 2004 7:37 PM
Subject: Re: pdfboxhelp
Here is the super simple code required.
import org.pdfbox.searchengine.lucene.*;
File pdfFile = new File("/path/to/the/file.pdf");
Hello,
I am currently working on a server app that will require the ability to
make index additions/deletions at any time. I want to cache/reuse index
searchers and readers. I know that once an index has changed only newly
opened readers will see the changes. Creating a new reader to see the
Did I leave you speechless!? :-)
Santosh wrote:
- Original Message -
From: Don Vaillancourt
To: Lucene Users List
Sent: Friday, August 20, 2004 7:37 PM
Subject: Re: pdfboxhelp
Here is the super simple code required.
import org.pdfbox.searchengine.lucene.*;
Fi
Iam sorry, mail has been sent accidentally
- Original Message -
From: Don Vaillancourt
To: Lucene Users List
Sent: Friday, August 20, 2004 8:02 PM
Subject: Re: pdfboxhelp
Did I leave you speechless!? :-)
Santosh wrote:
- Original Message -
From: Don Vailla
How to index and search database values using Lucene Search Engine?
By
T.Sivalingam.
Sivalingam T
Hi
Can we index and search database in Lucene Search Engine?
if anybody have please send reply.
With Warm Regards,
Sivalingam.T
Sai Eswar Innovations (P) Ltd,
Chennai-92
You need to create a lucene index from the database.
Just index the columns and the records from the database.
It will be useful to have also a field in lucene that contains the
database's primary key, so you can retrieve the actual record from the
database
Aviran
-Original Message-
From
Hi Otis,
>I'm asking, because it looks like your compiler is not finding Reader
>and IOException classes, both of which are in java.io.* package, which
>I see imported in StandardTokenizer.java as 'import java.io.*;'.
In my copy of StandardTokenizer.java, there is no 'import java.io.*;'
(and i
Funy thing is I was thinking of doing something like this just today.
This is especially good when you perform a lot of queries using the
LIKE statement. Lucene would increase search performance a great deal.
Aviran wrote:
You need to create a lucene index from the database.
Just index t
On Aug 20, 2004, at 7:54 AM, Rupinder Singh Mazara wrote:
hi erik
thanks for the warning and the code.
Let me re-phrase the question,
i have a index generated by lucene, i need to have the search
capabilty
to have a high availabilty. What solutions would be the most optimal
I'm guessing from y
On Aug 20, 2004, at 11:12 AM, Jeff Breidenbach wrote:
Hi Otis,
I'm asking, because it looks like your compiler is not finding Reader
and IOException classes, both of which are in java.io.* package, which
I see imported in StandardTokenizer.java as 'import java.io.*;'.
In my copy of StandardTokeniz
Infact we do the same exact thing. Session bean method called search()
delegates to a POJO SearchService. We lazy load the IndexSearch cache it in
memory and invalidate that object when someone else modifies the index. This
trick works wonderfually for us. The search has become faster after caching
>I don't understand this. StandardTokenizer.java hasn't changed since
>last year.
I have packaged Lucene such that 'ant javacc' is called at package
build time. I now see the problem - 'import java.io.*;' has been
removed from StandardTokenizer.jj in Lucene 1.4.1. When I put that
line back in
Doing query against lucene I run into memomry problem, i.e. it's look like
it's not giving memory back after the
query have been executed.
I use ParallelMultiSearcher ant call close method after results are
displayed.
hits=null; // Hits class
if (ms!=null) ms.close(); //ParallelMultiSearch
Ok, Lucene 1.4.1 has been uploaded to Debian. Hopefully it will have
enough time to percolate before the sarge release.
Now that that is taken care of, I'm curious about the status of gcj
compilation. Packaging Lucene as a native library might be useful for
projects such as PyLucene, and it is al
On Aug 20, 2004, at 12:36 PM, Jeff Breidenbach wrote:
I don't understand this. StandardTokenizer.java hasn't changed since
last year.
I have packaged Lucene such that 'ant javacc' is called at package
build time. I now see the problem - 'import java.io.*;' has been
removed from StandardTokenizer.
I just create a new IndexSearcher, leave the old IndexSearcher alone,
and JVM's garbage collection cleans it up.
Otis
--- "Crump, Michael" <[EMAIL PROTECTED]> wrote:
> Hello,
>
>
>
> I am currently working on a server app that will require the ability
> to
> make index additions/deletions at
So the finalizer on the underlying reader closes file handles?
-Original Message-
From: Otis Gospodnetic [mailto:[EMAIL PROTECTED]
Sent: Friday, August 20, 2004 2:41 PM
To: Lucene Users List
Subject: Re: continuous index updates
I just create a new IndexSearcher, leave the old IndexSearc
Looks to me like you're using an older version of Lucene on your Linux
box. The code is back-compatible, it will read old indexes, but Lucene
1.3 cannot read indexes created by Lucene 1.4, and will fail in the way
you describe.
Doug
Sven wrote:
Hi!
I have a problem to port a Lucene based knowl
Hello,
I'm interested in any feedback from anyone who has worked through implementing
Internationalization (I18N) search with Lucene or has ideas for this requirement.
Currently, we're using Lucene with straight English and are looking to add Spanish to
the mix (with maybe more languages to fo
Hello,
I'm interested in any feedback from anyone who has worked through implementing
Internationalization (I18N) search with Lucene or has ideas for this requirement.
Currently, we're using Lucene with straight English and are looking to add Spanish to
the mix (with maybe more languages to fo
I can successfully use gcc 3.4.0 with Lucene as follows:
ant jar jar-demo
gcj -O3 build/lucene-1.5-rc1-dev.jar build/lucene-demos-1.5-rc1-dev.jar
-o indexer --main=org.apache.lucene.demo.IndexHTML
./indexer -create docs
It runs pretty snappy too! However I don't know if there's much milage
in p
Hi,
I'm trying to figure out how to speed up queries to a
large index.
I'm currently getting 133 req/sec, which isn't bad,
but isn't too close
to MySQL, which is getting 500 req/sec on the same
hardware with the
same set of documents.
Setup info & Stats:
- 4.3M documents, 12 keyword fields per do
Hi guys!
I was hoping someone here could help me out with a custom filter.
We have an index of emails and do some searches on the text of an email message and
also searches based on the email addresses in a To, From or CC.
Since we also do searches on a bunch of emails, we created a custom filt
Have you considered using the built-in QueryFilter for this? Why
isn't it sufficient for your needs?
Erik
On Aug 20, 2004, at 6:32 PM, [EMAIL PROTECTED] wrote:
Hi guys!
I was hoping someone here could help me out with a custom filter.
We have an index of emails and do some searches on t
>It's easy enough for folks to compile Lucene this way
I'm having trouble, warnings and error messages appended. This is for
Lucene 1.4.1. One of the few Debian specific changes was to call the
jarball 1.4 instead of the default 1.5-rc1-dev designation in
build.xml.
rode:~> gcj --version
gcj (G
We're currently in lucene 1.2... haven't moved to 1.3 yet.
Roy.
On Fri, 20 Aug 2004 18:46:29 -0400, Erik Hatcher wrote
> Have you considered using the built-in QueryFilter for this? Why
> isn't it sufficient for your needs?
Are you calling ParallelMultiSearcher.search(Query query, Sort sort) to do your
search? If so, I am currently having a similar problem.
Terence
>
> Doing query against lucene I run into memomry problem, i.e. it's look like
> it's not giving memory back after the
> query have been executed.
>
On Aug 20, 2004, at 6:48 PM, [EMAIL PROTECTED] wrote:
We're currently in lucene 1.2... haven't moved to 1.3 yet.
Skip 1.3 and go straight to 1.4.1 :)
Upgrade - why not?
Erik
-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For a
The bottleneck seems to be disk IO.
Since this is a read-only index, why not spread some of the frequently
scanned index files over multiple disks, or put the index on SCSI disks
hooked up in a RAID. Maybe this is already the case, but you didn't
mention in.
Oh, I already answered a similar quest
I have Lucene working in an applet and I've seen this problem only when
the jar file really was not available (typo in the jar name), which is
what you'd expect. It's possible that the classpath for your
application is not the same as the classpath for the applet; perhaps
they're using differen
--- Otis Gospodnetic <[EMAIL PROTECTED]>
wrote:
> The bottleneck seems to be disk IO.
But it's not. Linux is caching the whole file, and
there really isn't any disk activity at all. Most of
the threads are blocked on InputStream.refill, not
waiting for the disk, but waiting for their turn into
--- Otis Gospodnetic <[EMAIL PROTECTED]>
wrote:
> The bottleneck seems to be disk IO.
But it's not. Linux is caching the whole file, and
there really isn't any disk activity at all. Most of
the threads are blocked on InputStream.refill, not
waiting for the disk, but waiting for their turn into
49 matches
Mail list logo