Hi,
I'm trying to use IndexSearcher.explain(Query query, int doc) and am
getting a NPE. If I remove the "explain" the search works fine.
I poked a little at the TermQuery.java code, but I can't really tell
what's causing the exception.
This is with 1.3rc3
Exception in thread "main" java.lang.N
On Tuesday 02 December 2003 09:51, Tun Lin wrote:
> Anyone knows a search engine that supports xml formats?
There's no way to generally "support xml formats", as xml is just a
meta-language. However, building specific search engines using Lucene core it
should be reasonably straight-forward to i
hi
In lucene demo the summary that is displayed is having text that contained inside html
tag (like margin, top , left and so) .
so how to display actually in the page which is related to the page description.
ur help is appreciated
thanking you
mahesh
Hi,
>> Do they produce same ranking results?
No; Lucene's operations on query weight and length normalization is not
equivalent to a vanilla cosine in vector space.
>> I guess the 2nd approach will be more precise but slow.
Query similarity
will indeed be faster, but may actually not be wor
i think i am missing the original question, but by most accepted definitions, the
tf/idf model in Lucene is a probabilistic model. it's got strange normalizations
though that doesn't allow comparisons of rank values across queries.
it isn't terribly hard to make a normalized probabilistic model
Hi,
>>
I would highly appreciate it if the experts here (especially Karsten or
Chong) look at my idea and tell me if this would be possible.
>>
Sorry, I have no idea about how to use a probabilistic approach with
Lucene, but if anyone does so, I would like to know, too.
I am currently puzzled
There is LARM, there is Nutch, there is Egothor (doesn't use Lucene),
etc.
Otis
--- "Zhou, Oliver" <[EMAIL PROTECTED]> wrote:
> I think it is common task to index a jsp based web site. A lot of
> poeple
> ask how to do so on this mailing list. However, Lucene does not have
> a
> ready to use we
On Wed, Dec 03, 2003 at 02:49:12PM +, jt oob wrote:
> --- Dror Matalon <[EMAIL PROTECTED]> wrote: > On Tue, Dec 02, 2003 at
> 01:54:58PM +, jt oob wrote:
> > > Hi,
> > >
> > > I have just indexed a lot of news (nntp) postings.
> > > I now have an index for each topic (a topic can have man
You can try Capek (needs JDK1.4, because it uses NIO). It can crawl
whatever you like.
API:
http://www.egothor.org/api/robot/
Console - demo (*.dundee.ac.uk):
http://www.egothor.org/egothor/index.jsp?q=http%3A%2F%2Fwww.compbio.dundee.ac.uk%2F
Leo
Zhou, Oliver wrote:
I think it is common task to
I think it is common task to index a jsp based web site. A lot of poeple
ask how to do so on this mailing list. However, Lucene does not have a
ready to use web crawler. My question is that has anybody used Spindle to
index a jsp based web site or is there any other tools out there.
Thanks,
Oli
You should ask Spindle author(s). The error doesn't look like
something that is related to Lucene, really.
Otis
--- "Zhou, Oliver" <[EMAIL PROTECTED]> wrote:
> What about Spindle? Has anybody used it to crawle a jsp based web
> site? Do I
> need to intall listlib.jar to do so?
>
> I got error
What about Spindle? Has anybody used it to crawle a jsp based web site? Do I
need to intall listlib.jar to do so?
I got error message "Jsp Translate:Unable to find setter method for
attribue:class" when I tried to run listlib-example.jsp in wsad.
Thanks,
Oliver
That was actually the answer. Originally I thought Hits provide a reference
to all documents. However it seem logical that documents with 0.0 should not
be contained.
Thank you,
Ralf
> I'm a bit confused by what you're asking. Hits points to all documents
> that matched the query. A score > 0.
On Wednesday, December 3, 2003, at 10:16 AM, Ralph wrote:
Does this mean Hits points to ALL documents and the last one might
have a
score of 0.0 ? If it does not contain all documents, where is the
treshhold
then? Or based on which condition it stops pointing to certain
documents?
I'm a bit con
Does this mean Hits points to ALL documents and the last one might have a
score of 0.0 ? If it does not contain all documents, where is the treshhold
then? Or based on which condition it stops pointing to certain documents?
Ralf
> On Wednesday, December 3, 2003, at 09:36 AM, Ralph wrote:
> > is
--- Dror Matalon <[EMAIL PROTECTED]> wrote: > On Tue, Dec 02, 2003 at
01:54:58PM +, jt oob wrote:
> > Hi,
> >
> > I have just indexed a lot of news (nntp) postings.
> > I now have an index for each topic (a topic can have many
> newsgroups)
> >
> > The index sizes are:
> >
> > 2.6G Current
On Wednesday, December 3, 2003, at 09:36 AM, Ralph wrote:
is there a maximum of documents Hits provide or is it unlimited (means
limited to heap size of VM)? If there is a maximimum, what is the
number?
Hits represents all documents that matched the query (and optionally
filtered).
But, Hits do
Hi,
is there a maximum of documents Hits provide or is it unlimited (means
limited to heap size of VM)? If there is a maximimum, what is the number?
Ralf
--
+++ GMX - die erste Adresse für Mail, Message, More +++
Neu: Preissenkung für MMS und FreeMMS! http://www.gmx.net
-
Hello group,
from the very inspiring conversations with Karsten I know that Lucene is
based on a Vector Space Model. I am just wondering if it would be possible to
turn this into a probabilistic Model approach. Of course I do know that I
cannot change the underlying indexing and searching principl
there is no direct support in Lucene for this. there are several strategies for
automatic query expansion and most of them rely on either extensive domain-specific
analysis of the top N documents on the assumption that the search engine performs well
enough to guarantee that the top N documents
Hello Group of Lucene users,
query reformulation is understood as a effective way to improve retrieval
power significantly. The theory teaches us that it consists of two basic steps:
a) Query expansion (with new terms)
b) Reweighting of the terms in the expanded query
User relevance feedback is
Uh, I get to do this dirty job. :(
Lucene-user and lucene-dev are not the appropriate fora for questions
such as this one.
Please ask the original author of the text for help, or use an online
translation service, such as the one at http://babelfish.av.com
Also, for questions about Lucene usage,
22 matches
Mail list logo