NPE when using explain

2003-12-03 Thread Dror Matalon
Hi, I'm trying to use IndexSearcher.explain(Query query, int doc) and am getting a NPE. If I remove the "explain" the search works fine. I poked a little at the TermQuery.java code, but I can't really tell what's causing the exception. This is with 1.3rc3 Exception in thread "main" java.lang.N

Re: SearchBlox J2EE Search Component Version 1.1 released

2003-12-03 Thread Tatu Saloranta
On Tuesday 02 December 2003 09:51, Tun Lin wrote: > Anyone knows a search engine that supports xml formats? There's no way to generally "support xml formats", as xml is just a meta-language. However, building specific search engines using Lucene core it should be reasonably straight-forward to i

summary

2003-12-03 Thread uma mahesh rao
hi In lucene demo the summary that is displayed is having text that contained inside html tag (like margin, top , left and so) . so how to display actually in the page which is related to the page description. ur help is appreciated thanking you mahesh

AW: Document Similarity

2003-12-03 Thread Karsten Konrad
Hi, >> Do they produce same ranking results? No; Lucene's operations on query weight and length normalization is not equivalent to a vanilla cosine in vector space. >> I guess the 2nd approach will be more precise but slow. Query similarity will indeed be faster, but may actually not be wor

RE: Probabilistic Model in Lucene - possible?

2003-12-03 Thread Chong, Herb
i think i am missing the original question, but by most accepted definitions, the tf/idf model in Lucene is a probabilistic model. it's got strange normalizations though that doesn't allow comparisons of rank values across queries. it isn't terribly hard to make a normalized probabilistic model

AW: Probabilistic Model in Lucene - possible?

2003-12-03 Thread Karsten Konrad
Hi, >> I would highly appreciate it if the experts here (especially Karsten or Chong) look at my idea and tell me if this would be possible. >> Sorry, I have no idea about how to use a probabilistic approach with Lucene, but if anyone does so, I would like to know, too. I am currently puzzled

RE: What about Spindle

2003-12-03 Thread Otis Gospodnetic
There is LARM, there is Nutch, there is Egothor (doesn't use Lucene), etc. Otis --- "Zhou, Oliver" <[EMAIL PROTECTED]> wrote: > I think it is common task to index a jsp based web site. A lot of > poeple > ask how to do so on this mailing list. However, Lucene does not have > a > ready to use we

Re: Ways to search indexes

2003-12-03 Thread Dror Matalon
On Wed, Dec 03, 2003 at 02:49:12PM +, jt oob wrote: > --- Dror Matalon <[EMAIL PROTECTED]> wrote: > On Tue, Dec 02, 2003 at > 01:54:58PM +, jt oob wrote: > > > Hi, > > > > > > I have just indexed a lot of news (nntp) postings. > > > I now have an index for each topic (a topic can have man

Re: What about Spindle

2003-12-03 Thread Leo Galambos
You can try Capek (needs JDK1.4, because it uses NIO). It can crawl whatever you like. API: http://www.egothor.org/api/robot/ Console - demo (*.dundee.ac.uk): http://www.egothor.org/egothor/index.jsp?q=http%3A%2F%2Fwww.compbio.dundee.ac.uk%2F Leo Zhou, Oliver wrote: I think it is common task to

RE: What about Spindle

2003-12-03 Thread Zhou, Oliver
I think it is common task to index a jsp based web site. A lot of poeple ask how to do so on this mailing list. However, Lucene does not have a ready to use web crawler. My question is that has anybody used Spindle to index a jsp based web site or is there any other tools out there. Thanks, Oli

Re: What about Spindle

2003-12-03 Thread Otis Gospodnetic
You should ask Spindle author(s). The error doesn't look like something that is related to Lucene, really. Otis --- "Zhou, Oliver" <[EMAIL PROTECTED]> wrote: > What about Spindle? Has anybody used it to crawle a jsp based web > site? Do I > need to intall listlib.jar to do so? > > I got error

What about Spindle

2003-12-03 Thread Zhou, Oliver
What about Spindle? Has anybody used it to crawle a jsp based web site? Do I need to intall listlib.jar to do so? I got error message "Jsp Translate:Unable to find setter method for attribue:class" when I tried to run listlib-example.jsp in wsad. Thanks, Oliver

Re: Hits - how many documents?

2003-12-03 Thread ambiesense
That was actually the answer. Originally I thought Hits provide a reference to all documents. However it seem logical that documents with 0.0 should not be contained. Thank you, Ralf > I'm a bit confused by what you're asking. Hits points to all documents > that matched the query. A score > 0.

Re: Hits - how many documents?

2003-12-03 Thread Erik Hatcher
On Wednesday, December 3, 2003, at 10:16 AM, Ralph wrote: Does this mean Hits points to ALL documents and the last one might have a score of 0.0 ? If it does not contain all documents, where is the treshhold then? Or based on which condition it stops pointing to certain documents? I'm a bit con

Re: Hits - how many documents?

2003-12-03 Thread Ralph
Does this mean Hits points to ALL documents and the last one might have a score of 0.0 ? If it does not contain all documents, where is the treshhold then? Or based on which condition it stops pointing to certain documents? Ralf > On Wednesday, December 3, 2003, at 09:36 AM, Ralph wrote: > > is

Re: Ways to search indexes

2003-12-03 Thread jt oob
--- Dror Matalon <[EMAIL PROTECTED]> wrote: > On Tue, Dec 02, 2003 at 01:54:58PM +, jt oob wrote: > > Hi, > > > > I have just indexed a lot of news (nntp) postings. > > I now have an index for each topic (a topic can have many > newsgroups) > > > > The index sizes are: > > > > 2.6G Current

Re: Hits - how many documents?

2003-12-03 Thread Erik Hatcher
On Wednesday, December 3, 2003, at 09:36 AM, Ralph wrote: is there a maximum of documents Hits provide or is it unlimited (means limited to heap size of VM)? If there is a maximimum, what is the number? Hits represents all documents that matched the query (and optionally filtered). But, Hits do

Hits - how many documents?

2003-12-03 Thread Ralph
Hi, is there a maximum of documents Hits provide or is it unlimited (means limited to heap size of VM)? If there is a maximimum, what is the number? Ralf -- +++ GMX - die erste Adresse für Mail, Message, More +++ Neu: Preissenkung für MMS und FreeMMS! http://www.gmx.net -

Probabilistic Model in Lucene - possible?

2003-12-03 Thread ambiesense
Hello group, from the very inspiring conversations with Karsten I know that Lucene is based on a Vector Space Model. I am just wondering if it would be possible to turn this into a probabilistic Model approach. Of course I do know that I cannot change the underlying indexing and searching principl

RE: Query reformulation (Relevance Feedback) in Lucene?

2003-12-03 Thread Chong, Herb
there is no direct support in Lucene for this. there are several strategies for automatic query expansion and most of them rely on either extensive domain-specific analysis of the top N documents on the assumption that the search engine performs well enough to guarantee that the top N documents

Query reformulation (Relevance Feedback) in Lucene?

2003-12-03 Thread ambiesense
Hello Group of Lucene users, query reformulation is understood as a effective way to improve retrieval power significantly. The theory teaches us that it consists of two basic steps: a) Query expansion (with new terms) b) Reweighting of the terms in the expanded query User relevance feedback is

Re: Translation.

2003-12-03 Thread Otis Gospodnetic
Uh, I get to do this dirty job. :( Lucene-user and lucene-dev are not the appropriate fora for questions such as this one. Please ask the original author of the text for help, or use an online translation service, such as the one at http://babelfish.av.com Also, for questions about Lucene usage,