: Its funny, but I'm having a memory leak with Hibernate that I spent the
: whole of yesterday banging my head against a wall about and so when
: searching for emails with Leak in the title came across your message.
: I'm probably going to hit the same problem as you for long running
: multi-threa
On Friday 09 February 2007 17:14, Sairaj Sunil wrote:
> I have increased the merge factor from 10 to 50.
Please try increasing setMaxBufferedDocs() instead, does that help?
Regards
Daniel
--
http://www.danielnaber.de
-
To un
Thanks a lot for all your help. I guess this temporary fix will have to do
until I have clearance to post some code. For the current index (that was last
modified over a year ago), it works fine, but I know it's not properly done.
Thank you all very much, especially you Mr Erickson.
Xavier Tô
B
> For example, given terms "female", "John" and "London" - all 3 may
> have equal IDF but should a document representing a female in London
> be given equal weighting to a document representing the rarer example
> of a female who happens to be called "John"?
Not to mention multi-word phrase tokeni
Also, there's a default of 10,000 tokens per field at index time
Erick
On 2/9/07, mark harwood <[EMAIL PROTECTED]> wrote:
See Highlighter.setMaxDocBytesToAnalyze(int byteCount)
It's default setting is limited in order to avoid excessive response
times.
Cheers
Mark
- Original Messag
The query should be tokenized *by the query parser*. You shouldn't have to
do the tokenizing yourself. When you print out the results of the parsing,
you should see something like field:value1 field:value2, which are built up
under the covers to be a BooleanQuery with a bunch of clauses.
I think,
See Highlighter.setMaxDocBytesToAnalyze(int byteCount)
It's default setting is limited in order to avoid excessive response times.
Cheers
Mark
- Original Message
From: Fred Eaker <[EMAIL PROTECTED]>
To: java-user@lucene.apache.org
Sent: Friday, 9 February, 2007 4:28:36 PM
Subject: High
Is there a limit to how many characters a Highlighter or NullFragmenter will
return?
I have indexed an entire HTML document (145kb). When I use the highlighter with
a NullFragmenter, the getBestFragment and getBestFragments methods return the
text of the field up to 51316 characters.
I have tried
Solr provides an XML interface to everything: index adds, deletes, updates,
searches, highlights, explanations, facets, commits, and optimize
statements. I'm sure I've forgotten some :)
It also supports JSON, as well as some other formats, if you prefer that.
The Solr wiki explains how it works.
Hi all,
I have increased the merge factor from 10 to 50. I thought the indexing
performance will be better. But the time taken taken to index is more than
the time taken for the merge factor of 10. The documentation and some
articles say that the time taken to index will improve if the merge facto
On Feb 9, 2007, at 9:13 AM, Kainth, Sachin wrote:
What does solr provide and how can I use it with dotLucene?
Have a 10 minute dedicated look at http://lucene.apache.org/solr -
download the latest binary distribution, follow along with the
tutorial. After that, you'll know almost everythi
But would it still use the Java version of Lucene? Are you saying that
unlike Lucene Web Service, Solr can be used via .NET code? Do they both
still use the Java version of Lucene though?
Let me explain what I want to do. I want to be able to set up a
dedicated machine for dotLucene so that ind
Does anybody have any experience with setting up a Lucene RAMDirectory index
for replication across multiple WebSphere servers and taking advantage of
WebSphere's built-in Object Cache? We are currently re-building/refreshing
from the source the entire RAMDirectory index on each WebSphere server
Hi
You could try SOLR
http://lucene.apache.org/solr/
This is obviously Java but you can access it using .NET...
Hope this helps
Patrick
On 09/02/07, Kainth, Sachin <[EMAIL PROTECTED]> wrote:
Hello all,
Does anyone know if there is a .NET version of Lucene Web Service?
Cheers
This email a
Hello all,
Does anyone know if there is a .NET version of Lucene Web Service?
Cheers
This email and any attached files are confidential and copyright protected. If
you are not the addressee, any dissemination of this communication is strictly
prohibited. Unless otherwise expressly agreed in w
Hi Otis,
Its funny, but I'm having a memory leak with Hibernate that I spent the
whole of yesterday banging my head against a wall about and so when
searching for emails with Leak in the title came across your message.
I'm probably going to hit the same problem as you for long running
multi-thread
What does solr provide and how can I use it with dotLucene?
-Original Message-
From: Erik Hatcher [mailto:[EMAIL PROTECTED]
Sent: 09 February 2007 14:11
To: java-user@lucene.apache.org
Subject: Re: categorisation
On Feb 9, 2007, at 9:08 AM, Kainth, Sachin wrote:
> Are you saying that w
On Feb 9, 2007, at 9:08 AM, Kainth, Sachin wrote:
Are you saying that without solr I will have caching problems under
load?
no, not at all. i'm saying you'll likely reinvent a lot of what solr
already provides, in order to _scale_ that is.
--
Are you saying that without solr I will have caching problems under
load?
-Original Message-
From: Erik Hatcher [mailto:[EMAIL PROTECTED]
Sent: 09 February 2007 14:06
To: java-user@lucene.apache.org
Subject: Re: categorisation
On Feb 9, 2007, at 7:07 AM, Kainth, Sachin wrote:
> But doe
On Feb 9, 2007, at 7:07 AM, Kainth, Sachin wrote:
But does that not imply that a second search is made against the index
by the line:
BitSet all = (new QueryFilter(q)).bits(reader)
Yeah, if you want to return facet counts and results in the same
sweep, yes. If all you want are the counts,
Hey, thanks a lot for taking so much time here...
I did check the and they appear to be the same...at least they are same class
and same package. I just noticed something : they are using LowerCaseFilter
I was going to say "could it be the source of the numbers being ignored ?" but
it shoul
But does that not imply that a second search is made against the index
by the line:
BitSet all = (new QueryFilter(q)).bits(reader)
-Original Message-
From: Kainth, Sachin [mailto:[EMAIL PROTECTED]
Sent: 09 February 2007 12:05
To: java-user@lucene.apache.org
Subject: RE: categorisation
A
Ahhh it all makes sense to me now :-)
-Original Message-
From: Erik Hatcher [mailto:[EMAIL PROTECTED]
Sent: 09 February 2007 12:01
To: java-user@lucene.apache.org
Subject: Re: categorisation
On Feb 9, 2007, at 5:40 AM, Kainth, Sachin wrote:
> It makes sense to me only if you tell me th
On Feb 9, 2007, at 5:40 AM, Kainth, Sachin wrote:
It makes sense to me only if you tell me that all the bits in the
BitSet
"all" will be 1.
well, ok, so the "all" may be misleading. call it queryBits instead
then :)
"all" means *all documents that match the query*, though.
it wouldn't
The distinguishing characteristics you mark out and put in a field may not be
so distinguishing as more content is added to an index (e.g. use of new
terminology like "podcast" becomes more prevalent). Maintaining/regenerating
this field in anything other than a static index then starts to look
You are right I didn't think about it at all to be honest.
-Original Message-
From: karl wettin [mailto:[EMAIL PROTECTED]
Sent: 09 February 2007 10:46
To: java-user@lucene.apache.org
Subject: Re: Empty search
9 feb 2007 kl. 11.34 skrev Kainth, Sachin:
> Yep it is the queryparser that
9 feb 2007 kl. 11.34 skrev Kainth, Sachin:
Yep it is the queryparser that I'm referring to. Just sounds odd
to me.
An empty string search should be handled properly I think. It should
simply to nothing.
I did not look any closer at this than reading you post, but what
about if you made
It makes sense to me only if you tell me that all the bits in the BitSet
"all" will be 1.
-Original Message-
From: Erik Hatcher [mailto:[EMAIL PROTECTED]
Sent: 08 February 2007 18:37
To: java-user@lucene.apache.org
Subject: Re: categorisation
On Feb 8, 2007, at 12:36 PM, Kainth, Sachin
Yep it is the queryparser that I'm referring to. Just sounds odd to me.
An empty string search should be handled properly I think. It should
simply to nothing.
-Original Message-
From: karl wettin [mailto:[EMAIL PROTECTED]
Sent: 08 February 2007 18:05
To: java-user@lucene.apache.org
S
I just woke up thinking it would be cool to attempt reducing the data
of all documents using PCA (or so) and store the output in a new
field per dimention introduced in order to find similair documents by
placing a simple proximity query. Did anyone attempt something like
this?
I did not
30 matches
Mail list logo