l will help you see what
files are open and you can validate that all of the really need to be open.
Best of luck.
Dmitry.
Neelam Bhatnagar wrote:
Hi,
I had requested help on an issue we have been facing with the "Too many
open files" Exception garbling the search indexes and crashing the
t.org/luke/) to look into them and see what documents
they contain?
Good luck!
Dmitry.
Thanks,
Ed
--- Dmitry Serebrennikov <[EMAIL PROTECTED]> wrote:
This is not a normal behavior, unless you are running on Windows and
have searchers open for that long that are still locking the segments
ay be the answer.
- look into "lsof" utility. It can display all file handles in use by a
given process. This is a good tool to
troubleshoot "too many open files" issues.
Good luck.
Dmitry.
-
To unsubscribe, e-
snapshot would have had this
problem. If you are running from CVS, try the latest release and see if
this occurs again.
Dmitry.
Edwin Tang wrote:
Hello,
I'm seeing in my index directory some segment files that are not included in the
segments or
deletable files. These segment files show
of Sun's stack. Sun's handler is called sun.net.www.protocol.http.Handler.
Hope this helps.
Good luck!
Dmitry.
Natarajan.T wrote:
Hi FYI,
I am doing web crawling in my application using proxy setting. like the
below code..
Properties systemSettings = System.getProperties();
systemSettings.put("http
35% was, I think, to illustrate that index data structures
used for searching by Lucene are efficient. But Lucene does nothing
special about stored content - no compression or anything like that. So
you end up with the pure size of your data plus the 35% of the indexed
data.
Cheers.
Dmitry
good working order. If you are not sure if you saw problems with
pre-8/15 or post-8/15 version of the code, is it possible for you to try
the latest CVS and see if the problem exists now? If it does, it will of
course require urgent attention.
Thanks very much!
Dmitry.
Daniel Naber wrote:
On
TED]
References and the resume are available upon request. Payment hourly, or
as a fixed bid.
May the source be with you! :)
Thanks very much, and best wishes to everyone.
Dmitry Serebrennikov
-
To unsubscribe, e-mail: [E
o match. If it is neither (no prefix in the query parser), it is
not required to match for the query to match, provided some other
component of the query does. This last one may seem useless, except that
if this query component does match, the score will be boosted. So
documents that do match t
script I'm including will
parse that kind of data and produce a comma-separated output of GC stats
that can be graphed more easily.
1458110
Hope you guys find the above useful.
Good luck!
Dmitry.
#!/usr/local/bin/python
import sys
text = open(sys.argv[1], "r&quo
close() was called twice on the same IndexWriter. Perhaps the demo has a
bug that ends up doing this in some cases?
Dmitry.
Subject:
RE: crash in Lucene
From:
"Chong, Herb" <[EMAIL PROTECTED]>
Date:
Tue, 4 Nov 2003 16:04:38 -0500
To:
"Lucene Users List" <[EMAIL PROTECTE
MultiSearcher). There, hits arrive in order
in which they are found, which is the insertion order. So I don't know
when a hit with the highest score will come about.
Dmitry.
--
To unsubscribe, e-mail: <mailto:[EMAIL PROTECTED]>
For additional commands, e-mail: <mailto:[EMAIL PROTECTED]>
Ype Kingma wrote:
>On Tuesday 15 October 2002 04:16, Dmitry Serebrennikov wrote:
>
>
>>Greetings,
>>
>>I know that the FAQ says that they are, but in at least one instance in
>>my index it appears to be equal to 1.94something. Are the scores
>>guaranteed
f it would, would then I have to do something during the indexing time
to set normalization / scoring factors for that field to something or
other?
Thanks.
Dmitry.
--
To unsubscribe, e-mail: <mailto:[EMAIL PROTECTED]>
For additional commands, e-mail: <mailto:[EMAIL PROTECTED]>
Greetings,
I know that the FAQ says that they are, but in at least one instance in
my index it appears to be equal to 1.94something. Are the scores
guaranteed to be between 0 and 1, and if not, what would it take to make
them such?
Thanks.
Dmitry.
--
To unsubscribe, e-mail: <mai
but I can't see without a detailed study whether the mergeFactor applies
to merging from RAM to disk only or for merging on-disk segments as
well. If it applies to both, perhaps we could add a different field to
the IndexWriter to allow the two values to be different? Am I missing
somethin
Could someone please change the Reply-To header on the digest messages
from the lucene-user list? Right now it goes back to the
lucene-user-digest address which bounces back. It's no big deal but it
bites me from time to time and I imaging a few other people as well...
Thanks.
Dmitry
m.
>>
>>It's not a big deal, as my actual document collection is not this small. I'm just
>curious.
>>
>>-- David Elworthy
>>
>There is no known problem, but there is buffering where 10 documents are
>indexed into memory and then are flushed to disk. The
without knowing more about the PPT file
format. If you can find a program or library that will extract text from
a PPT file, you then should be able to easily use Lucene to index this
text. This might not be as elegant as the final solution of the project
I mentioned above, but this is the w
Subject:
Re: Homogeneous vs Heterogeneous indexes (was: FileNotFoundException)
From:
petite_abeille <[EMAIL PROTECTED]>
Date:
Wed, 1 May 2002 08:37:51 +0200
To:
"Lucene Users List" <[EMAIL PROTECTED]>
On Wednesday, May 1, 2002, at 12:41 AM, Dmitry Serebrennik
le memory in a
particular NT kernel memory pool (not just the free memory on the
system). The pool size can be controlled probably, but I've found that
it is usually generous enough - more so than the Solaris settings.
If BSD is like NT in this regard (at least to some degree), the number
o
IndexWriter called infoStream.
If this is set to a PrintStream (such as System.out), various diagnostic
messages about the merging process will be printed to that stream. You
might find this helpful in tuning the merge parameters.
Hope this helps.
Good luck.
Dmitry.
--
To unsubscribe, e-mail
>
>
>Subject:
>
>RE: Relevance Feedback
>From:
>
>Doug Cutting <[EMAIL PROTECTED]>
>Date:
>
>Sat, 30 Mar 2002 08:51:39 -0800
>To:
>
>Lucene Users List <[EMAIL PROTECTED]>
>
>
>Dmitry Serebrennikov [[EMAIL PROTECTED]] has implemented a
>
>
>>>Lex Lawrence wrote:
>>>
You miss my point. The value of an "unstored" Field is not stored in the
index, however it's name most certainly is. That's what I'm interested in.
What I'd like to know if there is a way to get the names of all searchable
Fields in an index.
>
>
>
>>[1] There's no update so delete and then add is what you want.
>>[2] I have had the same problems w/ using an IndexWriter and IndexReader
>>at the same time and getting a locking problem when deleting. I think I
>>sent
>>mail to the list w/ a test case a week ago [disclaimer: this is not
>>
the other hand, if it is just a
String or a StringReader it would consume memory equal (probably
greater) to the size of the data. One way to fix this is to create your
own Reader class, say DelayedReader, which does not open a file upon
creation, but only upon the first read. That would help sa
> The issue is that the set of features for queries on different types of
> contextual units (used to define Lucene documents) will be different.
> An example is that our XML and text documents need fuzzy-matching and porter
> stemming capabilities and on others (created and maintained from metad
27 matches
Mail list logo