Are there any performance test suites available in lucene codebase which can be
reused by us to benchmark against our lucene infrastructure?
We are looking at mainly multithreaded indexing tests.
-Vidhya
Hi All
Thank you for all your suggestions. Some of the recommendations hadn't
yet been implemented, as our code base was using older versions of
Lucene with reduced capabilities. Thus, far, all the recommendations
for fast search have been implemented (e.g. using pagination with
searchAfter,
Greetings Lucene Users
As a follow-up to my earlier mail:
We are also using Lucene segment warmers, as per recommendation,
segments per tier is now set to five, buffer memory is set to
(Runtime.getRuntime().totalMemory()*.08)/1024/1024;
See below for code used to instantiate writer:
Hi,
> Am I correct that using SearchManager can't be used with a MultiReader and
> NRT? I would appreciate all suggestions on how to optimize our search
> performance further. Search time has become a usability issue.
Just have a SearcherManger for every index. MultiReader construction is cheap
I was planning to use ETSC in-conjunction with SortingMergePolicy and got
stuck.
In ESTC, we have
@Override
public void collect(int doc) throws IOException {
in.collect(doc);
if (++numCollected >= numDocsToCollect) {
throw new CollectionTerminatedException();
}
}
I und
Sorry for re-asking.
Has anyone implemented an AnalyzingSuggester which
- is fuzzy
- is case insensitive (or must/should this be implemented by the analyzer?)
- does infix search
[- has a small memory footprint]
-Ursprüngliche Nachricht-
Von: Clemens Wyss DEV [mailto:clemens...@mysign.ch
Hello,
I am getting below exception, and using Drillsideways facets. While getting
children i am getting below exception:
17:02:10,496 ERROR [stderr:71] (Thread-2
(HornetQ-client-global-threads-790878673))
java.lang.IllegalArgumentException: dimension "CITY" was not indexed into
field "$facets
17
Hi Clemens,
I haven't yet built a suggester which combines all three, and am not aware of
one. I'd love to have one though ;-)
Case- and diacritics insensitivity is supported out-of-the-box by the analyzing
suggesters, including the FuzzySuggester. The logic is in the Analyzer.
I haven't yet t
Lucene Experts -
Recently we upgraded to Lucene 4. We want to make use of concurrent flushing
feature Of Lucene.
Indexing for us includes certain db operations and writing to lucene ended by
commit. There may be multiple concurrent calls to Indexer to publish
single/multiple records.
So far,
You could just avoid calling commit() altogether if your application's
semantics allow this (i.e. it's non-transactional in nature). This way,
Lucene will do commits when appropriate, based on the buffering settings
you chose. It's generally unnecessary and undesirable to call commit at the
end of
If you are using stored fields in your index, consider playing with
compression settings, or perhaps turning stored field compression off
altogether. Ways to do this have been discussed in this forum on numerous
occasions. This is highly use case dependent though, as your indexing
performance may o
How do you add facets to your documents? Did you play with the
FacetsConfig, such as alter the field under which the CITY dimension is
indexed?
If you can reproduce this failure in a simple program, I guess it will be
easy to spot the error. Looks like a configuration error to me...
Shai
On Fri
It is non transactional. We first write the same data to database in a
transaction and then call writer addDocument. If lucene fails we still hold
the data to recover.
I can avoid the commit if we use NRT reader. We do need this to be searchable
immediately.
Another question. I did try removi
Hmm, I might have actually given you a slightly incorrect explanation wrt
what happens when internal buffers fill up. There will definitely be a
flush of the buffer, and segment files will be written to, but it's not
actually considered a full commit, i.e. an external reader will not see
these chan
Thanks for helping me.
Yes, i did couple of things:
Below is simple code for indexing which i use.
TrackingIndexWriter nrtWriter
DirectoryTaxonomyWriter taxoWriter = ...
FacetsConfig config = new FacetConfig();
config.setHierarchical("CITY", true)
config.setMultiValued("CITY", true);
config
Let me try with the NRT and periodic commit say every 5 mins in a committer
thread on need basis.
Is there a threshold limit on how long we can go without committing ? I think
the buffers get flushed to disk but not to crash proof on disk. So we should be
good on memory.
I should also verify
This is a better idea than what you had before, but I don't think there's
any point in doing any commits manually at all unless you have a way of
detecting and recovering exactly the data that hasn't been committed. In
other words, what difference does it make whether you lost 1 index record
or 1M,
We do have a way to recover partially with a version number for each
transaction. The same version maintained in lucene as one document. During
startup these numbers define what has to be syncd up. Unfortunately lucene is
used in a webapp, so this happens "only" during a jetty restart.
- Vidhy
Hmm, I'm not sure you want to rely on the presence or absence of a
particular document in the index to determine the recovery point. It may
work for inserts, but not likely for updates or removes. I would look into
driving the version numbers from the commiter to the DB, and record them as
commit u
19 matches
Mail list logo