[
https://issues.apache.org/jira/browse/LUCENE-3262?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13121844#comment-13121844
]
Doron Cohen commented on LUCENE-3262:
-------------------------------------
Thanks for reviewing Shai!
* facets.alg: I agree, and added that specific facet source setting. With also
modified the algorithm to have several rounds, some with facets, some without,
so the effect of adding facets on performance is measured.
* PFD.readers.incRef() - once the task (e.g. OpenReaderTask) opened the reader
that task is gone and the reader really is maintained by the PFD, therefore IMO
it is important that the PFD maintains its ref count as it does today. I agree
that more documentation for this can help and added something in this spirit to
both setReader()'s:
{code}
/**
* Set the taxonomy reader. Takes ownership of that taxonomy reader, that is,
* internally performs taxoReader.incRef().
* @param indexReader The indexReader to set.
*/
public synchronized void setTaxonomyReader(TaxonomyReader taxoReader) throws
IOException {
}
/**
* Set the index reader. Takes ownership of that index reader, that is,
* internally performs indexReader.incRef().
* @param indexReader The indexReader to set.
*/
public synchronized void setIndexReader(IndexReader indexReader) throws
IOException {
}
{code}
* ItemSource and config props "source.xyz" - I am afraid there might be
algorithms out there that will silently stop working, as the props names are
not safe types or something like that. So I would rather to not alter these
property names. Instead, following your comment, I renamed *ItemSource* to
*ContentItemsSource* which is still extended by both ContentSource and
FacetSource. So it is not that bad now that the property names start with
"content." (these properties apply to all content-item-sources - that is both
doc-content and facets).
* ItemSource.resetInputs @SuppressWarnings("unused") - this is for the
IOException which is not really thrown by that code (though it might be thrown
by subclasses). Perhaps just a misconfiguration of my IDE? (I assume you have
that warning disabled or stg?)
* PerfRunData Class.forName on String rather than class.getName() - this is in
fact the code pattern for all reflection actions in PerfRunData, so I just
repeated for the new code whatever was there for the existing code. However, in
fact, I thought about the same thing when adding this code. I think the
difference is that if config contains a specification of another class, the
default class would not even be loaded when using strings, but it will be
loaded (though not used) when using Class.getName(). It is not a big
performance issue, but it does seem "cleaner" to not load classes that are not
going to be used. In any case, if we would like to change that, should revisit
all the .forName(String) of default impls in the benchmark (and I think there
are a few) and it seems to not belong to this issue.
Thanks for the review, working on a new patch - there are several copy/paste
errors in the code where a CloseTaxonomyReader by mistake sets the PFD
IndexReader to null...
> Facet benchmarking
> ------------------
>
> Key: LUCENE-3262
> URL: https://issues.apache.org/jira/browse/LUCENE-3262
> Project: Lucene - Java
> Issue Type: New Feature
> Components: modules/benchmark, modules/facet
> Reporter: Shai Erera
> Assignee: Doron Cohen
> Attachments: CorpusGenerator.java, LUCENE-3262.patch,
> TestPerformanceHack.java
>
>
> A spin off from LUCENE-3079. We should define few benchmarks for faceting
> scenarios, so we can evaluate the new faceting module as well as any
> improvement we'd like to consider in the future (such as cutting over to
> docvalues, implement FST-based caches etc.).
> Toke attached a preliminary test case to LUCENE-3079, so I'll attach it here
> as a starting point.
> We've also done some preliminary job for extending Benchmark for faceting, so
> I'll attach it here as well.
> We should perhaps create a Wiki page where we clearly describe the benchmark
> scenarios, then include results of 'default settings' and 'optimized
> settings', or something like that.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators:
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]