[ 
https://issues.apache.org/jira/browse/LUCENE-3262?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13123650#comment-13123650
 ] 

Gilad Barkai commented on LUCENE-3262:
--------------------------------------

Doron, great patch!

I ran it and was somewhat surprised at the large overhead of the facet 
indexing. Digging deeper, I found the number of random facets to be 1-120 per 
document, with depth of 1-8. I believe those are overkill requirements. I 
reduced those to 1-*20* per document with depth of 1-*3* and got results I 
could live with.
Those number are scenario dependent but I think most cases I encountered are 
closer to my proposed numbers. What do you think?

Also, I changed the alg to consume the entire content source.

I would suggest renaming max.facet.length (in the alg) & maxFacetLengh (in the 
code) to max.facet.*depth* and maxFacetDepth. Depth seems more appropriate. 

Other than that - I'm thrilled to have a working benchmark with facets - thanks!
                
> Facet benchmarking
> ------------------
>
>                 Key: LUCENE-3262
>                 URL: https://issues.apache.org/jira/browse/LUCENE-3262
>             Project: Lucene - Java
>          Issue Type: New Feature
>          Components: modules/benchmark, modules/facet
>            Reporter: Shai Erera
>            Assignee: Doron Cohen
>         Attachments: CorpusGenerator.java, LUCENE-3262.patch, 
> LUCENE-3262.patch, LUCENE-3262.patch, TestPerformanceHack.java
>
>
> A spin off from LUCENE-3079. We should define few benchmarks for faceting 
> scenarios, so we can evaluate the new faceting module as well as any 
> improvement we'd like to consider in the future (such as cutting over to 
> docvalues, implement FST-based caches etc.).
> Toke attached a preliminary test case to LUCENE-3079, so I'll attach it here 
> as a starting point.
> We've also done some preliminary job for extending Benchmark for faceting, so 
> I'll attach it here as well.
> We should perhaps create a Wiki page where we clearly describe the benchmark 
> scenarios, then include results of 'default settings' and 'optimized 
> settings', or something like that.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to