Re: I Have a Question in 'BaseTokenStreamTestCase.java'

Michael Sokolov Sun, 29 Sep 2024 13:31:10 -0700

Lucene's test framework makes heavy use of randomization in order to
explore more of the vast space of possible states. You might be
familiar with this as "fuzz testing"? There's a blog post about it
here (from 2011!)
https://blog.mikemccandless.com/2011/03/your-test-cases-should-sometimes-fail.html
and an entire project spun out from Lucene (I think that's how it
started anyway) in order to support this:
https://github.com/randomizedtesting/randomizedtesting/wiki - you can
even watch a video about it (linked from that github page)


On Thu, Sep 26, 2024 at 12:36 PM Seunghan Jung <ajtwlstmd...@gmail.com> wrote:
>
> Hello Lucene Developers,
>
>
> I’m writing because I have a question regarding the Tokenizer-related test 
> code.
>
>
> I was looking at the following code in the link below:
>
> https://github.com/apache/lucene/blob/7b4b0238d7048a0f8532ce55afb72f89dfd69b1c/lucene/test-framework/src/java/org/apache/lucene/tests/analysis/BaseTokenStreamTestCase.java#L1547-L1558
>
>
> I noticed that the `newAttributeFactory` method, which is used for Tokenizer 
> testing, contains a random element. I was wondering why randomness was 
> introduced here.
>
> From my understanding, random elements should not be included in tests 
> because they can produce different results on multiple test runs. Was there a 
> specific reason for this?
>
> If anyone is familiar with the history of this, I’d really appreciate your 
> insight.
>
>
> Thank you.
>
>

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

Re: I Have a Question in 'BaseTokenStreamTestCase.java'

Reply via email to