[
https://issues.apache.org/jira/browse/LUCENE-4584?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13508292#comment-13508292
]
Uwe Schindler commented on LUCENE-4584:
---------------------------------------
I agree with Robert here. We don't need to test random data, for Lucene only 2
things are important:
- When you compress random data and decompress it again, the same exact bytes
must come back. This should be tested and needs no external C code. This is the
doesn't corrumptâ„¢ Robert is talking about.
- The compressed content should never get significantly bigger
There is no reason at all that Lucene's LZ4 returns the same compressed output.
E.g. if we find a better algorithm that performs better in Hotspot, although it
compresses to a different byte array, we are perfectly fine.
If we want to assert for now that both algorithms create the same compressed
output, we should have three different size random byte files (e.g. generated
by /dev/urandom) as test resources and the C-compressed ones also as test
resources, and then we can compare the results. We should just document how the
test data was created. But keep in mind: We may change the algorithm to produce
different bytes, so this is not mandatory. I think we may only assert that the
compression percentage of the random data is identical, not the actual bytes.
> Compare the LZ4 implementation in Lucene against the original impl
> ------------------------------------------------------------------
>
> Key: LUCENE-4584
> URL: https://issues.apache.org/jira/browse/LUCENE-4584
> Project: Lucene - Core
> Issue Type: Task
> Reporter: Adrien Grand
> Assignee: Adrien Grand
> Fix For: 4.1
>
>
> We should add tests to make sure that the LZ4 impl in Lucene compresses data
> the exact same way as the original impl.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]