This is an automated email from the ASF dual-hosted git repository.
tallison pushed a change to branch branch_1x
in repository https://gitbox.apache.org/repos/asf/tika.git.
from 499394e TIKA-3140 -- add the tika-eval metadata filter to a service
file so that it loads automatically
new 08e9b08 TIKA-3146 -- add Nutch's TextProfileSignature to tika-eval
new 523cb85 TIKA-3145 -- add TextSha256Signature
new 1a0314f TIKA-3146 -- clean up text profile signature and add unit
test for cjk
The 3 revisions listed above as "new" are entirely new to this
repository and will be described in separate emails. The revisions
listed as "add" were already present in the repository and have only
been added to this reference.
Summary of changes:
...tatsCalculator.java => BytesRefCalculator.java} | 12 +-
.../textstats/CompositeTextStatsCalculator.java | 74 ++++++++++--
.../tika/eval/textstats/TextProfileSignature.java | 126 +++++++++++++++++++++
.../tika/eval/textstats/TextSha256Signature.java | 54 +++++++++
.../apache/tika/eval/textstats/TextStatsTest.java | 105 +++++++++++++++++
5 files changed, 357 insertions(+), 14 deletions(-)
copy
tika-eval/src/main/java/org/apache/tika/eval/textstats/{StringStatsCalculator.java
=> BytesRefCalculator.java} (77%)
create mode 100644
tika-eval/src/main/java/org/apache/tika/eval/textstats/TextProfileSignature.java
create mode 100644
tika-eval/src/main/java/org/apache/tika/eval/textstats/TextSha256Signature.java
create mode 100644
tika-eval/src/test/java/org/apache/tika/eval/textstats/TextStatsTest.java