[
https://issues.apache.org/jira/browse/TIKA-4317?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Tim Allison resolved TIKA-4317.
-------------------------------
Resolution: Fixed
> Abusive content on https://corpora.tika.apache.org/
> ---------------------------------------------------
>
> Key: TIKA-4317
> URL: https://issues.apache.org/jira/browse/TIKA-4317
> Project: Tika
> Issue Type: Bug
> Components: site
> Reporter: Zoran Regvart
> Assignee: Tim Allison
> Priority: Major
>
> The Apache Camel team has been notified by Google of abusive content hosted
> on https://corpora.tika.apache.org/, with the assumption that this is somehow
> related to https://camel.apache.org. The scanning done by Google is against
> the whole apache.org domain, so implication is that any abusive content found
> on any domain within apache.org will be accredited and affect other domains
> within apache.org.
> Learn about abusive experiences here:
> https://support.google.com/webtools/answer/7347327.
> Singled out page from Google report (content & possibly security warning):
> {code}https://corpora.tika.apache.org/base/docs/commoncrawl3/QK/QKKJTNDRIVLIPP7433IFC3EF3UVOSPIB{code}
--
This message was sent by Atlassian Jira
(v8.20.10#820010)