dependabot[bot] opened a new pull request #183: URL: https://github.com/apache/any23/pull/183
Bumps `tika.version` from 1.27 to 2.1.0. Updates `tika-core` from 1.27 to 2.1.0 <details> <summary>Changelog</summary> <p><em>Sourced from <a href="https://github.com/apache/tika/blob/main/CHANGES.txt">tika-core's changelog</a>.</em></p> <blockquote> <p>Release 2.1.1 - ???</p> <ul> <li> <p>Improve robustness and features of the httpfetcher (TIKA-3543)</p> </li> <li> <p>Add optional fetch ranges to FetchEmitTuple to allow range fetching from, e.g. http or s3 (TIKA-3542).</p> </li> <li> <p>Exclude dependencies on jsoup and ehcache in ucar grib/cdm (TIKA-3003).</p> </li> </ul> <p>Release 2.1.0 - 08/18/2021</p> <p>MAJOR CHANGES in 2.1.0:</p> <ul> <li> <p>Improved packaging for tika-parsers-extended. Use the tika-parser-scientific-package and tika-parser-sqlite3-package artifacts if you want fat jars with dependencies. (TIKA-3510)</p> </li> <li> <p>Tika app writes UTF-8 when an encoding is not specified; the legacy behavior was UTF-8 on Mac OS, but System default on other OSs (TIKA-3515).</p> </li> <li> <p>Change the default rendering strategy for PDFs from NO_TEXT to ALL (TIKA-3520).</p> </li> </ul> <p>Other changes:</p> <ul> <li> <p>Fixed bug that pointed to the wrong tessdata directory if the user specified a tesseract path but not also a tessdata path (TIKA-3518).</p> </li> <li> <p>Fixed bug in Icu4j's encoding detector where it would return non-standard names for charsets, e.g. IBM424_rtl is now returned as IBM424 (TIKA-3516).</p> </li> <li> <p>Add a simple UrlFetcher in tika-core as a basic alternative to tika-fetcher-http (TIKA-3527).</p> </li> <li> <p>Add tika-pipes support for Google Cloud Storage (TIKA-3524).</p> </li> <li> <p>Fix markup ordering errors in xhtml output for ODT files (TIKA-2242).</p> </li> <li> <p>Fix serialization of embedded docs in OpenSearch emitter and fix embedded documents not being indexed in some use cases in the Solr emitter (TIKA-3490).</p> </li> <li> <p>Add pipesClientId system property to PipesServer so that each forked process can log to its own logger (TIKA-3480).</p> </li> <li> <p>Add DateNormalizingMetadataFilter let users ensure that all dates emitted to Solr/OpenSearch are in UTC. Users can configure which timezone they'd like to use in cases where the file format does not store a timezone (TIKA-3496).</p> </li> <li> <p>Breaking change in the Solr and OpenSearch emitters. To achieve</p> </li> </ul> <!-- raw HTML omitted --> </blockquote> <p>... (truncated)</p> </details> <details> <summary>Commits</summary> <ul> <li>See full diff in <a href="https://github.com/apache/tika/commits">compare view</a></li> </ul> </details> <br /> Updates `tika-parsers` from 1.27 to 2.1.0 <details> <summary>Changelog</summary> <p><em>Sourced from <a href="https://github.com/apache/tika/blob/main/CHANGES.txt">tika-parsers's changelog</a>.</em></p> <blockquote> <p>Release 2.1.1 - ???</p> <ul> <li> <p>Improve robustness and features of the httpfetcher (TIKA-3543)</p> </li> <li> <p>Add optional fetch ranges to FetchEmitTuple to allow range fetching from, e.g. http or s3 (TIKA-3542).</p> </li> <li> <p>Exclude dependencies on jsoup and ehcache in ucar grib/cdm (TIKA-3003).</p> </li> </ul> <p>Release 2.1.0 - 08/18/2021</p> <p>MAJOR CHANGES in 2.1.0:</p> <ul> <li> <p>Improved packaging for tika-parsers-extended. Use the tika-parser-scientific-package and tika-parser-sqlite3-package artifacts if you want fat jars with dependencies. (TIKA-3510)</p> </li> <li> <p>Tika app writes UTF-8 when an encoding is not specified; the legacy behavior was UTF-8 on Mac OS, but System default on other OSs (TIKA-3515).</p> </li> <li> <p>Change the default rendering strategy for PDFs from NO_TEXT to ALL (TIKA-3520).</p> </li> </ul> <p>Other changes:</p> <ul> <li> <p>Fixed bug that pointed to the wrong tessdata directory if the user specified a tesseract path but not also a tessdata path (TIKA-3518).</p> </li> <li> <p>Fixed bug in Icu4j's encoding detector where it would return non-standard names for charsets, e.g. IBM424_rtl is now returned as IBM424 (TIKA-3516).</p> </li> <li> <p>Add a simple UrlFetcher in tika-core as a basic alternative to tika-fetcher-http (TIKA-3527).</p> </li> <li> <p>Add tika-pipes support for Google Cloud Storage (TIKA-3524).</p> </li> <li> <p>Fix markup ordering errors in xhtml output for ODT files (TIKA-2242).</p> </li> <li> <p>Fix serialization of embedded docs in OpenSearch emitter and fix embedded documents not being indexed in some use cases in the Solr emitter (TIKA-3490).</p> </li> <li> <p>Add pipesClientId system property to PipesServer so that each forked process can log to its own logger (TIKA-3480).</p> </li> <li> <p>Add DateNormalizingMetadataFilter let users ensure that all dates emitted to Solr/OpenSearch are in UTC. Users can configure which timezone they'd like to use in cases where the file format does not store a timezone (TIKA-3496).</p> </li> <li> <p>Breaking change in the Solr and OpenSearch emitters. To achieve</p> </li> </ul> <!-- raw HTML omitted --> </blockquote> <p>... (truncated)</p> </details> <details> <summary>Commits</summary> <ul> <li>See full diff in <a href="https://github.com/apache/tika/commits">compare view</a></li> </ul> </details> <br /> Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting `@dependabot rebase`. [//]: # (dependabot-automerge-start) [//]: # (dependabot-automerge-end) --- <details> <summary>Dependabot commands and options</summary> <br /> You can trigger Dependabot actions by commenting on this PR: - `@dependabot rebase` will rebase this PR - `@dependabot recreate` will recreate this PR, overwriting any edits that have been made to it - `@dependabot merge` will merge this PR after your CI passes on it - `@dependabot squash and merge` will squash and merge this PR after your CI passes on it - `@dependabot cancel merge` will cancel a previously requested merge and block automerging - `@dependabot reopen` will reopen this PR if it is closed - `@dependabot close` will close this PR and stop Dependabot recreating it. You can achieve the same result by closing it manually - `@dependabot ignore this major version` will close this PR and stop Dependabot creating any more for this major version (unless you reopen the PR or upgrade to it yourself) - `@dependabot ignore this minor version` will close this PR and stop Dependabot creating any more for this minor version (unless you reopen the PR or upgrade to it yourself) - `@dependabot ignore this dependency` will close this PR and stop Dependabot creating any more for this dependency (unless you reopen the PR or upgrade to it yourself) </details> -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: dev-unsubscr...@any23.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org