janhoy commented on code in PR #3780: URL: https://github.com/apache/solr/pull/3780#discussion_r2437387130
########## solr/solr-ref-guide/modules/indexing-guide/pages/indexing-with-tika.adoc: ########## @@ -355,6 +410,40 @@ See the section <<Indexing Encrypted Documents>> for more information about usin + Example: `passwordsFile=/path/to/passwords.txt` +// TODO: Feature exists but keep undocumented for now +// `tikaserver.metadata.compatibility`:: +// + +// [%autowidth,frame=none] +// |=== +// |Optional |Default: false +// |=== +// + +// When enabled, Solr Cell tries to map some common metadata to other common names, e.g. `dc:author` is mapped also to `Author`. This can be useful if switching from `local` to `tikaserver` backend, since `tikaserver` uses more industry standard name-spaced metadata keys. +// + +// Only applicable for `tikaserver` backend. Can only be set in `solrconfig.xml`, not per request. + +`tikaserver.maxChars`:: ++ +[%autowidth,frame=none] +|=== +|Optional |Default: 100 MBytes +|=== ++ +Sets a hard limit on the number of bytes Solr will accept from the Tika Server response body when using the `tikaserver` backend. If the extracted content exceeds this limit, the request will fail with HTTP 400 (Bad Request). ++ +Only applicable for the `tikaserver` backend. This parameter can only be configured in the request handler configuration (`solrconfig.xml`), not per request. ++ +Example: In `solrconfig.xml`: `<long name="tikaserver.maxChars">1000000</long>` Review Comment: Yea, I thought about that at some point. It is indeed bytes. But most normal chars take exactly one byte, at least in english and western languages, so it's not THAT wrong... Is it worth changing you think? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
