I'm not seeing language hints in the document.xml within the docx nor
in the metadata.  Do you know where it might be stored inside the
docx?

On Wed, Apr 12, 2023 at 1:01 PM Chetan Bikire <chetab...@gmail.com> wrote:
>
> I am calling tika using rmeta/text endpoint by running tika server 2.7.
> Yes, language detection means any metadata field which shows language in 
> which document is written.
> like for example- in our case attached document contains spanish content in 
> it then metadata Content-Language:"es"
>
>
>
> On Wed, Apr 12, 2023 at 8:32 PM Tim Allison <talli...@apache.org> wrote:
>>
>> How are you calling Tika?  By "language", do you mean language
>> detection on the extracted text or an internal metadata flag that says
>> "I'm X language"?
>>
>> On Wed, Apr 12, 2023 at 10:48 AM Chetan Bikire <chetab...@gmail.com> wrote:
>> >
>> > Hi,
>> >
>> > After parsing documents tika does not return language as part parsing 
>> > result for some of the documents like docx,.msg files.
>> > Below is the example document.
>> > please assist.

Reply via email to