[ 
https://issues.apache.org/jira/browse/TIKA-492?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jan Høydahl updated TIKA-492:
-----------------------------

    Description: 
We need added support for Sami languages.

According to document "Requirements for support for Sami languages in data 
processing" (http://www.samit.no/01-850-51.pdf) Tika will get "Basic Level" 
support by detecting North Sami, Lule Sami and South Sami.

  was:
Currently there is one Norwegian language profile in Tika - "no". We need to 
distinguish between the two official Norwegian languages defined by ISO 639-1 
codes "nb" and "nn". Those codes are recommended used instead of the common 
"no" tag.

Proposed solved by removing the current language profile no.ngp and replacing 
it with two new ones for nb and nn.

We must also add tests for Norwegian


> Add language identification support for North Sami, Lule Sami and South Sami
> ----------------------------------------------------------------------------
>
>                 Key: TIKA-492
>                 URL: https://issues.apache.org/jira/browse/TIKA-492
>             Project: Tika
>          Issue Type: New Feature
>          Components: languageidentifier
>    Affects Versions: 0.7
>            Reporter: Jan Høydahl
>
> We need added support for Sami languages.
> According to document "Requirements for support for Sami languages in data 
> processing" (http://www.samit.no/01-850-51.pdf) Tika will get "Basic Level" 
> support by detecting North Sami, Lule Sami and South Sami.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to