+1 for using BCP-47, which will give you the overall most flexiblity.
--
Mark A. Matienzo | http://anarchivi.st/
Director of Technology
Digital Public Library of America
On Wed, Jun 1, 2016 at 7:51 PM, Andrew Cunningham
wrote:
> It is better to refer to BCP-47 instead.
>
> https://tools.ietf.o
It is better to refer to BCP-47 instead.
https://tools.ietf.org/html/bcp47
An RFC can be updated, when it is, it recieves a new number. For language
tagging, the relevant information is split across two RFCs. BCP-47 is a
permanent IEFT ifentifier referencing the latest versions of the two RFCs
re
On 2 Jun 2016 9:40 am, "Andrew Cunningham" wrote:
>
>
> Ultimately it is what a library is working on, if you are cataloguing
then all you have is ISO-639-3/B
>
Opps, meant to input ISO-639-2/B
Andrew
Outside the library sector, the most common approach to language tagging
and matching isn't ISO-639-2 or ISO-639-3, rather BCP-47.
Quite a number of ISO-639-2 language tags represent what ISO-639-3 refers
to as macro languages. For instance 'kar' in ISO-639-2 resolves to 20
language codes in ISO-6
I recommend reading https://tools.ietf.org/html/rfc5646 which seems to do
what you need.
cheers
stuart
--
...let us be heard from red core to black sky
On Thu, Jun 2, 2016 at 10:59 AM, Greg Lindahl wrote:
> Some of the Internet Archive's library partners are asking us about
> language metadata
We've never had any problems sticking to ISO639-2 codes (in cases there
isn't a shorter ISO639-1 code available). I'm interested in what sort of
regional languages you might be dealing with where there are significant
gaps in that standard?
You might also look at ISO 639-3, which is quite compreh
Some of the Internet Archive's library partners are asking us about
language metadata for regional languages that don't have standard
codes. Is there a standard way of dealing with this situation?
Overall we use MARC codes https://www.loc.gov/marc/languages/ which
were last updated in 2007. LOC a