Including IDN with Solaris

Stefan Teleman <Stefan.Teleman at Sun.COM>
15 March 2007

1.      Summary and motivation

        The inclusion of PHP5 in Solaris has identified a number of
        missing capabilities. One of these capabilities is a generic
        implementation of the Stringprep, Punycode and IDNA specifications
        as defined by IETF Internationalized Domain Names (IDN) Working
        Group. LibIDN provides such an implementation in a portable and
        platform-independent manner. According to the IDN web page at
        GNU.org, LibIDN is known to run on over 20 UNIX-like platforms.

        This FastTrack case proposes the integration of LibIDN in Solaris.
        LibIDN is GNU Software [http://www.gnu.org/software/libidn/] [1]
        and is developed outside of SMI. As such, the SFW Consolidation is
        the natural choice for LibIDN.

        This case proposes the most recent stable release of LibIDN,
        0.6.8.

        This case seeks Micro/Patch Relase Binding.

2.      Technical issues

        2.1.    Key objects.

        /usr/bin/idn

        /usr/lib/libidn.so.11.5.22
        /usr/lib/libidn.so.11 -> libidn.so.11.5.22
        /usr/lib/libidn.so -> libidn.so.11.5.22

        /usr/include/idn/stringprep.h
        /usr/include/idn/idna.h
        /usr/include/idn/punycode.h
        /usr/include/idn/idn-free.h
        /usr/include/idn/pr29.h
        /usr/include/idn/tld.h
        /usr/include/idn/idn-int.h

        /usr/share/lib/java/libidn-0.6.8.jar

        /usr/share/man/man1/idn.1
        /usr/share/man/man3/idna_strerror.3
        /usr/share/man/man3/idna_to_ascii_4i.3
        /usr/share/man/man3/idna_to_ascii_4z.3
        /usr/share/man/man3/idna_to_ascii_8z.3
        /usr/share/man/man3/idna_to_ascii_lz.3
        /usr/share/man/man3/idna_to_unicode_44i.3
        /usr/share/man/man3/idna_to_unicode_4z4z.3
        /usr/share/man/man3/idna_to_unicode_8z4z.3
        /usr/share/man/man3/idna_to_unicode_8z8z.3
        /usr/share/man/man3/idna_to_unicode_8zlz.3
        /usr/share/man/man3/idna_to_unicode_lzlz.3
        /usr/share/man/man3/pr29_4.3
        /usr/share/man/man3/pr29_4z.3
        /usr/share/man/man3/pr29_8z.3
        /usr/share/man/man3/pr29_strerror.3
        /usr/share/man/man3/punycode_decode.3
        /usr/share/man/man3/punycode_encode.3
        /usr/share/man/man3/punycode_strerror.3
        /usr/share/man/man3/stringprep.3
        /usr/share/man/man3/stringprep_4i.3
        /usr/share/man/man3/stringprep_4zi.3
        /usr/share/man/man3/stringprep_check_version.3
        /usr/share/man/man3/stringprep_convert.3
        /usr/share/man/man3/stringprep_locale_charset.3
        /usr/share/man/man3/stringprep_locale_to_utf8.3
        /usr/share/man/man3/stringprep_profile.3
        /usr/share/man/man3/stringprep_strerror.3
        /usr/share/man/man3/stringprep_ucs4_nfkc_normalize.3
        /usr/share/man/man3/stringprep_ucs4_to_utf8.3
        /usr/share/man/man3/stringprep_unichar_to_utf8.3
        /usr/share/man/man3/stringprep_utf8_nfkc_normalize.3
        /usr/share/man/man3/stringprep_utf8_to_locale.3
        /usr/share/man/man3/stringprep_utf8_to_ucs4.3
        /usr/share/man/man3/stringprep_utf8_to_unichar.3
        /usr/share/man/man3/tld_check_4.3
        /usr/share/man/man3/tld_check_4t.3
        /usr/share/man/man3/tld_check_4tz.3
        /usr/share/man/man3/tld_check_4z.3
        /usr/share/man/man3/tld_check_8z.3
        /usr/share/man/man3/tld_check_lz.3
        /usr/share/man/man3/tld_default_table.3
        /usr/share/man/man3/tld_get_4.3
        /usr/share/man/man3/tld_get_4z.3
        /usr/share/man/man3/tld_get_table.3
        /usr/share/man/man3/tld_get_z.3
        /usr/share/man/man3/tld_strerror.3

        LibIDN's functionality is provided by one executable [idn], and
        one shared library [libidn.so.*]. Key aspects of the facilities
        provided by LibIDN are discussed below.

        2.2     Specifications

        LibIDN implements the Stringprep, Punycode and IDNA specifications.

        The Stringprep specification is defined by RFC 3454
        [http://www.ietf.org/rfc/rfc3454.txt] [5]. According to the
        specification, Stringprep "specifies a framework of processing rules
        for Unicode text. Other protocols can create profiles of these rules;
        these profiles will allow users to enter internationalized text strings 
in
        applications and have the highest chance of getting the content of
        the strings correct.". In other words, in and of itself, Stringprep is
        merely a foundation library for Unicode character conversion and
        representation. It does not implement any protocols. Custom "profiles"
        implementing formalized protocols can be constructed on top of, and
        pursuant to, the Stringprep specification.

        The Nameprep Internet Protocol specification is defined by RFC 3491
        [http://www.ietf.org/rfc/rfc3491.txt] [6]. Nameprep specifies the
        processing rules which allow user input of internationalized domain
        names (IDNs) into applications, providing the highest success rate
        of correct string conversion. Nameprep is a Stringprep Profile.
        The Nameprep processing rules are intended solely for internationalized
        domain names, and not suitable for, nor do they support, arbitrary
        text.

        The Nameprep profile defines the following capabilities (as required
        by Stringprep):

                Internationalized Domain Names [IDN]
                Character universe that represents the possible input
                and output to Stringprep: Unicode 3.2
                [http://www.unicode.org/reports/tr28/tr28-3.html] [4]

        Other profiles exist for Stringprep:

                Internet Small Computer Systems Interface [iSCSI] Names,
                [http://www.ietf.org/rfc/rfc3722.txt] [RFC 3722] [7]

                Extensible Messaging and Presence Protocol [XMPP] Core,
                [http://www.ietf.org/rfc/rfc3920.txt] [RFC 3920] [8]

                Stringprep Profile for User Names and Passwords [SASL],
                [http://www.ietf.org/rfc/rfc4013.txt] [RFC 4013] [9]

        None of the profiles enumerated above pertain directly to LibIDN,
        and, for the purposes of this document, will not be discussed further.

        The Punycode Specification is defined by RFC 3492
        [http://www.ietf.org/rfc/rfc3492.txt] [10]. Punycode specifies an
        Internet Standard for implementing a "simple and efficient
        transfer encoding syntax designed for use with Internationalized
        Domain Names in Applications [IDNA]". Simply put, Punycode
        tranforms a Unicode string into an ASCII string. The conversion
        operation is reversible and bidirectional. Unicode characters
        which can be represented as ASCII are represented literally,
        and non-ASCII convertible characters are converted to, and
        represented by, ASCII characters allowed in host name labels
        [letters, digits, and hyphens].

        The IDNA Specification is defined by RFC 3490
        [http://www.ietf.org/rfc/rfc3490.txt] [11]. IDNA formalizes a Standard
        for Internationalized Domain Names [IDNs], and a mechanism named
        Internationalizing Domain Names in Applications [IDNA] for handling
        IDNs in a standard manner. IDNs can use characters available in the
        Unicode Universe. IDNA allows the non-ASCII characters to be
        represented using only the ASCII character set allowed in host
        name labels. This backward-compatible conversion and representation
        is required by existing protocols like DNS. This way, IDNs can be
        introduced with no changes to the existing infrastructure. IDNA
        is meant solely for processing domain names, and is not suitable for,
        nor does it support, free text.

        2.3.    Language bindings

        LibIDN is written in C. Language bindings for Java are included.

        2.4.    Documentation

        LibIDN provides an extensive and detailed set of man pages for
        all its interfaces. These manual pages will be installed in the
        default Solaris manual page location. Additionally, detailed
        documentation of LibIDN's APIs is provided in HTML format.

3.      Interfaces

        3.1.    Interface Stability

        LibIDN is an Open Source project, and is controlled by a group of
        developers external to SMI [GNU/FSF/Simon Joseffson]. Although
        LibIDN strives to maintain ABI and API compatibility between releases,
        no explicit guarantees of backwards compatibility between releases
        are offered by LibIDN's developers.


        3.2.    Imported interfaces

        LibIDN imports Standard C Library interfaces, Socket [-lsocket],
        and Network Services Library [-lnsl] interfaces.

        NAME                            NOTES

        Java [for Java bindings]        PSARC/2002/727

        3.3.    Exported interfaces

        NAME                    STABILITY               NOTES

        SUNWgnu-idn             Uncommitted             Package name

        /usr/bin/idn            Uncommitted             Executable location
        /usr/lib/libidn.so.11.5.22      Uncommitted     Library location
        /usr/lib/libidn.so.11   Uncommitted     Symbolic link
        /usr/lib/libidn.so      Uncommitted             Symbolic link
        /usr/share/lib/java/libidn-0.6.8.jar    Uncommitted     JAR file

        /usr/include/idn/       Uncommitted             Include files

4.      References

        [1]     http://www.gnu.org/software/libidn/
        [2]     http://www.gnu.org/software/libidn/manual/libidn.html
        [3]     http://www.gnu.org/software/libidn/reference/
        [4]     http://www.unicode.org/reports/tr28/tr28-3.html
        [5]     http://www.ietf.org/rfc/rfc3454.txt
        [6]     http://www.ietf.org/rfc/rfc3491.txt
        [7]     http://www.ietf.org/rfc/rfc3722.txt
        [8]     http://www.ietf.org/rfc/rfc3920.txt
        [9]     http://www.ietf.org/rfc/rfc4013.txt
        [10]    http://www.ietf.org/rfc/rfc3492.txt
        [11]    http://www.ietf.org/rfc/rfc3490.txt


-- 
Stefan Teleman
Sun Microsystems, Inc.
Stefan.Teleman at Sun.COM


Reply via email to