Hi!
Judson Valeski wrote:
>
> We decided on the following proposal. dougt, rpotts, chak, dmose, gagan, valeski,
>and nhotta attended the meeting.
>
> URI's would accept, and store, only UTF8 encoded strings. Protocols not able to
>handle UTF8 (HTTP for example), would access the charset attribute (proposed) off of
>nsIURI to convert back to the original string. The charset would be set by the URI
>creator as they have the best charset context. Is nsIURI the right
> place for the charset attribute?
I think it is. Also get away with the char representation of the uri
components. Use strings instead.
> The current ASCII % encoding would be removed from the internal URI representation.
>Again, this encoding would be pushed out to the protocol level.
So we will have a two levels of %-enconding? I don't think the
%-encoding can be removed completly. The first level applies to all URIs
and masks reserved chars as the current stuff does. On a second level
non ascii chars can be encoded as the protocol needs it.
> Currently, necko provides the ability to create both UTF8 encoded URIs, as well as
>ASCII URIs. This is a bug that needs to be fixed so *all* necko URI creation
>facilities would create UTF8 URIs.
>
> This proposal addresses LDAP's immediate need for UTF8 URIs (it is a protocol that
>can handle UTF8 strings), as well as HTTP's need to *not* use UTF8 (the charset
>attribute will allow HTTP to convert back to the original string).
>
> IDNS, and future HTTP servers handling UTF8 are believed to be covered under this
>model.
>
> Migration to this new world would be phased something like the following to minimize
>impact...
>
> First phase:
> - The URI charset attribute would be added first, and URI creators would start
>feeding in the charset.
> - Necko would provide consistency in URI creation facilities (all UTF8), and
>callers/users expecting non-UTF8 URIs would need to deal w/ the new encoding.
> - HTTP would covert out of UTF8 before sending requests (fixes chak's bug).
>
> Second phase:
> - ASCII % encoding would be removed from the url implementation(s), and pushed out
>to the protocols who need it. Callers expecting the encoding would also need to be
>repaired to handle the new UTF8 format.
>
> Jud
Andreas