On Mon, Oct 19, 2015 at 10:52:27AM +0200, walter harms wrote: > > > Am 18.10.2015 23:26, schrieb Isaac Dunham: > > On Sun, Oct 18, 2015 at 07:55:38PM +0200, walter harms wrote: > >> > >> > >> Am 18.10.2015 07:54, schrieb Isaac Dunham: > >>> RFC952/RFC1123 limit the characters in a hostname for a node to > >>> [-a-zA-Z0-9], with '-' being legal only in the middle; we were > >>> accepting everything from ' ' to '~'. > >>> (As a byproduct of this, the hostname in dumpleases can now be safely > >>> used from scripts without sanitization.) > >>> > >>> function old new delta > >>> add_lease 326 363 +37 > >>> ------------------------------------------------------------------------------ > >>> (add/remove: 0/0 grow/shrink: 1/0 up/down: 37/0) Total: 37 > >>> bytes > >>> text data bss dec hex filename > >>> 892983 6844 7288 907115 dd76b busybox_old > >>> 893020 6844 7288 907152 dd790 busybox_unstripped > >>> --- > >>> networking/udhcp/leases.c | 13 ++++++++++--- > >>> 1 file changed, 10 insertions(+), 3 deletions(-) > >>> > >>> diff --git a/networking/udhcp/leases.c b/networking/udhcp/leases.c > >>> index 745340a..1f7af87 100644 > >>> --- a/networking/udhcp/leases.c > >>> +++ b/networking/udhcp/leases.c > >>> @@ -65,12 +65,19 @@ struct dyn_lease* FAST_FUNC add_lease( > >>> if (hostname_len > sizeof(oldest->hostname)) > >>> hostname_len = sizeof(oldest->hostname); > >>> p = safe_strncpy(oldest->hostname, hostname, > >>> hostname_len); > >>> - /* sanitization (s/non-ASCII/^/g) */ > >>> + /* sanitization - per rfcs 952 & 1123 only [-a-zA-Z0-9] > >>> are legal > >>> + * with '-' being allowed only in the middle > >>> + */ > >>> while (*p) { > >>> - if (*p < ' ' || *p > 126) > >>> - *p = '^'; > >>> + if (! (isupper((char)*p) || islower((char)*p) || > >>> + isdigit((char)*p) || (char)*p > >>> == '-') ) > >>> + *p = '-'; > >>> p++; > >>> } > >>> + if (p--, *p == '-') > >>> + *p = 'X'; > >>> + if (p = oldest->hostname, *p == '-') > >>> + *p = 'X'; > >>> } > >>> if (chaddr) > >>> memcpy(oldest->lease_mac, chaddr, 6); > >> > >> since several tools check for hostnames, > >> maybe it is useful to make this a function ? > > > > What this does is not simply 'check for validity'; it *makes* a hostname > > valid, which is not what most tools need. > > It also is exclusively for leaf node names, rather than an FQDN (ie, > > '.' is not valid here). > > > > It would be possible to design a function that can check or fix the > > hostname depending how it's called, though I wonder if that's > > doing too much in a single call. > > > > It would probably have to be something like this: > > > > #define HOSTCHECK_LEAF 0x1 //leaf hostname-no '.' allowed > > #define HOSTCHECK_FIX 0x2 //fix-replace invalid chars with '-'/'X' > > > > //return NULL if valid, pointer to first invalid char otherwise > > char * validate_hostname(char *p, int flags); > > > > This does not handle transforming a URL via punycode, of course. > > > > Would such an interface be desireable? > > note: i did not make an inventory if this is needed by other > programms but i can imagine that with 'hostname' it would be useful.
I see no reason hostnames should be represented as punycode anywhere except DNS query packets, or in other protocols that require encoding as such. Everywhere else they should just be normal printable text. Rich _______________________________________________ busybox mailing list busybox@busybox.net http://lists.busybox.net/mailman/listinfo/busybox