Re: [Linux-uvc-devel] Camera name - what encoding?

Eugen Dedu Sun, 03 May 2009 13:14:27 -0700

Laurent Pinchart wrote:

Hi Eugen,

- Why the kernel usb subsystem transcode them to latin-1 and not to utf-8?


You will have to ask the USB subsystem developers.

- Do the linux kernel transcodes to latin-1 or it depends on how it was
configured?  Is it the same for linux, windows and macos?  It's because
I want to retranscode it to utf-8, and I need to know its current encoding.

The USB subsystem transcodes the UTF-16 strings to Latin-1 regardless ofconfiguration options. I have no idea how Windows and MacOS handle that.


Hi Laurent,

For info, starting from 2.6.31 (normally) the usb sub-system will returndevice names in utf-8, see e-mail attached.


Thanks!
--
Eugen

--- Begin Message ---

On Fri, Apr 24, 2009 at 03:32:10PM -0700, Greg KH wrote:
> On Fri, Apr 24, 2009 at 10:10:46AM +0200, Clemens Ladisch wrote:
> > Alan Stern wrote:
> > > It is feasible, but there are a couple of things to watch out for:
> > > 
> > >   With latin-1 encoding we know that each character occupies
> > >   only one byte; therefore any descriptor string will fit into a 
> > >   128-byte buffer (since the total descriptor length can't be 
> > >   larger than 255).  But with UTF-8 encoding, a character can 
> > >   occupy more than one byte.  Hence the callers may need to 
> > >   allocate larger buffers than they do now.  For instance, you 
> > >   would definitely want to change usb_cache_string().
> > 
> > That one is the only caller of usb_string() in the kernel that uses a
> > buffer larger than 64 bytes, so I didn't bother about the others.
> > 
> > >   Translation from UTF-16LE to latin-1 is easy.  Translation
> > >   to UTF-8 is harder because it requires you to check for
> > >   invalid code points.  Furthermore, if you write your own code
> > >   to do the translation then you are almost certainly duplicating 
> > >   code that already exists somewhere else in the kernel, which is 
> > >   a bad idea.
> > 
> > The only existing code I've found is utf8_wcstombs(), and it doesn't
> > bother about invalid code points.
> > 
> > I've included the NLS patches here because there doesn't seem to be an
> > NLS maintainer, and you wouldn't want to use the USB patch without those
> > fixes.
> > 
> > Not much tested, because I don't have a USB device with non-ASCII
> > strings.  And I'm not quite sure how applications will handle the
> > encoding change ...
> 
> Hm, I have a device with an extended ascii string:
> 
> $ cat /sys/kernel/debug/usb/devices  | grep Track
> S:  Product=Microsoft Trackball Optical�
> 
> so I'll try them out.

Hey, it fixed it!  With these patches I now get:
$ cat /sys/kernel/debug/usb/devices  | grep Track
S:  Product=Microsoft Trackball Optical®

Very nice, thanks for the patches, I'll queue them up for 2.6.31.

greg k-h

--- End Message ---

_______________________________________________
Linux-uvc-devel mailing list
[email protected]
https://lists.berlios.de/mailman/listinfo/linux-uvc-devel

Re: [Linux-uvc-devel] Camera name - what encoding?

Reply via email to