My reply wasn't meant to be condescending at all, just trying to explain the issue. UTF-16LE & UTF-16BE *are* encodings of the 16-bit UTF-16 encoding of Unicode onto 8-bit code units. If the server is sending UTF-16 or UTF-32, you should simply use the *W API, period, because in some places the 8-bit API can have problems with the embedded 0x00 bytes.
On Thursday, February 4, 2016 at 1:13:12 PM UTC-5, Stefan Karpinski wrote: > > The real issue is this: > > SQLCHAR is for encodings with 8-bit code units. > > > Condescending lecture on encodings notwithstanding, UTF-16 is not such an > encoding, yet UTF-16 is what the ODBC package is currently sending > to SQLExecDirect for an argument of type SQLCHAR * – and somehow it seems > to be working for many drivers, which still makes no sense to me. I can > only conclude that some ODBC drivers are treating this as a void * argument > and they expect pointers to data in whatever encoding they prefer, not > specifically in encodings with 8-bit code units. > > Querying the database about what encoding it expects is a good idea, but > how does one do that? The SQLGetInfo > <https://msdn.microsoft.com/en-us/library/ms711681(v=vs.85).aspx> > function seems like a good candidate but this page doesn't include > "encoding" or "utf" anywhere. > > On Thu, Feb 4, 2016 at 7:53 AM, Milan Bouchet-Valat <nali...@club.fr > <javascript:>> wrote: > >> Le mercredi 03 février 2016 à 11:44 -0800, Terry Seaward a écrit : >> > From R, it seems like the encoding is based on the connection (as >> > opposed to being hard coded). See `enc <- attr(channel, "encoding")` >> > below: >> > >> > ``` >> > [...] >> > >> > Digging down `odbcConnect` is just a wrapper for `odbcDriverConnect` >> > which has the following parameter `DBMSencoding = ""`. This calls the >> > `C` function `C_RODBCDriverConnect` (available here:RODBC_1.3- >> > 12.tar.gz), which has no reference to encodings. So `attr(channel, >> > "encoding")` is simply `DBMSencoding`, i.e. `""`. >> > >> > It seems to come down to `iconv(..., to = "")` which, from the R >> > source code, uses `win_iconv.c` attached. I can't seem to find how >> > `""` is handled, i.e. is there some default value based on the >> > system? >> "" refers to the encoding of the current system locale. This is a >> reasonable guess, but it will probably be wrong in many cases (else, R >> wouldn't have provided this option at all). >> >> >> Regards >> > >