William Dunlap wrote:
I can reproduce the difference that Stefan saw, depending
on whether or not I start Rgui with the flags
--no-environ --no-Rconsole
I think it boils down to the isBlankString() function.
For the string "\247" it returns 1 when those flags are
not present and 0 when they are. isBlankString does use
some locale-specific functions:
Rboolean isBlankString(const char *s)
{
#ifdef SUPPORT_MBCS
if(mbcslocale) {
wchar_t wc; int used; mbstate_t mb_st;
mbs_init(&mb_st);
while( (used = Mbrtowc(&wc, s, MB_CUR_MAX, &mb_st)) ) {
if(!iswspace(wc)) return FALSE;
s += used;
}
} else
#endif
while (*s)
if (!isspace((int)*s++)) return FALSE;
return TRUE;
}
I was using R 2.8.1, downloaded precompiled from CRAN, on Windows
XP SP3. The outputs of sessionInfo() and Sys.getenv() are the same
in both sessions. 'Process Explorer' shows that the 2 sessions
have the same dll's opened.
Thanks for that analysis Bill!
Stefan was in "German_Austria.1252" which I don't think is multibyte, so
only the else-clause should be relevant, pointing the finger rather
squarely at isspace(). Googling indicates that others have been caught
out by signed/unsigned char issues there. Should this possibly rather read
if (!isspace((unsigned int)*s++)) return FALSE;
??
sessionInfo()
R version 2.8.1 (2008-12-22)
i386-pc-mingw32
locale:
LC_COLLATE=English_United States.1252;LC_CTYPE=English_United
States.1252;LC_MONETARY=English_United
States.1252;LC_NUMERIC=C;LC_TIME=English_United States.1252
attached base packages:
[1] stats graphics grDevices utils datasets methods base
I did the test with a dll compiled from
#include <R.h>
#include <R_ext/Utils.h>
void test_isBlankString(char **s, int *res)
{
*res = isBlankString(*s) ;
}
and called by .C("test_isBlankString","\247",-1L)
I don't see the difference while running a version of 2.9.0(devel)
compiled locally on 11 March 2009 (from svn rev 48116).
Bill Dunlap
TIBCO Software Inc - Spotfire Division
wdunlap tibco.com
-----Original Message-----
From: r-devel-boun...@r-project.org
[mailto:r-devel-boun...@r-project.org] On Behalf Of Peter Dalgaard
Sent: Friday, April 10, 2009 2:03 AM
To: Raberger, Stefan
Cc: r-b...@r-project.org; r-de...@stat.math.ethz.ch
Subject: Re: [Rd] type.convert (PR#13646)
Raberger, Stefan wrote:
Hi Peter,
each of the four PCs actually has the same locale setting:
Sys.setlocale("LC_CTYPE")
[1] "German_Austria.1252"
(all the other settings returned by invoking
Sys.getlocale() are identical as well).
Just to be sure (because it's displayed incorrectly in my
browser on the bugtracking page): the character inside the
type.convert function ought to be a "section"-sign (HTML Code
§ or § , in R "\247", and not a dot ".").
I saw it correctly. It's "\302\247" in UTF8 locales, which is
of course
the reason I suspected locale settings, but I can't seem to
trigger the
NA behaviour.
I'm at a loss here, but some ideas:
In the cases where it returns NA, what type is it? (I.e.
storage.mode(type.convert(....)))
What do you get from
> charToRaw("§")
[1] c2 a7
(a7, presumably, but better check).
-p
-----Ursprüngliche Nachricht-----
Von: Peter Dalgaard [mailto:p.dalga...@biostat.ku.dk]
Gesendet: Donnerstag, 09. April 2009 19:26
An: Raberger, Stefan
Cc: r-de...@stat.math.ethz.ch; r-b...@r-project.org
Betreff: Re: [Rd] type.convert (PR#13646)
s.raber...@innovest.at wrote:
Full_Name: Stefan Raberger
Version: 2.8.1
OS: Windows XP
Submission from: (NULL) (213.185.163.242)
Hi there,
I recently noticed some strange behaviour of the command
"type.convert",
depending on the startup mode used. But there also seems
to be different
behaviour on different PCs (all running the same OS and
the same version of R).
On PC1:
When I start R in SDI mode (RGui --no-save --no-restore
--no-site-file
--no-init-file --no-environ) and try to convert, the result is
type.convert("§")
[1] NA
If I use MDI mode (RGui --no-save --no-restore
--no-site-file --no-init-file
--no-environ --no-Rconsole) instead, the result is
type.convert("§")
[1] §
Levels: §
On PC2 it's exactly the other way round (SDI: §, MDI: NA),
on PC2 the result is
always NA, independent of the startup mode used, and on
PC4 it's always §.
What's the result I should expect R to return, and why is
it different in so
many cases?
Which locale does R think it is in in the four cases?
(Sys.setlocale("LC_CTYPE"), I think).
Might well not be a bug (so please don't file it as one).
Any help is much appreciated!
Regards, Stefan
______________________________________________
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel
--
O__ ---- Peter Dalgaard Øster Farimagsgade 5, Entr.B
c/ /'_ --- Dept. of Biostatistics PO Box 2099, 1014 Cph. K
(*) \(*) -- University of Copenhagen Denmark Ph:
(+45) 35327918
~~~~~~~~~~ - (p.dalga...@biostat.ku.dk) FAX:
(+45) 35327907
______________________________________________
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel
--
O__ ---- Peter Dalgaard Øster Farimagsgade 5, Entr.B
c/ /'_ --- Dept. of Biostatistics PO Box 2099, 1014 Cph. K
(*) \(*) -- University of Copenhagen Denmark Ph: (+45) 35327918
~~~~~~~~~~ - (p.dalga...@biostat.ku.dk) FAX: (+45) 35327907
______________________________________________
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel