William Dunlap wrote:
I can reproduce the difference that Stefan saw, depending
on whether or not I start Rgui with the flags
    --no-environ --no-Rconsole
I think it boils down to the isBlankString() function.
For the string "\247" it returns 1 when those flags are
not present and 0 when they are.  isBlankString does use
some locale-specific functions:
Rboolean isBlankString(const char *s)
{
#ifdef SUPPORT_MBCS
    if(mbcslocale) {
        wchar_t wc; int used; mbstate_t mb_st;
        mbs_init(&mb_st);
        while( (used = Mbrtowc(&wc, s, MB_CUR_MAX, &mb_st)) ) {
            if(!iswspace(wc)) return FALSE;
            s += used;
        }
    } else
#endif
        while (*s)
            if (!isspace((int)*s++)) return FALSE;
    return TRUE;
}

I was using R 2.8.1, downloaded precompiled from CRAN, on Windows
XP SP3. The outputs of sessionInfo() and Sys.getenv() are the same
in both sessions.  'Process Explorer' shows that the 2 sessions
have the same dll's opened.

Thanks for that analysis Bill!

Stefan was in "German_Austria.1252" which I don't think is multibyte, so only the else-clause should be relevant, pointing the finger rather squarely at isspace(). Googling indicates that others have been caught out by signed/unsigned char issues there. Should this possibly rather read

if (!isspace((unsigned int)*s++)) return FALSE;

??


sessionInfo()
R version 2.8.1 (2008-12-22) i386-pc-mingw32
locale:
LC_COLLATE=English_United States.1252;LC_CTYPE=English_United 
States.1252;LC_MONETARY=English_United 
States.1252;LC_NUMERIC=C;LC_TIME=English_United States.1252

attached base packages:
[1] stats graphics grDevices utils datasets methods base
I did the test with a dll compiled from
#include <R.h>
#include <R_ext/Utils.h>

void test_isBlankString(char **s, int *res)
{
   *res = isBlankString(*s) ;
}

and called by .C("test_isBlankString","\247",-1L)

I don't see the difference while running a version of 2.9.0(devel)
compiled locally on 11 March 2009 (from svn rev 48116).

Bill Dunlap
TIBCO Software Inc - Spotfire Division
wdunlap tibco.com
-----Original Message-----
From: r-devel-boun...@r-project.org [mailto:r-devel-boun...@r-project.org] On Behalf Of Peter Dalgaard
Sent: Friday, April 10, 2009 2:03 AM
To: Raberger, Stefan
Cc: r-b...@r-project.org; r-de...@stat.math.ethz.ch
Subject: Re: [Rd] type.convert (PR#13646)

Raberger, Stefan wrote:
Hi Peter,

each of the four PCs actually has the same locale setting:
Sys.setlocale("LC_CTYPE")
[1] "German_Austria.1252"

(all the other settings returned by invoking
Sys.getlocale() are identical as well).
Just to be sure (because it's displayed incorrectly in my
browser on the bugtracking page): the character inside the type.convert function ought to be a "section"-sign (HTML Code &#167; or &sect; , in R "\247", and not a dot ".").

I saw it correctly. It's "\302\247" in UTF8 locales, which is of course the reason I suspected locale settings, but I can't seem to trigger the NA behaviour.

I'm at a loss here, but some ideas:

In the cases where it returns NA, what type is it? (I.e. storage.mode(type.convert(....)))

What do you get from

 > charToRaw("§")
[1] c2 a7

(a7, presumably, but better check).

-p

-----Ursprüngliche Nachricht-----
Von: Peter Dalgaard [mailto:p.dalga...@biostat.ku.dk] Gesendet: Donnerstag, 09. April 2009 19:26
An: Raberger, Stefan
Cc: r-de...@stat.math.ethz.ch; r-b...@r-project.org
Betreff: Re: [Rd] type.convert (PR#13646)

s.raber...@innovest.at wrote:
Full_Name: Stefan Raberger
Version: 2.8.1
OS: Windows XP
Submission from: (NULL) (213.185.163.242)


Hi there, I recently noticed some strange behaviour of the command
"type.convert",
depending on the startup mode used. But there also seems
to be different
behaviour on different PCs (all running the same OS and
the same version of R).
On PC1:
When I start R in SDI mode (RGui --no-save --no-restore
--no-site-file
--no-init-file --no-environ) and try to convert, the result is

type.convert("§")
[1] NA

If I use MDI mode (RGui --no-save --no-restore
--no-site-file --no-init-file
--no-environ --no-Rconsole) instead, the result is

type.convert("§")
[1] §
Levels: §

On PC2 it's exactly the other way round (SDI: §, MDI: NA),
on PC2 the result is
always NA, independent of the startup mode used, and on
PC4 it's always §.
What's the result I should expect R to return, and why is
it different in so
many cases?
Which locale does R think it is in in the four cases? (Sys.setlocale("LC_CTYPE"), I think).

Might well not be a bug (so please don't file it as one).

Any help is much appreciated!
Regards, Stefan

______________________________________________
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


--
    O__  ---- Peter Dalgaard             Øster Farimagsgade 5, Entr.B
   c/ /'_ --- Dept. of Biostatistics     PO Box 2099, 1014 Cph. K
(*) \(*) -- University of Copenhagen Denmark Ph: (+45) 35327918 ~~~~~~~~~~ - (p.dalga...@biostat.ku.dk) FAX: (+45) 35327907

______________________________________________
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel



--
   O__  ---- Peter Dalgaard             Øster Farimagsgade 5, Entr.B
  c/ /'_ --- Dept. of Biostatistics     PO Box 2099, 1014 Cph. K
 (*) \(*) -- University of Copenhagen   Denmark      Ph:  (+45) 35327918
~~~~~~~~~~ - (p.dalga...@biostat.ku.dk)              FAX: (+45) 35327907

______________________________________________
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Reply via email to