On 13-10-16 4:59 AM, Gerrit Eichner wrote:
Dear Duncan,
unfortunately, I have to correct myself in that I _can_ reproduce the
problem after changing the global width-option to 70, say: Using the data
frame X from before with the 'factory-fresh' setting for width and
executing
str( X, strict.width = "cut")
'data.frame': 11 obs. of 2 variables:
$ A: num 1e+04 2e+04 3e+04 4e+04 5e+04 6e+04 7e+04 8e+04 9e+04 1e+05 ...
$ B: Factor w/ 1 level "zjtvorkmoydsepnxkabmeondrjaanutjmfxlgzmrbjp": 1 1 1
1..
produces the correct output. But
oo <- options( width = 70)
str( X, strict.width = "cut")
'data.frame': 11 obs. of 2 variables:
$ A: num 1e+04 2e+04 3e+04 4e+04 5e+04 6e+04 7e+04 8e+04 9e+04 1e+..
$ A: num 1e+04 2e+04 3e+04 4e+04 5e+04 6e+04 7e+04 8e+04 9e+04 1e+..
is obviously the wrong output I reported previously. Restoring the old
options "solves" the problem:
options( oo)
str( X, strict.width = "cut")
'data.frame': 11 obs. of 2 variables:
$ A: num 1e+04 2e+04 3e+04 4e+04 5e+04 6e+04 7e+04 8e+04 9e+04 1e+05 ...
$ B: Factor w/ 1 level "zjtvorkmoydsepnxkabmeondrjaanutjmfxlgzmrbjp": 1 1 1
1..
Is that reproducible for you?
It was a simple error in the code for str.default. When both lines
needed cutting, the code mixed them up. Will soon be fixed in R-devel
and R-patched.
Duncan Murdoch
Regards -- Gerrit
PS: "New" session info:
sessionInfo()
R version 3.0.2 (2013-09-25)
Platform: x86_64-w64-mingw32/x64 (64-bit)
locale:
[1] LC_COLLATE=German_Germany.1252 LC_CTYPE=German_Germany.1252
[3] LC_MONETARY=German_Germany.1252 LC_NUMERIC=C
[5] LC_TIME=German_Germany.1252
attached base packages:
[1] stats graphics grDevices utils datasets methods base
other attached packages:
[1] fortunes_1.5-0
loaded via a namespace (and not attached):
[1] tools_3.0.2
On Wed, 16 Oct 2013, Gerrit Eichner wrote:
Thanks, Duncan,
for the good (indirect) hint: after a restart of R the problem is --
fortunately :-) -- not reproducible anymore for me either. The R session had
been running for a longer time and I recall doing some (system-related)
things outside of R that may have interfered with it; I just forgot to take
that possibility into consideration. :(
Regards -- Gerrit
On Tue, 15 Oct 2013, Duncan Murdoch wrote:
On 15/10/2013 7:53 AM, Gerrit Eichner wrote:
Dear list subscribers,
here is a small artificial example to demonstrate the problem that I
encountered when looking at the structure of a (larger) data frame that
comprised (among other components)
a numeric component of elements of the order of > 10000, and
a factor or character component with longer levels/strings:
k <- 43 # length of levels or character strings
n <- 11 # number of rows of data frame
M <- 10000 # order of magnitude of numerical values
set.seed( 47) # to reproduce the following artificial character string
longer.char.string <- paste( sample( letters, k, replace = TRUE),
collapse = "")
X <- data.frame( A = 1:n * M,
B = rep( longer.char.string, n))
The following call to str() gives apparently a wrong result
str( X, strict.width = "cut")
'data.frame': 11 obs. of 2 variables:
$ A: num 1e+04 2e+04 3e+04 4e+04 5e+04 6e+04 7e+04 8e+04 9e+04 1e+..
$ A: num 1e+04 2e+04 3e+04 4e+04 5e+04 6e+04 7e+04 8e+04 9e+04 1e+..
whereas the correct result appears for str( X) or if you decrease k to 42
(isn't that "the answer"? ;-) ) or n to 10 or M to 1000 (or smaller,
respectively).
I tried to dig into the entrails of str.default(), where the cause may
lie, but got lost pretty soon. So, I am hoping that someone may already
have a work-around or patch (or dares to dig further)? Thank you for any
feedback!
I can't reproduce this. I don't have a 64 bit copy of 3.0.2 handy, but I
don't see it in 64 bit 3.0.1, or 64 bit 3.0.2-patched, or various 32 bit
versions.
Is it reproducible for you? It looks to me as though (if it isn't just
something weird on your system, e.g. an old copy of str() in your
workspace), it might be a memory protection problem: something needed to
be duplicated but wasn't. But unless I can see it happen, I can't start to
fix it.
Duncan Murdoch
Best regards -- Gerrit
PS:
sessionInfo()
R version 3.0.2 (2013-09-25)
Platform: x86_64-w64-mingw32/x64 (64-bit)
locale:
[1] LC_COLLATE=German_Germany.1252 LC_CTYPE=German_Germany.1252
[3] LC_MONETARY=German_Germany.1252 LC_NUMERIC=C
[5] LC_TIME=German_Germany.1252
attached base packages:
[1] splines stats graphics grDevices utils datasets
[7] methods base
other attached packages:
[1] nparcomp_2.0 multcomp_1.2-21 mvtnorm_0.9-9996
[4] car_2.0-19 Hmisc_3.12-2 Formula_1.1-1
[7] survival_2.37-4 fortunes_1.5-0
loaded via a namespace (and not attached):
[1] cluster_1.14.4 grid_3.0.2 lattice_0.20-23 MASS_7.3-29
[5] nnet_7.3-7 rpart_4.1-3 stats4_3.0.2 tools_3.0.2
---------------------------------------------------------------------
Dr. Gerrit Eichner Mathematical Institute, Room 212
gerrit.eich...@math.uni-giessen.de Justus-Liebig-University Giessen
Tel: +49-(0)641-99-32104 Arndtstr. 2, 35392 Giessen, Germany
Fax: +49-(0)641-99-32109 http://www.uni-giessen.de/cms/eichner
______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.