Re: [R] Random Forest: OOB performance = test set performance?

2021-04-11 Thread thebudget72

Thanks Peter.

Indeed by setting a seed the two results are similar.

I am self-studying and wanted to make sure I understood the concept of 
OOB samples and how much "reliable" were performance metrics calculated 
on them.


It seems I did got it. That's good :)

On 4/11/21 6:34 AM, Peter Langfelder wrote:

I think the only thing you are doing wrong is not setting the random
seed (set.seed()) so your results are not reproducible. Depending on
the random sample used to select the training and test sets, you get
slightly varying accuracy for both, sometimes one is better and
sometimes the other.

HTH,

Peter

On Sat, Apr 10, 2021 at 8:49 PM  wrote:

Hi ML,

For random forest, I thought that the out-of-bag performance should be
the same (or at least very similar) to the performance calculated on a
separated test set.

But this does not seem to be the case.

In the following code, the accuracy computed on out-of-bag sample is
77.81%, while the one computed on a separated test set is 81%.

Can you please check what I am doing wrong?

Thanks in advance and best regards.

library(randomForest)
library(ISLR)

Carseats$High <- ifelse(Carseats$Sales<=8,"No","Yes")
Carseats$High <- as.factor(Carseats$High)

train = sample(1:nrow(Carseats), 200)

rf = randomForest(High~.-Sales,
data=Carseats,
subset=train,
mtry=6,
importance=T)

acc <- (rf$confusion[1,1] + rf$confusion[2,2]) / sum(rf$confusion)
print(paste0("Accuracy OOB: ", round(acc*100,2), "%"))

yhat <- predict(rf, newdata=Carseats[-train,])
y <- Carseats[-train,]$High
conftest <- table(y, yhat)
acctest <- (conftest[1,1] + conftest[2,2]) / sum(conftest)
print(paste0("Accuracy test set: ", round(acctest*100,2), "%"))

__
R-help@r-project.org  mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guidehttp://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] evil attributes

2021-04-11 Thread Bill Dunlap
Terry wrote
   I confess to being puzzled WHY the R core has decided on this
definition [of vector] ...
I believe that "R core" followed S's definition of "vector".  From the
beginning (at least when I first saw it in 1981) an S vector was the basic
unit of an S object - it had a type and a length and no more.  This has
little to do with the mathematician's or physicist's notion of a vector.
It is more like what Technopedia (
https://www.techopedia.com/definition/22817/vector-programming) says is a
programmer's notion of a vector:

What Does Vector Mean?
A vector, in programming, is a type of array that is one dimensional.
Vectors are a logical element in programming languages that are used for
storing data. Vectors are similar to arrays but their actual implementation
and operation differs.
Techopedia Explains Vector
Vectors are primarily used within the programming context of most
programming languages and serve as data structure containers. Being a data
structure, vectors are used for storing objects and collections of objects
in an organized structure.
The major difference between and array and a vector is that, unlike typical
arrays, the container size of a vector can be easily increased and
decreased to complement different data storage types. Vectors have a
dynamic structure and provide the ability to assign container size up front
and enable allocation of memory space quickly. Vectors can be thought of as
dynamic arrays.

-Bill


On Sun, Apr 11, 2021 at 8:04 AM Therneau, Terry M., Ph.D. via R-help <
r-help@r-project.org> wrote:

> I wrote: "I confess to being puzzled WHY the R core has decided on this
> definition..."
> After just a little more thought let me answer my own question.
>
> a. The as.vector() function is designed to strip off everything extraneous
> and leave just
> the core.   (I have a mental image of Jack Webb saying "Just the facts
> ma'am").   I myself
> use it freqently in the test suite for survival, in cases where I'm
> checking the corrent
> numeric result and don't care about any attached names.
>
>   b. is.vector(x) essentially answers the question "does x look like a
> result of as.vector?"
>
> Nevertheless I understand Roger's confusion.
>
> --
> Terry M Therneau, PhD
> Department of Quantitative Health Sciences
> Mayo Clinic
> thern...@mayo.edu
>
> "TERR-ree THUR-noh"
>
>
> [[alternative HTML version deleted]]
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] evil attributes

2021-04-11 Thread Duncan Murdoch

On 11/04/2021 2:46 p.m., Viechtbauer, Wolfgang (SP) wrote:

The is.vector() thing has also bitten me in the behind on a few occasions. When 
I want to check if something is a vector, allow for it to possibly have some 
additional attributes (besides names) that would make is.vector() evaluate to 
FALSE, but evaluate to FALSE for lists (since is.vector(list(a=1, b=2)) is TRUE 
-- which also wasn't what I had initially expected before reading the 
documentation), I use:

.is.vector <- function(x)
is.atomic(x) && !is.matrix(x) && !is.null(x)

This might also work:

.is.vector <- function(x)
is(x, "vector") && !is.list(x)

I am sure there are all kinds of edge (and probably also not so edge) cases 
where these also fail to work properly. Kinda curious if there are better 
approaches out there.


Sorry, but nobody has said what "properly" would be here.  How can an 
approach be better at something if you don't say what you want it to do?


The base::is.vector() definition looks fairly useless, and I can't 
remember ever using that function.  But at least it's quite well 
documented what it is supposed to do.  What claims are you making about 
your .is.vector() definitions?


Duncan Murdoch

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] updating OpenMx failed

2021-04-11 Thread Rich Shepard

On Sun, 11 Apr 2021, Martin Møller Skarbiniks Pedersen wrote:


You should contact the maintainers of the package.
According to this page:
https://cran.r-project.org/web/packages/OpenMx/index.html
you can get help from http://openmx.ssri.psu.edu/forums


Martin,

You're correct. I should have looked up the maintainer and directly
contacted them. I'll do so.

Regards,

Rich

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] updating OpenMx failed

2021-04-11 Thread Martin Møller Skarbiniks Pedersen
On Sun, 11 Apr 2021 at 18:10, Rich Shepard  wrote:
>
> I'm running Slackware-14.2/x86_64 and R-4.0.2-x86_64-1_SBo.
>
> Updating OpenMx failed:
> omxState.cpp:1230:82:   required from here
> omxState.cpp:1229:17: error: cannot call member function ‘void
ConstraintVec::eval(FitContext*, double*, double*)’ without object
>   eval(fc2, result.data(), 0);
>   ^

[...]

> Please advise,

You should contact the maintainers of the package.
According to this page:
https://cran.r-project.org/web/packages/OpenMx/index.html
you can get help from http://openmx.ssri.psu.edu/forums

Regards
Martin

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] evil attributes

2021-04-11 Thread David Winsemius



On 4/11/21 11:46 AM, Viechtbauer, Wolfgang (SP) wrote:

The is.vector() thing has also bitten me in the behind on a few occasions. When 
I want to check if something is a vector, allow for it to possibly have some 
additional attributes (besides names) that would make is.vector() evaluate to 
FALSE, but evaluate to FALSE for lists (since is.vector(list(a=1, b=2)) is TRUE 
-- which also wasn't what I had initially expected before reading the 
documentation), I use:

.is.vector <- function(x)
is.atomic(x) && !is.matrix(x) && !is.null(x)

This might also work:

.is.vector <- function(x)
is(x, "vector") && !is.list(x)


That will allow expression vectors to return TRUE, but they are not 
atomic so they would be excluded by your current version.


--

David.



I am sure there are all kinds of edge (and probably also not so edge) cases 
where these also fail to work properly. Kinda curious if there are better 
approaches out there.



You might want to exclude expression vectors as well.




Best,
Wolfgang


-Original Message-
From: R-help [mailto:r-help-boun...@r-project.org] On Behalf Of Therneau, Terry
M., Ph.D. via R-help
Sent: Saturday, 10 April, 2021 16:12
To: R-help
Subject: Re: [R] evil attributes

I wrote: "I confess to being puzzled WHY the R core has decided on this
definition..."
After just a little more thought let me answer my own question.

a. The as.vector() function is designed to strip off everything extraneous and
leave just
the core.   (I have a mental image of Jack Webb saying "Just the facts
ma'am").   I myself
use it freqently in the test suite for survival, in cases where I'm checking the
corrent
numeric result and don't care about any attached names.

  b. is.vector(x) essentially answers the question "does x look like a result of
as.vector?"

Nevertheless I understand Roger's confusion.

--
Terry M Therneau, PhD
Department of Quantitative Health Sciences
Mayo Clinic
thern...@mayo.edu

"TERR-ree THUR-noh"

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] evil attributes

2021-04-11 Thread Viechtbauer, Wolfgang (SP)
The is.vector() thing has also bitten me in the behind on a few occasions. When 
I want to check if something is a vector, allow for it to possibly have some 
additional attributes (besides names) that would make is.vector() evaluate to 
FALSE, but evaluate to FALSE for lists (since is.vector(list(a=1, b=2)) is TRUE 
-- which also wasn't what I had initially expected before reading the 
documentation), I use:

.is.vector <- function(x)
   is.atomic(x) && !is.matrix(x) && !is.null(x)

This might also work:

.is.vector <- function(x)
   is(x, "vector") && !is.list(x)

I am sure there are all kinds of edge (and probably also not so edge) cases 
where these also fail to work properly. Kinda curious if there are better 
approaches out there.

Best,
Wolfgang

>-Original Message-
>From: R-help [mailto:r-help-boun...@r-project.org] On Behalf Of Therneau, Terry
>M., Ph.D. via R-help
>Sent: Saturday, 10 April, 2021 16:12
>To: R-help
>Subject: Re: [R] evil attributes
>
>I wrote: "I confess to being puzzled WHY the R core has decided on this
>definition..."
>After just a little more thought let me answer my own question.
>
>a. The as.vector() function is designed to strip off everything extraneous and
>leave just
>the core.   (I have a mental image of Jack Webb saying "Just the facts
>ma'am").   I myself
>use it freqently in the test suite for survival, in cases where I'm checking 
>the
>corrent
>numeric result and don't care about any attached names.
>
>  b. is.vector(x) essentially answers the question "does x look like a result 
> of
>as.vector?"
>
>Nevertheless I understand Roger's confusion.
>
>--
>Terry M Therneau, PhD
>Department of Quantitative Health Sciences
>Mayo Clinic
>thern...@mayo.edu
>
>"TERR-ree THUR-noh"
__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] updating OpenMx failed

2021-04-11 Thread Rich Shepard

I'm running Slackware-14.2/x86_64 and R-4.0.2-x86_64-1_SBo.

Updating OpenMx failed:
omxState.cpp:1230:82:   required from here
omxState.cpp:1229:17: error: cannot call member function ‘void 
ConstraintVec::eval(FitContext*, double*, double*)’ without object
 eval(fc2, result.data(), 0);
 ^
/usr/lib64/R/etc/Makeconf:174: recipe for target 'omxState.o' failed
make: *** [omxState.o] Error 1
ERROR: compilation failed for package ‘OpenMx’
* removing ‘/usr/lib64/R/library/OpenMx’
* restoring previous ‘/usr/lib64/R/library/OpenMx’

The downloaded source packages are in
‘/tmp/Rtmp9jMe61/downloaded_packages’
Updating HTML index of packages in '.Library'
Making 'packages.html' ... done
Warning message:
In install.packages("OpenMx") :
  installation of package ‘OpenMx’ had non-zero exit status

Please advise,

Rich

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] evil attributes

2021-04-11 Thread Therneau, Terry M., Ph.D. via R-help
I wrote: "I confess to being puzzled WHY the R core has decided on this 
definition..."
After just a little more thought let me answer my own question.

a. The as.vector() function is designed to strip off everything extraneous and 
leave just 
the core.   (I have a mental image of Jack Webb saying "Just the facts ma'am"). 
  I myself 
use it freqently in the test suite for survival, in cases where I'm checking 
the corrent 
numeric result and don't care about any attached names.

  b. is.vector(x) essentially answers the question "does x look like a result 
of as.vector?"

Nevertheless I understand Roger's confusion.

-- 
Terry M Therneau, PhD
Department of Quantitative Health Sciences
Mayo Clinic
thern...@mayo.edu

"TERR-ree THUR-noh"


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Identifying column type

2021-04-11 Thread Steven Yen
Thanks. Great idea!

Sent from my iPhone
Beware: My autocorrect is crazy

> On Apr 10, 2021, at 1:37 PM, Rui Barradas  wrote:
> 
> Hello,
> 
> Maybe something like
> 
> 
> ok <- sapply(mydata, is.numeric)
> mydata <- mydata[ok]
> 
> 
> to keep the numeric columns only.
> 
> 
> Hope this helps,
> 
> Rui Barradas
> 
> Às 04:25 de 10/04/21, Steven Yen escreveu:
>> I have data of mixed types in a data frame - date and numeric, as shown
>> in summary below. How do I identify the column(s) that is/are not
>> numeric, in this case, the first. All I want is to identify the
>> column(s) and so that I can remove it/them from the data frame Thanks.
>>> summary(mydata)
>> Date Spot Futures Min. :1997-09-01 00:00:00 Min. : 735.1 Min. : 734.2
>> 1st Qu.:2002-10-16 12:00:00 1st Qu.:1120.7 1st Qu.:1122.6 Median
>> :2007-12-01 00:00:00 Median :1301.8 Median :1303.2 Mean :2007-12-01
>> 06:01:27 Mean :1423.1 Mean :1423.6 3rd Qu.:2013-01-16 12:00:00 3rd
>> Qu.:1540.0 3rd Qu.:1546.5 Max. :2018-03-01 00:00:00 Max. :2823.8 Max.
>> :2825.8
>>[[alternative HTML version deleted]]
>> __
>> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.