Re: [R] running crossvalidation many times MSE for Lasso regression

2023-10-23 Thread Jin Li
Hi Ben, Martin and all,

The function, glmnetcv, in the spm2 package was developed for the following
main reasons:
1. The training and testing samples were generated using a stratified
random sampling method instead of a simple random sampling method. By doing
this, we hoped that it may be able to decluster the spatial data as Ben
mentioned and also to reduce the variation in the perdictive accuarcy among
iterations and produce a more reliable predictive accuracy.
2.  It can be used to produce various prective accuracy measures (e.g.,
VEcv) as shown in the reproducible examples.
3.  We also wanted that all methods compared in Spatial Predictive Modeling
with R were based on cv functions that are using the same sampling methods
(i.e., a number of cv functions were developed for this purpose), so that
we could conclude that the differences in the accuracy of predictive
methods were resulted from the methods themselves.

Anyway, people interested can use their own data to test and see.

Best,
Jin


On Tue, Oct 24, 2023 at 4:59 AM Ben Bolker  wrote:

>For what it's worth it looks like spm2 is specifically for *spatial*
> predictive modeling; presumably its version of CV is doing something
> spatially aware.
>
>I agree that glmnet is old and reliable.  One might want to use a
> tidymodels wrapper to create pipelines where you can more easily switch
> among predictive algorithms (see the `parsnip` package), but otherwise
> sticking to glmnet seems wise.
>
> On 2023-10-23 4:38 a.m., Martin Maechler wrote:
> >>>>>> Jin Li
> >>>>>>  on Mon, 23 Oct 2023 15:42:14 +1100 writes:
> >
> >  > If you are interested in other validation methods (e.g., LOO or
> n-fold)
> >  > with more predictive accuracy measures, the function, glmnetcv,
> in the spm2
> >  > package can be directly used, and some reproducible examples are
> >  > also available in ?glmnetcv.
> >
> > ... and once you open that can of w..:   the  glmnet package itself
> > contains a function  cv.glmnet()  which we (our students) use when
> teaching.
> >
> > What's the advantage of the spm2 package ?
> > At least, the glmnet package is authored by the same who originated and
> > first published (as in "peer reviewed" ..) these algorithms.
> >
> >
> >
> >  > On Mon, Oct 23, 2023 at 10:59 AM Duncan Murdoch <
> murdoch.dun...@gmail.com>
> >  > wrote:
> >
> >  >> On 22/10/2023 7:01 p.m., Bert Gunter wrote:
> >  >> > No error message shown Please include the error message so
> that it is
> >  >> > not necessary to rerun your code. This might enable someone to
> see the
> >  >> > problem without running the code (e.g. downloading packages,
> etc.)
> >  >>
> >  >> And it's not necessarily true that someone else would see the
> same error
> >  >> message.
> >  >>
> >  >> Duncan Murdoch
> >  >>
> >  >> >
> >  >> > -- Bert
> >  >> >
> >  >> > On Sun, Oct 22, 2023 at 1:36 PM varin sacha via R-help
> >  >> >  wrote:
> >  >> >>
> >  >> >> Dear R-experts,
> >  >> >>
> >  >> >> Here below my R code with an error message. Can somebody help
> me to fix
> >  >> this error?
> >  >> >> Really appreciate your help.
> >  >> >>
> >  >> >> Best,
> >  >> >>
> >  >> >> 
> >  >> >> # MSE CROSSVALIDATION Lasso regression
> >  >> >>
> >  >> >> library(glmnet)
> >  >> >>
> >  >> >>
> >  >> >>
> >  >>
> x1=c(34,35,12,13,15,37,65,45,47,67,87,45,46,39,87,98,67,51,10,30,65,34,57,68,98,86,45,65,34,78,98,123,202,231,154,21,34,26,56,78,99,83,46,58,91)
> >  >> >>
> >  >>
> x2=c(1,3,2,4,5,6,7,3,8,9,10,11,12,1,3,4,2,3,4,5,4,6,8,7,9,4,3,6,7,9,8,4,7,6,1,3,2,5,6,8,7,1,1,2,9)
> >  >> >>
> >  >>
> y=c(2,6,5,4,6,7,8,10,11,2,3,1,3,5,4,6,5,3.4,5.6,-2.4,-5.4,5,3,6,5,-3,-5,3,2,-1,-8,5,8,6,9,4,5,-3,-7,-9,-9,8,7,1,2)
> >  >> >> T=data.frame(y,x1,x2)
> >  >> >>
> >  >> >> z=matrix(c(x1,x2), ncol=2)
> >  >> >> cv_model=glmnet(z,y,alpha=1)
> >  >> >> best_lam

Re: [R] running crossvalidation many times MSE for Lasso regression

2023-10-22 Thread Jin Li
If you are interested in other validation methods (e.g., LOO or n-fold)
with more predictive accuracy measures, the function, glmnetcv, in the spm2
package can be directly used, and some reproducible examples are
also available in ?glmnetcv.

On Mon, Oct 23, 2023 at 10:59 AM Duncan Murdoch 
wrote:

> On 22/10/2023 7:01 p.m., Bert Gunter wrote:
> > No error message shown Please include the error message so that it is
> > not necessary to rerun your code. This might enable someone to see the
> > problem without running the code (e.g. downloading packages, etc.)
>
> And it's not necessarily true that someone else would see the same error
> message.
>
> Duncan Murdoch
>
> >
> > -- Bert
> >
> > On Sun, Oct 22, 2023 at 1:36 PM varin sacha via R-help
> >  wrote:
> >>
> >> Dear R-experts,
> >>
> >> Here below my R code with an error message. Can somebody help me to fix
> this error?
> >> Really appreciate your help.
> >>
> >> Best,
> >>
> >> 
> >> # MSE CROSSVALIDATION Lasso regression
> >>
> >> library(glmnet)
> >>
> >>
> >>
> x1=c(34,35,12,13,15,37,65,45,47,67,87,45,46,39,87,98,67,51,10,30,65,34,57,68,98,86,45,65,34,78,98,123,202,231,154,21,34,26,56,78,99,83,46,58,91)
> >>
> x2=c(1,3,2,4,5,6,7,3,8,9,10,11,12,1,3,4,2,3,4,5,4,6,8,7,9,4,3,6,7,9,8,4,7,6,1,3,2,5,6,8,7,1,1,2,9)
> >>
> y=c(2,6,5,4,6,7,8,10,11,2,3,1,3,5,4,6,5,3.4,5.6,-2.4,-5.4,5,3,6,5,-3,-5,3,2,-1,-8,5,8,6,9,4,5,-3,-7,-9,-9,8,7,1,2)
> >> T=data.frame(y,x1,x2)
> >>
> >> z=matrix(c(x1,x2), ncol=2)
> >> cv_model=glmnet(z,y,alpha=1)
> >> best_lambda=cv_model$lambda.min
> >> best_lambda
> >>
> >>
> >> # Create a list to store the results
> >> lst<-list()
> >>
> >> # This statement does the repetitions (looping)
> >> for(i in 1 :1000) {
> >>
> >> n=45
> >>
> >> p=0.667
> >>
> >> sam=sample(1 :n,floor(p*n),replace=FALSE)
> >>
> >> Training =T [sam,]
> >> Testing = T [-sam,]
> >>
> >> test1=matrix(c(Testing$x1,Testing$x2),ncol=2)
> >>
> >> predictLasso=predict(cv_model, newx=test1)
> >>
> >>
> >> ypred=predict(predictLasso,newdata=test1)
> >> y=T[-sam,]$y
> >>
> >> MSE = mean((y-ypred)^2)
> >> MSE
> >> lst[i]<-MSE
> >> }
> >> mean(unlist(lst))
> >> ##
> >>
> >>
> >>
> >>
> >> __
> >> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> >> https://stat.ethz.ch/mailman/listinfo/r-help
> >> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> >> and provide commented, minimal, self-contained, reproducible code.
> >
> > __________
> > R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> > https://stat.ethz.ch/mailman/listinfo/r-help
> > PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> > and provide commented, minimal, self-contained, reproducible code.
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>


-- 
Jin
--
Jin Li, PhD
Founder, Data2action, Australia
https://www.researchgate.net/profile/Jin_Li32
https://scholar.google.com/citations?user=Jeot53EJ&hl=en

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Sensitivity and Specificity

2022-10-24 Thread Jin Li
Hi Greg,

This can be done by:
spm::pred.acc(testing$case,  predict_testing)

It will return both sensitivity and specificity, along with a few other
commonly used measures.

Hope this helps,
Jin

On Tue, Oct 25, 2022 at 6:01 AM Rui Barradas  wrote:

> Às 16:50 de 24/10/2022, greg holly escreveu:
> > Hi Michael,
> >
> > I appreciate your writing. Here are what I have after;
> >
> >> predict_testing <- ifelse(predict > 0.5,1,0)
> >>
> >> head(predict)
> >   1  2  3  5  7  8
> > 0.29006984 0.28370507 0.10761993 0.02204224 0.12873872 0.08127920
> >>
> >> # Sensitivity and Specificity
> >>
> >>
> >
> sensitivity<-(predict_testing[2,2]/(predict_testing[2,2]+predict_testing[2,1]))*100
> > Error in predict_testing[2, 2] : incorrect number of dimensions
> >> sensitivity
> > function (data, ...)
> > {
> >  UseMethod("sensitivity")
> > }
> > 
> > 
> >>
> >>
> >
> specificity<-(predict_testing[1,1]/(predict_testing[1,1]+predict_testing[1,2]))*100
> > Error in predict_testing[1, 1] : incorrect number of dimensions
> >> specificity
> > function (data, ...)
> > {
> >  UseMethod("specificity")
> > }
> > 
> > 
> >
> > On Mon, Oct 24, 2022 at 10:45 AM Michael Dewey 
> > wrote:
> >
> >> Rather hard to know without seeing what output you expected and what
> >> error message you got if any but did you mean to summarise your variable
> >> predict before doing anything with it?
> >>
> >> Michael
> >>
> >> On 24/10/2022 16:17, greg holly wrote:
> >>> Hi all R-Help ,
> >>>
> >>> After partitioning my data to testing and training (please see below),
> I
> >>> need to estimate the Sensitivity and Specificity. I failed. It would be
> >>> appropriate to get your help.
> >>>
> >>> Best regards,
> >>> Greg
> >>>
> >>>
> >>> inTrain <- createDataPartition(y=data$case,
> >>>  p=0.7,
> >>>  list=FALSE)
> >>> training <- data[ inTrain,]
> >>> testing  <- data[-inTrain,]
> >>>
> >>> attach(training)
> >>> #model training and prediction
> >>> data_training <- glm(case ~ age+BMI+Calcium+Albumin+meno_1, data =
> >>> training, family = binomial(link="logit"))
> >>>
> >>> predict <- predict(data_training, data_predict = testing, type =
> >> "response")
> >>>
> >>> predict_testing <- ifelse(predict > 0.5,1,0)
> >>>
> >>> # Sensitivity and Specificity
> >>>
> >>>
> >>
>  
> sensitivity<-(predict_testing[2,2]/(predict_testing[2,2]+predict_testing[2,1]))*100
> >>>sensitivity
> >>>
> >>>
> >>
>  
> specificity<-(predict_testing[1,1]/(predict_testing[1,1]+predict_testing[1,2]))*100
> >>>specificity
> >>>
> >>>[[alternative HTML version deleted]]
> >>>
> >>> __
> >>> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> >>> https://stat.ethz.ch/mailman/listinfo/r-help
> >>> PLEASE do read the posting guide
> >> http://www.R-project.org/posting-guide.html
> >>> and provide commented, minimal, self-contained, reproducible code.
> >>>
> >>
> >> --
> >> Michael
> >> http://www.dewey.myzen.co.uk/home.html
> >>
> >
> >   [[alternative HTML version deleted]]
> >
> > __
> > R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> > https://stat.ethz.ch/mailman/listinfo/r-help
> > PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> > and provide commented, minimal, self-contained, reproducible code.
>
> Hello,
>
> Instead of computing by hand, why not use package caret?
>
>
> tbl <- table(predict_testing, testing$case)
> caret::sensitivity(tbl)
> caret::specificity(tbl)
>
>
> Hope this helps,
>
> Rui Barradas
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>


-- 
Jin
--
Jin Li, PhD
Founder, Data2action, Australia
https://www.researchgate.net/profile/Jin_Li32
https://scholar.google.com/citations?user=Jeot53EJ&hl=en

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] How important is set.seed

2022-03-21 Thread Jin Li
The answer may depend on the model type you are going to develop. For
predictive models, yes you do need it. The dependence of predictive
accuracy measures on random seeds and dependence of stabilized predictive
accuracy measures on random seeds have been demonstrated and discussed in
Spatial Predictive Modeling with R (doi:10.1201/9781003091776), where many
reproducible examples are provided for various predictive methods including
RF, GBM and SVM.
Hope this helps.
Jin

On Tue, Mar 22, 2022 at 11:51 AM Ebert,Timothy Aaron  wrote:

> If you are using the program for data analysis then set.seed() is not
> necessary unless you are developing a reproducible example. In a standard
> analysis it is mostly counter-productive because one should then ask if
> your presented results are an artifact of a specific seed that you selected
> to get a particular result. However, in cases where you need a reproducible
> example, debugging a program, or specific other cases where you might need
> the same result with every run of the program then set.seed() is an
> essential tool.
> Tim
>
> -Original Message-
> From: R-help  On Behalf Of Jeff Newmiller
> Sent: Monday, March 21, 2022 8:41 PM
> To: r-help@r-project.org; Neha gupta ; r-help
> mailing list 
> Subject: Re: [R] How important is set.seed
>
> [External Email]
>
> First off, "ML models" do not all use random numbers (for prediction I
> would guess very few of them do). Learn and pay attention to what the
> functions you are using do.
>
> Second, if you use random numbers properly and understand the precision
> that your specific use case offers, then you don't need to use set.seed.
> However, in practice, using set.seed can allow you to temporarily avoid
> chasing precision gremlins, or set up specific test cases for testing code,
> not results. It is your responsibility to not let this become a crutch... a
> randomized simulation that is actually sensitive to the seed is unlikely to
> offer an accurate result.
>
> Where to put set.seed depends a lot on how you are performing your
> simulations. In general each process should set it once uniquely at the
> beginning, and if you use parallel processing then use the features of your
> parallel processing framework to insure that this happens. Beware of
> setting all worker processes to use the same seed.
>
> On March 21, 2022 5:03:30 PM PDT, Neha gupta 
> wrote:
> >Hello everyone
> >
> >I want to know
> >
> >(1) In which cases, we need to use set.seed while building ML models?
> >
> >(2) Which is the exact location we need to put the set.seed function i.e.
> >when we split data into train/test sets, or just before we train a model?
> >
> >Thank you
> >
> >   [[alternative HTML version deleted]]
> >
> >__
> >R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> >https://urldefense.proofpoint.com/v2/url?u=https-3A__stat.ethz.ch_mailm
> >an_listinfo_r-2Dhelp&d=DwICAg&c=sJ6xIWYx-zLMB3EPkvcnVg&r=9PEhQh2kVeAsRz
> >sn7AkP-g&m=s9osWKJN-zG2VafjXQYCmU_AMS5w3eAtCfeJAwnphAb7ap8kDYfcLwt2jrmf
> >0UaX&s=5b117E3OFSf5VyLOctfnrz0rj5B2WyRxpXsq4Y3TRMU&e=
> >PLEASE do read the posting guide
> >https://urldefense.proofpoint.com/v2/url?u=http-3A__www.R-2Dproject.org
> >_posting-2Dguide.html&d=DwICAg&c=sJ6xIWYx-zLMB3EPkvcnVg&r=9PEhQh2kVeAsR
> >zsn7AkP-g&m=s9osWKJN-zG2VafjXQYCmU_AMS5w3eAtCfeJAwnphAb7ap8kDYfcLwt2jrm
> >f0UaX&s=wI6SycC_C2fno2VfxGg9ObD3Dd1qh6vn56pIvmCcobg&e=
> >and provide commented, minimal, self-contained, reproducible code.
>
> --
> Sent from my phone. Please excuse my brevity.
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://urldefense.proofpoint.com/v2/url?u=https-3A__stat.ethz.ch_mailman_listinfo_r-2Dhelp&d=DwICAg&c=sJ6xIWYx-zLMB3EPkvcnVg&r=9PEhQh2kVeAsRzsn7AkP-g&m=s9osWKJN-zG2VafjXQYCmU_AMS5w3eAtCfeJAwnphAb7ap8kDYfcLwt2jrmf0UaX&s=5b117E3OFSf5VyLOctfnrz0rj5B2WyRxpXsq4Y3TRMU&e=
> PLEASE do read the posting guide
> https://urldefense.proofpoint.com/v2/url?u=http-3A__www.R-2Dproject.org_posting-2Dguide.html&d=DwICAg&c=sJ6xIWYx-zLMB3EPkvcnVg&r=9PEhQh2kVeAsRzsn7AkP-g&m=s9osWKJN-zG2VafjXQYCmU_AMS5w3eAtCfeJAwnphAb7ap8kDYfcLwt2jrmf0UaX&s=wI6SycC_C2fno2VfxGg9ObD3Dd1qh6vn56pIvmCcobg&e=
> and provide commented, minimal, self-contained, reproducible code.
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://

[R] Release of a new package steprf: Stepwise Predictive Variable Selection for Random Forest

2021-11-14 Thread Jin Li
Hi All,

A new package `steprf` is now available on CRAN.  It introduces several
novel predictive variable selection methods for random forest (RF). They
are based on various variable  importance methods and predictive accuracy
in stepwise algorithms. Their performance can be seen in the references
cited in the description of the package.

Cheers,

-- 
Jin
--
Jin Li, PhD
Founder, Data2action, Australia
https://www.researchgate.net/profile/Jin_Li32
https://scholar.google.com/citations?user=Jeot53EJ&hl=en

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] spm2: A new package for (spatial) predictive modelling

2021-10-17 Thread Jin Li
Dear all,

A new package, spm2_1.1.0, for (spatial) predictive modelling has just been
made available on CRAN. It is an updated and extended version of 'spm'
package, by introducing some further novel functions for modern statistical
methods (i.e., generalised linear models, glmnet,  generalised least
squares), support vector machine,  For each method, two functions are
provided, with one  function for assessing the predictive errors and
accuracy of the method based on cross-validation, and the other for
generating spatial predictions. It also contains a couple of functions for
data preparation and predictive accuracy assessment.

Any feedback is welcome and appreciated.

-- 
Jin
----------
Jin Li, PhD
Founder, Data2action, Australia
https://www.researchgate.net/profile/Jin_Li32
https://scholar.google.com/citations?user=Jeot53EJ&hl=en

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Spm package is updated and available on CRAN again

2021-09-07 Thread Jin Li
Dear spm users and all,

I am glad to inform you that the spm package is available on CRAN again. It
is an updated version with a few bugs fixed. Please note that some
functions in the package are not only for spatial predictive modelling but
also for predictive modeling in general.

Please feel free to contact me if you have any questions regarding the spm
package.

Best regards,

-- 
Jin

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] spm package is available on CRAN again

2021-09-06 Thread Jin Li
Dear spm users and all,

I am glad to inform you that the spm package is available on CRAN again. It
is an updated version with a few bugs fixed. Please note that some
functions in the package are not only for spatial predictive modelling but
also for general predictive modeling.

Please feel free to contact me if you have any questions regarding the spm
package.

Best regards,
-- 
Jin
--
Jin Li, PhD
Founder, Data2action, Australia
https://www.researchgate.net/profile/Jin_Li32
https://scholar.google.com/citations?user=Jeot53EJ&hl=en

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] How to make our data normally distributed in R

2020-03-15 Thread Jin Li
Please note that mu and sd are the mean and standard deviation of
validation samples. You may use pred.acc in spm to calculate a number of
error and accuracy measures including RMSE and VEcv from the observed and
predicted values directly.

On Sat, Mar 14, 2020 at 2:07 AM Neha gupta  wrote:

> Thanks a lot Jin..
>
> If my total number of observations are 500,
> n will be 500,
> mu will be average (500)
> s will be sd (500)
> and m will be RMSE value i.e. 4500 in this case?
>
> tovecv(n=500, mu=average (500), s=sd, m=4500, measure="rmse")
>
>
> On Fri, Mar 13, 2020 at 12:46 AM Jin Li  wrote:
>
>> Hi,
>> Why do you want to re-scale RMSE to 0-1? You can change ylim=(0,1) to
>> ylim=(0, 4600). You may use VEcv (Variance explained by predictive models
>> based on cross-validation) that ranges from  0 to 100% instead. It can be
>> calculated using vecv function in library(spm) or you can convert RMSE to
>> VEcv using tovecv in spm.
>> Hope this helps,
>> Jin
>>
>> On Fri, Mar 13, 2020 at 8:08 AM Neha gupta 
>> wrote:
>>
>>> Hi
>>>
>>> I have a regression based data where I get the RMSE results as:
>>>
>>> SVM=3500
>>> ANN=4600
>>> R.Forest=2900
>>>
>>> I want to know how can I make it so that its values comes as 0-1
>>>
>>> I plot the boxplot for it to indicate their RMSE values and used,
>>> ylim=(0,1), but the boxplot which works for RMSE values like 3500 etc,
>>> but
>>> when I use ylim=(0,1), all the boxplots suddenly disappears. What should
>>> I
>>> do for it?
>>>
>>> Thanks
>>>
>>> [[alternative HTML version deleted]]
>>>
>>> __
>>> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
>>> https://stat.ethz.ch/mailman/listinfo/r-help
>>> PLEASE do read the posting guide
>>> http://www.R-project.org/posting-guide.html
>>> and provide commented, minimal, self-contained, reproducible code.
>>>
>>
>>
>> --
>> Jin
>>
>

-- 
Jin

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] How to make our data normally distributed in R

2020-03-12 Thread Jin Li
Hi,
Why do you want to re-scale RMSE to 0-1? You can change ylim=(0,1) to
ylim=(0, 4600). You may use VEcv (Variance explained by predictive models
based on cross-validation) that ranges from  0 to 100% instead. It can be
calculated using vecv function in library(spm) or you can convert RMSE to
VEcv using tovecv in spm.
Hope this helps,
Jin

On Fri, Mar 13, 2020 at 8:08 AM Neha gupta  wrote:

> Hi
>
> I have a regression based data where I get the RMSE results as:
>
> SVM=3500
> ANN=4600
> R.Forest=2900
>
> I want to know how can I make it so that its values comes as 0-1
>
> I plot the boxplot for it to indicate their RMSE values and used,
> ylim=(0,1), but the boxplot which works for RMSE values like 3500 etc, but
> when I use ylim=(0,1), all the boxplots suddenly disappears. What should I
> do for it?
>
> Thanks
>
> [[alternative HTML version deleted]]
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>


-- 
Jin

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Is it possible to remove this loop? [SEC=UNCLASSIFIED]

2012-07-03 Thread Jin . Li
Thanks for your validation. Yes Peter's solution is the fastest, faster than 
the previous one by saving 25% time. It was missed out in my previous testing. 
Jin

-Original Message-
From: Pascal Oettli [mailto:kri...@ymail.com] 
Sent: Wednesday, 4 July 2012 2:07 PM
To: Li Jin
Cc: r-help@r-project.org
Subject: Re: [R] Is it possible to remove this loop? [SEC=UNCLASSIFIED]



Le 04/07/2012 12:43, Peter Ehlers a écrit :
> On 2012-07-03 17:23, jin...@ga.gov.au wrote:
>> Thank you all for providing various alternatives. They are all pretty
>> fast. Great help! Based on a test of a dataset with 800,000 rows, the
>> time used varies from 0.04 to 11.56 s. The champion is:
>>> a1$h2 <- 0
>>> a1$h2[a1$h1=="H"] <- 1
>
> Interesting. My testing shows that Petr's solution is about
> twice as fast. Not that it matters much - the time is pretty
> small in any case.
>
>   a0 <- data.frame(h1 = sample(c("H","J","K"), 1e7, replace = TRUE),
>stringsAsFactors = FALSE)
>   a1 <- a0
>   system.time({a1$h2 <- 0; a1$h2[a1$h1 == "H"] <- 1})
>   #   user  system elapsed
>   #   1.470.481.96
>   a11 <- a1
>
>   a1 <- a0
>   system.time(a1$h2 <- (a1$h1 == "H") * 1)
>   #  user  system elapsed
>   #  0.370.170.56
>   a12 <- a1
>   all.equal(a11,a12)
>   #[1] TRUE
>
> Peter Ehlers
>

I got the same result. Petr's solution is the fastest. Good to know it.

Pascal Oettli

>> Regards,
>> Jin
>>
>> Geoscience Australia Disclaimer: This e-mail (and files transmitted
>> with it) is intended only for the person or entity to which it is
>> addressed. If you are not the intended recipient, then you have
>> received this e-mail by mistake and any use, dissemination,
>> forwarding, printing or copying of this e-mail and its file
>> attachments is prohibited. The security of emails transmitted cannot
>> be guaranteed; by forwarding or replying to this email, you
>> acknowledge and accept these risks.
>
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>


Geoscience Australia Disclaimer: This e-mail (and files transmitted with it) is 
intended only for the person or entity to which it is addressed. If you are not 
the intended recipient, then you have received this e-mail by mistake and any 
use, dissemination, forwarding, printing or copying of this e-mail and its file 
attachments is prohibited. The security of emails transmitted cannot be 
guaranteed; by forwarding or replying to this email, you acknowledge and accept 
these risks.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Is it possible to remove this loop? [SEC=UNCLASSIFIED]

2012-07-03 Thread Jin . Li
Thank you all for providing various alternatives. They are all pretty fast. 
Great help! Based on a test of a dataset with 800,000 rows, the time used 
varies from 0.04 to 11.56 s. The champion is:
> a1$h2 <- 0
> a1$h2[a1$h1=="H"] <- 1
Regards,
Jin

Geoscience Australia Disclaimer: This e-mail (and files transmitted with it) is 
intended only for the person or entity to which it is addressed. If you are not 
the intended recipient, then you have received this e-mail by mistake and any 
use, dissemination, forwarding, printing or copying of this e-mail and its file 
attachments is prohibited. The security of emails transmitted cannot be 
guaranteed; by forwarding or replying to this email, you acknowledge and accept 
these risks.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Is it possible to remove this loop? [SEC=UNCLASSIFIED]

2012-07-03 Thread Jin . Li
Hi all,

I would like create a new column in a data.frame (a1) to store 0, 1 data 
converted from a factor as below.

a1$h2<-NULL
for (i in 1:dim(a1)[1]) {
  if (a1$h1[i]=="H") a1$h2[i]<-1 else a1$h2[i]<-0
  }

My question: is it possible to remove the loop from above code to achieve the 
desired result?

Thanks in advance,
Jin

Geoscience Australia Disclaimer: This e-mail (and files transmitted with it) is 
intended only for the person or entity to which it is addressed. If you are not 
the intended recipient, then you have received this e-mail by mistake and any 
use, dissemination, forwarding, printing or copying of this e-mail and its file 
attachments is prohibited. The security of emails transmitted cannot be 
guaranteed; by forwarding or replying to this email, you acknowledge and accept 
these risks.
-


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.