Re: [R] Regarding to R
shweta shukla gmail.com> writes: > > Dear, > I recently started R for my analysis but still not clear concept. > please guide me with some informative mterial to learn R. > > Iam also confuse that which one I should prefer R or R studio and what > differences in between. > R and RStudio are two different types of thing; it's not a question of "which one to use". RStudio is a front-end (interface) for R. You probably *should* use RStudio (rather than the standard "R console" interface that comes with R), it's well supported and has lots of useful features. Most of the time when you're asking questions, though, they will be questions about R (unless they are specific issues about the *interface*, they won't be RStudio-specific). It's hard to tell you where to start with "informative material". What is most useful will depend on your background; your field/desired type of analyses; language; personality; etc. etc.. There are literally thousands of R tutorials and books, many available for free on the internet. I'd suggest you google "learning R programming", inspect a dozen or so of the top hits, and see which ones seem to suit you best. good luck, Ben Bolker __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Regarding to R
Please help us help you. Tell us what you have tried and where you have looked (otherwise we may just point you to things you already know about). Also what is your focus (simple analysis, learning, programming, ...)? The best place to start is with "An Introduction to R" which Installs with R. Then there are a lot of tutorials pointed to from the main R website. Comparing R and Rstudio is not an either/or question. R is the statistical package, Rstudio is an interface to R (Rstudio by itself is not very useful, it passes the commands to R). Think of R as being a car with only a few options, it can take you anywhere and for many it is good enough. Rstudio is a nice options package (power steering, a GPS navigation system, etc.) Things that make driving the car easier and more enjoyable, but would not help much without the car to add them to. On Tue, May 10, 2016 at 4:10 AM, shweta shuklawrote: > Dear, > I recently started R for my analysis but still not clear concept. > please guide me with some informative mterial to learn R. > > Iam also confuse that which one I should prefer R or R studio and what > differences in between. > > > > > Thank you. > > [[alternative HTML version deleted]] > > __ > R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. -- Gregory (Greg) L. Snow Ph.D. 538...@gmail.com __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Regarding to R
Dear, I recently started R for my analysis but still not clear concept. please guide me with some informative mterial to learn R. Iam also confuse that which one I should prefer R or R studio and what differences in between. Thank you. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Regarding accesing R- Repositories at servers
On 13.07.2010 09:42, venkatesh bandaru wrote: Dear R-help team , I am venkatesh, student of University of Hyderabad, India. I couldn't able to access R-repositories at Your specified servers.It is giving error such as Couldn't able to access media. It would be helpful to see what you did (the function call you entered), what your R version is as well as your OS, how you are connected. And the actual error message, not one you made up Best, Uwe Ligges Can you please help me Regarding this. i am anticipating for your reply, thanking you. wishes regards B.venkatesh, University of Hyderabad, India 9440186746 [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Regarding accesing R- Repositories at servers
Dear R-help team , I am venkatesh, student of University of Hyderabad, India. I couldn't able to access R-repositories at Your specified servers.It is giving error such as Couldn't able to access media. Can you please help me Regarding this. i am anticipating for your reply, thanking you. wishes regards B.venkatesh, University of Hyderabad, India 9440186746 [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Regarding the 'R' Load Command
Hi Gavin, Steve Tons and tons of Thanks! This solved my problem. My sample data differed from the original working testdata in the factor levels. Once I set the levels using the command given by Gavin, things started working like magic. Hurray! I learned a great deal from you'll. Thank You for helping me so promptly! Best Regards, Murali Godavarthi 410-585-3746 (w) ITCD - DPSCS Data Mining -Original Message- From: Gavin Simpson [mailto:gavin.simp...@ucl.ac.uk] Sent: Wednesday, May 19, 2010 3:52 PM To: Godavarthi, Murali Cc: Steve Lianoglou; r-help@r-project.org Subject: RE: [R] Regarding the 'R' Load Command On Wed, 2010-05-19 at 14:59 -0400, Godavarthi, Murali wrote: Hi Gavin, Steve Sorry, please use the below dput for mytestdata. Thanks!! No need. The issue I think issue is due to the number of levels in the factor. IIRC correctly, I've been bitten by this before where the newdata object contained factors with different numbers of levels and/or a different subset of levels. Try setting the levels on mytestdata explicitly from the levels of testdata, e.g. something like: mytestdata - within(mytestdata, { sex - factor(sex, levels = levels(testdata$sex)) race - factor(race, levels = levels(testdata$race)) marstat - factor(marstat, levels = levels(testdata$marstat)) empac - factor(empac, levels = levels(testdata$empac)) }) Then check with str(mytestdata) that it is consistent with str(testdata) If it is, then try to call predict on your RF model and newdata = testdata) HTH G structure(list(imurder = 0, itheft = 0, irobbery = 0, iassault = 1, idrug = 0, iburglary = 0, igun = 0, psych = 0, Freq = 0, priors = 58, firstage = 19, intage = 19, sex = structure(1, .Label = 1, class = factor), race = structure(1, .Label = BLACK, class = factor), marstat = structure(1, .Label = SINGLE, class = factor), empac = structure(1, .Label = UNEMPLD, class = factor), educ = 0, zipcode = 21215, suspendmn = 0, drugs = 0, alco = 0, probation = 1, parole = 0), .Names = c(imurder, itheft, irobbery, iassault, idrug, iburglary, igun, psych, Freq, priors, firstage, intage, sex, race, marstat, empac, educ, zipcode, suspendmn, drugs, alco, probation, parole), class = data.frame, row.names = 10) Best Regards, Murali Godavarthi 410-585-3746 (w) ITCD - DPSCS Data Mining -Original Message- From: Godavarthi, Murali Sent: Wednesday, May 19, 2010 2:43 PM To: 'gavin.simp...@ucl.ac.uk'; Steve Lianoglou Cc: r-help@r-project.org Subject: RE: [R] Regarding the 'R' Load Command Hi Steve, Gavin This is being really helpful. I've pasted the working data, and my test data below after running the str command on both of those variables. The working sample actually contains about 300 records, hence I am not able to paste the whole data here. However my sample test data which I am trying to get working, is only 1 record, and I've pasted the dput result below. Datatypes seem to match in both variables for me in terms of being num/factor. Please suggest where it could be wrong. Thank You! mytestdata structure(list(imurder = 0, itheft = 0, irobbery = 0, iassault = 1L, idrug = 0L, iburglary = 0L, igun = 0L, psych = 0L, Freq = 0L, priors = 58L, firstage = 19L, intage = 19L, sex = structure(1L, .Label = 1, class = factor), race = structure(1L, .Label = BLACK, class = factor), marstat = structure(1L, .Label = SINGLE, class = factor), empac = structure(1L, .Label = UNEMPLD, class = factor), educ = 0L, zipcode = 21215L, suspendmn = 0L, drugs = 0L, alco = 0L, probation = 1L, parole = 0L), .Names = c(imurder, itheft, irobbery, iassault, idrug, iburglary, igun, psych, Freq, priors, firstage, intage, sex, race, marstat, empac, educ, zipcode, suspendmn, drugs, alco, probation, parole), class = data.frame, row.names = 10) str(testdata) 'data.frame': 291 obs. of 23 variables: $ imurder : num 0 0 0 0 0 0 0 0 0 0 ... $ itheft : num 0 0 0 0 0 1 0 0 0 0 ... $ irobbery : num 0 0 0 0 0 0 0 0 0 0 ... $ iassault : num 1 0 1 0 0 0 0 0 0 0 ... $ idrug: num 0 1 0 1 1 0 0 1 1 1 ... $ iburglary: num 0 0 0 0 0 0 0 0 0 0 ... $ igun : num 0 0 0 0 0 0 0 0 0 0 ... $ psych: num 0 0 0 0 0 0 0 0 0 0 ... $ Freq : num 0 0 0 0 0 0 0 0 0 0 ... $ priors : num 58 4 2 0 6 22 0 36 0 0 ... $ firstage : num 19 39 28 0 49 32 0 24 0 55 ... $ intage : num 19 39 28 25 49 32 32 24 30 55 ... $ sex : Factor w/ 2 levels 1,2: 1 2 1 2 2 1 1 1 1 1 ... $ race : Factor w/ 5 levels WHITE,BLACK,..: 2 2 1 1 2 1 1 2 2 2 ... $ marstat : Factor w/ 7 levels SINGLE,MARRIED,..: 1 2 2 1 2 4 7 1 7 3 ... $ empac: Factor w/ 6 levels EMPLD FT,EMPLD PT,..: 3 4 3 3 3 3 6 3 6 3 ... $ educ : num 0 0 0 1 0 0 0 0 0 1 ... $ zipcode : num 21215 21217 21223 21223 21217 ... $ suspendmn
Re: [R] Regarding the 'R' Load Command
Hi Murali, I'm sorry, but you're making this too difficult to provide any help. Describing what your data structures are and contain is too tedious to follow, and end up being rather ambiguous anyway. My first guess: by your error message, perhaps the columns of the data to predict on are different than the training. If you want to get help, please provide your data in a form that we can test against. Look at this post by Hadley Wickham to help you do that: http://gist.github.com/270442 In particular, not the use of dput that you should use so we can paste the result in our workspace and get the data objects you try to describe. So, to be clear: send us a chunk of text we can paste into R that will recreate a workspace that can reproduce your problem. Please trim your data files to be only as large as necessary (eg. provide 3 observations instead of 300) Thanks, -steve On Wed, May 19, 2010 at 11:40 AM, Godavarthi, Murali mgodavar...@dpscs.state.md.us wrote: Hi Steve, Thanks so much for your inputs! I was actually trying to implement your suggestions, I get the below error (please see the results of predict command below). What we are trying to do is to feed in values for about 23 characteristics of an individual, and use the randomForest() function to determine if the individual is a violent offender. Expected output is 0 or 1, indicating yes/no. Am I going wrong again? Here is what I was doing: 1) Created a text file with following data: imurder itheft irobbery iassault idrug iburglary igun psych Freq priors firstage intage sex race marstat empac educ zipcode suspendmn drugs alco probation parole 10 0 0 0 1 0 0 0 0 0 58 19 19 1 BLACK SINGLE UNEMPLD 0 21215 0 0 0 1 0 The above format in which the text file was created is in the same format as the one which is already working, but has characteristics of about 290 individuals fed-in instead of just one individual as above. Not sure why this doesn't work! 2) Executed the below command sequence: library(randomForest) randomForest 4.5-34 Type rfNews() to see new features/changes/bug fixes. load(C://Program Files//R//R-2.10.1//bin//rfoutput) testmurali-read.table(ex.data,T) load(testmurali) Error in load(testmurali) : bad 'file' argument load(testmurali) names(testmurali) [1] imurder itheft irobbery iassault idrug iburglary igun psych Freq priors [11] firstage intage sex race marstat empac educ zipcode suspendmn drugs [21] alco probation parole predict(rfoutput,newdata=testmurali,type=response) Error in predict.randomForest(rfoutput, newdata = testmurali, type = response) : Type of predictors in new data do not match that of the training data. The model rfoutput used in the above predict command is also based on a working example with similar data. Also, does load command accept a data string input directly (without storing it into a file and then providing path of the file as a string)? Please suggest. Thanks in advance! Best Regards, Murali Godavarthi 410-585-3746 (w) ITCD - DPSCS Data Mining -Original Message- From: Steve Lianoglou [mailto:mailinglist.honey...@gmail.com] Sent: Tuesday, May 18, 2010 4:13 PM To: Godavarthi, Murali Cc: r-help@r-project.org Subject: Re: [R] Regarding the 'R' Load Command Hi, On Tue, May 18, 2010 at 2:49 PM, Godavarthi, Murali mgodavar...@dpscs.state.md.us wrote: Hi, I'm new to 'R' and need some help on the Load command. Any responses will be highly appreciated. Thanks in advance! As per manuals, the Load command expects a binary file input that is saved using a save command. Or a path to the file ... However it is required that we need to call the 'R' program from Java web application using RJava, and pass a string to the 'R program instead of a binary file. Is it possible? Yes, pay closer attention to the description for the file argument in the load function (see ?load): a (readable binary) connection **or a character string** giving the name of the file to load (emphasis mine) I was exploring the options of using TextConnections, file connections and other types of connections in order to read a stream of input (either from a file, stdin etc). I am able to read the string, but the Save and Load commands are not accepting the string input. Here is the sequence of commands I tried running, and the error received. There is no clue on this error, especially when trying to use the eval function in randomForest package, even on the internet. Can anyone help please! library(randomForest) randomForest 4.5-34 Type rfNews() to see new features/changes/bug fixes. load(C://Program Files//R//R-2.10.1//bin//rfoutput) zz - file(ex.data, w) cat(\imurder\ \itheft\ \irobbery\ \iassault\ \idrug\ \iburglary\ \igun\ \psych\ \Freq\ \priors\ \firstage\ \intage\ \sex\ \race\ \marstat\ \empac\ \educ\ \zipcode\ \suspendmn\ \drugs\ \alco\ \probation
Re: [R] Regarding the 'R' Load Command
I think the answer is clear from the error: R thinks the type of data in the components of 'testmurali' do not match those of the data used to fit the original randomForest. The OP should go back to his model fitting code and do str(obj) where 'obj' is the name of his original data object used to fit the randomForest and compare it with str(testmurali) to see why the types of data are different. Look for variables that were factors or characters in one data set and numeric/integer in the other. This smells like a data import issue... If revelation still doesn't occur Murali, *please* follow Steve's suggestions and post and message that shows exactly (i.e. the R code executed) along side a data set *we* can load into R without jumping through hoops or having to divine what your data look like using a crystal ball or ESP. HTH G On Wed, 2010-05-19 at 12:24 -0400, Steve Lianoglou wrote: Hi Murali, I'm sorry, but you're making this too difficult to provide any help. Describing what your data structures are and contain is too tedious to follow, and end up being rather ambiguous anyway. My first guess: by your error message, perhaps the columns of the data to predict on are different than the training. If you want to get help, please provide your data in a form that we can test against. Look at this post by Hadley Wickham to help you do that: http://gist.github.com/270442 In particular, not the use of dput that you should use so we can paste the result in our workspace and get the data objects you try to describe. So, to be clear: send us a chunk of text we can paste into R that will recreate a workspace that can reproduce your problem. Please trim your data files to be only as large as necessary (eg. provide 3 observations instead of 300) Thanks, -steve On Wed, May 19, 2010 at 11:40 AM, Godavarthi, Murali mgodavar...@dpscs.state.md.us wrote: Hi Steve, Thanks so much for your inputs! I was actually trying to implement your suggestions, I get the below error (please see the results of predict command below). What we are trying to do is to feed in values for about 23 characteristics of an individual, and use the randomForest() function to determine if the individual is a violent offender. Expected output is 0 or 1, indicating yes/no. Am I going wrong again? Here is what I was doing: 1) Created a text file with following data: imurder itheft irobbery iassault idrug iburglary igun psych Freq priors firstage intage sex race marstat empac educ zipcode suspendmn drugs alco probation parole 10 0 0 0 1 0 0 0 0 0 58 19 19 1 BLACK SINGLE UNEMPLD 0 21215 0 0 0 1 0 The above format in which the text file was created is in the same format as the one which is already working, but has characteristics of about 290 individuals fed-in instead of just one individual as above. Not sure why this doesn't work! 2) Executed the below command sequence: library(randomForest) randomForest 4.5-34 Type rfNews() to see new features/changes/bug fixes. load(C://Program Files//R//R-2.10.1//bin//rfoutput) testmurali-read.table(ex.data,T) load(testmurali) Error in load(testmurali) : bad 'file' argument load(testmurali) names(testmurali) [1] imurder itheftirobbery iassault idrug iburglary igun psych Freq priors [11] firstage intagesex race marstat empac educ zipcode suspendmn drugs [21] alco probation parole predict(rfoutput,newdata=testmurali,type=response) Error in predict.randomForest(rfoutput, newdata = testmurali, type = response) : Type of predictors in new data do not match that of the training data. The model rfoutput used in the above predict command is also based on a working example with similar data. Also, does load command accept a data string input directly (without storing it into a file and then providing path of the file as a string)? Please suggest. Thanks in advance! Best Regards, Murali Godavarthi 410-585-3746 (w) ITCD - DPSCS Data Mining -Original Message- From: Steve Lianoglou [mailto:mailinglist.honey...@gmail.com] Sent: Tuesday, May 18, 2010 4:13 PM To: Godavarthi, Murali Cc: r-help@r-project.org Subject: Re: [R] Regarding the 'R' Load Command Hi, On Tue, May 18, 2010 at 2:49 PM, Godavarthi, Murali mgodavar...@dpscs.state.md.us wrote: Hi, I'm new to 'R' and need some help on the Load command. Any responses will be highly appreciated. Thanks in advance! As per manuals, the Load command expects a binary file input that is saved using a save command. Or a path to the file ... However it is required that we need to call the 'R' program from Java web application using RJava, and pass a string to the 'R program instead of a binary file. Is it possible? Yes, pay closer attention to the description for the file argument
Re: [R] Regarding the 'R' Load Command
Hi Steve, Gavin This is being really helpful. I've pasted the working data, and my test data below after running the str command on both of those variables. The working sample actually contains about 300 records, hence I am not able to paste the whole data here. However my sample test data which I am trying to get working, is only 1 record, and I've pasted the dput result below. Datatypes seem to match in both variables for me in terms of being num/factor. Please suggest where it could be wrong. Thank You! mytestdata structure(list(imurder = 0, itheft = 0, irobbery = 0, iassault = 1L, idrug = 0L, iburglary = 0L, igun = 0L, psych = 0L, Freq = 0L, priors = 58L, firstage = 19L, intage = 19L, sex = structure(1L, .Label = 1, class = factor), race = structure(1L, .Label = BLACK, class = factor), marstat = structure(1L, .Label = SINGLE, class = factor), empac = structure(1L, .Label = UNEMPLD, class = factor), educ = 0L, zipcode = 21215L, suspendmn = 0L, drugs = 0L, alco = 0L, probation = 1L, parole = 0L), .Names = c(imurder, itheft, irobbery, iassault, idrug, iburglary, igun, psych, Freq, priors, firstage, intage, sex, race, marstat, empac, educ, zipcode, suspendmn, drugs, alco, probation, parole), class = data.frame, row.names = 10) str(testdata) 'data.frame': 291 obs. of 23 variables: $ imurder : num 0 0 0 0 0 0 0 0 0 0 ... $ itheft : num 0 0 0 0 0 1 0 0 0 0 ... $ irobbery : num 0 0 0 0 0 0 0 0 0 0 ... $ iassault : num 1 0 1 0 0 0 0 0 0 0 ... $ idrug: num 0 1 0 1 1 0 0 1 1 1 ... $ iburglary: num 0 0 0 0 0 0 0 0 0 0 ... $ igun : num 0 0 0 0 0 0 0 0 0 0 ... $ psych: num 0 0 0 0 0 0 0 0 0 0 ... $ Freq : num 0 0 0 0 0 0 0 0 0 0 ... $ priors : num 58 4 2 0 6 22 0 36 0 0 ... $ firstage : num 19 39 28 0 49 32 0 24 0 55 ... $ intage : num 19 39 28 25 49 32 32 24 30 55 ... $ sex : Factor w/ 2 levels 1,2: 1 2 1 2 2 1 1 1 1 1 ... $ race : Factor w/ 5 levels WHITE,BLACK,..: 2 2 1 1 2 1 1 2 2 2 ... $ marstat : Factor w/ 7 levels SINGLE,MARRIED,..: 1 2 2 1 2 4 7 1 7 3 ... $ empac: Factor w/ 6 levels EMPLD FT,EMPLD PT,..: 3 4 3 3 3 3 6 3 6 3 ... $ educ : num 0 0 0 1 0 0 0 0 0 1 ... $ zipcode : num 21215 21217 21223 21223 21217 ... $ suspendmn: num 0 600 0 0 60 3 2 479 0 3 ... $ drugs: num 0 1 0 0 0 1 0 0 0 1 ... $ alco : num 0 0 0 0 0 1 0 0 0 1 ... $ probation: num 1 1 0 0 1 1 1 1 0 1 ... $ parole : num 0 0 0 0 0 0 0 0 0 0 ... str(mytestdata) 'data.frame': 1 obs. of 23 variables: $ imurder : num 0 $ itheft : num 0 $ irobbery : num 0 $ iassault : num 1 $ idrug: num 0 $ iburglary: num 0 $ igun : num 0 $ psych: num 0 $ Freq : num 0 $ priors : num 58 $ firstage : num 19 $ intage : num 19 $ sex : Factor w/ 1 level 1: 1 $ race : Factor w/ 1 level BLACK: 1 $ marstat : Factor w/ 1 level SINGLE: 1 $ empac: Factor w/ 1 level UNEMPLD: 1 $ educ : num 0 $ zipcode : num 21215 $ suspendmn: num 0 $ drugs: num 0 $ alco : num 0 $ probation: num 1 $ parole : num 0 Best Regards, Murali Godavarthi 410-585-3746 (w) ITCD - DPSCS Data Mining -Original Message- From: Gavin Simpson [mailto:gavin.simp...@ucl.ac.uk] Sent: Wednesday, May 19, 2010 12:58 PM To: Steve Lianoglou Cc: Godavarthi, Murali; r-help@r-project.org Subject: Re: [R] Regarding the 'R' Load Command I think the answer is clear from the error: R thinks the type of data in the components of 'testmurali' do not match those of the data used to fit the original randomForest. The OP should go back to his model fitting code and do str(obj) where 'obj' is the name of his original data object used to fit the randomForest and compare it with str(testmurali) to see why the types of data are different. Look for variables that were factors or characters in one data set and numeric/integer in the other. This smells like a data import issue... If revelation still doesn't occur Murali, *please* follow Steve's suggestions and post and message that shows exactly (i.e. the R code executed) along side a data set *we* can load into R without jumping through hoops or having to divine what your data look like using a crystal ball or ESP. HTH G On Wed, 2010-05-19 at 12:24 -0400, Steve Lianoglou wrote: Hi Murali, I'm sorry, but you're making this too difficult to provide any help. Describing what your data structures are and contain is too tedious to follow, and end up being rather ambiguous anyway. My first guess: by your error message, perhaps the columns of the data to predict on are different than the training. If you want to get help, please provide your data in a form that we can test against. Look at this post by Hadley Wickham to help you do that: http://gist.github.com/270442 In particular, not the use of dput that you should use so we can paste the result in our workspace and get the data objects you try to describe. So, to be clear: send us
Re: [R] Regarding the 'R' Load Command
Hi Steve, Thanks so much for your inputs! I was actually trying to implement your suggestions, I get the below error (please see the results of predict command below). What we are trying to do is to feed in values for about 23 characteristics of an individual, and use the randomForest() function to determine if the individual is a violent offender. Expected output is 0 or 1, indicating yes/no. Am I going wrong again? Here is what I was doing: 1) Created a text file with following data: imurder itheft irobbery iassault idrug iburglary igun psych Freq priors firstage intage sex race marstat empac educ zipcode suspendmn drugs alco probation parole 10 0 0 0 1 0 0 0 0 0 58 19 19 1 BLACK SINGLE UNEMPLD 0 21215 0 0 0 1 0 The above format in which the text file was created is in the same format as the one which is already working, but has characteristics of about 290 individuals fed-in instead of just one individual as above. Not sure why this doesn't work! 2) Executed the below command sequence: library(randomForest) randomForest 4.5-34 Type rfNews() to see new features/changes/bug fixes. load(C://Program Files//R//R-2.10.1//bin//rfoutput) testmurali-read.table(ex.data,T) load(testmurali) Error in load(testmurali) : bad 'file' argument load(testmurali) names(testmurali) [1] imurder itheftirobbery iassault idrug iburglary igun psych Freq priors [11] firstage intagesex race marstat empac educ zipcode suspendmn drugs [21] alco probation parole predict(rfoutput,newdata=testmurali,type=response) Error in predict.randomForest(rfoutput, newdata = testmurali, type = response) : Type of predictors in new data do not match that of the training data. The model rfoutput used in the above predict command is also based on a working example with similar data. Also, does load command accept a data string input directly (without storing it into a file and then providing path of the file as a string)? Please suggest. Thanks in advance! Best Regards, Murali Godavarthi 410-585-3746 (w) ITCD - DPSCS Data Mining -Original Message- From: Steve Lianoglou [mailto:mailinglist.honey...@gmail.com] Sent: Tuesday, May 18, 2010 4:13 PM To: Godavarthi, Murali Cc: r-help@r-project.org Subject: Re: [R] Regarding the 'R' Load Command Hi, On Tue, May 18, 2010 at 2:49 PM, Godavarthi, Murali mgodavar...@dpscs.state.md.us wrote: Hi, I'm new to 'R' and need some help on the Load command. Any responses will be highly appreciated. Thanks in advance! As per manuals, the Load command expects a binary file input that is saved using a save command. Or a path to the file ... However it is required that we need to call the 'R' program from Java web application using RJava, and pass a string to the 'R program instead of a binary file. Is it possible? Yes, pay closer attention to the description for the file argument in the load function (see ?load): a (readable binary) connection **or a character string** giving the name of the file to load (emphasis mine) I was exploring the options of using TextConnections, file connections and other types of connections in order to read a stream of input (either from a file, stdin etc). I am able to read the string, but the Save and Load commands are not accepting the string input. Here is the sequence of commands I tried running, and the error received. There is no clue on this error, especially when trying to use the eval function in randomForest package, even on the internet. Can anyone help please! library(randomForest) randomForest 4.5-34 Type rfNews() to see new features/changes/bug fixes. load(C://Program Files//R//R-2.10.1//bin//rfoutput) zz - file(ex.data, w) cat(\imurder\ \itheft\ \irobbery\ \iassault\ \idrug\ \iburglary\ \igun\ \psych\ \Freq\ \priors\ \firstage\ \intage\ \sex\ \race\ \marstat\ \empac\ \educ\ \zipcode\ \suspendmn\ \drugs\ \alco\ \probation\ \parole\,file = zz, sep = \n, fill = TRUE) cat(\10\ 0 0 0 1 0 0 0 0 0 58 19 19 \1\ \BLACK\ \SINGLE\ \UNEMPLD\ 0 21215 0 0 0 1 0,file = zz, sep = \n, fill = TRUE) What are you trying to do here? It looks like you want to save a table of sorts. First create your data into a data.frame, then save that data.frame to a file using write.table (or write.csv, etc). save(zz, file = testmurali, version = 2) You're saving a file object here, not the contents of the file. Once you successfully serialize your data into a text file, just load it from like normal using read.table (or similar). Anyway, I'm not sure what we're talking about here, but in short: 1. You need to make sure that you are correctly saving what you think you're saving. 2. You can pass a character string to the `load` function, so you can send it through (over from) java as you wich. 3. I don't think you really want to deal with load/save here, because it looks like you are dealing with some tab delimited file -- in which case use
Re: [R] Regarding the 'R' Load Command
Hi Gavin, Steve Sorry, please use the below dput for mytestdata. Thanks!! structure(list(imurder = 0, itheft = 0, irobbery = 0, iassault = 1, idrug = 0, iburglary = 0, igun = 0, psych = 0, Freq = 0, priors = 58, firstage = 19, intage = 19, sex = structure(1, .Label = 1, class = factor), race = structure(1, .Label = BLACK, class = factor), marstat = structure(1, .Label = SINGLE, class = factor), empac = structure(1, .Label = UNEMPLD, class = factor), educ = 0, zipcode = 21215, suspendmn = 0, drugs = 0, alco = 0, probation = 1, parole = 0), .Names = c(imurder, itheft, irobbery, iassault, idrug, iburglary, igun, psych, Freq, priors, firstage, intage, sex, race, marstat, empac, educ, zipcode, suspendmn, drugs, alco, probation, parole), class = data.frame, row.names = 10) Best Regards, Murali Godavarthi 410-585-3746 (w) ITCD - DPSCS Data Mining -Original Message- From: Godavarthi, Murali Sent: Wednesday, May 19, 2010 2:43 PM To: 'gavin.simp...@ucl.ac.uk'; Steve Lianoglou Cc: r-help@r-project.org Subject: RE: [R] Regarding the 'R' Load Command Hi Steve, Gavin This is being really helpful. I've pasted the working data, and my test data below after running the str command on both of those variables. The working sample actually contains about 300 records, hence I am not able to paste the whole data here. However my sample test data which I am trying to get working, is only 1 record, and I've pasted the dput result below. Datatypes seem to match in both variables for me in terms of being num/factor. Please suggest where it could be wrong. Thank You! mytestdata structure(list(imurder = 0, itheft = 0, irobbery = 0, iassault = 1L, idrug = 0L, iburglary = 0L, igun = 0L, psych = 0L, Freq = 0L, priors = 58L, firstage = 19L, intage = 19L, sex = structure(1L, .Label = 1, class = factor), race = structure(1L, .Label = BLACK, class = factor), marstat = structure(1L, .Label = SINGLE, class = factor), empac = structure(1L, .Label = UNEMPLD, class = factor), educ = 0L, zipcode = 21215L, suspendmn = 0L, drugs = 0L, alco = 0L, probation = 1L, parole = 0L), .Names = c(imurder, itheft, irobbery, iassault, idrug, iburglary, igun, psych, Freq, priors, firstage, intage, sex, race, marstat, empac, educ, zipcode, suspendmn, drugs, alco, probation, parole), class = data.frame, row.names = 10) str(testdata) 'data.frame': 291 obs. of 23 variables: $ imurder : num 0 0 0 0 0 0 0 0 0 0 ... $ itheft : num 0 0 0 0 0 1 0 0 0 0 ... $ irobbery : num 0 0 0 0 0 0 0 0 0 0 ... $ iassault : num 1 0 1 0 0 0 0 0 0 0 ... $ idrug: num 0 1 0 1 1 0 0 1 1 1 ... $ iburglary: num 0 0 0 0 0 0 0 0 0 0 ... $ igun : num 0 0 0 0 0 0 0 0 0 0 ... $ psych: num 0 0 0 0 0 0 0 0 0 0 ... $ Freq : num 0 0 0 0 0 0 0 0 0 0 ... $ priors : num 58 4 2 0 6 22 0 36 0 0 ... $ firstage : num 19 39 28 0 49 32 0 24 0 55 ... $ intage : num 19 39 28 25 49 32 32 24 30 55 ... $ sex : Factor w/ 2 levels 1,2: 1 2 1 2 2 1 1 1 1 1 ... $ race : Factor w/ 5 levels WHITE,BLACK,..: 2 2 1 1 2 1 1 2 2 2 ... $ marstat : Factor w/ 7 levels SINGLE,MARRIED,..: 1 2 2 1 2 4 7 1 7 3 ... $ empac: Factor w/ 6 levels EMPLD FT,EMPLD PT,..: 3 4 3 3 3 3 6 3 6 3 ... $ educ : num 0 0 0 1 0 0 0 0 0 1 ... $ zipcode : num 21215 21217 21223 21223 21217 ... $ suspendmn: num 0 600 0 0 60 3 2 479 0 3 ... $ drugs: num 0 1 0 0 0 1 0 0 0 1 ... $ alco : num 0 0 0 0 0 1 0 0 0 1 ... $ probation: num 1 1 0 0 1 1 1 1 0 1 ... $ parole : num 0 0 0 0 0 0 0 0 0 0 ... str(mytestdata) 'data.frame': 1 obs. of 23 variables: $ imurder : num 0 $ itheft : num 0 $ irobbery : num 0 $ iassault : num 1 $ idrug: num 0 $ iburglary: num 0 $ igun : num 0 $ psych: num 0 $ Freq : num 0 $ priors : num 58 $ firstage : num 19 $ intage : num 19 $ sex : Factor w/ 1 level 1: 1 $ race : Factor w/ 1 level BLACK: 1 $ marstat : Factor w/ 1 level SINGLE: 1 $ empac: Factor w/ 1 level UNEMPLD: 1 $ educ : num 0 $ zipcode : num 21215 $ suspendmn: num 0 $ drugs: num 0 $ alco : num 0 $ probation: num 1 $ parole : num 0 Best Regards, Murali Godavarthi 410-585-3746 (w) ITCD - DPSCS Data Mining -Original Message- From: Gavin Simpson [mailto:gavin.simp...@ucl.ac.uk] Sent: Wednesday, May 19, 2010 12:58 PM To: Steve Lianoglou Cc: Godavarthi, Murali; r-help@r-project.org Subject: Re: [R] Regarding the 'R' Load Command I think the answer is clear from the error: R thinks the type of data in the components of 'testmurali' do not match those of the data used to fit the original randomForest. The OP should go back to his model fitting code and do str(obj) where 'obj' is the name of his original data object used to fit the randomForest and compare it with str(testmurali) to see why the types of data are different. Look for variables that were factors or characters in one data set and numeric/integer in the other
Re: [R] Regarding the 'R' Load Command
On Wed, 2010-05-19 at 14:59 -0400, Godavarthi, Murali wrote: Hi Gavin, Steve Sorry, please use the below dput for mytestdata. Thanks!! No need. The issue I think issue is due to the number of levels in the factor. IIRC correctly, I've been bitten by this before where the newdata object contained factors with different numbers of levels and/or a different subset of levels. Try setting the levels on mytestdata explicitly from the levels of testdata, e.g. something like: mytestdata - within(mytestdata, { sex - factor(sex, levels = levels(testdata$sex)) race - factor(race, levels = levels(testdata$race)) marstat - factor(marstat, levels = levels(testdata$marstat)) empac - factor(empac, levels = levels(testdata$empac)) }) Then check with str(mytestdata) that it is consistent with str(testdata) If it is, then try to call predict on your RF model and newdata = testdata) HTH G structure(list(imurder = 0, itheft = 0, irobbery = 0, iassault = 1, idrug = 0, iburglary = 0, igun = 0, psych = 0, Freq = 0, priors = 58, firstage = 19, intage = 19, sex = structure(1, .Label = 1, class = factor), race = structure(1, .Label = BLACK, class = factor), marstat = structure(1, .Label = SINGLE, class = factor), empac = structure(1, .Label = UNEMPLD, class = factor), educ = 0, zipcode = 21215, suspendmn = 0, drugs = 0, alco = 0, probation = 1, parole = 0), .Names = c(imurder, itheft, irobbery, iassault, idrug, iburglary, igun, psych, Freq, priors, firstage, intage, sex, race, marstat, empac, educ, zipcode, suspendmn, drugs, alco, probation, parole), class = data.frame, row.names = 10) Best Regards, Murali Godavarthi 410-585-3746 (w) ITCD - DPSCS Data Mining -Original Message- From: Godavarthi, Murali Sent: Wednesday, May 19, 2010 2:43 PM To: 'gavin.simp...@ucl.ac.uk'; Steve Lianoglou Cc: r-help@r-project.org Subject: RE: [R] Regarding the 'R' Load Command Hi Steve, Gavin This is being really helpful. I've pasted the working data, and my test data below after running the str command on both of those variables. The working sample actually contains about 300 records, hence I am not able to paste the whole data here. However my sample test data which I am trying to get working, is only 1 record, and I've pasted the dput result below. Datatypes seem to match in both variables for me in terms of being num/factor. Please suggest where it could be wrong. Thank You! mytestdata structure(list(imurder = 0, itheft = 0, irobbery = 0, iassault = 1L, idrug = 0L, iburglary = 0L, igun = 0L, psych = 0L, Freq = 0L, priors = 58L, firstage = 19L, intage = 19L, sex = structure(1L, .Label = 1, class = factor), race = structure(1L, .Label = BLACK, class = factor), marstat = structure(1L, .Label = SINGLE, class = factor), empac = structure(1L, .Label = UNEMPLD, class = factor), educ = 0L, zipcode = 21215L, suspendmn = 0L, drugs = 0L, alco = 0L, probation = 1L, parole = 0L), .Names = c(imurder, itheft, irobbery, iassault, idrug, iburglary, igun, psych, Freq, priors, firstage, intage, sex, race, marstat, empac, educ, zipcode, suspendmn, drugs, alco, probation, parole), class = data.frame, row.names = 10) str(testdata) 'data.frame': 291 obs. of 23 variables: $ imurder : num 0 0 0 0 0 0 0 0 0 0 ... $ itheft : num 0 0 0 0 0 1 0 0 0 0 ... $ irobbery : num 0 0 0 0 0 0 0 0 0 0 ... $ iassault : num 1 0 1 0 0 0 0 0 0 0 ... $ idrug: num 0 1 0 1 1 0 0 1 1 1 ... $ iburglary: num 0 0 0 0 0 0 0 0 0 0 ... $ igun : num 0 0 0 0 0 0 0 0 0 0 ... $ psych: num 0 0 0 0 0 0 0 0 0 0 ... $ Freq : num 0 0 0 0 0 0 0 0 0 0 ... $ priors : num 58 4 2 0 6 22 0 36 0 0 ... $ firstage : num 19 39 28 0 49 32 0 24 0 55 ... $ intage : num 19 39 28 25 49 32 32 24 30 55 ... $ sex : Factor w/ 2 levels 1,2: 1 2 1 2 2 1 1 1 1 1 ... $ race : Factor w/ 5 levels WHITE,BLACK,..: 2 2 1 1 2 1 1 2 2 2 ... $ marstat : Factor w/ 7 levels SINGLE,MARRIED,..: 1 2 2 1 2 4 7 1 7 3 ... $ empac: Factor w/ 6 levels EMPLD FT,EMPLD PT,..: 3 4 3 3 3 3 6 3 6 3 ... $ educ : num 0 0 0 1 0 0 0 0 0 1 ... $ zipcode : num 21215 21217 21223 21223 21217 ... $ suspendmn: num 0 600 0 0 60 3 2 479 0 3 ... $ drugs: num 0 1 0 0 0 1 0 0 0 1 ... $ alco : num 0 0 0 0 0 1 0 0 0 1 ... $ probation: num 1 1 0 0 1 1 1 1 0 1 ... $ parole : num 0 0 0 0 0 0 0 0 0 0 ... str(mytestdata) 'data.frame': 1 obs. of 23 variables: $ imurder : num 0 $ itheft : num 0 $ irobbery : num 0 $ iassault : num 1 $ idrug: num 0 $ iburglary: num 0 $ igun : num 0 $ psych: num 0 $ Freq : num 0 $ priors : num 58 $ firstage : num 19 $ intage : num 19 $ sex : Factor w/ 1 level 1: 1 $ race : Factor w/ 1 level BLACK: 1 $ marstat
[R] Regarding the 'R' Load Command
Hi, I'm new to 'R' and need some help on the Load command. Any responses will be highly appreciated. Thanks in advance! As per manuals, the Load command expects a binary file input that is saved using a save command. However it is required that we need to call the 'R' program from Java web application using RJava, and pass a string to the 'R program instead of a binary file. Is it possible? I was exploring the options of using TextConnections, file connections and other types of connections in order to read a stream of input (either from a file, stdin etc). I am able to read the string, but the Save and Load commands are not accepting the string input. Here is the sequence of commands I tried running, and the error received. There is no clue on this error, especially when trying to use the eval function in randomForest package, even on the internet. Can anyone help please! library(randomForest) randomForest 4.5-34 Type rfNews() to see new features/changes/bug fixes. load(C://Program Files//R//R-2.10.1//bin//rfoutput) zz - file(ex.data, w) cat(\imurder\ \itheft\ \irobbery\ \iassault\ \idrug\ \iburglary\ \igun\ \psych\ \Freq\ \priors\ \firstage\ \intage\ \sex\ \race\ \marstat\ \empac\ \educ\ \zipcode\ \suspendmn\ \drugs\ \alco\ \probation\ \parole\,file = zz, sep = \n, fill = TRUE) cat(\10\ 0 0 0 1 0 0 0 0 0 58 19 19 \1\ \BLACK\ \SINGLE\ \UNEMPLD\ 0 21215 0 0 0 1 0,file = zz, sep = \n, fill = TRUE) save(zz, file = testmurali, version = 2) predict(rfoutput,newdata=testmurali,type=response) Error in eval(expr, envir, enclos) : object 'imurder' not found Best Regards, Murali Godavarthi mgodavar...@dpscs.state.md.us [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Regarding the 'R' Load Command
Hi, On Tue, May 18, 2010 at 2:49 PM, Godavarthi, Murali mgodavar...@dpscs.state.md.us wrote: Hi, I'm new to 'R' and need some help on the Load command. Any responses will be highly appreciated. Thanks in advance! As per manuals, the Load command expects a binary file input that is saved using a save command. Or a path to the file ... However it is required that we need to call the 'R' program from Java web application using RJava, and pass a string to the 'R program instead of a binary file. Is it possible? Yes, pay closer attention to the description for the file argument in the load function (see ?load): a (readable binary) connection **or a character string** giving the name of the file to load (emphasis mine) I was exploring the options of using TextConnections, file connections and other types of connections in order to read a stream of input (either from a file, stdin etc). I am able to read the string, but the Save and Load commands are not accepting the string input. Here is the sequence of commands I tried running, and the error received. There is no clue on this error, especially when trying to use the eval function in randomForest package, even on the internet. Can anyone help please! library(randomForest) randomForest 4.5-34 Type rfNews() to see new features/changes/bug fixes. load(C://Program Files//R//R-2.10.1//bin//rfoutput) zz - file(ex.data, w) cat(\imurder\ \itheft\ \irobbery\ \iassault\ \idrug\ \iburglary\ \igun\ \psych\ \Freq\ \priors\ \firstage\ \intage\ \sex\ \race\ \marstat\ \empac\ \educ\ \zipcode\ \suspendmn\ \drugs\ \alco\ \probation\ \parole\,file = zz, sep = \n, fill = TRUE) cat(\10\ 0 0 0 1 0 0 0 0 0 58 19 19 \1\ \BLACK\ \SINGLE\ \UNEMPLD\ 0 21215 0 0 0 1 0,file = zz, sep = \n, fill = TRUE) What are you trying to do here? It looks like you want to save a table of sorts. First create your data into a data.frame, then save that data.frame to a file using write.table (or write.csv, etc). save(zz, file = testmurali, version = 2) You're saving a file object here, not the contents of the file. Once you successfully serialize your data into a text file, just load it from like normal using read.table (or similar). Anyway, I'm not sure what we're talking about here, but in short: 1. You need to make sure that you are correctly saving what you think you're saving. 2. You can pass a character string to the `load` function, so you can send it through (over from) java as you wich. 3. I don't think you really want to deal with load/save here, because it looks like you are dealing with some tab delimited file -- in which case use read.table (or similar) and load it that way. You can, of course, still use save/load, but make sure you save/load the right thing (not a file object like you're doing here). -- Steve Lianoglou Graduate Student: Computational Systems Biology | Memorial Sloan-Kettering Cancer Center | Weill Medical College of Cornell University Contact Info: http://cbio.mskcc.org/~lianos/contact __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.