[R] extracting splitting rules from GBM

2012-12-12 Thread Andrew Ziem
I extracting splitting rules from Greg Ridgeway's GBM 1.6-3.2 in R 2.15.2, so I can run classification in a production system outside of R.  I have it working and verified for a dummy data set with all variable types (numeric, factor, ordered) and missing values, but in the titanic survivors

Re: [R] extracting splitting rules from GBM

2012-12-12 Thread Andrew Ziem
The mailing list ate the attachments, so here they are again. R code https://gist.github.com/4270628 Log http://pastebin.com/0e49CTsL Andrew -Original Message- From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On Behalf Of Andrew Ziem Sent: Wednesday

Re: [R] Compare clustering solutions to a correct one

2011-11-09 Thread Andrew Ziem
It sounds like you want to do supervised classification, so maybe a supervised classification algorithm would be more appropriate? Consider logistic regression, rpart, ctree, earth, etc. Andrew -Original Message- From: r-help-boun...@r-project.org

Re: [R] R in batch mode packages loading question

2011-11-07 Thread Andrew Ziem
Try the fork() function in the package multicore (if your system supports it) Andrew -Original Message- From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On Behalf Of PALMIER Patrick (Responsable de groupe) - CETE NP/TM/ST Sent: Monday, November 07, 2011 3:49 AM

Re: [R] Decision tree model using rpart ( classification

2011-11-04 Thread Andrew Ziem
aajit75 aajit75 at yahoo.co.in writes: fit - rpart(decile ~., method=class, control=rpart.control(minsplit=min_obs_split, cp=c_c_factor), data=dtm_ip) In A and B target variable 'segment' is from the clustering data using same set of input variables , while in C target

[R] bug calculating ROC with caret and earth?

2011-11-04 Thread Andrew Ziem
Does caret have a bug calculating ROC with earth?  When using caret and earth on any of my data sets, caret's ROC never varies.  This could mean earth is finding the same model (for example, because of using an nprune parameter that is too high).  However, if that were true, sensitivity and

Re: [R] ROC from R-SVM?

2011-02-22 Thread Andrew Ziem
In addition's to Max's suggestion about caret, look at ROCR which visualizes ROC charts for any binary classifier. I have an example of e1071::SVN and ROCR here https://heuristically.wordpress.com/2009/12/23/compare-performance-machine-learning-classifiers-r/ -Original Message-

Re: [R] Categorical Variables and Machine Learning

2011-02-17 Thread Andrew Ziem
Try the function ctree() in the package party or earth() in earth. You can use factor variable as is, or you can transform the factor to binary variables (i.e., is_P is 0 or 1, is_D is 0 or 1). In the second case, you can use any algorithm, and earth() automatically transforms factors to

[R] missing values in party::ctree

2011-02-17 Thread Andrew Ziem
After ctree builds a tree, how would I determine the direction missing values follow by examining the BinaryTree-class object? For instance in the example below Bare.nuclei has 16 missing values and is used for the first split, but the missing values are not listed in either set of factors.

[R] caret::train() and ctree()

2011-02-16 Thread Andrew Ziem
Like earth can be trained simultaneously for degree and nprune, is there a way to train ctree simultaneously for mincriterion and maxdepth? Also, I notice there are separate methods ctree and ctree2, and if both options are attempted to tune with one method, the summary averages the option it

Re: [R] preparing data for barplot()

2009-03-05 Thread Andrew Ziem
Hello Petr, Thank you. That works beautifully. I searched for a way to transpose a data frame, but you are right: barplot() wants a matrix. Andrew On Wed, Mar 4, 2009 at 1:49 AM, Petr PIKAL petr.pi...@precheza.cz wrote: Read what barplot does and look to your plot. If you want each row to

[R] preparing data for barplot()

2009-03-03 Thread Andrew Ziem
What is the best way to produce a barplot from my data? I would like the barplot to show each person with the values stacked val1+val2+val3, so there is one bar for each person When I use barplot(data.matrix(realdata)), it shows one bar for each value instead. To post here, I created an

[R] How to join many records against SQL database

2009-02-28 Thread Andrew Ziem
This is a working example of how to merge records with a SQL database given the constraints 1. The database is too large to pull all the records 2. The database permissions don't allow creating a table for temporarily storing identifiers 3. The R database driver doesn't allow creating temporary

[R] How to create temporary table in MySQL

2009-02-28 Thread Andrew Ziem
Creating a temp table isn't completely intuitive with MySQL 5 and R 2.8..1, but it can be done. library(RMySQL) Loading required package: DBI con - dbConnect(dbDriver(MySQL), dbname = foo, user=me,password=secret) x- data.frame(1:10) colnames(x) -c(x) dbWriteTable(con, #x, x,

[R] Temporary tables with Microsoft SQL?

2009-02-28 Thread Andrew Ziem
I can create a temp table with MySQL and R DBI[1], but I don't see how to do the same with Microsoft SQL 2005 and RODBC. R 2.8.1 creates the table, but then it can never see it. I'm looking to avoid replacing the convenience functions like sqlSave(). [1]

[R] Optimize for loop / find last record for each person

2009-02-27 Thread Andrew Ziem
I want to find the last record for each person_id in a data frame (from a SQL database) ordered by date. Is there a better way than this for loop? for (i in 2:length(history[,1])) { if (history[i, person_id] == history[i - 1, person_id]) history[i, order] = history[i - 1, order] + 1 #

Re: [R] Optimize for loop / find last record for each person

2009-02-27 Thread Andrew Ziem
On Fri, Feb 27, 2009 at 2:10 PM, William Dunlap wdun...@tibco.com wrote: Andrew, it makes it easier to help if you supply a typical input and expected output along with your code.  I tried your code with the following input: I'll be careful to avoid these mistakes. Also, I should not have