Re: [R] Populating then sorting a matrix and/or data.frame

2010-11-11 Thread Noah Silverman
Still doesn't work. When using rbind to build the data.frame, it get a structure mostly full of NA. The data is correct, so something about pushing into the data.frame is breaking. Example code: results - data.frame() for(i in 1:n){ #do all the work #a is a test label. b,c,d are

Re: [R] Populating then sorting a matrix and/or data.frame

2010-11-11 Thread Noah Silverman
Langfelder Sent: Thursday, November 11, 2010 12:25 PM To: Noah Silverman Cc: r-help@r-project.org Subject: Re: [R] Populating then sorting a matrix and/or data.frame On Thu, Nov 11, 2010 at 11:33 AM, Noah Silverman n...@smartmediacorp.com wrote: Still doesn't work. When using rbind to build

Re: [R] Populating then sorting a matrix and/or data.frame

2010-11-11 Thread Noah Silverman
That makes perfect sense. All of my numbers are being coerced into strings by the c() function. Subsequently, my data.frame contains all strings. I can't know the length of the data.frame ahead of time, so can't predefine it like your example. One thought would be to make it arbitrarily long

Re: [R] Populating then sorting a matrix and/or data.frame

2010-11-11 Thread Noah Silverman
David, Great solution. While a bit longer to enter, it lets me explicitly define a type for each column. Thanks!!! -N On 11/11/10 4:02 PM, David Winsemius wrote: On Nov 11, 2010, at 6:38 PM, Noah Silverman wrote: That makes perfect sense. All of my numbers are being coerced

[R] Populating then sorting a matrix and/or data.frame

2010-11-10 Thread Noah Silverman
Hi, I have a process in R that produces a lot of output. My plan was to build up a matrix or data.frame row by row, so that I'll have a nice object with all the resulting data. I started with: results - matrix(ncol=3) names(results) - c(one, two, three) Then, when looping through the data:

Re: [R] Populating then sorting a matrix and/or data.frame

2010-11-10 Thread Noah Silverman
That was a typo. It should have read: results[results$one 100,] It does still fail. There is ONE column that is text. So my guess is that R is seeing that and assuming that the entire data.frame should be factors. -N On 11/10/10 11:16 PM, Michael Bedward wrote: Hello Noah, If you set

[R] Regular Expressions

2010-11-05 Thread Noah Silverman
Hi, I'm trying to figure out how to use capturing parenthesis in regular expressions in R. (Doing this in Perl, Java, etc. is fairly trivial, but I can't seem to find the functionality in R.) For example, given the string:10 Nov 13.00 (PFE1020K13) I want to capture the first to digits

Re: [R] Regular Expressions

2010-11-05 Thread Noah Silverman
: On Thu, 4 Nov 2010, Noah Silverman wrote: Hi, I'm trying to figure out how to use capturing parenthesis in regular expressions in R. (Doing this in Perl, Java, etc. is fairly trivial, but I can't seem to find the functionality in R.) For example, given the string:10 Nov 13.00

Re: [R] accessing return variables from a function

2010-07-10 Thread Noah Silverman
Thanks! -N On 7/9/10 2:20 AM, Noah Silverman wrote: Hi, I am trying to figure out a short way to access two values output from the sort function. x - c(3,4,3,6,78,3,1,2) sort(x, index.return=T) $x [1] 1 2 3 3 3 4 6 78 $ix [1] 7 8 1 3 6 2 4 5 It would be great

[R] accessing return variables from a function

2010-07-09 Thread Noah Silverman
Hi, I am trying to figure out a short way to access two values output from the sort function. x - c(3,4,3,6,78,3,1,2) sort(x, index.return=T) $x [1] 1 2 3 3 3 4 6 78 $ix [1] 7 8 1 3 6 2 4 5 It would be great to do something like this (doesn't work.): c(y, indexes) - sort(x,

Re: [R] interpretation of svm models with the e1071 package

2010-07-09 Thread Noah Silverman
Steve, Couldn't he also just use the decision.value property to see the equivilent of t(x) %*% b for each row? -N On 7/9/10 7:11 PM, Steve Lianoglou wrote: Hi, On Fri, Jul 9, 2010 at 12:15 PM, manuel.martin manuel.mar...@orleans.inra.fr wrote: Dear all, after having calibrated a svm

[R] Factoring a variable

2010-06-17 Thread Noah Silverman
Hi, I have a dataset where the results are coded (yes, no) We want to do some machine learning with SVM to predict the yes outcome My problem is that if I just use the as.factor function to convert, then it reverses the levels. -- x - c(no, no, no, yes, yes, no, no)

[R] Factoring a variable

2010-06-17 Thread Noah Silverman
Hi, I have a dataset where the results are coded (yes, no) We want to do some machine learning with SVM to predict the yes outcome My problem is that if I just use the as.factor function to convert, then it reverses the levels. -- x - c(no, no, no, yes, yes, no, no)

[R] Sim function

2010-06-12 Thread Noah Silverman
I'm reading Gellman's book Data Analysis Using Regression and Multilevel-Hierarchical Models In Chapter 7 (and later), he makes frequent referent to a function names sim. I can't find the function anywhere, not in my standard R install, or in any of the packages. Doe anyone have a

[R] Decision values from KSVM

2010-06-11 Thread Noah Silverman
Hi, I'm working on a project using the kernlab library. For one phase, I want the decision values from the SVM prediction, not the class label. the e1071 library has this function, but I can't find the equivalent in ksvm. In general, when an SVM is used for classification, the label of an

[R] Plot multiple columns

2010-06-01 Thread Noah Silverman
I'm running a long MCMC chain that is generating samples for 22 variables. I have each run of the chain as a row in a matrix. So: Chain[,1] is the column with all the samples for variable one. Chain[,2] is the column with all the samples for variable 2, etc. I'd like to fit all 22 on a single

Re: [R] Plot multiple columns

2010-06-01 Thread Noah Silverman
Hi, I used the term run, as each iteration of the Gibbs sampler produces 22 variables (coefficients for Beta in a regression model) The example wont work On 6/1/10 5:54 AM, Ben Bolker wrote: Noah Silverman noah at smartmediacorp.com writes: I'm running a long MCMC chain

Re: [R] Fancy Page layout

2010-06-01 Thread Noah Silverman
AM, Noah Silverman wrote: Hi, Working on a report that is going to have a large number of graphs and summaries. We have 80 groups with 20 variables each. Ideally, I'd like to produce ONE page for each group. It would have two columns of 10 graphs and then the 5 number summary

[R] textbox in lattice

2010-06-01 Thread Noah Silverman
Hi, I want to add a box at the bottom of a lattice window (device/page?). Lattice has drawn a nice group of panels with all the plots I need. How do I add my own summary text at the bottom (several lines worth?) __ R-help@r-project.org mailing list

Re: [R] Plot multiple columns

2010-06-01 Thread Noah Silverman
On Tue, Jun 1, 2010 at 9:37 AM, Noah Silverman n...@smartmediacorp.com mailto:n...@smartmediacorp.com wrote: Hi, I used the term run, as each iteration of the Gibbs sampler produces 22 variables (coefficients for Beta in a regression model) The example wont work

Re: [R] Plot multiple columns

2010-06-01 Thread Noah Silverman
, 2010 at 10:51 AM, Noah Silverman n...@smartmediacorp.com mailto:n...@smartmediacorp.com wrote: You are correct, I initially missed the as.mcmc step. Without it, R doesn't want to squeeze so many plots onto a page. I've had that problem before with lattice...which makes me wonder

Re: [R] textbox in lattice

2010-06-01 Thread Noah Silverman
::splitTextGrob function might help. HTH, baptiste On 1 June 2010 19:37, Noah Silverman n...@smartmediacorp.com wrote: Hi, I want to add a box at the bottom of a lattice window (device/page?). Lattice has drawn a nice group of panels with all the plots I need. How do I add my own summary text

Re: [R] textbox in lattice

2010-06-01 Thread Noah Silverman
- matrix(runif(2200),ncol=22) m - as.mcmc(x) p = xyplot(m, layout = c(2, 11)) pdf(,height=15) arrange(p, tableGrob(as.matrix(summary(iris)), theme=theme.white()), heights= unit(c(3,1),null)) dev.off() HTH, baptiste On 1 June 2010 20:52, Noah Silverman n...@smartmediacorp.com wrote

[R] Comparing multiple columns in matrix

2010-05-31 Thread Noah Silverman
We're running Monte Carlo repeated measures for several groups. The goal is to determine the number of time each group has the highest score. A toy example: [,1] [,2] [,3] 0.1 0.2 0.3 0.1 0.2 0.3 0.1 0.2 0.3 0.1 0.3 0.2 0.1 0.3 0.2 0.2 0.3 0.1 For this example:

[R] Fancy Page layout

2010-05-31 Thread Noah Silverman
Hi, Working on a report that is going to have a large number of graphs and summaries. We have 80 groups with 20 variables each. Ideally, I'd like to produce ONE page for each group. It would have two columns of 10 graphs and then the 5 number summary of the variables at the bottom. So, perhaps

Re: [R] Fancy Page layout

2010-05-31 Thread Noah Silverman
Lattice looks nice, but how can I put some summary text at the bottom? On 5/31/10 11:27 AM, RICHARD M. HEIBERGER wrote: Use lattice. require(lattice) ?lattice ?xyplot __ R-help@r-project.org mailing list

[R] Building a list

2010-05-30 Thread Noah Silverman
Hello, I need to build a list of lists We have 20 groups we are generating MCMC samples for. There are 10 coefficients, and 1 MCMC iterations. I would like to store each iteration by-group in a list. My problem is with the first iteration. Here is a toy example: Chain - list() for (j in

Re: [R] Building a list

2010-05-30 Thread Noah Silverman
]], coef) If it does, this has the additional advantage that it tends to be faster to initialize the list at size rather than expanding it as needed. HTH, Josh On Sun, May 30, 2010 at 2:52 PM, Noah Silverman n...@smartmediacorp.com wrote: Hello, I need to build a list of lists We

Re: [R] Building a list

2010-05-30 Thread Noah Silverman
) } } On Sun, May 30, 2010 at 6:05 PM, Noah Silverman n...@smartmediacorp.com wrote: That would be great, except I just realized I made a typo when sending my code. I'm tracking 20 coefficents for 10 groups. So I need a top list of 10 groups. Then each of the 10,000 samples for each of the 20

Re: [R] Discretize factors?

2010-05-16 Thread Noah Silverman
to lm, but not actually doing any regression? Thanks again! -N On 5/15/10 11:17 AM, Thomas Stewart wrote: Maybe this? group - factor(c(A, B,B,C,C,C)) model.matrix(~0+group) -tgs On Sat, May 15, 2010 at 2:02 PM, Noah Silverman n...@smartmediacorp.com mailto:n...@smartmediacorp.com wrote

Re: [R] Discretize factors?

2010-05-16 Thread Noah Silverman
1 0 4 4 3 0 0 1 5 5 4 0 0 1 6 6 5 0 0 1 Any ideas? -N On 5/15/10 11:02 AM, Noah Silverman wrote: Hi, I'm looking for an easy way to discretize factors in R

Re: [R] Discretize factors?

2010-05-16 Thread Noah Silverman
I could, but with close to 100 columns, its messy. On 5/16/10 11:22 AM, Peter Ehlers wrote: On 2010-05-16 11:06, Noah Silverman wrote: Update, I have it working, but now its producing really ugly labels. Must be a small adjustment to the code. Any ideas?? ##Create example data.frame

[R] Discretize factors?

2010-05-15 Thread Noah Silverman
Hi, I'm looking for an easy way to discretize factors in R I've noticed that the lm function does this automatically with a nice result. If I have group - c(A, B,B,C,C,C) and run: lm(result ~ x1 + group) The lm function has split the group into separate binary variables {0,1} before

[R] Summarizing counts by multiple factors

2010-05-11 Thread Noah Silverman
Hi, An example data set is: grouplevelcolor A1blue A1Red B1blue B2Red A2Red B2Red B2blue B2blue A2blue A2Red I'd like to

[R] Results from clogit out of range?

2010-04-20 Thread Noah Silverman
Hi, I'm calculating a conditional logit on some data stratified by group. My understanding was that a conditional logit by definition returns a value between 0 and 1 a a probability. Can anyone suggest why I'm seeing results outside of the {0,1} range?? The call in R is: m - clogit(score ~

Re: [R] Results from clogit out of range?

2010-04-20 Thread Noah Silverman
Thanks David, That explains a lot. I appreciate it. -- Noah On 4/20/10 3:48 PM, David Winsemius wrote: On Apr 20, 2010, at 5:59 PM, Noah Silverman wrote: Hi, I'm calculating a conditional logit on some data stratified by group. My understanding was that a conditional logit

Re: [R] Results from clogit out of range?

2010-04-20 Thread Noah Silverman
On 4/20/10 4:22 PM, Noah Silverman wrote: Thanks David, That explains a lot. I appreciate it. -- Noah On 4/20/10 3:48 PM, David Winsemius wrote: On Apr 20, 2010, at 5:59 PM, Noah Silverman wrote: Hi, I'm calculating a conditional logit on some data stratified by group. My

Re: [R] Binning Question

2010-04-12 Thread Noah Silverman
David, That helps me a lot. Thanks!!! -N On 4/12/10 9:06 PM, David Winsemius wrote: dat - as.data.frame(matrix( rnorm(200), 100 , 2)) # bivariate normal n=100 ab - matrix( c(-5,-5,5,5), 2, 2) # interval [-5,5) x [-5,5) nbin - c( 20, 20) # 400 bins bins - bin2(dat, ab, nbin) # bin

[R] Binning Question

2010-04-12 Thread Noah Silverman
Hi, I'm trying to setup some complicated binning with statistics and could use a little help. I've found the bin2 function from the ash package, but it doesn't do everything I need. My intention is to copy some of their code and then modify as needed. I have a vector of two columns:

[R] Manually calculate SVM

2010-03-25 Thread Noah Silverman
Hi, I'm learning more about SVMs and kernels in general. I've gotten used to using the svm function in the e1071 package. It works great. Now, I want to do/learn some more interesting stuff. (Perhaps my own kernel and/or scoring system). So I want to better understand 1) how calculation of

Re: [R] Manually calculate SVM

2010-03-25 Thread Noah Silverman
Thanks Steve, 1) I get that the kernel is a normal function. But my understanding was that the kernel created a higher dimensional space than the original data, thus allowing the SVM to be a pseudo-linear classifier in that higher dimension. So, if the the kernel is the dot_product do I iterate

[R] Factor variables with GAM models

2010-03-19 Thread Noah Silverman
I'm just starting to learn about GAM models. When using the lm function in R, any factors I have in my data set are automatically converted into a series of binomial variables. For example, if I have a data.frame with a column named color and values red, green, blue. The lm function

Re: [R] Factor variables with GAM models

2010-03-19 Thread Noah Silverman
-project.org] On Behalf Of Noah Silverman [n...@smartmediacorp.com] Sent: March 19, 2010 12:54 PM To: r-help@r-project.org Subject: [R] Factor variables with GAM models I'm just starting to learn about GAM models. When using the lm function in R, any factors I have in my data set are automatically

Re: [R] logistic regression by group?

2010-03-04 Thread Noah Silverman
Corey, Thanks for the quick reply. I cant give any sample code as I don't know how to code this in R. That's why I tried to pass along some pseudo code. I'm looking for the best beta that maximize likelihood over all the groups. So, while your suggestion is close, it isn't quite what I need.

[R] logistic regression by group?

2010-03-03 Thread Noah Silverman
Hi, Looking for a function in R that can help me calculate a parameter that maximizes the likelihood over groups of observations. The general formula is: p = exp(xb) / sum(exp(xb)) So, according to the formulas I've seen published, to do this by group is product(p = exp(x_i * b_i) /

[R] Strange behavior with poisosn and glm

2010-03-02 Thread Noah Silverman
Hi, I'm just learning about poison links for the glm function. One of the data sets I'm playing with has several of the variables as factors (i.e. month, group, etc.) When I call the glm function with a formula that has a factor variable, R automatically converts the variable to a series of

Re: [R] Strange behavior with poisosn and glm

2010-03-02 Thread Noah Silverman
should be equal to the fitted value. Here it is not. I don't understand why. Any insight? -N On 3/2/10 12:47 AM, (Ted Harding) wrote: On 02-Mar-10 08:02:27, Noah Silverman wrote: Hi, I'm just learning about poison links for the glm function. One of the data sets I'm playing with has

[R] lapply with data frame

2010-02-28 Thread Noah Silverman
I'm a bit confused on how to use lapply with a data.frame. For example. lapply(data, function(x) print(x)) WHAT exactly is passed to the function. Is it each ROW in the data frame, one by one, or each column, or the entire frame in one shot? What I want to do apply a function to each row

[R] lapply with data frame

2010-02-27 Thread Noah Silverman
I'm a bit confused on how to use lapply with a data.frame. For example. lapply(data, function(x) print(x)) WHAT exactly is passed to the function. Is it each ROW in the data frame, one by one, or each column, or the entire frame in one shot? What I want to do apply a function to each row

[R] Avoiding for loops

2009-11-02 Thread Noah Silverman
Hi, I'm trying to normalize some data. My data is organized by groups. I want to normalize PER GROUP as opposed to over the entire data set. The current double loop that I'm using takes almost an hour to run on about 30,000 rows of data in 2,500 groups. I'm currently doing this:

Re: [R] Avoiding for loops

2009-11-02 Thread Noah Silverman
/ data$sum data .. or even transform(data, norm=ave(y, group, FUN = function(x) x/sum(x))) I hope it helps. Best, Dimitris Noah Silverman wrote: Hi, I'm trying to normalize some data. My data is organized by groups. I want to normalize PER GROUP as opposed to over

[R] Stratified Maximum Likelihood

2009-10-30 Thread Noah Silverman
Hi, I've search rseek.org high and low and can't seem to find an answer to this. I want to maximize likelihood for a set of training data, but the data is grouped. (Think multiple trials.) It would probably be possible to do this with some nested for loops manually, but would be painfully

[R] Data format for KSVM

2009-10-23 Thread Noah Silverman
Hi, I have a process using svm from the e1071 library. it works. I want to try using the KSVM library instead. The same data used wiht e1071 gives me an error with KSVM. My data is a data.frame. sample code: svm_formula - formula(y ~ a + B + C) svm_model - ksvm(formula, data=train_data,

Re: [R] RMySql problem

2009-10-23 Thread Noah Silverman
Hi, It looks like you are potentially dealing with two separate issues. 1) Access - Mysql has very find grained permissions as to who can access what and from where. You need to make sure that your username in mysql is allowed to access the database/tables from your location. 2) Corruption

[R] Best SVM Performance measure?

2009-10-20 Thread Noah Silverman
Hi, This is probably going to be one of those, It depends what you want kind of answers, but I'm very curious to see if the group has an opinion or some general suggestions. The actual experiment is too complicated for a quick e-mail, but I'll summarize well enough(hopefully) to get the

[R] Best SVM Performance measure?

2009-10-19 Thread Noah Silverman
Hi, This is probably going to be one of those, It depends what you want kind of answers, but I'm very curious to see if the group has an opinion or some general suggestions. The actual experiment is too complicated for a quick e-mail, but I'll summarize well enough(hopefully) to get the

[R] Converting dataframe to matrix

2009-10-16 Thread Noah Silverman
Hi, I'm experimenting with a few learners that require a matrix as their input. (Currently svmpath, vbmp, etc.) I currently have a dataframe with 50 columns and 20,000 rows. I tried using: x - as.matrix(my_data.frame) If I then as, is.matrix(x), I get TRUE. However everywhere I've tried

Re: [R] Converting dataframe to matrix

2009-10-16 Thread Noah Silverman
: On Fri, Oct 16, 2009 at 01:33:14AM -0700, Noah Silverman wrote: Hi, I'm experimenting with a few learners that require a matrix as their input. (Currently svmpath, vbmp, etc.) I currently have a dataframe with 50 columns and 20,000 rows. I tried using: x- as.matrix(my_data.frame) If I

[R] Different way of scaling data

2009-10-16 Thread Noah Silverman
Hi, I have a data.frame that I need to scale. I've been using the scale function and it works nicely. Some of the libraries I'm testing won't accept negative values for data, so I need to find a way to scale the data from 0 to 1 Any ideas? Thans!

[R] Possible command line bug.

2009-10-13 Thread Noah Silverman
I think I've come across a bug in the command line switches. From R --help --vanillaCombine --no-save, --no-restore, --no-site-file, --no-init-file and --no-environ --slave Make R run as quietly as possible -q, --quiet

[R] Pull Coefficients from MCMCpack models

2009-09-21 Thread Noah Silverman
Hi, I've been testing some models with the MCMCpack library. I can run the process and get a nice model object. I can easily see the summary and even plot it. I can't seem to figure out how to: 1) Access the final coefficients in the model 2) Turn the coefficients into a model so I can then

Re: [R] Pull Coefficients from MCMCpack models

2009-09-21 Thread Noah Silverman
this: apply(foo, 2, mean) or apply(foo, 2, median) Thanks, Deb Noah Silverman n...@smartmediacorp.com 22/09/2009 12:34 pm Hi, I've been testing some models with the MCMCpack library. I can run the process and get a nice model object. I can easily see the summary and even plot it. I can't

[R] Grouped Logistic (Or conditional Logistic.)

2009-09-17 Thread Noah Silverman
Hi, I'm not sure of the correct nomenclature or function for what I'm trying to do. I'm interested in calculated a logistic regression on a binary dependent variable (True,False). There are a few ways to easily do this in R. Both SVM and GLM work easily. The part that I want to add is

Re: [R] Grouped Logistic (Or conditional Logistic.)

2009-09-17 Thread Noah Silverman
have to be adjusted to look per group. I would call this something like grouped maximum liklihood if I got to make up the name. -N On 9/17/09 11:06 AM, (Ted Harding) wrote: On 17-Sep-09 17:28:16, Noah Silverman wrote: Hi, I'm not sure of the correct nomenclature or function for what

[R] Strange question/result about SVM

2009-09-14 Thread Noah Silverman
Hello, I have a very unusual situation with an SVM and wanted to get the group's opinion. We developed an experiment where we train the SVM with one set of data (train data) and then test with a completely independent set of data (test data). The results were VERY good. I found and error

Re: [R] Strange question/result about SVM

2009-09-14 Thread Noah Silverman
...@r-project.org [mailto:r-help-boun...@r-project.org] On Behalf Of Noah Silverman Sent: Monday, September 14, 2009 1:00 PM To: r help Subject: [R] Strange question/result about SVM Hello, I have a very unusual situation with an SVM and wanted to get the group's opinion. We developed an experiment

Re: [R] Moving to Mac OS X

2009-09-11 Thread Noah Silverman
Hi, I'm a daily user of both mac and Linux so wanted to offer some thoughts: 1) R runs great on a Mac. There is a standard install from the cran website that has a nice GUI built into it. You can do things like drag files to the console and it will fill in the path name. 2) I like using

Re: [R] Moving to Mac OS X

2009-09-11 Thread Noah Silverman
Steve, You make a good point. I confused 64 bit with a multi-core setup. That said, I don't belive the pretty packaged up GUI has a 64 bit version, just the raw terminal version does. On 9/11/09 12:38 PM, Steve Lianoglou wrote: Hi, On Sep 11, 2009, at 3:08 PM, Noah Silverman wrote: 3) I

Re: [R] Moving to Mac OS X

2009-09-11 Thread Noah Silverman
Thanks Steve, That's a big help. On 9/11/09 12:48 PM, Steve Lianoglou wrote: Hi, On Sep 11, 2009, at 3:40 PM, Noah Silverman wrote: Steve, You make a good point. I confused 64 bit with a multi-core setup. That said, I don't belive the pretty packaged up GUI has a 64 bit version, just

[R] R on Multi Core

2009-09-11 Thread Noah Silverman
Hi, Our discussions about 64 bit R has led me to another thought. I have a nice dual core 3.0 chip inside my Linux Box (Running Fedora 11.) Is there a version of R that would take advantage of BOTH cores?? (Watching my system performance meter now is interesting, Running R will hold a single

[R] Alternative to Scale Function?

2009-09-11 Thread Noah Silverman
Hi, Is there an alternative to the scale function where I can specify my own mean and standard deviation? I've come across an interesting issue where this would help. I'm training and testing on completely different sets of data. The testing set is smaller than the training set. Using

Re: [R] Alternative to Scale Function?

2009-09-11 Thread Noah Silverman
sure that a value is transformed the same regardless of which data set it is in. Do I have this correct, or can anybody contribute any more to the concept? Thanks! -- Noah On 9/11/09 1:10 PM, Noah Silverman wrote: Hi, Is there an alternative to the scale function where I can specify my own

Re: [R] Alternative to Scale Function?

2009-09-11 Thread Noah Silverman
Genius, That certainly is much faster that what I had worked out on my own. I looked at sweep, but couldn't understand the rather thin help page. Your example makes it really clear Thank You!!! -- Noah On 9/11/09 1:57 PM, Gavin Simpson wrote: On Fri, 2009-09-11 at 13:10 -0700, Noah

[R] Confused - better empirical results with error in data

2009-09-07 Thread Noah Silverman
Hi, I have a strange one for the group. We have a system that predicts probabilities using a fairly standard svm (e1017). We are looking at probabilities of a binary outcome. The input data is generated by a perl script that calculates a bunch of things, fetches data from a database, etc.

Re: [R] Confused - better empirical results with error in data

2009-09-07 Thread Noah Silverman
You both make good points. Ideally, it would be nice to know WHY it works. Without digging into too much verbiage, the system is designed to predict the outcome of certain events. The broken model predicts outcomes correctly much more frequently than one with the broken data withheld. So,

Re: [R] Confused - better empirical results with error in data

2009-09-07 Thread Noah Silverman
You both make good points. Ideally, it would be nice to know WHY it works. Without digging into too much verbiage, the system is designed to predict the outcome of certain events. The broken model predicts outcomes correctly much more frequently than one with the broken data withheld. So,

Re: [R] Confused - better empirical results with error in data

2009-09-07 Thread Noah Silverman
Interesting point. Our data is NOT continuous. Sure, some of the test examples are older than others, but there is no relationship between them. (More Markov like in behavior.) When creating a specific record, we actually account for this in our SQL queries which tend to be along the lines

Re: [R] [OT] book on Linux scripting

2009-09-03 Thread Noah Silverman
Erin, Linux supports many scripting languages. Which language are you interested in: Perl, PHP, Bash, Python, etc??? -- Noah On 9/2/09 10:35 PM, Erin Hodgess wrote: Dear R People: I know that this is off topic, but could anyone recommend a good book on Linux scripting please? Any help

Re: [R] [OT] book on Linux scripting

2009-09-03 Thread Noah Silverman
Hi, There are good books by O'Reilly for both sed and awk. That said, neither is what I would call a complete scripting language. Without knowing the details of your requirements, I STRONGLY suggest that you learn perl. It allows you do almost anything with data, fetch web pages, access

[R] Easy way to get top 2 items from vector

2009-09-03 Thread Noah Silverman
Hi, I use the max function often to find the top value from a matrix or column of a data.frame. Now I'm looking to find the top 2 (or three) values from my data. I know that I could sort the list and then access the first two items, but that seems like the long way. Is there some way to

Re: [R] Easy way to get top 2 items from vector

2009-09-03 Thread Noah Silverman
Department of Statistics UC Berkeley spec...@stat.berkeley.edu On Thu, 3 Sep 2009, Noah Silverman wrote: Hi, I use the max function often to find the top value from a matrix or column of a data.frame. Now I'm looking to find the top 2 (or three) values

Re: [R] SVM coefficients

2009-08-31 Thread Noah Silverman
Steve, That doesn't work. I just trained an SVM with 80 variables. svm_model$coefs gives me a list of 10,000 items. My training set is 30,000 examples of 80 variables, so I have no idea what the 10,000 items represent. There should be some attribute that lists the weights for each of the

Re: [R] SVM coefficients

2009-08-31 Thread Noah Silverman
(significance.) the SVM assigned to each variable. On 8/31/09 12:54 AM, Achim Zeileis wrote: On Mon, 31 Aug 2009, Noah Silverman wrote: Steve, That doesn't work. I just trained an SVM with 80 variables. svm_model$coefs gives me a list of 10,000 items. My training set is 30,000 examples

[R] Probit function

2009-08-31 Thread Noah Silverman
Hello, I want to start testing using the MNP probit function in stead of the lrm function in my current experiment. I have one dependant label and two independent varaibles. The lrm is simple model - lrm(label ~ val1 + val2) I tried the same thing with the mnp function and got an error

Re: [R] Probit function

2009-08-31 Thread Noah Silverman
was that they were using the same for this application. Any thoughts? -- Noah On 8/31/09 5:07 PM, Achim Zeileis wrote: On Mon, 31 Aug 2009, Noah Silverman wrote: Hello, I want to start testing using the MNP probit function in stead of the lrm function in my current experiment. I have one

Re: [R] Probit function

2009-08-31 Thread Noah Silverman
I get that. Still trying to figure out what the multi nominal labels they used were. That's why I passed on the reference to the seminar summary. On 8/31/09 5:40 PM, Achim Zeileis wrote: On Mon, 31 Aug 2009, Noah Silverman wrote: Thanks Achim, I discovered the Journal article just after

Re: [R] Probit function

2009-08-31 Thread Noah Silverman
multiple discreet choices, I'm not sure how the probit model would. Hence my inquiry. On 8/31/09 6:23 PM, Achim Zeileis wrote: On Mon, 31 Aug 2009, Noah Silverman wrote: I get that. Still trying to figure out what the multi nominal labels they used were. That's why I passed on the reference

Re: [R] Probit function

2009-08-31 Thread Noah Silverman
rank, would you please show me where that is in their paper. Thanks! -N On 8/31/09 7:17 PM, Achim Zeileis wrote: On Mon, 31 Aug 2009, Noah Silverman wrote: Um. I did my research. Have been for years. I assume you're referring to Boltman and Chapmanm A multinomial logit model

[R] Sapply

2009-08-30 Thread Noah Silverman
Hi, I need a bit of guidance with the sapply function. I've read the help page, but am still a bit unsure how to use it. I have a large data frame with about 100 columns and 30,000 rows. One of the columns is group of which there are about 2,000 distinct groups. I want to normalize (sum

[R] SVM coefficients

2009-08-30 Thread Noah Silverman
Hello, I'm using the svm function from the e1071 package. It works well and gives me nice results. I'm very curious to see the actual coefficients calculated for each input variable. (Other packages, like RapidMiner, show you this automatically.) I've tried looking at attributes for the

Re: [R] Submit a R job to a server

2009-08-27 Thread Noah Silverman
Deb, I generally run my larger R tasks on a server. Here is my workflow. 1) Write an R script using a text editor. (There are many popular ones.) 2) FTP the R script to your server. 3) SSH into the server 4) Run R 5) Run the script that you uploaded from the R process you just started. On

[R] Select top three values from data frame

2009-08-26 Thread Noah Silverman
Hi, I'm trying to find an easy way to do this. I want to select the top three values of a specific column in a subset of rows in a data.frame. I'll demonstrate. ABC x21 x41 x32 y15 y26 y38 I want the top 3 values of B from the

Re: [R] Select top three values from data frame

2009-08-26 Thread Noah Silverman
I only have a few values in my example, but the real data set might have 20-100 rows with A=X. So how do I pick just the three highest ones? -N On 8/26/09 2:46 AM, Ottorino-Luca Pantani wrote: df.mydata[df.mydata$A==X AND df.mydata$C 2, ] will do the job ? 8rino Noah Silverman ha

Re: [R] Select top three values from data frame

2009-08-26 Thread Noah Silverman
summary function head(df.mydata[df.mydata$A==X df.mydata$C 2, ],3) Colin. -Original Message- From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On Behalf Of Noah Silverman Sent: 26 August 2009 10:54 To: ottorino-luca.pant...@unifi.it Cc: r help Subject: Re: [R

[R] Managing output

2009-08-26 Thread Noah Silverman
Hi, Is there a way to build up a vector, item by item. In perl, we can push an item onto an array. How can we can do this in R? I have a loop that generates values as it goes. I want to end up with a vector of all the loop results. In perl it woud be: for(item in list){ result -

Re: [R] Managing output

2009-08-26 Thread Noah Silverman
? In general, the for loop construct can be avoided so you don't have to think about messy indexing. What exactly are you trying to do? -Original Message- From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On Behalf Of Noah Silverman Sent: Wednesday, August 26, 2009 2:20

Re: [R] Managing output

2009-08-26 Thread Noah Silverman
of Statistics UC Berkeley spec...@stat.berkeley.edu On Wed, 26 Aug 2009, Noah Silverman wrote: The actually process is REALLY complicate, I just gave a simple example for the list. I have a lot of steps to process the data before I get a final score

Re: [R] Clogit or LRM?

2009-08-25 Thread Noah Silverman
reference didn't help me much with that so if you know of others, please let me know. Thanks. Mark On Aug 25, 2009, *Noah Silverman* n...@smartmediacorp.com wrote: Hello I believe that I'm getting

Re: [R] Trying something for fun...

2009-08-22 Thread Noah Silverman
is the exact value of the strata saved as part of the model, or is it just used for grouping?) On 8/22/09 10:57 AM, Charles C. Berry wrote: On Fri, 21 Aug 2009, Noah Silverman wrote: Hi, For fun, I'm trying to throw some horse racing data into either an svm or lrm model. Curious to see what

Re: [R] Trying something for fun...

2009-08-22 Thread Noah Silverman
options. (I can see one that is a probability option.) Thanks!! -N On 8/22/09 10:57 AM, Charles C. Berry wrote: On Fri, 21 Aug 2009, Noah Silverman wrote: Hi, For fun, I'm trying to throw some horse racing data into either an svm or lrm model. Curious to see what comes out as there are so

<    1   2   3   >