[R] which() with multiple conditions

2012-10-01 Thread pdb
I hope someone can point me in the right direction please.

I have a data frame with a column containing names.  I want to identify the
columns that contain names in a list.

namestofind - c('fred','bill',a long list)

If I only wanted to identify a single name I would use

which(z$name == 'bill')

What syntax would I use to identify all the rows that contain any of the
names in namestofind?

Thanks in advance for the pointer





--
View this message in context: 
http://r.789695.n4.nabble.com/which-with-multiple-conditions-tp4644677.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] POSIXlt and daylight savings time

2012-09-05 Thread pdb
I'll rephrase the question...

If you try...

as.POSIXlt('2004-10-31 02:00:00') 

you get 

[1] 2004-10-31 

What do I need to do to make it return

[1] 2004-10-31 02:00:00 



--
View this message in context: 
http://r.789695.n4.nabble.com/POSIXlt-and-daylight-savings-time-tp4642253p4642272.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] POSIXlt and daylight savings time

2012-09-04 Thread pdb
I have a data frame that contains dates, but when I use as.POSIXlt() I lose
the hours on all records. I traced this down to a particuar hour which
causes the issue...

 as.POSIXlt('2004-10-31 02:00:00')
[1] 2004-10-31
 as.POSIXlt('2004-10-31 03:00:00')
[1] 2004-10-31 03:00:00

How do I tell as.POSIXlt() to ignore daylight savings and just convert to a
time as is? I've read about the 'isdst' but it is still unclear what to do.

This is a cleaned up date field that I received so adjusting the date itself
is not possible.

Thanks in advance. 



--
View this message in context: 
http://r.789695.n4.nabble.com/POSIXlt-and-daylight-savings-time-tp4642253.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] revolution foreach oddity

2012-05-07 Thread pdb
I know this is not a revolution support forum, but as anyone noticed the
following?

I have a foreach loop to generate random samples. If I run the exact code
below in normal r (2.14.1) it works as expected, but if I run it from
revolution 4.2.0 each loop returns the same numbers.

The only way I can get revolution to give different numbers is using 1
instead of 8 in
registerDoSNOW(makeCluster(8, type = SOCK))

but that seems to defeat the point. 




library(foreach)
library(doSNOW)
registerDoSNOW(makeCluster(8, type = SOCK))
getDoParWorkers()
getDoParName()
getDoParVersion()


mySamples - foreach (jj = 1:4, .combine=cbind) %dopar% {
return(sample(1:10,10,replace=TRUE))
}

mySamples 

##
r 2.14.1
##

 library(foreach)
 library(doSNOW)
 registerDoSNOW(makeCluster(8, type = SOCK))
 getDoParWorkers()
[1] 8
 getDoParName()
[1] doSNOW
 getDoParVersion()
[1] 1.0.6
 
 
 mySamples - foreach (jj = 1:4, .combine=cbind) %dopar% {
+ return(sample(1:10,10,replace=TRUE))
+ }
 
 mySamples 
  result.1 result.2 result.3 result.4
 [1,]5314
 [2,]1   10   103
 [3,]7949
 [4,]2593
 [5,]27   101
 [6,]78   10   10
 [7,]69   104
 [8,]8662
 [9,]   10794
[10,]2419
 



# revolution r


 library(foreach)
Loading required package: iterators
Loading required package: codetools
foreach: simple, scalable parallel programming from REvolution Computing
Use REvolution R for scalability, fault tolerance and more.
http://www.revolution-computing.com
 library(doSNOW)
Loading required package: snow
 registerDoSNOW(makeCluster(8, type = SOCK))
 getDoParWorkers()
[1] 8
 getDoParName()
[1] doSNOW
 getDoParVersion()
[1] 1.0.3
 
 
 mySamples - foreach (jj = 1:4, .combine=cbind) %dopar% {
+   return(sample(1:10,10,replace=TRUE))
+   }
 
 mySamples 
  result.1 result.2 result.3 result.4
 [1,]4444
 [2,]   10   10   10   10
 [3,]4444
 [4,]   10   10   10   10
 [5,]5555
 [6,]5555
 [7,]9999
 [8,]2222
 [9,]6666
[10,]9999
 

--
View this message in context: 
http://r.789695.n4.nabble.com/revolution-foreach-oddity-tp4616237.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] directory of current script

2012-04-12 Thread pdb
I found this...

https://stat.ethz.ch/pipermail/r-help/2009-January/184745.html

--
View this message in context: 
http://r.789695.n4.nabble.com/directory-of-current-script-tp4553386p4553409.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] directory of current script

2012-04-12 Thread pdb
I am running a series of scripts sequentially and they all need some global
parameters. These will be included in a file in a known sub directory as the
scripts themselves.

The scripts need to be run by anyone without ANY editing.

Question is: 

Is there a command to return the directory of the current script, so it then
knows where to find the global parameter file? 

Or is there a simpler way?

Cheers.

--
View this message in context: 
http://r.789695.n4.nabble.com/directory-of-current-script-tp4553386p4553386.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] can this sequence be generated easier?

2011-06-17 Thread pdb
I have 'x' variables that I need to find the optimum combination of, with the
constraint that the sum of all x variables needs to be exactly 100. I need
to test all combinations to get the optimal mix.

This is easy if I know how many variables I have - I can hard code as below.
But what if I don't know the number of variables and want this to be a
flexible parameter. Is there a sexy recursive way that this can be done in
R?

#for combinations of 2 variables
vars = 2
for(i in 0:100){
for(j in 0:(100-i)){
...do some test i,j combination
}}

#for combinations of 3 variables
vars = 3
for(i in 0:100){
for(j in 0:(100-i)){
for(k in 0:100-(i+j)){
...do some test on i,j,k combination
}}}



--
View this message in context: 
http://r.789695.n4.nabble.com/can-this-sequence-be-generated-easier-tp3607240p3607240.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] computer name

2011-06-12 Thread pdb
Is there an r function that will be able to identify the computer the code is
running on?

I have some common code that I run on several computers and each has a
database with a different server name - although the content is identical.

I need to set thisServer depending on which machine the code is running
on...

something like...

if(pcname = pc1) thisServer = 'SERVER1'
if(pcname = pc2) thisServer = 'SERVER2'


conn - odbcDriverConnect(driver=SQL Server;database=x;server=thisServer;)

...rest of code will now run OK.

I know I could set the DSN names the same and use...

conn - odbcConnect(commonDSNname)

 but I was wondering if there was another way


--
View this message in context: 
http://r.789695.n4.nabble.com/computer-name-tp3593120p3593120.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] caret - prevent resampling when no parameters to find

2011-05-01 Thread pdb
I want to use caret to build a model with an algorithm that actually has no
parameters to find. 

How do I stop it from repeatedly building the same model 25 times?


library(caret)
data(mdrr)
LOGISTIC_model - train(mdrrDescr,mdrrClass
,method='glm'
,family=binomial(link=logit)
)
LOGISTIC_model

528 samples
342 predictors
  2 classes: 'Active', 'Inactive' 

Pre-processing: None 
Resampling: Bootstrap (25 reps) 

Summary of sample sizes: 528, 528, 528, 528, 528, 528, ... 

Resampling results

  Accuracy  Kappa   Accuracy SD  Kappa SD
  0.552 0.0999  0.0388   0.0776  --
View this message in context: 
http://r.789695.n4.nabble.com/caret-prevent-resampling-when-no-parameters-to-find-tp3488761p3488761.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] caret - prevent resampling when no parameters to find

2011-05-01 Thread pdb
Hi Max,

But in this example, it says the sample size is the same as the total number
of samples, so unless the sampling is done by columns, wouldn't you get
exactly the same model each time for logistic regression?

ps - great package btw. I'm just beginning to explore its potential now.--
View this message in context: 
http://r.789695.n4.nabble.com/caret-prevent-resampling-when-no-parameters-to-find-tp3488761p341.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] caret - prevent resampling when no parameters to find

2011-05-01 Thread pdb
Thanks for the clarification Max - I should have realised that.

One final question, I like caret because it lets me pass in data to all
functions in the same way. For glm I have only ever used the formula
notation and did not see a way to pass in predictors and a target
individually. How do I do this? How do I get the 2nd example below to work?

Many thanks.

LOGISTIC_model - train(mdrrDescr,mdrrClass
,method='glm'
,family=binomial(link=logit)
)

  
LOGISTIC_model1 - glm(mdrrDescr,mdrrClass, family=binomial(link=logit)) --
View this message in context: 
http://r.789695.n4.nabble.com/caret-prevent-resampling-when-no-parameters-to-find-tp3488761p3488911.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] caret - prevent resampling when no parameters to find

2011-05-01 Thread pdb
glm.fit - answered my own question by reading the manual!--
View this message in context: 
http://r.789695.n4.nabble.com/caret-prevent-resampling-when-no-parameters-to-find-tp3488761p3488923.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] caret - prevent resampling when no parameters to find

2011-05-01 Thread pdb
Thanks again Max - a great time saver this is.

Now just for my sanity, if I use glm.fit to build a model where I have the
matrices, how do I then use the predict function without getting an error
message?

 LOGISTIC_model1 - glm.fit(mdrrDescr,mdrrClass,
 family=binomial(link=logit)) 
Warning messages:
1: glm.fit: algorithm did not converge 
2: glm.fit: fitted probabilities numerically 0 or 1 occurred 
 predict(LOGISTIC_model1) 
Error in UseMethod(predict) : 
  no applicable method for 'predict' applied to an object of class
c('double', 'numeric')

Secondly, caret acts as a nice wrapper to protect me from all this, and it
does the resampling to give me an idea of the expected model fit. If I was
doing a parameter search, would it do all this resampling for each
combination of parameters? Now if I just want to build a model and not worry
about all the resampling (in my case I just want a set of baseline
predictions to compare various variable selections methods against) it would
be nice if there was a simple option to turn off the resampling.

--
View this message in context: 
http://r.789695.n4.nabble.com/caret-prevent-resampling-when-no-parameters-to-find-tp3488761p3489020.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] caret - prevent resampling when no parameters to find

2011-05-01 Thread pdb
Hi Max,
I tried your suggestion but cam up with errors:

fitControl-trainControl(number=1)
LOGISTIC_model - train(mdrrDescr,mdrrClass
,method='glm'
,trControl = fitControl
)

Fitting: parameter=none 
Error in if (all.equal(sort(x$index[[1]]), seq(along = x$data$.outcome)))
x$data else x$data[-x$index[[i]],  : 
  argument is not interpretable as logical


fitControl-trainControl(seq(along = mdrrClass))   
LOGISTIC_model - train(mdrrDescr,mdrrClass
,method='glm'
,trControl = fitControl
) 


Error in switch(tolower(trControl$method), oob = NULL, cv = createFolds(y, 
: 
  EXPR must be a length 1 vector
In addition: Warning message:
In if (trControl$method == oob  !(method %in% c(rf, treebag,  :
  the condition has length  1 and only the first element will be used--
View this message in context: 
http://r.789695.n4.nabble.com/caret-prevent-resampling-when-no-parameters-to-find-tp3488761p3489091.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] changing a specific column name

2011-04-28 Thread pdb
Hi,

Can someone please tell me how to change the column name of a specific
column. How do I change the name of the column 'Species'?

Thanks in advance


d - iris

colnames(d)
[1] Sepal.Length Sepal.Width  Petal.Length Petal.Width  Species 

ind - which(names(d)=='Species')

ind
[1] 5

colnames(d[ind])
[1] Species

colnames(d[ind]) - 'new name'

colnames(d[ind])
[1] Species

--
View this message in context: 
http://r.789695.n4.nabble.com/changing-a-specific-column-name-tp3480739p3480739.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] boxplot - how to supress groups with low counts

2011-01-28 Thread pdb

In a boxplot - how can I prevent groups where the number of cases is less
than a set threshold from being plotted.

set.seed(42)
 DF - data.frame(type=sample(LETTERS[1:5], 100, replace=TRUE),
cost=rnorm(100)) 
 count - boxplot(cost ~ type, data=DF, plot = 0) 
 count$n

## how to only include plots where count$n  18   
 boxplot(cost ~ type, data=DF)   

Thanks in advance for any solutions.
-- 
View this message in context: 
http://r.789695.n4.nabble.com/boxplot-how-to-supress-groups-with-low-counts-tp3244424p3244424.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] 2 functions with same name - what to do to get the one I want

2011-01-26 Thread pdb

There seems to be 2 functions call ecdf...

http://lib.stat.cmu.edu/S/Harrell/help/Hmisc/html/ecdf.html

http://127.0.0.1:11885/library/stats/html/ecdf.html

How do I get the one ecdf {Hmisc} to run instead of the ecdf {stats} 

A pointer in the right direction would be greatly appreciated.


Tried to instal Hmisc but got this message, so I assume I have it

 utils:::menuInstallPkgs()
Warning: package 'Hmisc' is in use and will not be installed
 
 

ran the demo from Hmisc with no luck...

 set.seed(1)
 ch - rnorm(1000, 200, 40)
 ecdf(ch, xlab=Serum Cholesterol)
Error in ecdf(ch, xlab = Serum Cholesterol) : 
  unused argument(s) (xlab = Serum Cholesterol)


ran the sample code from stats and it worked... 
 
 x - rnorm(12)
 Fn - ecdf(x)
 Fn # a *function*
Empirical CDF 
Call: ecdf(x)
 x[1:12] = -1.9123, -1.6626, -1.2468,  ..., 1.1119,  1.135
 Fn(x)  # returns the percentiles for x
 [1] 1. 0.9167 0. 0.6667 0.5833 0.1667
0.7500 0.0833 0.2500 0.8333 0.4167 0.5000
 tt - seq(-2,2, by = 0.1)
 12 * Fn(tt) # Fn is a 'simple' function {with values k/12}
 [1]  0  1  1  1  2  2  2  2  3  3  3  3  4  4  4  5  5  5  6  6  6  7  7  8 
8  8  8  8  8  9 10 10 12 12 12 12 12 12 12 12 12
 

-- 
View this message in context: 
http://r.789695.n4.nabble.com/2-functions-with-same-name-what-to-do-to-get-the-one-I-want-tp3237788p3237788.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] 2 functions with same name - what to do to get the one I want

2011-01-26 Thread pdb

Thanks for the quick response, but that doesn't seem to help

What do I need to do to get it to work?

 Hmisc:::ecdf(...) 
Error in get(name, envir = asNamespace(pkg), inherits = FALSE) : 
  object 'ecdf' not found


-- 
View this message in context: 
http://r.789695.n4.nabble.com/2-functions-with-same-name-what-to-do-to-get-the-one-I-want-tp3237788p3237820.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] removed data is still there!

2010-09-21 Thread pdb

I'm confused, hope someone can point out what is not obvious to me.

I thought I was creating a new data frame by 'deleting' rows from an
existing dataframe - I've tried 2 methods.

But this new data frame seems to remember values from its parent - even
though there are no occurences.  

Where does it get the values versicolor  and virginica from and give then a
count of 0?

What am I missing?

Thanks in advance. 

 summary(iris$Species)
setosa versicolor  virginica 
50 50 50 

 nrow(iris)
[1] 150

 iris1 - iris[iris$Species == 'setosa',]

 nrow(iris1)
[1] 50

 summary(iris1$Species)
setosa versicolor  virginica 
50  0  0 

boxplot(Petal.Width ~ Species, data = iris1, plot=1)

 iris2 - subset(iris, Species == 'setosa')

 nrow(iris2)
[1] 50

 summary(iris2$Species)
setosa versicolor  virginica 
50  0  0 

 boxplot(Petal.Width ~ Species, data = iris2, plot=1)




-- 
View this message in context: 
http://r.789695.n4.nabble.com/removed-data-is-still-there-tp2548440p2548440.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] removed data is still there!

2010-09-21 Thread pdb

Thanks, but that was what I just discovered myself the hard way.

What I really wanted to know was how to solve this issue.
-- 
View this message in context: 
http://r.789695.n4.nabble.com/removed-data-is-still-there-tp2548440p2548527.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] getting a function to do something

2010-09-18 Thread pdb

Hi,

I want to repeatedly do a task, so thought I could put it in a function and
then just call the function.
The task is just clearing all the graphics devices and then opening a new
one of a specified size.

Now, when I call the function below, nothing appears to happen. But when I
run the 2 lines in the function on there own, I get what I want.

Please can someone explain to me what is the obvious thing I am missing?

 clearG - function() {
 graphics.off()
 windows(13,8)
}


#nothing happens (as far as I can tell)
clearG

#but this works, but I want to just type 1 line rather than several
 graphics.off()
 windows(13,8)




-- 
View this message in context: 
http://r.789695.n4.nabble.com/getting-a-function-to-do-something-tp2545594p2545594.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] getting a function to do something

2010-09-18 Thread pdb

as, silly me.

clearG() 

this now works!


-- 
View this message in context: 
http://r.789695.n4.nabble.com/getting-a-function-to-do-something-tp2545594p2545596.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] transaction object - how to coerce this data

2010-08-31 Thread pdb

Hi,

I am wanting to look at frequent item sets using the arules package. I need
to transform my data into a transactions object. The data I read in from a
file has 2 columns, an ID and an item. How do I convert data like this into
a transactions object?

I've tried 
class? transactions
but it only confuses me.

My data is like this
 
basketIDitem
1   bread
1   cheese
1   milk
2   bread
2   cheese 
2   eggs
3   bread
3   cheese
3   beer

and from what I gather it should be like this?

 data - list(
  c(bread,cheese,milk),
  c(bread,cheese,eggs),
  c(bread,cheese,beer)
)

so I can use:

t - as(data, transactions)

Thanks in advance.

Phil


-- 
View this message in context: 
http://r.789695.n4.nabble.com/transaction-object-how-to-coerce-this-data-tp2402613p2402613.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] checking if a package is installed

2010-08-26 Thread pdb

Hi,

I am writing a function that requires a specific package to be installed.

Is there a way of checking if the package is installed and returning a TRUE
/ FALSE result so my function can return an appropriate error message and
exit the function gracefully rather than just bombing out? 

I'm thinking along the following lines (but want code that works),

f_checkpackage - function()
{

if (library(madeupname) == TRUE) {
cat(package loaded OK\n)
}
else
{
 cat(ERROR: package not loaded)
}
 
}

f_checkpackage()
-- 
View this message in context: 
http://r.789695.n4.nabble.com/checking-if-a-package-is-installed-tp2340534p2340534.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Tinn R - the preferred R term was not defined

2010-08-23 Thread pdb

Ok - I found the correct forum and that this seems to be a common problem.

http://sourceforge.net/projects/tinn-r/forums/forum/481900/topic/3741784


-- 
View this message in context: 
http://r.789695.n4.nabble.com/Tinn-R-the-preferred-R-term-was-not-defined-tp2334642p2334649.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Tinn R - the preferred R term was not defined

2010-08-22 Thread pdb

I have Windows 7 64 bit and 64 bit version of R.

I have installed Tinn R.

Everytime I start R from within Tinn R it gives me the message

The preferred R term was not defined. Do you desire to do this now

I then tell Tinn R where the Rterm.exe and Rgui.exe are.

Rterm works OK - I can open r code files and submit them.
Rgui does not work. R opens but in Tinn R toolbar for submitting code is
disabled.

I then go Rconfigurepermanent and Tinn R writes to my R
etc/Rprofile.site file

When I restart Tinn R and try to start an Rterm or Rgui, I again get
prompted...

The preferred R term was not defined. Do you desire to do this now

This seems to be a repetitive loop.

Can anybody please point me in the right direction.

Cheers.






-- 
View this message in context: 
http://r.789695.n4.nabble.com/Tinn-R-the-preferred-R-term-was-not-defined-tp2334642p2334642.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] finding max value in a row and reporting colum name

2010-08-01 Thread pdb

Hi,

Hopefully someone can point me in the right direction on how I would go
about solving the following.

I have some data and need to find the column name of the maximum value in
each row.

This could be the data...

 a - data.frame(x = rnorm(4), y = rnorm(4), z = rnorm(4)) 
 a
   x   y  z
1  1.6534561  0.11523404  0.2261730
2 -1.2274320 -0.24096054  1.5096028
3 -1.4503096  0.07227427  1.6740867
4  0.1867416  1.25318913 -0.7350560

Here is what I need to generate...

1 x
2 z
3 z
4 y

Any pointers would be appreciated.

Regards,



-- 
View this message in context: 
http://r.789695.n4.nabble.com/finding-max-value-in-a-row-and-reporting-colum-name-tp2309358p2309358.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Lags and Differences of zoo Objects

2010-07-30 Thread pdb

Hi,

I'm struggling to understand the documentation.

 ?lag.zoo


x - a zoo object. 
k, lag - the number of lags (in units of observations). Note the sign of k
behaves as in lag. 
differences - an integer indicating the order of the difference. 

What does the above line actually mean? I've tried a few settings on sample
data but can't figure out what it is doing.


x - iris
x$Species = NULL
x$Petal.Width = NULL
x$Sepal.Width = NULL
x$Sepal.Length = NULL

x - zoo(x)

x - 
merge(orig = x
,lag1diff2 = diff(x, lag = 1, differences = 2, arithmetic = TRUE, na.pad =
TRUE)
,lag2diff1 = diff(x, lag = 2, differences = 1, arithmetic = TRUE, na.pad =
TRUE)
,lag2diff2 = diff(x, lag = 2, differences = 2, arithmetic = TRUE, na.pad =
TRUE)
)

head(x)

-- 
View this message in context: 
http://r.789695.n4.nabble.com/Lags-and-Differences-of-zoo-Objects-tp2308666p2308666.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Lags and Differences of zoo Objects

2010-07-30 Thread pdb

Thanks for the response.

I can figure out the 'lag' parameter to the function, but I dont understand
the 'differences' parameter.

differences - an integer indicating the order of the difference

What does the 'order of the difference' mean in English?

How are these numbers calculated?

 x - iris
 x$Species = NULL
 x$Petal.Width = NULL
 x$Sepal.Width = NULL
 x$Sepal.Length = NULL
 
 x - zoo(x)
 
 x - 
+ merge(orig = x
+ ,l1d1 = diff(x, lag = 1, differences = 1, arithmetic = TRUE, na.pad =
TRUE)
+ ,l1d2 = diff(x, lag = 1, differences = 2, arithmetic = TRUE, na.pad =
TRUE)
+ ,l2d1 = diff(x, lag = 2, differences = 1, arithmetic = TRUE, na.pad =
TRUE)
+ ,l2d2 = diff(x, lag = 2, differences = 2, arithmetic = TRUE, na.pad =
TRUE)
+ )
 
 x
Petal.Length.orig Petal.Length.l1d1 Petal.Length.l1d2 Petal.Length.l2d1
Petal.Length.l2d2
1 1.4NANANA 
  
NA
2 1.4   0.0NANA 
  
NA
3 1.3  -0.1 -1.00e-01  -0.1 
  
NA
4 1.5   0.2  3.00e-01   0.1 
  
NA
5 1.4  -0.1 -3.00e-01   0.1 
2.00e-01
6 1.7   0.3  4.00e-01   0.2 
1.00e-01
7 1.4  -0.3 -6.00e-01   0.0
-1.00e-01
8 1.5   0.1  4.00e-01  -0.2
-4.00e-01
9 1.4  -0.1 -2.00e-01   0.0 
0.00e+00
101.5   0.1  2.00e-01   0.0 
2.00e-01
111.5   0.0 -1.00e-01   0.1 
1.00e-01
121.6   0.1  1.00e-01   0.1 
1.00e-01
-- 
View this message in context: 
http://r.789695.n4.nabble.com/Lags-and-Differences-of-zoo-Objects-tp2308666p2308681.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] where did the column names go to?

2010-07-29 Thread pdb

I've just tried to merge 2 data sets thinking they would only keep the common
columns, but noticed the column count was not adding up. I've then
replicated a simple example and got the same thing happening.

q1. why doesn't 'b' have a column name?

q2. when I merge, why does the new column 'y' have all values as 5.1?

Thanks in advance,

Mr. confused
 

 a - iris[,]
 b - iris[,1]
 
 head(a)
  Sepal.Length Sepal.Width Petal.Length Petal.Width Species
1  5.1 3.5  1.4 0.2  setosa
2  4.9 3.0  1.4 0.2  setosa
3  4.7 3.2  1.3 0.2  setosa
4  4.6 3.1  1.5 0.2  setosa
5  5.0 3.6  1.4 0.2  setosa
6  5.4 3.9  1.7 0.4  setosa
 head(b)
[1] 5.1 4.9 4.7 4.6 5.0 5.4
 
 c -merge(a,b)
 head(c)
  Sepal.Length Sepal.Width Petal.Length Petal.Width Species   y
1  5.1 3.5  1.4 0.2  setosa 5.1
2  4.9 3.0  1.4 0.2  setosa 5.1
3  4.7 3.2  1.3 0.2  setosa 5.1
4  4.6 3.1  1.5 0.2  setosa 5.1
5  5.0 3.6  1.4 0.2  setosa 5.1
6  5.4 3.9  1.7 0.4  setosa 5.1
 
 NCOL(a)
[1] 5
 NCOL(b)
[1] 1
 NCOL(c)
[1] 6
 

-- 
View this message in context: 
http://r.789695.n4.nabble.com/where-did-the-column-names-go-to-tp2306267p2306267.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] how to 'stack' data frames?

2010-07-29 Thread pdb

I have 2 data frames (A  B) with some common column names.

A has 10 rows.
B has 20 rows.

How do I combine them so I end up with a data frame with 30 rows that only
contains the common columns.

I was trying 'merge' (Merge two data frames by common columns .etc. )
but that is not giving me what I expect...

 a - iris
 b - iris
 
 c -merge(a,b)
 
 NROW(a)
[1] 150
 NROW(c)
[1] 152

Why is there only 152 rows and not 300?



-- 
View this message in context: 
http://r.789695.n4.nabble.com/how-to-stack-data-frames-tp2306284p2306284.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] how to 'stack' data frames?

2010-07-29 Thread pdb

Thanks Dennis - easy when you know how !

-- 
View this message in context: 
http://r.789695.n4.nabble.com/how-to-stack-data-frames-tp2306284p2306309.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] eliminating constant variables

2010-07-10 Thread pdb

Hi all,

I have a large data set and want to immediately build a 'blind' model
without first examining the data. Now it appears in the data there are a lot
of fields that are constant or all missing values - which prevents the model
from being built.

Can someone point me the right direction as to how I can automatically purge
my data file of these useless fields. 

Thanks in advance,

pdb

train - read.csv(TrainingData.csv)
library(gbm)
i.gbm-gbm(TargetVariable ~ . ,data=train,distribution=bernoulli.

1: In gbm.fit(x, y, offset = offset, distribution = distribution,  ... :
  variable 5: var1 has no variation.
-- 
View this message in context: 
http://r.789695.n4.nabble.com/eliminating-constant-variables-tp2284831p2284831.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] eliminating constant variables

2010-07-10 Thread pdb

Hi Jim, 

Thanks for your response, although I was probably not clear about exactly
what I want to achieve, please let me see if I can explain a little
better...

There are certain (unknown) columns in my data that contain either NULL in
every row, or the same value in every row (eg '1'). These columns are
useless for modelling as there is no variation in the data.

I need a way to automatically find and delete all these columns (it is not
rows I want to delete, but the whole column, as in 

train$Variablexxx = NULL

where Variablexxx needs to be automatically found.

Thanks in advance,

pdb
-- 
View this message in context: 
http://r.789695.n4.nabble.com/eliminating-constant-variables-tp2284831p2284853.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] eliminating constant variables

2010-07-10 Thread pdb

Yep - that is what I want.

Cheers Jim you Legend.
-- 
View this message in context: 
http://r.789695.n4.nabble.com/eliminating-constant-variables-tp2284831p2284861.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] eliminating constant variables

2010-07-10 Thread pdb

Awsome!

It made sense once I realised SD=standard deviation !

pdb
-- 
View this message in context: 
http://r.789695.n4.nabble.com/eliminating-constant-variables-tp2284831p2284915.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] r code exchange site?

2010-07-05 Thread pdb

Does there exist a site where snippets of r code examples can be deposited,
such as the one that exists for matlab?

http://www.mathworks.com/matlabcentral/fileexchange/

ps
I also noted from the main r site

http://www.r-project.org/

when you click on the nabble link under the search link, I end up here

http://e-nvf.vvvay.net/-td13672.html#a13819

which I don't think is anything to do with R as far as I can tell (but my
Russian is not that hot)

Yours Hopefully,

pb

-- 
View this message in context: 
http://r.789695.n4.nabble.com/r-code-exchange-site-tp2278205p2278205.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] plot focus

2010-06-30 Thread pdb

I am doing calculations in a loop and then plotting the results by adding a
point to each of 2 charts at the end of the loop. Its very informative as
you can see the progression through time.

My problem is, if I have 2 plots, I don't know how to get the focus back to
the first plot.

layout(matrix(c(1,2)))

plot(iris[,1],col=red,) #plot1
plot(iris[,3],col=blue) #plot2

#goes on plot2
lines(iris[,2],col=pink) 

#how do I put this line on plot 1
lines(iris[,4],col=black) 


I tried the method below but when you switch the focus back to screen 1 the
line gets drawn not where I expect

split.screen(c(2,1))
screen(1) # prepare screen 1 for output
plot(iris[,1],col=red,) #plot1
screen(2) # prepare screen 2 for output
plot(iris[,3],col=blue) #plot2

screen(1)
lines(iris[,2],col=pink,lwd=8) 

screen(2)
lines(iris[,4],col=green,lwd=8) 

Any pointers please as to what I need to do?

-- 
View this message in context: 
http://r.789695.n4.nabble.com/plot-focus-tp2272699p2272699.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] plot focus - another issue (ylim)

2010-06-30 Thread pdb

Thanks Henrique, that appeared to work, but now I have another issue.

If I add a ylim to the plot then when I plot another line it gets plotted on
the wrong scale.

#this works as expected
plot(iris[,1],col=red,ylim=c(-10,10)) #plot1
lines(iris[,4],col=black) 


#this does not
par(mfrow=c(2,1))

plot(iris[,1],col=red,ylim=c(-10,10)) #plot1
plot(iris[,3],col=blue) #plot2

#goes on plot2
par(mfg = c(2, 1))
lines(iris[,2],col=pink)

#goes on plot 1
par(mfg = c(1, 1))
lines(iris[,4],col=black) 
-- 
View this message in context: 
http://r.789695.n4.nabble.com/plot-focus-tp2272699p2274541.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] randomforests - how to classify

2010-05-04 Thread pdb

Hi,

I'm experimenting with random forests and want to perform a binary
classification task. 
I've tried some of the sample codes in the help files and things run, but I
get a message to the effect 'you don't have very many unique values in the
target - are you sure you want to do regression?' (sorry, don't know exact
message but r is busy now so can't check).


In reading the help files I see 2 examples, one for classification and one
for regression. To the uninformed - these don't seem much different to each
other. How does rf know to do regression or classification?

## Classification:
##data(iris)
set.seed(71)
iris.rf - randomForest(Species ~ ., data=iris, importance=TRUE,
proximity=TRUE)


## Regression:
## data(airquality)
set.seed(131)
ozone.rf - randomForest(Ozone ~ ., data=airquality, mtry=3,
 importance=TRUE, na.action=na.omit)


My target variable only has 2 values - why does it want to do regression?
I've entered code just like that in the classification example above. Also
when it asks me 'are you sure you want to do regression' - how do I say 'NO,
do classification please'?




-- 
View this message in context: 
http://r.789695.n4.nabble.com/randomforests-how-to-classify-tp2126166p2126166.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] timing a function

2010-05-04 Thread pdb

Hi,
I want to time how long a function takes to execute. Any clues on what to
search for to achieve this?

Thanks in advance.
-- 
View this message in context: 
http://r.789695.n4.nabble.com/timing-a-function-tp2126319p2126319.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.