date:20110813

Re: [R] Entropy based feature selection in R

2011-08-13 Thread andy1234

Hello everyone,

Any thoughts in this one please? 

The only thing I found was the FSelector package
(http://en.wikibooks.org/wiki/Data_Mining_Algorithms_In_R/Dimensionality_Reduction/Feature_Selection#Aviable_Feature_Ranking_Techniques_in_FSelector_Package).
Unfortunately though it seems to be far from scalable on my data (~300k
features, ~10k instances). 

I would appreciate some advice on this.

Thanks in advance.
Andy

--
View this message in context: 
http://r.789695.n4.nabble.com/Entropy-based-feature-selection-in-R-tp3708056p3740878.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Automating an R function call

2011-08-13 Thread RobertJK

Thanks for your help everyone! I'm happy enough with an asynchronous
solution here. Thanks!
Robert


--
View this message in context: 
http://r.789695.n4.nabble.com/Automating-an-R-function-call-tp3740070p3740333.html
Sent from the R help mailing list archive at Nabble.com.
[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Adjacency Matrix help

2011-08-13 Thread collegegurl69

I have created an adjacency matrix but have not been able to figure something
out. I need to put zeros on the diagonal of the adjacency matrix. For
instance, location (i,i) to equal 0. Please help. Thanks 

--
View this message in context: 
http://r.789695.n4.nabble.com/Adjacency-Matrix-help-tp3740946p3740946.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Finding an average time spent

2011-08-13 Thread erinbspace

Hello R help! 

I am extremely new to R (as in 3 just days) and I've been using it to do
some pretty basic things. I am frustratingly stuck on one point, and am so
so so close to figuring it out, but just far enough away to ask for some
(perhaps embarrassingly easy) help.

I have a dataset, visitors, that has a variable called Time.Spent.
Time.Spent consists of times in the format hh:mm:ss , and it is a
measurement, kind of like a timer, of the amount of time someone spent in a
museum exhibit. 

I need to find the average time spent.  I've figured the easiest way to do
this would be to convert it into seconds. I found a function that someone
wrote on how to do this here:
http://stackoverflow.com/questions/1389428/dealing-with-time-periods-such-as-5-minutes-and-30-seconds-in-r

I thought this would be the answer! However, when I run the code, it works
perfectly for the first variable in the first observation, but then repeats
the same answer all the way down the rows. 

Sorry for the wordiness, here's the code I have:

# The function to convert hh:mm:ss into just seconds:

time.to.seconds - function(time) {
   time - strsplit(time, :)[[1]]
   return ((as.numeric(time[1]) * 60 * 60) + (as.numeric(time[2]) * 60) +
(as.numeric(time[3])))
}

# I've tried many things to then create a new variable in the dataset
visitors:

visitors$TimeInSeconds - time.to.seconds(time=c(visitors$Time.Spent))

# Or

visitors$TimeInSeconds - time.to.seconds(visitors$Time.Spent)


I figure it has something to do with the fact that strsplit() makes a list?
Do I need a loop to go through each variable? I know this is a huge question
but any hints at al would be very much appreciated. 


--
View this message in context: 
http://r.789695.n4.nabble.com/Finding-an-average-time-spent-tp3740391p3740391.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] sapply to bind columns, with repeat?

2011-08-13 Thread Katrina Bennett

Hi Weidong Gu,

This works! For my clarity, and so I can repeat this process if need be:

The 'mat' generates a matrix using whatever is supplied to x (i.e.
coop.dat) using the columns from position 9:length(x) of 6 columns (by
row).

The 'rem.col' generates a matrix of the first 1:8 columns of 8 columns.

The 'return' statement calls the function to cbind together rem.col and mat.

Then 'apply' this all to coop.dat, by rows, using function reorg.

Is this correct?

Thank you very much,

Katrina


On Fri, Aug 12, 2011 at 10:28 AM, Weidong Gu anopheles...@gmail.com wrote:
 Katrina,

 try this.

 reorg-function(x){
 mat-matrix(x[9:length(x)],ncol=6,byrow=T)
 rem.col-matrix(rep(x[1:8],nrow(mat)),byrow=T,ncol=8)
 return(data.frame(cbind(rem.col,mat)))
 }

 co-do.call('rbind',apply(coop.dat,1,function(x) reorg(x)))

 You may need to tweak a bit to fit exactly what you want.

 Weidong Gu

 On Fri, Aug 12, 2011 at 2:35 AM, Katrina Bennett kebenn...@alaska.edu wrote:
 Hi R-help,

 I am working with US COOP network station data and the files are
 concatenated in single rows for all years, but I need to pull these
 apart into rows for each day. To do this, I need to extract part of
 each row such as station id, year, mo, and repeat this against other
 variables in the row (days). My problem is that there are repeated
 values for each day, and the files are fixed width field without
 order.

 Here is an example of just one line of data.

 coop.raw - c(DLY09752806TMAX F2010010620107 00049 20107 00062
 B0207 00041 20207 00049 B0307 00040 20307 00041 B0407 00042 20407
 00040 B0507 00041 20507 00042 B0607 00043 20607 00041 B0707 00055
 20707 00043 B0807 00039 20807 00055 B0907 00037 20907 00039 B1007
 00038 21007 00037 B1107 00048 21107 00038 B1207 00050 21207 00048
 B1307 00051 21307 00050 B1407 00058 21407 00051 B1507 00068 21507
 00058 B1607 00065 21607 00068 B1707 00068 21707 00065 B1807 00067
 21807 00068 B1907 00068 21907 00067 B2007 00069 22007 00068 B2107
 00057 22107 00069 B2207 00048 22207 00057 B2307 00051 22307 00048
 B2407 00073 22407 00051 B2507 00062 22507 00073 B2607 00056 22607
 00062 B2707 00053 22707 00056 B2807 00064 22807 00053 B2907 00057
 22907 00064 B3007 00047 23007 00057 B3107 00046 23107 00047 B)
 write.csv(coop.raw, coop.tmp, row.names=F, quote=F)
 coop.dat - read.fwf(coop.tmp, widths =
 c(c(3,8,4,2,4,2,4,3),rep(c(2,2,1,5,1,1),62)), na.strings=c(),
 skip=1, as.is=T)
 rep.name - rep(c(day,hr,met,dat,fl1,fl2), 62)
 rep.count - rep(c(1:62), each=6, 1)
 names(coop.dat) - c(rect, id, elem, unt, year, mo,
 fill, numval, paste(rep.name, rep.count, sep=_))

 I would like to generate output that contains in one row, the columns
 id, elem, unt, year, mo, and numval. Binded to these
 initial columns, I would like only day_1, hr_1, met_1, dat_1,
 fl1_1, and fl2_1. Then, in the next row I would like repeated the
 initial columns id, elem, unt, year, mo, and numval and
 then binded day_2, hr_2, met_2, dat_2, fl1_2, and f2_2 and
 so on until all the data for all rows has been allocated. Then, move
 onto the next row and repeat.

 I think I should be able to do this with some sort of sapply or lapply
 function, but I'm struggling with the format for repeating the initial
 columns, and then skipping through the next columns.

 Thank you,

 Katrina

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.



__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] post

2011-08-13 Thread bdeepthi

Hello,

I was trying to plot multiple graph using par(mfrow=c(3,2)).
But this is giving me the following error:

Error in axis(side = side, at = at, labels = labels, ...) :
  X11 font -adobe-helvetica-%s-%s-*-*-%d-*-*-*-*-*-*-*, face 1 at size 8
could not be loaded

Could someone decode this error, please.


Thank you

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Adjacency Matrix help

2011-08-13 Thread Timothy Bates

diag(adjMatrix) -0


On Aug 13, 2011, at 7:34 AM, collegegurl69 wrote:

 I have created an adjacency matrix but have not been able to figure something
 out. I need to put zeros on the diagonal of the adjacency matrix. For
 instance, location (i,i) to equal 0. Please help. Thanks 

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Finding an average time spent

2011-08-13 Thread Gabor Grothendieck

On Fri, Aug 12, 2011 at 4:23 PM, erinbspace erin.brasw...@gmail.com wrote:
 Hello R help!

 I am extremely new to R (as in 3 just days) and I've been using it to do
 some pretty basic things. I am frustratingly stuck on one point, and am so
 so so close to figuring it out, but just far enough away to ask for some
 (perhaps embarrassingly easy) help.

 I have a dataset, visitors, that has a variable called Time.Spent.
 Time.Spent consists of times in the format hh:mm:ss , and it is a
 measurement, kind of like a timer, of the amount of time someone spent in a
 museum exhibit.

 I need to find the average time spent.  I've figured the easiest way to do

 library(chron)
 Time.Spent - c(12:12:10, 13:12:10)
 times(mean(as.numeric(times(Time.Spent
[1] 12:42:10

-- 
Statistics  Software Consulting
GKX Group, GKX Associates Inc.
tel: 1-877-GKX-GROUP
email: ggrothendieck at gmail.com

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Adjacency Matrix help

2011-08-13 Thread collegegurl69

Thanks so much for your quick reply. it seems to work. the problem is that it
now places actual zeros on the diagonal whereas the rest of the adjacency
matrix has dots to represent zeroes. Do you have any ideas on how to change
these zeros to dots like in the rest of the adj matrix? Or is it the same
thing? Thanks.

--
View this message in context: 
http://r.789695.n4.nabble.com/Adjacency-Matrix-help-tp3740946p3740996.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] optimization problems

2011-08-13 Thread Kathie

Dear R users

I am trying to use OPTIMX(OPTIM) for nonlinear optimization.

There is no error in my code but the results are so weird (see below).

When I ran via OPTIM, the results are that

Initial values are that theta0 = 0.6 1.6 0.6 1.6 0.7. (In fact true vales
are 0.5,1.0,0.8,1.2, 0.6.)

 optim(par=theta0, fn=obj.fy, method=BFGS, control=list(trace=1,
 maxit=1), hessian=T)
initial  value -0.027644 
final  value -0.027644 
converged
$par
[1] 0.6 1.6 0.6 1.6 0.7

$value
[1] -0.02764405

$counts
function gradient 
   11 

$convergence
[1] 0

$message
NULL

$hessian
 [,1] [,2] [,3] [,4] [,5]
[1,]00000
[2,]00000
[3,]00000
[4,]00000
[5,]00000


When I ran via OPTIMX, the results are that


 optimx(par=theta0, fn=obj.fy, method=BFGS, control=list(maxit=1),
 hessian=T)
par fvalues   method   fns grs  itns
conv KKT1 KKT2 xtimes
1 0.6, 1.6, 0.6, 1.6, 0.7   -0.02764405 BFGS 1   1   NULL0 TRUE  
NA   8.71
 


Whenever I used different initial values, the initial ones are the answer of
OPTIMX(OPTIM).

Would you plz explain why it happened? or any suggestion will be greatly
appreciated.

Regards,

Kathryn Lord 

--
View this message in context: 
http://r.789695.n4.nabble.com/optimization-problems-tp3741005p3741005.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] optimization problems

2011-08-13 Thread Kathie

To be honest, 

The first derivative of my objective function is very complicated so I
ignore this. Could it lead to this sort of problem?

Kathie

--
View this message in context: 
http://r.789695.n4.nabble.com/optimization-problems-tp3741005p3741010.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Any alternatives to draw.colorkey from lattice package?

2011-08-13 Thread Jim Lemon


On 08/13/2011 04:34 AM, Mikhail Titov wrote:

Hello!

I’d like to have a continuous color bar on my lattice xyplot with colors lets say 
from topo.colors such that it has ticks  labels at few specific points only.

Right now I use do.breaks  level.colors with somewhat large number of steps. 
The problem is that color change point doesn’t necessary correspond to the value 
I’d like to label. Since I have many color steps and I don’t need high precision I 
generate labels like this

labels- ifelse( sapply(at,function(x) any(abs(att-x).03)) , sprintf(depth= %s ft, 
at), )

, where `att` has mine points of interest on color scale bar and `at` 
corresponds to color change points used with level.colors . It is a bit 
inconvenient as I have to adjust threshold `.03`, number of color steps so that 
it labels only adjacent color change point with my labels.

Q: Are there any ready to use functions that would generate some kind of 
GRaphical OBject with continuous color scale bar/key with custom at/labels such 
that it would work with `legend` argument of xyplot from lattice?


Hi Mikhail,
I think that color.legend in the plotrix package will do what you are 
asking, but it is in base graphics, and may not work with lattice.


Jim

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Finding an average time spent

2011-08-13 Thread Jim Lemon


On 08/13/2011 06:23 AM, erinbspace wrote:

Hello R help!

I am extremely new to R (as in 3 just days) and I've been using it to do
some pretty basic things. I am frustratingly stuck on one point, and am so
so so close to figuring it out, but just far enough away to ask for some
(perhaps embarrassingly easy) help.

I have a dataset, visitors, that has a variable called Time.Spent.
Time.Spent consists of times in the format hh:mm:ss , and it is a
measurement, kind of like a timer, of the amount of time someone spent in a
museum exhibit.

I need to find the average time spent.  I've figured the easiest way to do
this would be to convert it into seconds. I found a function that someone
wrote on how to do this here:
http://stackoverflow.com/questions/1389428/dealing-with-time-periods-such-as-5-minutes-and-30-seconds-in-r

I thought this would be the answer! However, when I run the code, it works
perfectly for the first variable in the first observation, but then repeats
the same answer all the way down the rows.

Sorry for the wordiness, here's the code I have:

# The function to convert hh:mm:ss into just seconds:

time.to.seconds- function(time) {
time- strsplit(time, :)[[1]]
return ((as.numeric(time[1]) * 60 * 60) + (as.numeric(time[2]) * 60) +
(as.numeric(time[3])))
}

# I've tried many things to then create a new variable in the dataset
visitors:

visitors$TimeInSeconds- time.to.seconds(time=c(visitors$Time.Spent))

# Or

visitors$TimeInSeconds- time.to.seconds(visitors$Time.Spent)


I figure it has something to do with the fact that strsplit() makes a list?
Do I need a loop to go through each variable? I know this is a huge question
but any hints at al would be very much appreciated.


Hi erinbspace,
By hard coding the [[1]] in your function, you are automatically taking 
the first element of any list. If you want to convert a vector of times, 
try this:


time.to.seconds- function(time) {
 time-strsplit(time, :)[[1]]
 return(as.numeric(time[1]) * 3600 + as.numeric(time[2]) * 60 +
  as.numeric(time[3]))
}
watch.times-c(0:2:31,0:4:12,0:0:47)
# use sapply to step through the vector of times
sapply(watch.times,time.to.seconds)
0:2:31 0:4:12 0:0:47
   151252 47

Jim

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Any alternatives to draw.colorkey from lattice package?

2011-08-13 Thread Felix Andrews

You can just specify the label positions, you don't need to give
labels for every color change point:
(there is an 'at' for the color changes and a 'labels$at' for the labels)

levelplot(rnorm(100) ~ x * y, expand.grid(x = 1:10, y = 1:10),
colorkey = list(at = seq(-3,3,length=100),
labels = list(labels = paste(-3:3, units), at = -3:3)))



On 13 August 2011 19:59, Jim Lemon j...@bitwrit.com.au wrote:
 On 08/13/2011 04:34 AM, Mikhail Titov wrote:

 Hello!

 I’d like to have a continuous color bar on my lattice xyplot with colors
 lets say from topo.colors such that it has ticks  labels at few specific
 points only.

 Right now I use do.breaks  level.colors with somewhat large number of
 steps. The problem is that color change point doesn’t necessary correspond
 to the value I’d like to label. Since I have many color steps and I don’t
 need high precision I generate labels like this

 labels- ifelse( sapply(at,function(x) any(abs(att-x).03)) ,
 sprintf(depth= %s ft, at), )

 , where `att` has mine points of interest on color scale bar and `at`
 corresponds to color change points used with level.colors . It is a bit
 inconvenient as I have to adjust threshold `.03`, number of color steps so
 that it labels only adjacent color change point with my labels.

 Q: Are there any ready to use functions that would generate some kind of
 GRaphical OBject with continuous color scale bar/key with custom at/labels
 such that it would work with `legend` argument of xyplot from lattice?

 Hi Mikhail,
 I think that color.legend in the plotrix package will do what you are
 asking, but it is in base graphics, and may not work with lattice.

 Jim

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.




-- 
Felix Andrews / 安福立
http://www.neurofractal.org/felix/

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] linear regression

2011-08-13 Thread maggy yan

dear R users,
my data looks like this

 PM10   Ref   UZ JZ WT   RH   FT   WR
1   10.973195  4.338874 nein Winter   Dienstag   ja nein West
26.381684  2.250446 nein SommerSonntag nein   ja  Süd
3   62.586512 66.304869   ja SommerSonntag nein nein  Ost
45.590101  8.526152   ja Sommer Donnerstag nein nein Nord
5   30.925054 16.073091 nein WinterSonntag nein nein  Ost
6   10.750567  2.285075 nein Winter   Mittwoch nein nein  Süd
7   39.118316 17.128691   ja SommerSonntag nein nein  Ost
89.327564  7.038572   ja Sommer Montag nein nein Nord
9   52.271744 15.021977 nein Winter Montag nein nein  Ost
10  27.388416 22.449102   ja Sommer Montag nein nein  Ost

.

.

.

.

til 200


I'm trying to make a linear regression between PM10 and Ref for each of the
four WR, I've tried this:
plot(Nord$PM10 ~ Nord$Ref, main=Nord, xlab=Ref, ylab=PM10)
but it does not work, because Nord cannot be found
what was wrong? how can I do it? please help me

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Passing on groups argument to xyplot within a plotting function

2011-08-13 Thread Felix Andrews

The problem is that xyplot tries to evaluate 'groups' in 'data' or in
the formula environment. Your local function environment (where the
variable named groups is defined) is neither of these. There are a
couple of ways to get the evaluation to work out; here is one:

pb - list(F1 = 1:8, F2 = 1:8, Type = c('a','a','a','a','b','b','b','b'))

foo - function(x,data,groups, ...){
  ccall - quote(xyplot(x,data=data, ...))
  ccall$groups - substitute(groups)
  eval(ccall)
}

foo(F1 ~ F2, pb, groups = Type)

Hope that helps
-Felix


On 11 August 2011 19:42, Fredrik Karlsson dargo...@gmail.com wrote:
 Hi,

 I am constructing a plotting function that I would like to behave like
 plotting functions within the lattice package. It takes a groups argument,
 which I process, and then I would like to pass that argument on to the
 xyplot function for the actual plotting. However, what ever I do, get an
 error that the variable is missing.

 A short illustration:

 Given the data set

 names(pb)
 [1] Type    Sex     Speaker Vowel   IPA     F0      F1
 [8] F2      F3

 and these test functions:

 TESTFUN - function(x,data,groups){

  xyplot(x,data=data,groups=groups)
 }

 TESTFUN2 - function(x,data,groups){

  xyplot(x,data=data,groups=substitute(groups))
 }

 TESTFUN3 - function(x,data,groups){
  groups - eval(substitute(groups), data, environment(x))

  xyplot(x,data=data,groups=groups)
 }

 I fail to get groups to be passed on to xyplot correctly:

 TESTFUN(F1 ~ F2,data=pb,groups=Type)
 Error in eval(expr, envir, enclos) : object 'groups' not found
 TESTFUN2(F1 ~ F2,data=pb,groups=Type)
 Error in prepanel.default.function(groups = groups, x = c(2280L, 2400L,  :
  object 'groups' not found
 TESTFUN3(F1 ~ F2,data=pb,groups=Type)
 Error in eval(expr, envir, enclos) : object 'groups' not found

 Please help me understand what I am doing wrong.

 /Fredrik

 --
 Life is like a trumpet - if you don't put anything into it, you don't get
 anything out of it.

        [[alternative HTML version deleted]]

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.




-- 
Felix Andrews / 安福立
http://www.neurofractal.org/felix/

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Excluding NAs from round correlation

2011-08-13 Thread Julie

Hello,
I am quite new to R and I am trying to get a round correlation from a table
with dozens of columns. However, all the columns contain several blank
places which show to me as NAs. Then, when I type round(cor(data),2), I get
no results - everything (except correlation of one column with the same one,
of course) is NA.
I do not want to replace NA with zero, because it would ruin the results. I
just want R not to look at NA and correlate just places with numbers. Is
it possible?
Thank you very much for help!

--
View this message in context: 
http://r.789695.n4.nabble.com/Excluding-NAs-from-round-correlation-tp3741296p3741296.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] linear regression

2011-08-13 Thread Steven Kennedy

your dataframe needs to be called Nord. If it is not, then replace
Nord with the actual name of your dataframe



On Sat, Aug 13, 2011 at 10:43 PM, maggy yan kiot...@googlemail.com wrote:
 dear R users,
 my data looks like this

         PM10       Ref   UZ     JZ         WT   RH   FT   WR
 1   10.973195  4.338874 nein Winter   Dienstag   ja nein West
 2    6.381684  2.250446 nein Sommer    Sonntag nein   ja  Süd
 3   62.586512 66.304869   ja Sommer    Sonntag nein nein  Ost
 4    5.590101  8.526152   ja Sommer Donnerstag nein nein Nord
 5   30.925054 16.073091 nein Winter    Sonntag nein nein  Ost
 6   10.750567  2.285075 nein Winter   Mittwoch nein nein  Süd
 7   39.118316 17.128691   ja Sommer    Sonntag nein nein  Ost
 8    9.327564  7.038572   ja Sommer     Montag nein nein Nord
 9   52.271744 15.021977 nein Winter     Montag nein nein  Ost
 10  27.388416 22.449102   ja Sommer     Montag nein nein  Ost

 .

 .

 .

 .

 til 200


 I'm trying to make a linear regression between PM10 and Ref for each of the
 four WR, I've tried this:
 plot(Nord$PM10 ~ Nord$Ref, main=Nord, xlab=Ref, ylab=PM10)
 but it does not work, because Nord cannot be found
 what was wrong? how can I do it? please help me

        [[alternative HTML version deleted]]


 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.



__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Excluding NAs from round correlation

2011-08-13 Thread Weidong Gu

check

?cor

Please note the parameter 'use'

Weidong Gu

On Sat, Aug 13, 2011 at 9:06 AM, Julie julie.novak...@gmail.com wrote:
 Hello,
 I am quite new to R and I am trying to get a round correlation from a table
 with dozens of columns. However, all the columns contain several blank
 places which show to me as NAs. Then, when I type round(cor(data),2), I get
 no results - everything (except correlation of one column with the same one,
 of course) is NA.
 I do not want to replace NA with zero, because it would ruin the results. I
 just want R not to look at NA and correlate just places with numbers. Is
 it possible?
 Thank you very much for help!

 --
 View this message in context: 
 http://r.789695.n4.nabble.com/Excluding-NAs-from-round-correlation-tp3741296p3741296.html
 Sent from the R help mailing list archive at Nabble.com.

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] post

2011-08-13 Thread Uwe Ligges




On 13.08.2011 06:52, bdeep...@ibab.ac.in wrote:

Hello,

I was trying to plot multiple graph using par(mfrow=c(3,2)).
But this is giving me the following error:

Error in axis(side = side, at = at, labels = labels, ...) :
   X11 font -adobe-helvetica-%s-%s-*-*-%d-*-*-*-*-*-*-*, face 1 at size 8
could not be loaded


The font is missing. You may want to install some more fonts on your 
machine.


Uwe Ligges





Could someone decode this error, please.


Thank you

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] optimization problems

2011-08-13 Thread John C Nash

optimx with BFGS uses optim, so you actually incur some overhead unnecessarily. 
And BFGS
really needs good gradients (as does Rvmmin and Rcgmin which are updated BFGS 
and CG, but
all in R and with bounds or box constraints).

From the Hessian, your function is (one of the many!) that have pretty bad 
numerical
properties. With all 0s, Newton is spinning in his grave. Probably the gradient 
is small
also. So the optimizers decide they are at a minimum.

As a first step, I'd suggest
- checking that the function is computed correctly. That is, does your function 
give the
correct value?

- try a few other points nearby. Are any lower than your first point?

- Use numDeriv and get the gradient (and possibly Hessian) at each of these 
nearby points.

These steps may reveal either that you have a bug in the function, or that it 
is pretty
nasty numerically. In the latter case, you really need to try to find an 
equivalent
function e.g., log(f) that can be minimized more easily.

For information, I'm rather slowly working on a function test suite to do this. 
Also a lot
of changes are going on in optimx to try to catch some of the various nasties. 
These
appear first in the R-forge development versions. Use and comments welcome.

If you DO find a lower point, then I'd give Nelder-Mead a try. Ravi Varadhan 
has a variant
of this that may do a little better in dfoptim.

You could also be a bit lazy and try optimx with the control 
all.methods=TRUE. Not
recommended for production use, but often helpful in seeing if any method can 
get some
traction.

Cheers,

JN







On 08/13/2011 06:00 AM, r-help-requ...@r-project.org wrote:
 -- Message: 47 Date: Sat, 13 Aug 2011 01:12:09 
 -0700 (PDT)
 From: Kathie kathryn.lord2...@gmail.com To: r-help@r-project.org Subject: 
 [R]
 optimization problems Message-ID: 1313223129383-3741005.p...@n4.nabble.com 
 Content-Type:
 text/plain; charset=us-ascii Dear R users I am trying to use OPTIMX(OPTIM) 
 for nonlinear
 optimization. There is no error in my code but the results are so weird (see 
 below). When
 I ran via OPTIM, the results are that Initial values are that theta0 = 0.6 
 1.6 0.6 1.6
 0.7. (In fact true vales are 0.5,1.0,0.8,1.2, 0.6.)
 
  optim(par=theta0, fn=obj.fy, method=BFGS, control=list(trace=1,
  maxit=1), hessian=T)
 initial  value -0.027644 
 final  value -0.027644 
 converged
 $par
 [1] 0.6 1.6 0.6 1.6 0.7
 
 $value
 [1] -0.02764405
 
 $counts
 function gradient 
11 
 
 $convergence
 [1] 0
 
 $message
 NULL
 
 $hessian
  [,1] [,2] [,3] [,4] [,5]
 [1,]00000
 [2,]00000
 [3,]00000
 [4,]00000
 [5,]00000
 
 
 When I ran via OPTIMX, the results are that
 
 
  optimx(par=theta0, fn=obj.fy, method=BFGS, control=list(maxit=1),
  hessian=T)
 par fvalues   method   fns grs  itns
 conv KKT1 KKT2 xtimes
 1 0.6, 1.6, 0.6, 1.6, 0.7   -0.02764405 BFGS 1   1   NULL0 TRUE  
 NA   8.71
  
 
 
 Whenever I used different initial values, the initial ones are the answer of
 OPTIMX(OPTIM).
 
 Would you plz explain why it happened? or any suggestion will be greatly
 appreciated.
 
 Regards,
 
 Kathryn Lord 
 
 --
 View this message in context: 
 http://r.789695.n4.nabble.com/optimization-problems-tp3741005p3741005.html
 Sent from the R help mailing list archive at Nabble.com.
 


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] optimization problems

2011-08-13 Thread Ravi Varadhan

Kathie,

It is very difficult to help without adequate information.  What does your 
objective function look like? Are you maximizing (in which case you have to 
make sure that the sign of the objective function is correct) or minimizing?

Can you try optimx with the control option all.methods=TRUE?

Hope this is helpful,
Ravi.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] define variables from a matrix

2011-08-13 Thread Dennis Murphy

There may well be more efficient ways to do this, but here's one attempt:

foo - function(x, val) if(any(x == val, na.rm = TRUE)) which(x == val) else NA
u - apply(A, 1, function(x) foo(x, 20L))
v - apply(A, 1, function(x) foo(x, 100L))
ifelse(u  v, v, NA)
[1]  3  5 NA NA NA

HTH,
Dennis

On Fri, Aug 12, 2011 at 7:18 PM, gallon li gallon...@gmail.com wrote:
 I have a following matrix and wish to define a variable based the variable

  A=matrix(0,5,5)
 A[1,]=c(30,20,100,120,90)
 A[2,]=c(40,30,20,50,100)
 A[3,]=c(50,50,40,30,30)
 A[4,]=c(30,20,40,50,50)
 A[5,]=c(30,50,NA,NA,100)
 A
     [,1] [,2] [,3] [,4] [,5]
 [1,]   30   20  100  120   90
 [2,]   40   30   20   50  100
 [3,]   50   50   40   30   30
 [4,]   30   20   40   50   50
 [5,]   30   50   NA   NA  100

 I want to define two variables:

 X is the first column in each row that is equal to 20, for example, for the
 first row, I need X=2; 2nd row, X=3; 3rd row, X=NA; 4th row, X=2, 5th row,
 X=NA;

 Y is then the first column in each row that is equal to 100 if before this a
 20 has been reached, for example, for the first row, Y=3; 2nd row, Y=5; 3rd
 row, Y=NA, 4th row, Y=NA; 5th row, Y=NA.

 the matrix may involve NA as well.

 How can I define these two variables quickly?

        [[alternative HTML version deleted]]

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] linear regression

2011-08-13 Thread Dennis Murphy

Hi:

Try something like this, using dat as the name of your data frame:

xyplot(PM10 ~ Ref | WR, data = dat, type = c('p', 'r'))

The plot looks silly with the data snippet you provided, but should
hopefully look more sensible with the complete data. The code creates
a four panel plot, one per direction, with points and a least squares
regression line fit in each panel. The regression line is specific to
a data subset, not the entire data frame.

HTH,
Dennis

On Sat, Aug 13, 2011 at 5:43 AM, maggy yan kiot...@googlemail.com wrote:
 dear R users,
 my data looks like this

         PM10       Ref   UZ     JZ         WT   RH   FT   WR
 1   10.973195  4.338874 nein Winter   Dienstag   ja nein West
 2    6.381684  2.250446 nein Sommer    Sonntag nein   ja  Süd
 3   62.586512 66.304869   ja Sommer    Sonntag nein nein  Ost
 4    5.590101  8.526152   ja Sommer Donnerstag nein nein Nord
 5   30.925054 16.073091 nein Winter    Sonntag nein nein  Ost
 6   10.750567  2.285075 nein Winter   Mittwoch nein nein  Süd
 7   39.118316 17.128691   ja Sommer    Sonntag nein nein  Ost
 8    9.327564  7.038572   ja Sommer     Montag nein nein Nord
 9   52.271744 15.021977 nein Winter     Montag nein nein  Ost
 10  27.388416 22.449102   ja Sommer     Montag nein nein  Ost

 .

 .

 .

 .

 til 200


 I'm trying to make a linear regression between PM10 and Ref for each of the
 four WR, I've tried this:
 plot(Nord$PM10 ~ Nord$Ref, main=Nord, xlab=Ref, ylab=PM10)
 but it does not work, because Nord cannot be found
 what was wrong? how can I do it? please help me

        [[alternative HTML version deleted]]


 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.



__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] define variables from a matrix

2011-08-13 Thread David Winsemius



On Aug 12, 2011, at 7:18 PM, gallon li wrote:

I have a following matrix and wish to define a variable based the  
variable


A=matrix(0,5,5)
A[1,]=c(30,20,100,120,90)
A[2,]=c(40,30,20,50,100)
A[3,]=c(50,50,40,30,30)
A[4,]=c(30,20,40,50,50)
A[5,]=c(30,50,NA,NA,100)

A

[,1] [,2] [,3] [,4] [,5]
[1,]   30   20  100  120   90
[2,]   40   30   20   50  100
[3,]   50   50   40   30   30
[4,]   30   20   40   50   50
[5,]   30   50   NA   NA  100

I want to define two variables:

X is the first column in each row that is equal to 20, for example,  
for the
first row, I need X=2; 2nd row, X=3; 3rd row, X=NA; 4th row, X=2,  
5th row,

X=NA;


X - apply(A, 1, function(x) which(x==20) )
is.na(X) - !unlist(lapply(X, length))
X

The first command seems obvious, but the second might be a bit  
obscure. It says assign NA to any X whose length is non-zero (i.e.  
positive in the case of length).




Y is then the first column in each row that is equal to 100 if  
before this a
20 has been reached, for example, for the first row, Y=3; 2nd row,  
Y=5; 3rd

row, Y=NA, 4th row, Y=NA; 5th row, Y=NA.


Y - apply(A, 1, function(x) which(x==20)*(which(x==20)   
which(x==100) ) )

is.na(Y) - !unlist(lapply(Y, length))
Y

--
David.



the matrix may involve NA as well.

How can I define these two variables quickly?

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


David Winsemius, MD
Heritage Laboratories
West Hartford, CT

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Any alternatives to draw.colorkey from lattice package?

2011-08-13 Thread Mikhail Titov

Felix:

Thank you! Perhaps I should read documentation more careful as I missed
that another `at`.
lattice  latticeExtra are so marvelous so I hardly want to use anything
else.

Mikhail

On 08/13/2011 07:31 AM, Felix Andrews wrote:
 You can just specify the label positions, you don't need to give
 labels for every color change point:
 (there is an 'at' for the color changes and a 'labels$at' for the labels)

 levelplot(rnorm(100) ~ x * y, expand.grid(x = 1:10, y = 1:10),
 colorkey = list(at = seq(-3,3,length=100),
 labels = list(labels = paste(-3:3, units), at = -3:3)))



 On 13 August 2011 19:59, Jim Lemon j...@bitwrit.com.au wrote:
 On 08/13/2011 04:34 AM, Mikhail Titov wrote:
 Hello!

 I’d like to have a continuous color bar on my lattice xyplot with colors
 lets say from topo.colors such that it has ticks  labels at few specific
 points only.

 Right now I use do.breaks  level.colors with somewhat large number of
 steps. The problem is that color change point doesn’t necessary correspond
 to the value I’d like to label. Since I have many color steps and I don’t
 need high precision I generate labels like this

 labels- ifelse( sapply(at,function(x) any(abs(att-x).03)) ,
 sprintf(depth= %s ft, at), )

 , where `att` has mine points of interest on color scale bar and `at`
 corresponds to color change points used with level.colors . It is a bit
 inconvenient as I have to adjust threshold `.03`, number of color steps so
 that it labels only adjacent color change point with my labels.

 Q: Are there any ready to use functions that would generate some kind of
 GRaphical OBject with continuous color scale bar/key with custom at/labels
 such that it would work with `legend` argument of xyplot from lattice?

 Hi Mikhail,
 I think that color.legend in the plotrix package will do what you are
 asking, but it is in base graphics, and may not work with lattice.

 Jim

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.




__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] R's handling of high dimensional data

2011-08-13 Thread andy1234

Hello all, 

I am looking at doing text classification on very high dimensional data
(about 300,000 or more features) and upto 2000 documents. I am quite new to
R though, and was just wondering if R and it's libraries would scale to such
high dimensions. 

Any thoughts will be much appreciated.

Thanks.
Andy 

--
View this message in context: 
http://r.789695.n4.nabble.com/R-s-handling-of-high-dimensional-data-tp3741758p3741758.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] degrees of freedom does not appear in the summary lmer :(

2011-08-13 Thread xy

Hi ,

Could someone pls help me about this topic, I dont know how can i extract
them from my  model!!

Thanks,

Sophie

--
View this message in context: 
http://r.789695.n4.nabble.com/degrees-of-freedom-does-not-appear-in-the-summary-lmer-tp3741327p3741327.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Plotting and quantiles

2011-08-13 Thread Mark D.

Dear R users,


This is most likely very basic question but I am new to R and would really 
appreciate some tips on those two problems.

1) I need to plot variables from a data frame. Because of some few high numbers 
my graph is really strange looking. How could I plot a fraction of the samples 
(like 0.1 (10%), 0.2 up to for example 0.6) on x axis and values 'boundaries' 
(like any value ' 100',  '101-200' and ' 201') on the y axis? This needs to 
be a simple line plot like the one I attached for an example. The values would 
come from one column.


2) I have a data frame with values and need to subset the rows based on the 
values. I wanted to order them (with increasing values) and divide into 3-4 
groups. I though about using quantile but I want the group to be something like 
'1-25', '26-50', '51-75', '75-100' (ordered and for example 25th percentile, 
26-50th etc). I could just look for a median divide into two and then again (or 
use quantiles 0.25, 0.5, 0.7 and 1 and then get rid of all rows in 0.25 that 
are in 0.5 etc) but surely there must by a faster and simpler way to do that (I 
need to do this a lot on different columns)?

Thanks for your help,
Markattachment: viewer.png__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Individual p-values for correlation matrices

2011-08-13 Thread January Weiner

Dear all,

I am calculating each-against-each correlations for a number of
variables, in order to use the correlations as distances. This is easy
enough using just cor(), but not if I want to have a p-value for each
calculated correlation, and especially if I want to correct them for
multiple testing (but see below). I do that currently on foot,
looping around the variables to apply cor.test to each combination of
two variables. Is there a function or a package that would do that for
me?

Specifically, what I do is

# a is the data matrix
for( i in 1:ncol( a ) ) {
  for( j in (i+1):ncol(a) ) {
result - cor.test( a[,i], a[,j], method=spear )
# store the result somehow
  }
}

This is slow and I seek a better solution.

As I mentioned before, I correct the p-values using Bonferroni
correction, which does not assume independence of the hypotheses to be
tested (obviously that is the case here). However, is there a better
method to do this? Bonferroni results in a large number of false
negatives.

Kind regards,

j.

-- 
 Dr. January Weiner 3 --

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Own R function doubt

2011-08-13 Thread garciap

Hi to all the people again,

I was writting a simply function in R, and wish to collect the results in a
excel file. The work goes as follows,

Ciervos-function(K1, K0, A, R,M,Pi,Hembras)
{B-(K1-K0)/A
T1-(R*Pi*Hembras-M*Pi+B)/(Pi-M*Pi+R*Pi*Hembras)
P1-Pi-B
R1-P1*Hembras*R
M1-P1*M
T2-(R1-M1+B)/(P1-M1+R1)
P2-P1-B
R2-P2*Hembras*R
M2-P2*M
T3-(R2-M2+B)/(P2-M2+R2)
P3-P2-B
R3-P3*Hembras*R
M3-P3*M
T4-(R3-M3+B)/(P3-M3+R3)
P4-P3-B
R4-P4*Hembras*R
M4-P4*M
T5-(R4-M4+B)/(P4-M4+R4)
P5-P4-B
R5-P5*Hembras*R
M5-P5*M
T6-(R5-M5+B)/(P5-M5+R5)
P6-P5-B
R6-P6*Hembras*R
M6-P6*M
T7-(R6-M6+B)/(P6-M6+R6)
P7-P6-B
R7-P7*Hembras*R
M7-P7*M
T8-(R7-M7+B)/(P7-M7+R7)
P8-P7-B
R8-P8*Hembras*R
M8-P8*M
T9-(R8-M8+B)/(P8-M8+R8)
P9-P8-B
R9-P9*Hembras*R
M9-P9*M
T10-(R9-M9+B)/(P9-M9+R9)
P10-P9-B
R10-P10*Hembras*R
M10-P10*M
result-list(B,T1,P1,R1,M1,T2,P2,R2,M2,T3,P4,R4,M4,T5,P5,R5,M5,T6,P6,R6,T6,P7,R7,M7,T8,
P8,R8,M8,T9,P9,R9,M9,T10,P10,R10,M10)
return(result)
}


library(memisc)
Gestion-as.data.frame(Simulate(Ciervos(K1, K0, A, R,M,Pi,Hembras), 
expand.grid(K1=c(420,580),K0=c(300,600),A=3,R=0.4,M=0.1,Pi=420,Hembras=0.5),nsim=1,seed=1))
xls.getshlib()
write.xls(Gestion, PoblacionCiervos.xls)

All is fine with the function, by the results (the parameters from B to M10)
are collected in excel by the column names result 1, result 2, etc, and
I wish to collect the results with their proper name (B instead of result 1;
T1 instead of result 2, etc).

I will ackonowledge any help, many thanks

pablo

--
View this message in context: 
http://r.789695.n4.nabble.com/Own-R-function-doubt-tp3741463p3741463.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] fit.mult.impute() in Hmisc

2011-08-13 Thread Paul Johnson

On Thu, Mar 31, 2011 at 2:56 PM, Yuelin Li li...@mskcc.org wrote:
 I tried multiple imputation with aregImpute() and
 fit.mult.impute() in Hmisc 3.8-3 (June 2010) and R-2.12.1.

 The warning message below suggests that summary(f) of
 fit.mult.impute() would only use the last imputed data set.
 Thus, the whole imputation process is ignored.

  Not using a Design fitting function; summary(fit)
   will use standard errors, t, P from last imputation only.
   Use vcov(fit) to get the correct covariance matrix,
   sqrt(diag(vcov(fit))) to get s.e.


Hello.  I fiddled around with rms  multiple imputation when I was
preparing these notes from our R summer course. I ran into that same
thing you did, and my conclusion is slightly different from yours.

http://pj.freefaculty.org/guides/Rcourse/multipleImputation/multipleImputation-1-lecture.pdf

Look down to slide 80 or so, where I launch off into that question.
It appears to me that aregImpute will give the right answer for
fitters from rms, but if you want to feel confident about the results
for other fitters, you should use mitools or some other paramater
combining approach. My conclusion (slide 105) is

Please note: the standard errors in the output based on lrm match
the std.errors estimated by MItools. Thus I conclude
sqrt(diag(cov(fit.mult.impute.object) did not give correct results




 But the standard errors in summary(f) agree with the values
 from sqrt(diag(vcov(f))) to the 4th decimal point.  It would
 seem that summary(f) actually adjusts for multiple
 imputation?

 Does summary(f) in Hmisc 3.8-3 actually adjust for MI?

 If it does not adjust for MI, then how do I get the
 MI-adjusted coefficients and standard errors?

 I can't seem to find answers in the documentations, including
 rereading section 8.10 of the Harrell (2001) book  Googling
 located a thread in R-help back in 2003, which seemed dated.
 Many thanks in advance for the help,

 Yuelin.
 http://idecide.mskcc.org
 ---
 library(Hmisc)
 Loading required package: survival
 Loading required package: splines
 data(kyphosis, package = rpart)
 kp - lapply(kyphosis, function(x)
 +       { is.na(x) - sample(1:length(x), size = 10); x })
 kp - data.frame(kp)
 kp$kyp - kp$Kyphosis == present
 set.seed(7)
 imp - aregImpute( ~ kyp + Age + Start + Number, dat = kp, n.impute = 10,
 +                      type = pmm, match = closest)
 Iteration 13
 f - fit.mult.impute(kyp ~ Age + Start + Number, fitter=glm, xtrans=imp,
 +                 family = binomial, data = kp)

 Variance Inflation Factors Due to Imputation:

 (Intercept)         Age       Start      Number
       1.06        1.28        1.17        1.12

 Rate of Missing Information:

 (Intercept)         Age       Start      Number
       0.06        0.22        0.14        0.10

 d.f. for t-distribution for Tests of Single Coefficients:

 (Intercept)         Age       Start      Number
    2533.47      193.45      435.79      830.08

 The following fit components were averaged over the 10 model fits:

  fitted.values linear.predictors

 Warning message:
 In fit.mult.impute(kyp ~ Age + Start + Number, fitter = glm, xtrans = imp,  :
  Not using a Design fitting function; summary(fit) will use
 standard errors, t, P from last imputation only.  Use vcov(fit) to get the
 correct covariance matrix, sqrt(diag(vcov(fit))) to get s.e.


 f

 Call:  fitter(formula = formula, family = binomial, data = completed.data)

 Coefficients:
 (Intercept)          Age        Start       Number
    -3.6971       0.0118      -0.1979       0.6937

 Degrees of Freedom: 80 Total (i.e. Null);  77 Residual
 Null Deviance:      80.5
 Residual Deviance: 58   AIC: 66
 sqrt(diag(vcov(f)))
 (Intercept)         Age       Start      Number
  1.5444782   0.0063984   0.0652068   0.2454408
 -0.1979/0.0652068
 [1] -3.0350
 summary(f)

 Call:
 fitter(formula = formula, family = binomial, data = completed.data)

 Deviance Residuals:
   Min      1Q  Median      3Q     Max
 -1.240  -0.618  -0.288  -0.109   2.409

 Coefficients:
            Estimate Std. Error z value Pr(|z|)
 (Intercept)  -3.6971     1.5445   -2.39   0.0167
 Age           0.0118     0.0064    1.85   0.0649
 Start        -0.1979     0.0652   -3.03   0.0024
 Number        0.6937     0.2454    2.83   0.0047

 (Dispersion parameter for binomial family taken to be 1)

    Null deviance: 80.508  on 80  degrees of freedom
 Residual deviance: 57.965  on 77  degrees of freedom
 AIC: 65.97

 Number of Fisher Scoring iterations: 5


     =

     Please note that this e-mail and any files transmitted with it may be
     privileged, confidential, and protected from disclosure under
     applicable law. If the reader of this message is not the intended
     recipient, or an employee or agent responsible for delivering this
     message to the intended recipient, you are hereby notified that any
     reading, dissemination, distribution,

Re: [R] fit.mult.impute() in Hmisc

2011-08-13 Thread Frank Harrell

For your approach how do you know that either summary or vcov used multiple
imputation?  You are using a non-rms fitting function so be careful. 
Compare with using the lrm fitting function.  Also repace Design with the
rms package.

Please omit confidentiality notices from your e-mails.

Frank


I tried multiple imputation with aregImpute() and
fit.mult.impute() in Hmisc 3.8-3 (June 2010) and R-2.12.1.

The warning message below suggests that summary(f) of
fit.mult.impute() would only use the last imputed data set.
Thus, the whole imputation process is ignored.

  Not using a Design fitting function; summary(fit) 
   will use standard errors, t, P from last imputation only.  
   Use vcov(fit) to get the correct covariance matrix, 
   sqrt(diag(vcov(fit))) to get s.e.

But the standard errors in summary(f) agree with the values
from sqrt(diag(vcov(f))) to the 4th decimal point.  It would
seem that summary(f) actually adjusts for multiple
imputation?

Does summary(f) in Hmisc 3.8-3 actually adjust for MI?

If it does not adjust for MI, then how do I get the
MI-adjusted coefficients and standard errors?

I can't seem to find answers in the documentations, including
rereading section 8.10 of the Harrell (2001) book  Googling
located a thread in R-help back in 2003, which seemed dated.
Many thanks in advance for the help,

Yuelin.
http://idecide.mskcc.org
---
 library(Hmisc)
Loading required package: survival
Loading required package: splines
 data(kyphosis, package = rpart)
 kp - lapply(kyphosis, function(x) 
+   { is.na(x) - sample(1:length(x), size = 10); x })
 kp - data.frame(kp)
 kp$kyp - kp$Kyphosis == present
 set.seed(7)
 imp - aregImpute( ~ kyp + Age + Start + Number, dat = kp, n.impute = 10, 
+  type = pmm, match = closest)
Iteration 13 
 f - fit.mult.impute(kyp ~ Age + Start + Number, fitter=glm, xtrans=imp, 
+ family = binomial, data = kp)

Variance Inflation Factors Due to Imputation:

(Intercept) Age   Start  Number 
   1.061.281.171.12 

Rate of Missing Information:

(Intercept) Age   Start  Number 
   0.060.220.140.10 

d.f. for t-distribution for Tests of Single Coefficients:

(Intercept) Age   Start  Number 
2533.47  193.45  435.79  830.08 

The following fit components were averaged over the 10 model fits:

  fitted.values linear.predictors 

Warning message:
In fit.mult.impute(kyp ~ Age + Start + Number, fitter = glm, xtrans = imp, 
:
  Not using a Design fitting function; summary(fit) will use
standard errors, t, P from last imputation only.  Use vcov(fit) to get the
correct covariance matrix, sqrt(diag(vcov(fit))) to get s.e.


 f

Call:  fitter(formula = formula, family = binomial, data = completed.data)

Coefficients:
(Intercept)  AgeStart   Number  
-3.6971   0.0118  -0.1979   0.6937  

Degrees of Freedom: 80 Total (i.e. Null);  77 Residual
Null Deviance:  80.5 
Residual Deviance: 58   AIC: 66 
 sqrt(diag(vcov(f)))
(Intercept) Age   Start  Number 
  1.5444782   0.0063984   0.0652068   0.2454408 
 -0.1979/0.0652068
[1] -3.0350
 summary(f)

Call:
fitter(formula = formula, family = binomial, data = completed.data)

Deviance Residuals: 
   Min  1Q  Median  3Q Max  
-1.240  -0.618  -0.288  -0.109   2.409  

Coefficients:
Estimate Std. Error z value Pr(|z|)
(Intercept)  -3.6971 1.5445   -2.39   0.0167
Age   0.0118 0.00641.85   0.0649
Start-0.1979 0.0652   -3.03   0.0024
Number0.6937 0.24542.83   0.0047

(Dispersion parameter for binomial family taken to be 1)

Null deviance: 80.508  on 80  degrees of freedom
Residual deviance: 57.965  on 77  degrees of freedom
AIC: 65.97

Number of Fisher Scoring iterations: 5

 

-
Frank Harrell
Department of Biostatistics, Vanderbilt University
--
View this message in context: 
http://r.789695.n4.nabble.com/fit-mult-impute-in-Hmisc-tp3419037p3741881.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] seeking advice about rounding error and %%

2011-08-13 Thread Paul Johnson

A client came into our consulting center with some data that had been
damaged by somebody who opened it in MS Excel.  The columns were
supposed to be integer valued, 0 through 5, but some of the values
were mysteriously damaged. There were scores like 1.18329322 and such
in there.  Until he tracks down the original data and finds out what
went wrong, he wants to take all fractional valued scores and convert
to NA.

As a quick hack, I suggest an approach using %%

 x - c(1,2,3,1.1,2.12131, 2.001)
 x %% 1
[1] 0.0 0.0 0.0 0.1 0.12131 0.00100
 which(x %% 1  0)
[1] 4 5 6
 xbad - which(x %% 1  0)
  x[xbad] - NA
  x
[1]  1  2  3 NA NA NA

I worry about whether x %% 1 may ever return a non zero result for an
integer because of rounding error.

Is there a recommended approach?

What about zapsmall on the left, but what on the right of ?

which( zapsmall(x %% 1)   0 )


Thanks in advance

-- 
Paul E. Johnson
Professor, Political Science
1541 Lilac Lane, Room 504
University of Kansas

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Own R function doubt

2011-08-13 Thread R. Michael Weylandt

It sounds like the data frame produced by Simulate() doesn't set the names
you want. You can probably fix this by including

colnames(Gestion)  c(B,T1,... # etc)

immediately after the simulation.

Can't confirm this without knowing which of the excel/R interface packages
you're using, but I'd be willing to be that if you asked R for
colnames(Gestion) you'd see result 1, result 2 etc that show up in Excel
later.

Hope this helps -- feel free to let me know if this doesn't work,

Michael Weylandt

On Sat, Aug 13, 2011 at 10:50 AM, garciap garc...@usal.es wrote:

 Hi to all the people again,

 I was writting a simply function in R, and wish to collect the results in a
 excel file. The work goes as follows,

 Ciervos-function(K1, K0, A, R,M,Pi,Hembras)
 {B-(K1-K0)/A
 T1-(R*Pi*Hembras-M*Pi+B)/(Pi-M*Pi+R*Pi*Hembras)
 P1-Pi-B
 R1-P1*Hembras*R
 M1-P1*M
 T2-(R1-M1+B)/(P1-M1+R1)
 P2-P1-B
 R2-P2*Hembras*R
 M2-P2*M
 T3-(R2-M2+B)/(P2-M2+R2)
 P3-P2-B
 R3-P3*Hembras*R
 M3-P3*M
 T4-(R3-M3+B)/(P3-M3+R3)
 P4-P3-B
 R4-P4*Hembras*R
 M4-P4*M
 T5-(R4-M4+B)/(P4-M4+R4)
 P5-P4-B
 R5-P5*Hembras*R
 M5-P5*M
 T6-(R5-M5+B)/(P5-M5+R5)
 P6-P5-B
 R6-P6*Hembras*R
 M6-P6*M
 T7-(R6-M6+B)/(P6-M6+R6)
 P7-P6-B
 R7-P7*Hembras*R
 M7-P7*M
 T8-(R7-M7+B)/(P7-M7+R7)
 P8-P7-B
 R8-P8*Hembras*R
 M8-P8*M
 T9-(R8-M8+B)/(P8-M8+R8)
 P9-P8-B
 R9-P9*Hembras*R
 M9-P9*M
 T10-(R9-M9+B)/(P9-M9+R9)
 P10-P9-B
 R10-P10*Hembras*R
 M10-P10*M

 result-list(B,T1,P1,R1,M1,T2,P2,R2,M2,T3,P4,R4,M4,T5,P5,R5,M5,T6,P6,R6,T6,P7,R7,M7,T8,
 P8,R8,M8,T9,P9,R9,M9,T10,P10,R10,M10)
 return(result)
 }


 library(memisc)
 Gestion-as.data.frame(Simulate(Ciervos(K1, K0, A, R,M,Pi,Hembras),

 expand.grid(K1=c(420,580),K0=c(300,600),A=3,R=0.4,M=0.1,Pi=420,Hembras=0.5),nsim=1,seed=1))
 xls.getshlib()
 write.xls(Gestion, PoblacionCiervos.xls)

 All is fine with the function, by the results (the parameters from B to
 M10)
 are collected in excel by the column names result 1, result 2, etc, and
 I wish to collect the results with their proper name (B instead of result
 1;
 T1 instead of result 2, etc).

 I will ackonowledge any help, many thanks

 pablo

 --
 View this message in context:
 http://r.789695.n4.nabble.com/Own-R-function-doubt-tp3741463p3741463.html
 Sent from the R help mailing list archive at Nabble.com.

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] degrees of freedom does not appear in the summary lmer :(

2011-08-13 Thread Joshua Wiley

Hi Sophie,

It is not clear what the degrees of freedom should be in an lmer
model, so their not appearing is intentional.  There is fairly
extensive discussion of this topic in the archives for the R-sig-mixed
list.

See, for example: http://rwiki.sciviews.org/doku.php?id=guides:lmer-tests

Cheers,

Josh

On Sat, Aug 13, 2011 at 6:31 AM, xy wtemptat...@hotmail.co.uk wrote:
 Hi ,

 Could someone pls help me about this topic, I dont know how can i extract
 them from my  model!!

 Thanks,

 Sophie

 --
 View this message in context: 
 http://r.789695.n4.nabble.com/degrees-of-freedom-does-not-appear-in-the-summary-lmer-tp3741327p3741327.html
 Sent from the R help mailing list archive at Nabble.com.

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.




-- 
Joshua Wiley
Ph.D. Student, Health Psychology
Programmer Analyst II, ATS Statistical Consulting Group
University of California, Los Angeles
https://joshuawiley.com/

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] linear regression

2011-08-13 Thread Mikhail Titov

Don't forget to load `lattice` package. `latticeExtra` with
`panel.ablineeq` can be also helpful.

This was however for plotting. For subset regression by each WR without
plotting you'd use something like `lapply` or `sapply`.

ans - sapply(unique(data$WR), function(dir) {
out - list(lm(PM10~Ref, subset(data, WR=dir)))
names(out) - dir
out
})

`ans$West` will return one of the results.

There are many ways to skin a cat. Perhaps it was not the best one.

Mikhail

On 08/13/2011 11:30 AM, Dennis Murphy wrote:
 Hi:

 Try something like this, using dat as the name of your data frame:

 xyplot(PM10 ~ Ref | WR, data = dat, type = c('p', 'r'))

 The plot looks silly with the data snippet you provided, but should
 hopefully look more sensible with the complete data. The code creates
 a four panel plot, one per direction, with points and a least squares
 regression line fit in each panel. The regression line is specific to
 a data subset, not the entire data frame.

 HTH,
 Dennis

 On Sat, Aug 13, 2011 at 5:43 AM, maggy yan kiot...@googlemail.com wrote:
 dear R users,
 my data looks like this

 PM10   Ref   UZ JZ WT   RH   FT   WR
 1   10.973195  4.338874 nein Winter   Dienstag   ja nein West
 26.381684  2.250446 nein SommerSonntag nein   ja  Süd
 3   62.586512 66.304869   ja SommerSonntag nein nein  Ost
 45.590101  8.526152   ja Sommer Donnerstag nein nein Nord
 5   30.925054 16.073091 nein WinterSonntag nein nein  Ost
 6   10.750567  2.285075 nein Winter   Mittwoch nein nein  Süd
 7   39.118316 17.128691   ja SommerSonntag nein nein  Ost
 89.327564  7.038572   ja Sommer Montag nein nein Nord
 9   52.271744 15.021977 nein Winter Montag nein nein  Ost
 10  27.388416 22.449102   ja Sommer Montag nein nein  Ost

 .

 .

 .

 .

 til 200


 I'm trying to make a linear regression between PM10 and Ref for each of the
 four WR, I've tried this:
 plot(Nord$PM10 ~ Nord$Ref, main=Nord, xlab=Ref, ylab=PM10)
 but it does not work, because Nord cannot be found
 what was wrong? how can I do it? please help me

[[alternative HTML version deleted]]


 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Plotting and quantiles

2011-08-13 Thread R. Michael Weylandt

I believe you received an informative answer to both these questions from
Daniel Maiter one hour and twenty five minutes after sending your question:
I repeat it here just in case you didn't get it.

--

Q1 is very opaque because you are not even saying what kind of plot you
want.
For a regular scatterplot, you have multiple options.

a.) select only the data in the given intervals and plot the data

b.) plot the entire data, but restrict the graph region to the intervals you
are interested in, or

c.) winsorize the data (i.e., set values below the lower cutoff and above
the upper cutoff to the cutoff values

Which one you want to do depends on which one makes the most sense given the
purpose of your analysis

Say:

x-rnorm(100)
y-x+rnorm(100)

Then

a.) plot(y~x,data=data.frame(x,y)[
x2x-2 , ])
#plots y against x only for xs between -2 and 2

b.) plot(y~x,xlim=c(-2,2))

#plots all y agains x, but restricts the plotting region to -2 to 2 on the
x-axis

c.)

x-replace(x,x2,2)
x-replace(x,x(-2),-2)
plot(y~x)

#sets all x-values below -2 and above 2 to these cutoffs



Q2: look at the cut() function.

?cut

HTH,
Daniel

-

If you need more information, a different solution, or further
clarification, please ask new questions.

Michael Weylandt

On Sat, Aug 13, 2011 at 10:10 AM, Mark D. d.mar...@ymail.com wrote:

 Dear R users,


 This is most likely very basic question but I am new to R and would really
 appreciate some tips on those two problems.

 1) I need to plot variables from a data frame. Because of some few high
 numbers my graph is really strange looking. How could I plot a fraction of
 the samples (like 0.1 (10%), 0.2 up to for example 0.6) on x axis and values
 'boundaries' (like any value ' 100',  '101-200' and ' 201') on the y axis?
 This needs to be a simple line plot like the one I attached for an example.
 The values would come from one column.


 2) I have a data frame with values and need to subset the rows based on the
 values. I wanted to order them (with increasing values) and divide into 3-4
 groups. I though about using quantile but I want the group to be something
 like '1-25', '26-50', '51-75', '75-100' (ordered and for example 25th
 percentile, 26-50th etc). I could just look for a median divide into two and
 then again (or use quantiles 0.25, 0.5, 0.7 and 1 and then get rid of all
 rows in 0.25 that are in 0.5 etc) but surely there must by a faster and
 simpler way to do that (I need to do this a lot on different columns)?

 Thanks for your help,
 Mark
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.



[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] efficient use of lm over a matrix vs. using apply over rows

2011-08-13 Thread darius

Good Bless you Duncan. Your explanation is crisp and to the point. Thank you.

--
View this message in context: 
http://r.789695.n4.nabble.com/efficient-use-of-lm-over-a-matrix-vs-using-apply-over-rows-tp870810p3742043.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] NA in lm model summary, no NA in data table!

2011-08-13 Thread stefy

Dear users of R!
I have problems with linear model summary. I do not have any NA values in my
data table, but in summary of linear model there are some NA instead of
resulst. I don´t know why :(

I am interested in ecological factors influencing temperature in ant nest, I
have data concerning ant nest temperature and air temperature and some nest
parameters as explanatory variable. Nest and air temperatues are different
every day, nest parameters as size, moisture, shading do not change with the
date.
It looks like this:
date  nest   nest.temp  nest.t.fluctuation   air.temperature  nest.size 
nest.GPS   rain sun
1.3.A1 25.3  5.02 12.3   
1.06856 225  1247
2.3.A1 23.1  4.5   11.9   
1.06856 225  1247
...

In results I can see:

 summary(model3)

Call:
lm(formula = t.change ~ nest + dat.1 + GPS + moist + volume + 
T.prum + a.flukt + sun.year)

Residuals:
Min  1Q  Median  3Q Max 
-5.0853 -0.1879  0.0104  0.1874  4.0023 

Coefficients: (3 not defined because of singularities)
  Estimate Std. Error t value Pr(|t|)
(Intercept) -5.035e+00  3.805e+00  -1.3230.186
nestA2  -1.742e+01  2.672e+00  -6.522 1.02e-10 ***
nestA3  -2.371e+00  2.880e-01  -8.232 4.75e-16 ***
nestA4  -7.886e+00  1.140e+00  -6.920 7.35e-12 ***
nestB1  -5.000e+00  7.298e-01  -6.852 1.16e-11 ***
nestB2   7.874e-01  1.897e-01   4.151 3.54e-05 ***
nestB3   2.435e+00  3.852e-01   6.321 3.66e-10 ***
nestB4  -4.804e+00  7.522e-01  -6.387 2.41e-10 ***
nestC1  -1.985e+01  3.002e+00  -6.613 5.67e-11 ***
nestC2  -8.721e+00  1.291e+00  -6.753 2.25e-11 ***
nestC3  -2.143e+01  3.254e+00  -6.585 6.80e-11 ***
nestC4  -6.586e+00  9.884e-01  -6.663 4.08e-11 ***
dat.1   -1.610e-04  1.026e-04  -1.5680.117
GPS NA NA  NA   NA
moist5.675e-01  8.669e-02   6.546 8.75e-11 ***
volume  NA NA  NA   NA
T.prum  -1.138e-02  1.608e-03  -7.078 2.48e-12 ***
a.flukt -1.584e-02  2.415e-03  -6.558 8.10e-11 ***
sun.yearNA NA  NA   NA
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1 

Residual standard error: 0.4568 on 1199 degrees of freedom
  (92 observations deleted due to missingness)
Multiple R-squared: 0.2305, Adjusted R-squared: 0.2209 
F-statistic: 23.95 on 15 and 1199 DF,  p-value:  2.2e-16 

 help.search(singularities)
No help files found with alias or concept or title matching ‘singularities’
using fuzzy
matching.
 


I count data separately for each season of year (spring, summer...),
togehter I have more than 1000 rows in table (91 days for each of 12 nests).
There are no NA values in my data, most of factors are numeric vectors,
there are only 2 factors. I have chcecked whether the factors are saved as
factors, I have searched for NA values...  Ihave tried to load the date many
times... but nothing.
When I do the model forward, when I start with GPS it is ok, the summary
shows DF, sum of squares, p, but when I fitt whole model and update it by
taking away unsignificant variables the model shows NA in summary. It writes
something about singularities but I can´t found it i help.
The most strange is, that this problem occures only in some data sheeds, for
example it occurs in spring but not in summer. But the data arangement
and process of counting in R I have used are identical.

Please, could you help me?
Thank you very much
Stefy

--
View this message in context: 
http://r.789695.n4.nabble.com/NA-in-lm-model-summary-no-NA-in-data-table-tp3741822p3741822.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Excluding NAs from round correlation

2011-08-13 Thread Julie

Thank you, I found this in the help pack:

use: an optional character string giving a method for computing covariances
in the presence of missing values. This must be (an abbreviation of) one of
the strings everything, all.obs, complete.obs, na.or.complete, or
pairwise.complete.obs

I should probably use na.or.complete when I want to get results not saying
NA, is it right?

--
View this message in context: 
http://r.789695.n4.nabble.com/Excluding-NAs-from-round-correlation-tp3741296p3741924.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] what is Inverse link functions in linear modelling of location

2011-08-13 Thread Laura

Hello

I have a problem with the following function
(http://www.oga-lab.net/RGM2/func.php?rd_id=ismev:gev.fit):

gev.fit(xdat, ydat = NULL, mul = NULL, sigl = NULL, shl = NULL, 
mulink = identity, siglink = identity, shlink = identity, 
muinit = NULL, siginit = NULL, shinit = NULL,
show = TRUE, method = Nelder-Mead, maxit = 1, ...)

For the Parameter mulink i neet to pass the Inverse link functions for
generalized linear modelling of the location. Is it possible to define an
linear trend, where the slope is fitted by this function?

Thank you!

Best regards


--
View this message in context: 
http://r.789695.n4.nabble.com/what-is-Inverse-link-functions-in-linear-modelling-of-location-tp3742010p3742010.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Excluding NAs from round correlation

2011-08-13 Thread Julie

The help pack says:

use: an optional character string giving a method for computing covariances
in the presence of missing values. This must be (an abbreviation of) one of
the strings everything, all.obs, complete.obs, na.or.complete, or
pairwise.complete.obs

If I used everything, the results would be NAs again. all.obs would
result in error. complete.obs gives me error too. na.or.complete gives
me all NAs... But pairwise.complete.obs finally got the right results.

--
View this message in context: 
http://r.789695.n4.nabble.com/Excluding-NAs-from-round-correlation-tp3741296p3742039.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] degrees of freedom does not appear in the summary lmer :(

2011-08-13 Thread Dennis Murphy

Hi:

This is worth reading and bookmarking:
http://glmm.wikidot.com/faq

HTH,
Dennis


On Sat, Aug 13, 2011 at 6:31 AM, xy wtemptat...@hotmail.co.uk wrote:
 Hi ,

 Could someone pls help me about this topic, I dont know how can i extract
 them from my  model!!

 Thanks,

 Sophie

 --
 View this message in context: 
 http://r.789695.n4.nabble.com/degrees-of-freedom-does-not-appear-in-the-summary-lmer-tp3741327p3741327.html
 Sent from the R help mailing list archive at Nabble.com.

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] efficient use of lm over a matrix vs. using apply over rows

2011-08-13 Thread darius

Good Bless you Duncan. Your explanation is crisp and to the point. Thank you.

--
View this message in context: 
http://r.789695.n4.nabble.com/efficient-use-of-lm-over-a-matrix-vs-using-apply-over-rows-tp870810p3742058.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Casualty Actuarial Society request for proposals for R Workshop

2011-08-13 Thread Kevin Burke

I'm a property-casualty actuary, use R in at my job, and lurk on the 
list. In conjunction with one of its meetings, the Casualty Actuarial 
Society (I'm a member) is looking for proposals from people to teach a 
workshop in R and I thought members of the list might be interested. 
I've pasted the information below.


My apologies if this posting violates list rules.

Thanks.

Kevin





http://www.casact.org/cms/index.cfm?fa=viewArticlearticleID=1613

2012 RPM Seminar Committee Welcomes Proposals for New R Workshop
08/08/2011 —

I. Casualty Actuarial Society

The Casualty Actuarial Society (CAS) was organized in 1914 as a 
professional society with the purpose of advancing the body of knowledge 
of actuarial science applied to property, casualty and similar risk 
exposures. This is accomplished through communication with the publics 
affected by insurance, the presentation and discussion of papers, 
attendance at seminars and workshops, collection of a library, funded 
research activities, and other means. The membership of the CAS includes 
over 4,000 actuaries employed by insurance companies, consulting firms, 
brokers, and the government. Additional information about the CAS can be 
found on the CAS website.


II. CAS Ratemaking/Product Management Seminar

The CAS Ratemaking/Product Management (“RPM”) Seminar is scheduled to 
take place in Philadelphia, PA on March 19-21, 2012. As with previous 
RPM Seminars, full day workshops on 4 different subject areas of 
particular interest are scheduled to be offered on the first day, or 
Monday, March 19, 2012. Examples of schedules, workshop descriptions and 
presentations from previous such workshops can be found on the CAS 
website. In response to feedback received from previous RPM Seminar 
attendees, the RPM Seminar Planning Committee (“Committee”) intends to 
include Introduction to R as one of the workshop topics at the 2012 
seminar, to provide hands-on R training for beginners.


III. Project Specifications

The Committee wishes to enlist subject matter experts to develop and 
conduct the above described 1 day workshop on R. The workshop should be 
customized to focus on the critical issues surrounding creation of an R 
program.


The Committee is most interested in providing training in the following 
areas, but is open to considering additional steps that are offered by 
respondents:


R interface
Programming in R
R datasets
R packages
Actuarial models in R

The workshop will need to include a dataset for analysis during the 
session(s), to be provided to attendees with sufficient lead time that 
they can become knowledgeable about the dataset before the workshop 
begins. An assignment could accompany the dataset so that attendees can 
review the data and perform necessary data analysis.


It is expected that the presenters will use a computer and an LCD 
projector, and that attendees will be able to use their own computers to 
conduct analysis on the dataset prior to and during the workshop. The 
participants should have access to the R software during the workshop 
and should be provided instructions on how to download the required 
version of R prior to the workshop along with any required packages.


Expected workshop attendance would be 50 persons.

The presenters must adhere to the same requirements and deadlines 
imposed on all workshop/RPM Seminar presenters, in terms of working with 
Committee session coordinators and making materials available to 
attendees prior to the seminar.


IV. Proposal Requirements

Proposals are due by September 12, 2011 and should include the following 
items:


A clear description of seminar education content
Demonstrated experience within the field
Three professional references
All submitted proposals will be evaluated equally. The Committee will, 
by October 10, 2011, select the respondent who, in the judgment of the 
Committee, is best able to perform the work as specified herein. The 
Committee reserves the right not to accept any proposal if an acceptable 
proposal is not received.


Interested parties should submit their proposals and any questions in 
writing via e-mail to Vincent Edwards, CAS Manager, Professional 
Education, at vedwa...@casact.org.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Adjacency Matrix help

2011-08-13 Thread Daniel Nordlund

 -Original Message-
 From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org]
 On Behalf Of collegegurl69
 Sent: Saturday, August 13, 2011 1:01 AM
 To: r-help@r-project.org
 Subject: Re: [R] Adjacency Matrix help

 Thanks so much for your quick reply. it seems to work. the problem is that
 it
 now places actual zeros on the diagonal whereas the rest of the adjacency
 matrix has dots to represent zeroes. Do you have any ideas on how to
 change
 these zeros to dots like in the rest of the adj matrix? Or is it the same
 thing? Thanks.

This is one of the reasons why it is useful/important to provide a reproducible 
example.  When I think of an adjacency matrix, my default mental representation 
is a numeric matrix which was reinforced by the request for zeros on the 
diagonal. 

So, how did you create this matrix?  Could you post a self-contained, 
reproducible example as the posting guide requests?  At a minimum, can apply 
str() to your matrix and post the output?

Dan

Daniel Nordlund
Bothell, WA USA

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] compiling r from source on Windows 7 (64 bit)

2011-08-13 Thread Erin Hodgess

Dear R People:

Hope you're having a nice Saturday.

I'm trying to compile R-2.13.1 from source on Windows 7 (64 bit).
I've been able to compile on a 32 bit without any problems.

I changed my BINPREF64, WIN, DEFS_W64 in MkRules.local and did the
usual stuff with the jpeg, etc.

But things are jogging along and I get the following:

Makefile.win:28: ../../../../etc/x64/Makeconf: No such file or directory

Has anyone run across this, please?

Should I possibly just switch back to 32 bit, do you think, please?

I need to compile from source because I'm building packages.

Thanks for any help.

Sincerely,
Erin



-- 
Erin Hodgess
Associate Professor
Department of Computer and Mathematical Sciences
University of Houston - Downtown
mailto: erinm.hodg...@gmail.com

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] compiling r from source on Windows 7 (64 bit)

2011-08-13 Thread Joshua Wiley

Erin,

You can build packages without compiling from source (what made you
think you couldn't?).  Did you make sure when you installed the Rtools
(I am assuming you are using those rather than going out and getting
everything you need on your own) that you included everything for 64
bit builds?  When switching between 32  64, I typically only switch
between WIN = 32 and WIN = 64.

Josh

On Sat, Aug 13, 2011 at 6:18 PM, Erin Hodgess erinm.hodg...@gmail.com wrote:
 Dear R People:

 Hope you're having a nice Saturday.

 I'm trying to compile R-2.13.1 from source on Windows 7 (64 bit).
 I've been able to compile on a 32 bit without any problems.

 I changed my BINPREF64, WIN, DEFS_W64 in MkRules.local and did the
 usual stuff with the jpeg, etc.

 But things are jogging along and I get the following:

 Makefile.win:28: ../../../../etc/x64/Makeconf: No such file or directory

 Has anyone run across this, please?

 Should I possibly just switch back to 32 bit, do you think, please?

 I need to compile from source because I'm building packages.

 Thanks for any help.

 Sincerely,
 Erin



 --
 Erin Hodgess
 Associate Professor
 Department of Computer and Mathematical Sciences
 University of Houston - Downtown
 mailto: erinm.hodg...@gmail.com

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.




-- 
Joshua Wiley
Ph.D. Student, Health Psychology
Programmer Analyst II, ATS Statistical Consulting Group
University of California, Los Angeles
https://joshuawiley.com/

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] seeking advice about rounding error and %%

2011-08-13 Thread Joshua Wiley

Hi Paul,

What about using:

x[x != as.integer(x)] - NA

I cannot think of a situation off hand where this would fail to turn
every non integer to missing.

I wonder if there is really a point to this?  Can the client proceed
with data analysis with any degree of confidence when an unknown
mechanism has altered data in unknown ways?  Could Excel have
sometimes changed one integer to another (e.g., 4s became
1.18whatever, but 3s became 1s or)?

Cheers,

Josh

On Sat, Aug 13, 2011 at 12:42 PM, Paul Johnson pauljoh...@gmail.com wrote:
 A client came into our consulting center with some data that had been
 damaged by somebody who opened it in MS Excel.  The columns were
 supposed to be integer valued, 0 through 5, but some of the values
 were mysteriously damaged. There were scores like 1.18329322 and such
 in there.  Until he tracks down the original data and finds out what
 went wrong, he wants to take all fractional valued scores and convert
 to NA.

 As a quick hack, I suggest an approach using %%

 x - c(1,2,3,1.1,2.12131, 2.001)
 x %% 1
 [1] 0.0 0.0 0.0 0.1 0.12131 0.00100
 which(x %% 1  0)
 [1] 4 5 6
 xbad - which(x %% 1  0)
  x[xbad] - NA
  x
 [1]  1  2  3 NA NA NA

 I worry about whether x %% 1 may ever return a non zero result for an
 integer because of rounding error.

 Is there a recommended approach?

 What about zapsmall on the left, but what on the right of ?

 which( zapsmall(x %% 1)   0 )


 Thanks in advance

 --
 Paul E. Johnson
 Professor, Political Science
 1541 Lilac Lane, Room 504
 University of Kansas

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.




-- 
Joshua Wiley
Ph.D. Student, Health Psychology
Programmer Analyst II, ATS Statistical Consulting Group
University of California, Los Angeles
https://joshuawiley.com/

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] [R\ Compiling R from source on Windows 7 (64 bit) solved

2011-08-13 Thread Erin Hodgess

Hello again.

Due to the excellent help from Josh Wiley,  I ran back in the C:/R
directory with only changing WIN = 64 in the MkRules.local file (other
than the JPEG, etc).  All was well.

Thanks,
Erin


-- 
Erin Hodgess
Associate Professor
Department of Computer and Mathematical Sciences
University of Houston - Downtown
mailto: erinm.hodg...@gmail.com

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Using get() or similar function to access more than one element in a vector

2011-08-13 Thread Joseph Sorell

Dear R-users,

I've written a script that produces a frequency table for a group of
texts. The table has a total frequency for each word type and
individual frequency counts for each of the files. (I have not
included the code for creating the column headers.) Below is a sample:

Word  Total 01.txt  02.txt  03.txt  04.txt  05.txt
the 22442   26673651157921323097
I   18377   3407  454 824 449   3746
and 15521   23772174  891   10062450
to  13598   17161395  905   10211983
of  12834   16471557  941   11271887
it  12440   2160  916 497 493   2449
you 12036   2283  356 293 106   2435

I've encountered two problems when I try to construct and save the file.

The combined.sorted.freq.list is a named integer vector in which the
integers are the total frequency counts for each word. The names are
the words. For each of the individual lists I've created frequency
lists that are sorted in the order of the combined list. (NAs have
been replaced with 0). These are called combined. plus the number
of the file.
If I were to write the line to save the file manually, it would look like this:

combined.table-paste(names(combined.sorted.freq.list),
combined.sorted.freq.list, combined.01, combined.02, combined.03,
combined.04, combined.05, combined.06, combined.07, combined.08,
combined.09, combined.10, combined.11, combined.12, sep=\t)
#creates a table with columns for the combined and all of the
component lists

However, each time I run the script, there may be a differing number
of text files. I created a list of the individual frequency counts
called combined.file.list

combined.file.count-1:length(selected.files) #counts number of files
originally selected
combined.file.list-paste(combined, combined.file.count, sep=.)
#creates the file names for the combined lists by catenating
combined with each file number separated by a period by recycled the
string combined for each number

I then tried to include it as one of the elements to be pasted by using get().

combined.table-paste(names(combined.sorted.freq.list),
combined.sorted.freq.list, get(combined.file.list[]), sep=\t)
#intended to create a table with columns for the combined and all of
the component lists

Unfortunately, the get() function only gets the first component list
since get() can apparently only access one object.

This results in a table with only the total frequency and the amount
of the first text:

Word  Total 01.txt
the 22442   2667
I   18377   3407
and 15521   2377
to  13598   1716
of  12834   1647
it  12440   2160
you 12036   2283

If I try to construct the file piece by piece as they are created, I
get an error message that a vector of more than 1.3 Gb cannot be
created. Does anyone know how I could use get() or some other method
to access all of the files named in a vector?

Many thank for any help you can offer!

Joseph

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] seeking advice about rounding error and %%

2011-08-13 Thread Ken

How about something like:
If(round(x)!=x){zap} not exactly working code but might help

  Ken 
On Aug 13, 2554 BE, at 3:42 PM, Paul Johnson pauljoh...@gmail.com wrote:

 A client came into our consulting center with some data that had been
 damaged by somebody who opened it in MS Excel.  The columns were
 supposed to be integer valued, 0 through 5, but some of the values
 were mysteriously damaged. There were scores like 1.18329322 and such
 in there.  Until he tracks down the original data and finds out what
 went wrong, he wants to take all fractional valued scores and convert
 to NA.
 
 As a quick hack, I suggest an approach using %%
 
 x - c(1,2,3,1.1,2.12131, 2.001)
 x %% 1
 [1] 0.0 0.0 0.0 0.1 0.12131 0.00100
 which(x %% 1  0)
 [1] 4 5 6
 xbad - which(x %% 1  0)
 x[xbad] - NA
 x
 [1]  1  2  3 NA NA NA
 
 I worry about whether x %% 1 may ever return a non zero result for an
 integer because of rounding error.
 
 Is there a recommended approach?
 
 What about zapsmall on the left, but what on the right of ?
 
 which( zapsmall(x %% 1)   0 )
 
 
 Thanks in advance
 
 -- 
 Paul E. Johnson
 Professor, Political Science
 1541 Lilac Lane, Room 504
 University of Kansas
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Using get() or similar function to access more than one element in a vector

2011-08-13 Thread Joshua Wiley

Hi Joseph,

Without a reproducible example, you probably will not get the precise
code for a solution but look at ?list

Rather than doing what you are doing now, put everything into a list,
and then you will not need to use get() at all.  You will just work
with the whole list.  It can take a bit to get to get used to working
that way, but it is worth it.

Cheers,

Josh

On Sat, Aug 13, 2011 at 9:34 PM, Joseph Sorell josephsor...@gmail.com wrote:
 Dear R-users,

 I've written a script that produces a frequency table for a group of
 texts. The table has a total frequency for each word type and
 individual frequency counts for each of the files. (I have not
 included the code for creating the column headers.) Below is a sample:

 Word  Total     01.txt  02.txt  03.txt  04.txt  05.txt
 the     22442   2667    3651    1579    2132    3097
 I       18377   3407      454     824     449   3746
 and     15521   2377    2174      891   1006    2450
 to      13598   1716    1395      905   1021    1983
 of      12834   1647    1557      941   1127    1887
 it      12440   2160      916     497     493   2449
 you     12036   2283      356     293     106   2435

 I've encountered two problems when I try to construct and save the file.

 The combined.sorted.freq.list is a named integer vector in which the
 integers are the total frequency counts for each word. The names are
 the words. For each of the individual lists I've created frequency
 lists that are sorted in the order of the combined list. (NAs have
 been replaced with 0). These are called combined. plus the number
 of the file.
 If I were to write the line to save the file manually, it would look like 
 this:

 combined.table-paste(names(combined.sorted.freq.list),
 combined.sorted.freq.list, combined.01, combined.02, combined.03,
 combined.04, combined.05, combined.06, combined.07, combined.08,
 combined.09, combined.10, combined.11, combined.12, sep=\t)
 #creates a table with columns for the combined and all of the
 component lists

 However, each time I run the script, there may be a differing number
 of text files. I created a list of the individual frequency counts
 called combined.file.list

 combined.file.count-1:length(selected.files) #counts number of files
 originally selected
 combined.file.list-paste(combined, combined.file.count, sep=.)
 #creates the file names for the combined lists by catenating
 combined with each file number separated by a period by recycled the
 string combined for each number

 I then tried to include it as one of the elements to be pasted by using get().

 combined.table-paste(names(combined.sorted.freq.list),
 combined.sorted.freq.list, get(combined.file.list[]), sep=\t)
 #intended to create a table with columns for the combined and all of
 the component lists

 Unfortunately, the get() function only gets the first component list
 since get() can apparently only access one object.

 This results in a table with only the total frequency and the amount
 of the first text:

 Word  Total     01.txt
 the     22442   2667
 I       18377   3407
 and     15521   2377
 to      13598   1716
 of      12834   1647
 it      12440   2160
 you     12036   2283

 If I try to construct the file piece by piece as they are created, I
 get an error message that a vector of more than 1.3 Gb cannot be
 created. Does anyone know how I could use get() or some other method
 to access all of the files named in a vector?

 Many thank for any help you can offer!

 Joseph

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.




-- 
Joshua Wiley
Ph.D. Student, Health Psychology
Programmer Analyst II, ATS Statistical Consulting Group
University of California, Los Angeles
https://joshuawiley.com/

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

54 matches

Mail list logo