Re: [R] Comparing two groups

2013-10-17 Thread Andrej
glsnow wrote
>>From your question it is not clear what your question/concerns really are,
> and from what we can see it could very well be that you do not understand
> the statistics that you are computing (not just the R implementation).  We
> ask for a reproducible example because that helps us to help you, just a
> couple of boxplots let us make some guesses, but we do not know the data
> values or even the means and standard deviations, even the actual sample
> sizes could help.
> 
>>From the graph it is not surprising that the wilcox test say that the 2
> groups are different and that the t test says that they are not (but
> knowing data values would help even more).  The 2 tests are testing very
> different hypotheses.  The wilcox test is testing that the 2 distributions
> are identical and the more specific way it tests that is by looking at all
> possible pairs between the 2 groups and seeing what proportion of them
> have
> each group higher, if the null were true then half the time the data point
> from mixed would be higher than the data point from monoculture and half
> the time the other way.  From the boxplot we can see that the median of
> monoculture is below the 1st quartile of mixed, so it is not surprising at
> all that the wilcox test rejects the null hypothesis.
> 
> The t-test (which version you used you do not say) is testing if the means
> are equal, since monculture is clearly skewed to the right with potential
> outliers, it would not be surprising if the sample means were close enough
> to each other that the t-test does not see a significant difference.  The
> 2
> tests give different answers because they are answering very different
> questions.
> 
> You state that "I am not allowed to perform it" referring to the t-test.
>  This indicates that you don't have a full understanding or appreciation
> of
> the Central Limit Theorem (an important enough theorem that I have a
> cross-stitch based on it hanging on my wall (along with 2 other
> cross-stitches of Bayes theorem and the mean value theorem of
> integration)).  The plot shows 18 outliers in the monoculture group which
> implies a sample size of at least 72, which means the other group has a
> sample size of at least 14 if I interpret "five times as big" correctly.
>  This is a large enough sample size for the CLT to tell us the t-test will
> give a reasonable approximation (provided the other assumptions hold
> reasonably well and you are interested in the question being answered).
> 
> So, I believe that the advice to read a textbook, or otherwise get some
> help in basic understanding of the statistical tools is reasonable.  Once
> you have that, then if you still need help then give us a reproducible
> example and make it clear what your question really is and you will be
> much
> more likely to receive an answer.

 
Thank you for your answer. It was actually really helpful. I apologize for
the inadequate information, but I can see that I do really need to gather
more statistical knowledge. 



--
View this message in context: 
http://r.789695.n4.nabble.com/Comparing-two-groups-tp4678190p4678510.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] dmvnorm returns NaN

2013-10-17 Thread David Winsemius

On Oct 17, 2013, at 9:11 PM, Steven LeBlanc wrote:

> Greets,
> 
> I'm using nlminb() to estimate the parameters of a multivariate normal random 
> sample with missing values and ran into an unexpected result from my call to 
> dmvnorm()

There are at least 5 different version of dmvnorm. None of them are in the 
default packages.

> within the likelihood function. Particular details are provided below.

Complete? Except for the name of the package that has `dmvnorm`.

> It appears that dmvnorm() makes a call to log(eigen(sigma)). Whereas 
> eigen(sigma) is returning a negative number, I understand log()'s complaint. 
> However, it is a mystery to me why this data set should produce such a result.
> 
> Any suggestions?
> 
> Best Regards,
> Steven
> 
>> complete
> [,1]  [,2]
> [1,]  0.84761637  3.994261
> [2,]  0.91487059  4.952595
> [3,]  0.84527267  4.521837
> [4,]  2.53821358  8.374880
> [5,]  1.16646209  6.255022
> [6,]  0.94706527  4.169510
> [7,]  0.48813564  3.349230
> [8,]  3.71828469  9.441518
> [9,]  0.08953357  1.651497
> [10,]  0.68530515  5.498403
> [11,]  1.52771645  8.484671
> [12,]  1.55710697  5.231272
> [13,]  1.89091603  4.152658
> [14,]  1.08483541  5.401544
> [15,]  0.58125385  5.340141
> [16,]  0.24473250  2.965046
> [17,]  1.59954401  8.095561
> [18,]  1.57656436  5.335744
> [19,]  2.73976992  8.572871
> [20,]  0.87720252  6.067468
> [21,]  1.18403087  3.526790
> [22,] -1.03145244  1.776478
> [23,]  2.88197343  7.720838
> [24,]  0.60705218  4.406073
> [25,]  0.58083464  3.374075
> [26,]  0.87913427  5.247637
> [27,]  1.10832692  3.534508
> [28,]  2.92698371  8.682130
> [29,]  4.04115277 11.827360
> [30,] -0.57913297  1.476586
> [31,]  0.84804365  7.009075
> [32,]  0.79497940  3.671164
> [33,]  1.58837762  5.535409
> [34,]  0.63412821  3.932767
> [35,]  3.14032433  9.271014
> [36,] -0.18183869  1.47
> [37,]  0.57535770  6.881830
> [38,]  3.21417723 10.901636
> [39,]  0.29207932  4.120408
> [40,]  0.65938218  5.209301
>> u
> [1] 1.267198 5.475045
>> sigma
>  [,1] [,2]
> [1,] 0.6461647 2.228951
> [2,] 2.2289513 5.697834
>> dmvnorm(x=complete,mean=u,sigma=sigma)

> [1] NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN 
> NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN
> [30] NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN
> Warning message:
> In log(eigen(sigma, symmetric = TRUE, only.values = TRUE)$values) :
>  NaNs produced
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

David Winsemius
Alameda, CA, USA

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] (no subject)

2013-10-17 Thread Manuel Figueroa
I have a simple table, 2 columns and 1994 rows. First row, "Crime" is how
many crimes happen every month per 10 inhabitants and second row is
"Income" which contains the average income recorded in a city.

here's the head(dataset):

Crime  Income
1 356.5152 4285.720
2 734.5625 4114.291
3 541.5171 3542.861
4 292.1667 4057.148
5 219.7747 4457.149
6 308.2538 6114.296

I want to stratify the crime based on income and then box plot each stratum
to compare. Also I need to get the variance of each stratum in a table.

this is the summary of the Income column:

Min. 1st Qu.  MedianMean 3rd Qu.Max.
200032573714400144577714


Closer I've been able to get is this:
strata=table(cut(dataset$Income, breaks, right= FALSE))

where breaks is
> breaks
 [1] 2000 3500 5000 6500 8000

this gives me as result:
> cbind(strata)
  strata
[2e+03,3.5e+03)  805
[3.5e+03,5e+03)  894
[5e+03,6.5e+03)  206
[6.5e+03,8e+03)   89


I'm not even sure if that's the right way to get the strata.

*The important thing here is I need to find a way to get a boxplot of the
Crime values in each stratum and the variance too.
*
Thanks so much in advance.

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] dmvnorm returns NaN

2013-10-17 Thread Steven LeBlanc
Greets,

I'm using nlminb() to estimate the parameters of a multivariate normal random 
sample with missing values and ran into an unexpected result from my call to 
dmvnorm() within the likelihood function. Particular details are provided 
below. It appears that dmvnorm() makes a call to log(eigen(sigma)). Whereas 
eigen(sigma) is returning a negative number, I understand log()'s complaint. 
However, it is a mystery to me why this data set should produce such a result.

Any suggestions?

Best Regards,
Steven

> complete
 [,1]  [,2]
 [1,]  0.84761637  3.994261
 [2,]  0.91487059  4.952595
 [3,]  0.84527267  4.521837
 [4,]  2.53821358  8.374880
 [5,]  1.16646209  6.255022
 [6,]  0.94706527  4.169510
 [7,]  0.48813564  3.349230
 [8,]  3.71828469  9.441518
 [9,]  0.08953357  1.651497
[10,]  0.68530515  5.498403
[11,]  1.52771645  8.484671
[12,]  1.55710697  5.231272
[13,]  1.89091603  4.152658
[14,]  1.08483541  5.401544
[15,]  0.58125385  5.340141
[16,]  0.24473250  2.965046
[17,]  1.59954401  8.095561
[18,]  1.57656436  5.335744
[19,]  2.73976992  8.572871
[20,]  0.87720252  6.067468
[21,]  1.18403087  3.526790
[22,] -1.03145244  1.776478
[23,]  2.88197343  7.720838
[24,]  0.60705218  4.406073
[25,]  0.58083464  3.374075
[26,]  0.87913427  5.247637
[27,]  1.10832692  3.534508
[28,]  2.92698371  8.682130
[29,]  4.04115277 11.827360
[30,] -0.57913297  1.476586
[31,]  0.84804365  7.009075
[32,]  0.79497940  3.671164
[33,]  1.58837762  5.535409
[34,]  0.63412821  3.932767
[35,]  3.14032433  9.271014
[36,] -0.18183869  1.47
[37,]  0.57535770  6.881830
[38,]  3.21417723 10.901636
[39,]  0.29207932  4.120408
[40,]  0.65938218  5.209301
> u
[1] 1.267198 5.475045
> sigma
  [,1] [,2]
[1,] 0.6461647 2.228951
[2,] 2.2289513 5.697834
> dmvnorm(x=complete,mean=u,sigma=sigma)
 [1] NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN 
NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN
[30] NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN
Warning message:
In log(eigen(sigma, symmetric = TRUE, only.values = TRUE)$values) :
  NaNs produced
__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] speeding up a loop

2013-10-17 Thread David Winsemius

On Oct 17, 2013, at 2:56 PM, Ye Lin wrote:

> Hey R professionals,
> 
> I have a large dataset and I want to run a loop on it basically creating a
> new column which gathers information from another reference table.
> 
> When I run the code, R just freezes and even does not response after 30min
> which is really unusual. I tried sapply as well but does not improve at
> all.
> 
> I am running R 3.0.2 on Windows 7.  I checked the system, when I run the
> code, my CPU usage is about 25%-30% that is taxing my desktop.

A guess: It's not your CPU use ... it's your RAM use. You've probably exhausted 
your RAM and your system has paged out to virutla memory
> 
> Here is my code:
> 
> #df1 is the data set I want to add a new column#
> #b is the reference tabel#
> 
> for (i in (1:nrow(df1))) {
>  begin=which(b$Time2==df1$start[i] & b$Date==df1$Date[i])
>  date=unlist(strsplit(as.character(dff$end[i])," "))[1]
>   end=ifelse(date=="2013-10-17",
>   which(b$Time2==df1$end[i] & b$Date==df1$Date[i]),
>   which(b$Time2==df1$end[i]-3600*24 & b$Date==as.Date(df1$Date[i])+1))
>df1$new[i] <- sum(b[begin:end,]$Power)
> }
> 

I get: 
Error in strsplit(as.character(dff$end[i]), " ") : object 'dff' not found

If I change the dff to df1, I get: 
Error in begin:end : argument of length 0

-- 
David.
> And here is a mimic sample of df1 & b:
> 
> df1 <- structure(list(Date = structure(c(1369699200, 1369699200,
> 1369699200,
> 1369699200, 1369699200), tzone = "UTC", class = c("POSIXct",
> "POSIXt")), start = structure(c(1381991205, 1381990247, 1382010454,
> 1382007281, 1381992288), tzone = "UTC", class = c("POSIXct",
> "POSIXt")), end = structure(c(1381992405, 1381993727, 1382010694,
> 1382007461, 1381992468), tzone = "UTC", class = c("POSIXct",
> "POSIXt"))), .Names = c("Date", "start", "end"), row.names = c(NA,
> -5L), class = "data.frame")
> 
> 
> b <- structure(list(Date = structure(c(1369699200, 1369699200, 1369699200,
> 1369699200, 1369699200, 1369699200, 1369699200, 1369699200, 1369699200,
> 1369699200, 1369699200, 1369699200, 1369699200, 1369699200, 1369699200,
> 1369699200, 1369699200, 1369699200, 1369699200, 1369699200, 1369699200,
> 1369699200, 1369699200, 1369699200, 1369699200, 1369699200, 1369699200,
> 1369699200, 1369699200, 1369699200, 1369699200, 1369699200, 1369699200,
> 1369699200, 1369699200, 1369699200, 1369699200, 1369699200, 1369699200,
> 1369699200, 1369699200, 1369699200, 1369699200, 1369699200, 1369699200,
> 1369699200, 1369699200, 1369699200, 1369699200, 1369699200), tzone = "UTC",
> class = c("POSIXct",
> "POSIXt")), Time2 = structure(c(1381989634, 1381989694, 1381989754,
> 1381989814, 1381989874, 1381989934, 1381989994, 1381990054, 1381990114,
> 1381990174, 1381990234, 1381990294, 1381990354, 1381990414, 1381990474,
> 1381990534, 1381990594, 1381990654, 1381990714, 1381990774, 1381990834,
> 1381990894, 1381990954, 1381991014, 1381991074, 1381991134, 1381991194,
> 1381991254, 1381991314, 1381991374, 1381991434, 1381991494, 1381991554,
> 1381991614, 1381991674, 1381991734, 1381991794, 1381991854, 1381991914,
> 1381991974, 1381992034, 1381992094, 1381992154, 1381992214, 1381992274,
> 1381992334, 1381992394, 1381992454, 1381992514, 1381992574), tzone = "UTC",
> class = c("POSIXct",
> "POSIXt")), Power = c(1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12,
> 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28,
> 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44,
> 45, 46, 47, 48, 49, 50)), .Names = c("Date", "Time2", "Power"
> ), row.names = c(NA, -50L), class = "data.frame")
> 
> Thanks for your help!
> 
>   [[alternative HTML version deleted]]
> 
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

David Winsemius
Alameda, CA, USA

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] crr question‏ in library(cmprsk)

2013-10-17 Thread Elan InP
Hi all

I do not understand why I am getting the following error message. Can 
anybody help me with this? Thanks in advance.

install.packages("cmprsk")
library(cmprsk)
result1 <-crr(ftime, fstatus, cov1, failcode=1, cencode=0 )
one.pout1 = predict(result1,cov1,X=cbind(1,one.z1,one.z2))

predict.crr(result1,cov1,X=cbind(1,one.z1,one.z2))
Error: could not find function "predict.crr"



[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Selecting maximums between different variables

2013-10-17 Thread Law, Jason
See ?pmax for getting the max for each year.

do.call('pmax', oil[-1])

Or equivalently:

pmax(oil$TX, oil$CA, oil$AL, oil$ND)

apply and which.max will give you the index:

i <- apply(oil[-1], 1, which.max)

which you can use to extract the state:

names(oil[-1])[i]

Jason

-Original Message-
From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On 
Behalf Of Tim Umbach
Sent: Thursday, October 17, 2013 9:49 AM
To: r-help@r-project.org
Subject: [R] Selecting maximums between different variables

Hi there,

another beginners question, I'm afraid. Basically i want to selct the maximum 
of values, that correspond to different variables. I have a table of oil 
production that looks somewhat like this:

oil <- data.frame( YEAR = c(2011, 2012),
   TX = c(2, 3),
   CA = c(4, 25000),
   AL = c(2,
21000),

   ND = c(21000,6))

Now I want to find out, which state produced most oil in a given year. I tried 
this:

attach(oil)
last_year = oil[ c(YEAR == 2012), ]
max(last_year)

Which works, but it doesnt't give me the corresponding values (i.e. it just 
gives me the maximum output, not what state its from).
So I tried this:

oil[c(oil == max(last_year)),]
and this:
oil[c(last_year == max(last_year)),]
and this:
oil[which.max(last_year),]
and this:
last_year[max(last_year),]

None of them work, but they don't give error messages either, the output is 
just "NA". The problem is, in my eyes, that I'm comparing the values of 
different variables with each other. Because if i change the structure of the 
dataframe (which I can't do with the real data, at least not with out doing it 
by hand with a huge dataset), it looks like this and works
perfectly:

oil2 <- data.frame (
  names = c('YEAR', 'TX', 'CA', 'AL', 'ND'),
  oil_2011 = c(2011, 2, 4, 2, 21000),
  oil_2012 = c(2012, 3, 25000, 21000, 6)
  )
attach(oil2)
oil2[c(oil_2012 == max(oil_2012)),]

Any help is much appreciated.

Thanks, Tim Umbach

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Incorporate Julia into R

2013-10-17 Thread Suzen, Mehmet
On 17 October 2013 15:38, Timo Schmid  wrote:
> I have some code in R with a lot of matrix multiplication and inverting. R 
> can be very slow for larger matrices like 5000x5000.
> I have seen the new programming language Julia (www.julialang.org) which is 
> quite fast in doing matrix algebra.

Its not Julia, but LAPACK.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] speeding up a loop

2013-10-17 Thread Ye Lin
Hey R professionals,

I have a large dataset and I want to run a loop on it basically creating a
new column which gathers information from another reference table.

When I run the code, R just freezes and even does not response after 30min
which is really unusual. I tried sapply as well but does not improve at
all.

I am running R 3.0.2 on Windows 7.  I checked the system, when I run the
code, my CPU usage is about 25%-30% that is taxing my desktop.

Here is my code:

#df1 is the data set I want to add a new column#
#b is the reference tabel#

for (i in (1:nrow(df1))) {
  begin=which(b$Time2==df1$start[i] & b$Date==df1$Date[i])
  date=unlist(strsplit(as.character(dff$end[i])," "))[1]
   end=ifelse(date=="2013-10-17",
   which(b$Time2==df1$end[i] & b$Date==df1$Date[i]),
   which(b$Time2==df1$end[i]-3600*24 & b$Date==as.Date(df1$Date[i])+1))
df1$new[i] <- sum(b[begin:end,]$Power)
}

And here is a mimic sample of df1 & b:

df1 <- structure(list(Date = structure(c(1369699200, 1369699200,
1369699200,
1369699200, 1369699200), tzone = "UTC", class = c("POSIXct",
"POSIXt")), start = structure(c(1381991205, 1381990247, 1382010454,
1382007281, 1381992288), tzone = "UTC", class = c("POSIXct",
"POSIXt")), end = structure(c(1381992405, 1381993727, 1382010694,
1382007461, 1381992468), tzone = "UTC", class = c("POSIXct",
"POSIXt"))), .Names = c("Date", "start", "end"), row.names = c(NA,
-5L), class = "data.frame")


b <- structure(list(Date = structure(c(1369699200, 1369699200, 1369699200,
1369699200, 1369699200, 1369699200, 1369699200, 1369699200, 1369699200,
1369699200, 1369699200, 1369699200, 1369699200, 1369699200, 1369699200,
1369699200, 1369699200, 1369699200, 1369699200, 1369699200, 1369699200,
1369699200, 1369699200, 1369699200, 1369699200, 1369699200, 1369699200,
1369699200, 1369699200, 1369699200, 1369699200, 1369699200, 1369699200,
1369699200, 1369699200, 1369699200, 1369699200, 1369699200, 1369699200,
1369699200, 1369699200, 1369699200, 1369699200, 1369699200, 1369699200,
1369699200, 1369699200, 1369699200, 1369699200, 1369699200), tzone = "UTC",
class = c("POSIXct",
"POSIXt")), Time2 = structure(c(1381989634, 1381989694, 1381989754,
1381989814, 1381989874, 1381989934, 1381989994, 1381990054, 1381990114,
1381990174, 1381990234, 1381990294, 1381990354, 1381990414, 1381990474,
1381990534, 1381990594, 1381990654, 1381990714, 1381990774, 1381990834,
1381990894, 1381990954, 1381991014, 1381991074, 1381991134, 1381991194,
1381991254, 1381991314, 1381991374, 1381991434, 1381991494, 1381991554,
1381991614, 1381991674, 1381991734, 1381991794, 1381991854, 1381991914,
1381991974, 1381992034, 1381992094, 1381992154, 1381992214, 1381992274,
1381992334, 1381992394, 1381992454, 1381992514, 1381992574), tzone = "UTC",
class = c("POSIXct",
"POSIXt")), Power = c(1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12,
13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28,
29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44,
45, 46, 47, 48, 49, 50)), .Names = c("Date", "Time2", "Power"
), row.names = c(NA, -50L), class = "data.frame")

Thanks for your help!

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Subseting a data.frame

2013-10-17 Thread William Dunlap
seq_along(x), integer(length(x)), is.na(x), or anything that produces an integer
(or numeric or logical) vector the length of x would work.  I use integer() or 
numeric()
to indicate I'm not using its value: it is just a vector in which to place the
return values of FUN().

Bill Dunlap
Spotfire, TIBCO Software
wdunlap tibco.com


> -Original Message-
> From: arun [mailto:smartpink...@yahoo.com]
> Sent: Thursday, October 17, 2013 2:33 PM
> To: R help
> Cc: William Dunlap; Bert Gunter
> Subject: Re: [R] Subseting a data.frame
> 
> Hi Bill,
> 
> #seq_along() worked in the cases you showed.
> 
>  ave(seq_along(fac),fac,FUN=length)
> #[1] 3 1 3 3
>   ave(seq_along(num), num, FUN=length)
> #[1] 3 1 3 3
>   ave(seq_along(char), char, FUN=length)
> #[1] 3 1 3 3
> 
> 
> 
> I thought, there might be some advantages in speed, but they were similar in 
> speed.
> set.seed(195)
>  num1 <- sample(1e3,1e7,replace=TRUE)
>  system.time(res1 <- ave(integer(length(num1)),num1,FUN=length))
>   # user  system elapsed
>   #4.148   0.228   4.382
> system.time(res2 <- ave(seq_along(num1),num1,FUN=length))
> #   user  system elapsed
>  # 3.944   0.228   4.181
> system.time(res3 <- ave(num1,num1,FUN=length))
> #   user  system elapsed
>  # 3.740   0.264   4.012
> identical(res1,res2)
> #[1] TRUE
>  identical(res2,res3)
> #[1] TRUE
> 
> 
> A.K.
> 
> 
> 
> 
> On Thursday, October 17, 2013 4:34 PM, William Dunlap  
> wrote:
>   May I ask why:
>     count_by_class <- with(dat, ave(numeric(length(basel_asset_class)), 
> basel_asset_class,
> FUN=length))
>   should not be more simply done as:
>     count_by_class <- with(dat, ave(basel_asset_class, basel_asset_class, 
> FUN=length))
> 
> The way I did it would work if basel_asset_class were non-numeric.
> In ave(x, group, FUN=FUN), FUN's return value should be the same type as x (or
> you can get some odd type conversions).  E.g.,
> 
>    > num <- c(2,3,2,2) ;  char <- c("Two","Three","Two","Two")
>    > ave(num, num, FUN=length) # good
>    [1] 3 1 3 3
>    > ave(char, char, FUN=length) # bad
>    [1] "3" "1" "3" "3"
>    > fac <- factor(char, levels=c("One","Two","Three"))
>    > ave(fac, fac, FUN=length)
>    [1]
>    Levels: One Two Three
>    Warning messages:
>    1: In `[<-.factor`(`*tmp*`, i, value = 0L) :
>      invalid factor level, NA generated
>    2: In `[<-.factor`(`*tmp*`, i, value = 3L) :
>      invalid factor level, NA generated
>    3: In `[<-.factor`(`*tmp*`, i, value = 1L) :
>      invalid factor level, NA generated
> but x=integer(length(group)) works in all cases:
>    > ave(integer(length(fac)), fac, FUN=length)
>    [1] 3 1 3 3
>    > ave(integer(length(char)), char, FUN=length)
>       [1] 3 1 3 3
> 
> Bill Dunlap
> Spotfire, TIBCO Software
> wdunlap tibco.com
> 
> From: Bert Gunter [mailto:gunter.ber...@gene.com]
> Sent: Thursday, October 17, 2013 1:06 PM
> To: William Dunlap
> Cc: Katherine Gobin; r-help@r-project.org
> Subject: Re: [R] Subseting a data.frame
> 
> May I ask why:
> 
> count_by_class <- with(dat, ave(numeric(length(basel_
> asset_class)), basel_asset_class, FUN=length))
> should not be more simply done as:
> 
> count_by_class <- with(dat, ave(basel_asset_class, basel_asset_class, 
> FUN=length))
> 
> ?
> -- Bert
> 
> On Thu, Oct 17, 2013 at 12:36 PM, William Dunlap
> mailto:wdun...@tibco.com>> wrote:
> > What I need is to select only those records for which there are more than 
> > two default
> > frequencies (defa_frequency),
> 
> Here is one way.  There are many others:
>    > dat <- data.frame( # slightly less trivial example
>         basel_asset_class=c(4,8,8,8,74,3,74),
>         defa_frequency=(1:7)/8)
>    > count_by_class <- with(dat, ave(numeric(length(basel_asset_class)),
> basel_asset_class, FUN=length))
>    > cbind(dat, count_by_class) # see what we just computed
>      basel_asset_class defa_frequency count_by_class
>    1                 4          0.125              1
>    2                 8          0.250              3
>    3                 8          0.375              3
>    4                 8          0.500              3
>    5                74          0.625              2
>    6                 3          0.750              1
>    7                74          0.875              2
>    > mydat[count_by_class>1, ] # I think this is what you are asking for
>      basel_asset_class defa_frequency
>    2                 8          0.250
>    3                 8          0.375
>    4                 8          0.500
>    5                74          0.625
>    7                74          0.875
> 
> Bill Dunlap
> Spotfire, TIBCO Software
> wdunlap tibco.com
> 
> 
> > -Original Message-
> > From: r-help-boun...@r-project.org 
> > [mailto:r-
> help-boun...@r-project.org] On Behalf
> > Of Katherine Gobin
> > Sent: Thursday, October 17, 2013 11:05 AM
> > To: Bert Gunter
> > Cc: r-help@r-project.org

Re: [R] Subseting a data.frame

2013-10-17 Thread arun
Hi Bill,

#seq_along() worked in the cases you showed.

 ave(seq_along(fac),fac,FUN=length)
#[1] 3 1 3 3
  ave(seq_along(num), num, FUN=length) 
#[1] 3 1 3 3
  ave(seq_along(char), char, FUN=length) 
#[1] 3 1 3 3



I thought, there might be some advantages in speed, but they were similar in 
speed.
set.seed(195)
 num1 <- sample(1e3,1e7,replace=TRUE)
 system.time(res1 <- ave(integer(length(num1)),num1,FUN=length))
  # user  system elapsed 
  #4.148   0.228   4.382 
system.time(res2 <- ave(seq_along(num1),num1,FUN=length))
#   user  system elapsed 
 # 3.944   0.228   4.181 
system.time(res3 <- ave(num1,num1,FUN=length))
#   user  system elapsed 
 # 3.740   0.264   4.012 
identical(res1,res2)
#[1] TRUE
 identical(res2,res3)
#[1] TRUE


A.K. 




On Thursday, October 17, 2013 4:34 PM, William Dunlap  wrote:
  May I ask why:
    count_by_class <- with(dat, ave(numeric(length(basel_asset_class)), 
basel_asset_class, FUN=length))
  should not be more simply done as:
    count_by_class <- with(dat, ave(basel_asset_class, basel_asset_class, 
FUN=length))

The way I did it would work if basel_asset_class were non-numeric.
In ave(x, group, FUN=FUN), FUN's return value should be the same type as x (or
you can get some odd type conversions).  E.g.,

   > num <- c(2,3,2,2) ;  char <- c("Two","Three","Two","Two")
   > ave(num, num, FUN=length) # good
   [1] 3 1 3 3
   > ave(char, char, FUN=length) # bad
   [1] "3" "1" "3" "3"
   > fac <- factor(char, levels=c("One","Two","Three"))
   > ave(fac, fac, FUN=length)
   [1]
   Levels: One Two Three
   Warning messages:
   1: In `[<-.factor`(`*tmp*`, i, value = 0L) :
     invalid factor level, NA generated
   2: In `[<-.factor`(`*tmp*`, i, value = 3L) :
     invalid factor level, NA generated
   3: In `[<-.factor`(`*tmp*`, i, value = 1L) :
     invalid factor level, NA generated
but x=integer(length(group)) works in all cases:
   > ave(integer(length(fac)), fac, FUN=length)
   [1] 3 1 3 3
   > ave(integer(length(char)), char, FUN=length)
      [1] 3 1 3 3

Bill Dunlap
Spotfire, TIBCO Software
wdunlap tibco.com

From: Bert Gunter [mailto:gunter.ber...@gene.com]
Sent: Thursday, October 17, 2013 1:06 PM
To: William Dunlap
Cc: Katherine Gobin; r-help@r-project.org
Subject: Re: [R] Subseting a data.frame

May I ask why:

count_by_class <- with(dat, ave(numeric(length(basel_
asset_class)), basel_asset_class, FUN=length))
should not be more simply done as:

count_by_class <- with(dat, ave(basel_asset_class, basel_asset_class, 
FUN=length))

?
-- Bert

On Thu, Oct 17, 2013 at 12:36 PM, William Dunlap 
mailto:wdun...@tibco.com>> wrote:
> What I need is to select only those records for which there are more than two 
> default
> frequencies (defa_frequency),

Here is one way.  There are many others:
   > dat <- data.frame( # slightly less trivial example
        basel_asset_class=c(4,8,8,8,74,3,74),
        defa_frequency=(1:7)/8)
   > count_by_class <- with(dat, ave(numeric(length(basel_asset_class)), 
basel_asset_class, FUN=length))
   > cbind(dat, count_by_class) # see what we just computed
     basel_asset_class defa_frequency count_by_class
   1                 4          0.125              1
   2                 8          0.250              3
   3                 8          0.375              3
   4                 8          0.500              3
   5                74          0.625              2
   6                 3          0.750              1
   7                74          0.875              2
   > mydat[count_by_class>1, ] # I think this is what you are asking for
     basel_asset_class defa_frequency
   2                 8          0.250
   3                 8          0.375
   4                 8          0.500
   5                74          0.625
   7                74          0.875

Bill Dunlap
Spotfire, TIBCO Software
wdunlap tibco.com


> -Original Message-
> From: r-help-boun...@r-project.org 
> [mailto:r-help-boun...@r-project.org] On 
> Behalf
> Of Katherine Gobin
> Sent: Thursday, October 17, 2013 11:05 AM
> To: Bert Gunter
> Cc: r-help@r-project.org
> Subject: Re: [R] Subseting a data.frame
>
> Correction. (2nd para first three lines)
>
> Pl read following line
>
> What I need is to select only those records for which there are more than two 
> default
> frequencies (defa_frequency), Thus, there is only one default frequency = 
> 0.150 w.r.t
> basel_asset_class = 4 whereas there are default frequencies w.r.t. basel 
> aseet class 4,
>
>
> as
>
> What I need is to select only those records for which there are more than two 
> default
> frequencies (defa_frequency), Thus, there is only one default frequency = 
> 0.150 w.r.t
> basel_asset_class = 4 whereas there are THREE default frequencies w.r.t. 
> basel aseet
> class 8,
>
>
>
> I alpologize for the incovenience.
>
> Regards
>
> KAtherine
>
>
>
>
>
>
>
>
> On , Katherine Gobin 

Re: [R] Subseting a data.frame

2013-10-17 Thread Bert Gunter
Thanks, Bill.

But ?ave specifically says:

ave(x, ..., FUN = mean)

Arguments:
x

A numeric.

So that it should not be expected to work properly if the argument is
not (coercible to) numeric. Nevertheless, defensive programming is
always wise.

Cheers,
Bert


On Thu, Oct 17, 2013 at 1:34 PM, William Dunlap  wrote:
>   May I ask why:
> count_by_class <- with(dat, ave(numeric(length(basel_asset_class)),
> basel_asset_class, FUN=length))
>
>   should not be more simply done as:
> count_by_class <- with(dat, ave(basel_asset_class, basel_asset_class,
> FUN=length))
>
> The way I did it would work if basel_asset_class were non-numeric.
>
> In ave(x, group, FUN=FUN), FUN's return value should be the same type as x
> (or
>
> you can get some odd type conversions).  E.g.,
>
>
>
>> num <- c(2,3,2,2) ;  char <- c("Two","Three","Two","Two")
>
>> ave(num, num, FUN=length) # good
>
>[1] 3 1 3 3
>
>> ave(char, char, FUN=length) # bad
>
>[1] "3" "1" "3" "3"
>
>> fac <- factor(char, levels=c("One","Two","Three"))
>
>> ave(fac, fac, FUN=length)
>
>[1]
>
>Levels: One Two Three
>
>Warning messages:
>
>1: In `[<-.factor`(`*tmp*`, i, value = 0L) :
>
>  invalid factor level, NA generated
>
>2: In `[<-.factor`(`*tmp*`, i, value = 3L) :
>
>  invalid factor level, NA generated
>
>3: In `[<-.factor`(`*tmp*`, i, value = 1L) :
>
>  invalid factor level, NA generated
>
> but x=integer(length(group)) works in all cases:
>
>> ave(integer(length(fac)), fac, FUN=length)
>
>[1] 3 1 3 3
>
>> ave(integer(length(char)), char, FUN=length)
>
>   [1] 3 1 3 3
>
>
>
> Bill Dunlap
>
> Spotfire, TIBCO Software
>
> wdunlap tibco.com
>
>
>
> From: Bert Gunter [mailto:gunter.ber...@gene.com]
> Sent: Thursday, October 17, 2013 1:06 PM
> To: William Dunlap
> Cc: Katherine Gobin; r-help@r-project.org
> Subject: Re: [R] Subseting a data.frame
>
>
>
> May I ask why:
>
> count_by_class <- with(dat, ave(numeric(length(basel_
>
> asset_class)), basel_asset_class, FUN=length))
>
> should not be more simply done as:
>
> count_by_class <- with(dat, ave(basel_asset_class, basel_asset_class,
> FUN=length))
>
> ?
>
> -- Bert
>
>
>
> On Thu, Oct 17, 2013 at 12:36 PM, William Dunlap  wrote:
>
>> What I need is to select only those records for which there are more than
>> two default
>> frequencies (defa_frequency),
>
> Here is one way.  There are many others:
>> dat <- data.frame( # slightly less trivial example
> basel_asset_class=c(4,8,8,8,74,3,74),
> defa_frequency=(1:7)/8)
>> count_by_class <- with(dat, ave(numeric(length(basel_asset_class)),
> basel_asset_class, FUN=length))
>> cbind(dat, count_by_class) # see what we just computed
>  basel_asset_class defa_frequency count_by_class
>1 4  0.125  1
>2 8  0.250  3
>3 8  0.375  3
>4 8  0.500  3
>574  0.625  2
>6 3  0.750  1
>774  0.875  2
>> mydat[count_by_class>1, ] # I think this is what you are asking for
>  basel_asset_class defa_frequency
>2 8  0.250
>3 8  0.375
>4 8  0.500
>574  0.625
>774  0.875
>
> Bill Dunlap
> Spotfire, TIBCO Software
> wdunlap tibco.com
>
>
>> -Original Message-
>> From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org]
>> On Behalf
>> Of Katherine Gobin
>> Sent: Thursday, October 17, 2013 11:05 AM
>> To: Bert Gunter
>> Cc: r-help@r-project.org
>> Subject: Re: [R] Subseting a data.frame
>>
>> Correction. (2nd para first three lines)
>>
>> Pl read following line
>>
>> What I need is to select only those records for which there are more than
>> two default
>> frequencies (defa_frequency), Thus, there is only one default frequency =
>> 0.150 w.r.t
>> basel_asset_class = 4 whereas there are default frequencies w.r.t. basel
>> aseet class 4,
>>
>>
>> as
>>
>> What I need is to select only those records for which there are more than
>> two default
>> frequencies (defa_frequency), Thus, there is only one default frequency =
>> 0.150 w.r.t
>> basel_asset_class = 4 whereas there are THREE default frequencies w.r.t.
>> basel aseet
>> class 8,
>>
>>
>>
>> I alpologize for the incovenience.
>>
>> Regards
>>
>> KAtherine
>>
>>
>>
>>
>>
>>
>>
>>
>> On , Katherine Gobin  wrote:
>>
>>  I am sorry perhaps  was not able to put the question properly. I am not
>> looking for the
>> subset of the data.frame where the basel_asset_class is > 2. I do agree
>> that would have
>> been a basic requirement. Let me try to put the question again.
>>
>> I have a data frame as
>>
>> mydat = data.frame(basel_asset_class = c(4, 8, 8 ,8), defa_frequency =
>>

Re: [R] Subseting a data.frame

2013-10-17 Thread William Dunlap
  May I ask why:
count_by_class <- with(dat, ave(numeric(length(basel_asset_class)), 
basel_asset_class, FUN=length))
  should not be more simply done as:
count_by_class <- with(dat, ave(basel_asset_class, basel_asset_class, 
FUN=length))

The way I did it would work if basel_asset_class were non-numeric.
In ave(x, group, FUN=FUN), FUN's return value should be the same type as x (or
you can get some odd type conversions).  E.g.,

   > num <- c(2,3,2,2) ;  char <- c("Two","Three","Two","Two")
   > ave(num, num, FUN=length) # good
   [1] 3 1 3 3
   > ave(char, char, FUN=length) # bad
   [1] "3" "1" "3" "3"
   > fac <- factor(char, levels=c("One","Two","Three"))
   > ave(fac, fac, FUN=length)
   [1]
   Levels: One Two Three
   Warning messages:
   1: In `[<-.factor`(`*tmp*`, i, value = 0L) :
 invalid factor level, NA generated
   2: In `[<-.factor`(`*tmp*`, i, value = 3L) :
 invalid factor level, NA generated
   3: In `[<-.factor`(`*tmp*`, i, value = 1L) :
 invalid factor level, NA generated
but x=integer(length(group)) works in all cases:
   > ave(integer(length(fac)), fac, FUN=length)
   [1] 3 1 3 3
   > ave(integer(length(char)), char, FUN=length)
  [1] 3 1 3 3

Bill Dunlap
Spotfire, TIBCO Software
wdunlap tibco.com

From: Bert Gunter [mailto:gunter.ber...@gene.com]
Sent: Thursday, October 17, 2013 1:06 PM
To: William Dunlap
Cc: Katherine Gobin; r-help@r-project.org
Subject: Re: [R] Subseting a data.frame

May I ask why:

count_by_class <- with(dat, ave(numeric(length(basel_
asset_class)), basel_asset_class, FUN=length))
should not be more simply done as:

count_by_class <- with(dat, ave(basel_asset_class, basel_asset_class, 
FUN=length))

?
-- Bert

On Thu, Oct 17, 2013 at 12:36 PM, William Dunlap 
mailto:wdun...@tibco.com>> wrote:
> What I need is to select only those records for which there are more than two 
> default
> frequencies (defa_frequency),

Here is one way.  There are many others:
   > dat <- data.frame( # slightly less trivial example
basel_asset_class=c(4,8,8,8,74,3,74),
defa_frequency=(1:7)/8)
   > count_by_class <- with(dat, ave(numeric(length(basel_asset_class)), 
basel_asset_class, FUN=length))
   > cbind(dat, count_by_class) # see what we just computed
 basel_asset_class defa_frequency count_by_class
   1 4  0.125  1
   2 8  0.250  3
   3 8  0.375  3
   4 8  0.500  3
   574  0.625  2
   6 3  0.750  1
   774  0.875  2
   > mydat[count_by_class>1, ] # I think this is what you are asking for
 basel_asset_class defa_frequency
   2 8  0.250
   3 8  0.375
   4 8  0.500
   574  0.625
   774  0.875

Bill Dunlap
Spotfire, TIBCO Software
wdunlap tibco.com


> -Original Message-
> From: r-help-boun...@r-project.org 
> [mailto:r-help-boun...@r-project.org] On 
> Behalf
> Of Katherine Gobin
> Sent: Thursday, October 17, 2013 11:05 AM
> To: Bert Gunter
> Cc: r-help@r-project.org
> Subject: Re: [R] Subseting a data.frame
>
> Correction. (2nd para first three lines)
>
> Pl read following line
>
> What I need is to select only those records for which there are more than two 
> default
> frequencies (defa_frequency), Thus, there is only one default frequency = 
> 0.150 w.r.t
> basel_asset_class = 4 whereas there are default frequencies w.r.t. basel 
> aseet class 4,
>
>
> as
>
> What I need is to select only those records for which there are more than two 
> default
> frequencies (defa_frequency), Thus, there is only one default frequency = 
> 0.150 w.r.t
> basel_asset_class = 4 whereas there are THREE default frequencies w.r.t. 
> basel aseet
> class 8,
>
>
>
> I alpologize for the incovenience.
>
> Regards
>
> KAtherine
>
>
>
>
>
>
>
>
> On , Katherine Gobin 
> mailto:katherine_go...@yahoo.com>> wrote:
>
>  I am sorry perhaps  was not able to put the question properly. I am not 
> looking for the
> subset of the data.frame where the basel_asset_class is > 2. I do agree that 
> would have
> been a basic requirement. Let me try to put the question again.
>
> I have a data frame as
>
> mydat = data.frame(basel_asset_class = c(4, 8, 8 ,8), defa_frequency = 
> c(0.15, 0.07, 0.03,
> 0.001))
>
> # Please note I have changed the basel_asset_class to 4 from 2, to avoid 
> confusion.
>
> > mydat
>   basel_asset_class defa_frequency
> 1 4  0.150
> 2 8  0.070
> 3 8  0.030
> 4 8  0.001
>
>
>
> This is just an representative example. In reality, I may have no of basel 
> asset c

Re: [R] Subseting a data.frame

2013-10-17 Thread Bert Gunter
May I ask why:

count_by_class <- with(dat, ave(numeric(length(basel_
asset_class)), basel_asset_class, FUN=length))

should not be more simply done as:

count_by_class <- with(dat, ave(basel_asset_class, basel_asset_class,
FUN=length))

?

-- Bert


On Thu, Oct 17, 2013 at 12:36 PM, William Dunlap  wrote:

> > What I need is to select only those records for which there are more
> than two default
> > frequencies (defa_frequency),
>
> Here is one way.  There are many others:
>> dat <- data.frame( # slightly less trivial example
> basel_asset_class=c(4,8,8,8,74,3,74),
> defa_frequency=(1:7)/8)
>> count_by_class <- with(dat, ave(numeric(length(basel_asset_class)),
> basel_asset_class, FUN=length))
>> cbind(dat, count_by_class) # see what we just computed
>  basel_asset_class defa_frequency count_by_class
>1 4  0.125  1
>2 8  0.250  3
>3 8  0.375  3
>4 8  0.500  3
>574  0.625  2
>6 3  0.750  1
>774  0.875  2
>> mydat[count_by_class>1, ] # I think this is what you are asking for
>  basel_asset_class defa_frequency
>2 8  0.250
>3 8  0.375
>4 8  0.500
>574  0.625
>774  0.875
>
> Bill Dunlap
> Spotfire, TIBCO Software
> wdunlap tibco.com
>
>
> > -Original Message-
> > From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org]
> On Behalf
> > Of Katherine Gobin
> > Sent: Thursday, October 17, 2013 11:05 AM
> > To: Bert Gunter
> > Cc: r-help@r-project.org
> > Subject: Re: [R] Subseting a data.frame
> >
> > Correction. (2nd para first three lines)
> >
> > Pl read following line
> >
> > What I need is to select only those records for which there are more
> than two default
> > frequencies (defa_frequency), Thus, there is only one default frequency
> = 0.150 w.r.t
> > basel_asset_class = 4 whereas there are default frequencies w.r.t. basel
> aseet class 4,
> >
> >
> > as
> >
> > What I need is to select only those records for which there are more
> than two default
> > frequencies (defa_frequency), Thus, there is only one default frequency
> = 0.150 w.r.t
> > basel_asset_class = 4 whereas there are THREE default frequencies w.r.t.
> basel aseet
> > class 8,
> >
> >
> >
> > I alpologize for the incovenience.
> >
> > Regards
> >
> > KAtherine
> >
> >
> >
> >
> >
> >
> >
> >
> > On , Katherine Gobin  wrote:
> >
> >  I am sorry perhaps  was not able to put the question properly. I am not
> looking for the
> > subset of the data.frame where the basel_asset_class is > 2. I do agree
> that would have
> > been a basic requirement. Let me try to put the question again.
> >
> > I have a data frame as
> >
> > mydat = data.frame(basel_asset_class = c(4, 8, 8 ,8), defa_frequency =
> c(0.15, 0.07, 0.03,
> > 0.001))
> >
> > # Please note I have changed the basel_asset_class to 4 from 2, to avoid
> confusion.
> >
> > > mydat
> >   basel_asset_class defa_frequency
> > 1 4  0.150
> > 2 8  0.070
> > 3 8  0.030
> > 4 8  0.001
> >
> >
> >
> > This is just an representative example. In reality, I may have no of
> basel asset classes. 4, 8
> > etc are the IDs can be anything thus I cant hard code it as subset(mydat,
> > mydat$basel_asset_class > 2).
> >
> >
> > What I need is to select only those records for which there are more
> than two default
> > frequencies (defa_frequency), Thus, there is only one default frequency
> = 0.150 w.r.t
> > basel_asset_class = 4 whereas there are default frequencies w.r.t. basel
> aseet class 4,
> > similarly there could be another basel asset class having say 5 default
> frequncies. Thus, I
> > need to take subset of the data.frame s.t. the no of corresponding
> defa_frequencies is
> > greater than 2.
> >
> > The idea is we try to fit exponential curve Y = A exp( BX ) for each of
> the basel asset
> > classes and to estimate values of A and B, mathematically one needs to
> have at least two
> > values of X.
> >
> > I hope I may be able to express my requirement. Its not that I need the
> subset of mydat
> > s.t. basel asset class is > 2 (now 4 in revised example), but sbuset
> s.t. no of default
> > frequencies is greater than or equal to 2. This 2 is not same as basel
> asset class 2.
> >
> > Kindly guide
> >
> > With warm regards
> >
> > Katherine Gobin
> >
> >
> >
> >
> > On Thursday, 17 October 2013 9:33 PM, Bert Gunter <
> gunter.ber...@gene.com> wrote:
> >
> > "Kindly guide" ...
> >
> > This is a very basic question, so the kindest guide I can give is to
> read an Introduction to R
> > (ships with R) or a R web tutorial of your choi

Re: [R] Weighted regression markers on scatter plots

2013-10-17 Thread Jim Lemon

On 10/17/2013 04:04 AM, Msugarman wrote:

Hi all,

I'm trying to graph the results of a weighted regression analysis. Is anyone
aware of a way to make my markers appear a different sizes to be consistent
with their respective weights?


Hi Mike,
Have a look at the "size_n_color" function in the plotrix package.

Jim

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Subseting a data.frame

2013-10-17 Thread William Dunlap
> What I need is to select only those records for which there are more than two 
> default
> frequencies (defa_frequency),

Here is one way.  There are many others:
   > dat <- data.frame( # slightly less trivial example
basel_asset_class=c(4,8,8,8,74,3,74),
defa_frequency=(1:7)/8)
   > count_by_class <- with(dat, ave(numeric(length(basel_asset_class)), 
basel_asset_class, FUN=length))
   > cbind(dat, count_by_class) # see what we just computed
 basel_asset_class defa_frequency count_by_class
   1 4  0.125  1
   2 8  0.250  3
   3 8  0.375  3
   4 8  0.500  3
   574  0.625  2
   6 3  0.750  1
   774  0.875  2
   > mydat[count_by_class>1, ] # I think this is what you are asking for
 basel_asset_class defa_frequency
   2 8  0.250
   3 8  0.375
   4 8  0.500
   574  0.625
   774  0.875

Bill Dunlap
Spotfire, TIBCO Software
wdunlap tibco.com


> -Original Message-
> From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On 
> Behalf
> Of Katherine Gobin
> Sent: Thursday, October 17, 2013 11:05 AM
> To: Bert Gunter
> Cc: r-help@r-project.org
> Subject: Re: [R] Subseting a data.frame
> 
> Correction. (2nd para first three lines)
> 
> Pl read following line
> 
> What I need is to select only those records for which there are more than two 
> default
> frequencies (defa_frequency), Thus, there is only one default frequency = 
> 0.150 w.r.t
> basel_asset_class = 4 whereas there are default frequencies w.r.t. basel 
> aseet class 4,
> 
> 
> as
> 
> What I need is to select only those records for which there are more than two 
> default
> frequencies (defa_frequency), Thus, there is only one default frequency = 
> 0.150 w.r.t
> basel_asset_class = 4 whereas there are THREE default frequencies w.r.t. 
> basel aseet
> class 8,
> 
> 
> 
> I alpologize for the incovenience.
> 
> Regards
> 
> KAtherine
> 
> 
> 
> 
> 
> 
> 
> 
> On , Katherine Gobin  wrote:
> 
>  I am sorry perhaps  was not able to put the question properly. I am not 
> looking for the
> subset of the data.frame where the basel_asset_class is > 2. I do agree that 
> would have
> been a basic requirement. Let me try to put the question again.
> 
> I have a data frame as
> 
> mydat = data.frame(basel_asset_class = c(4, 8, 8 ,8), defa_frequency = 
> c(0.15, 0.07, 0.03,
> 0.001))
> 
> # Please note I have changed the basel_asset_class to 4 from 2, to avoid 
> confusion.
> 
> > mydat
>   basel_asset_class defa_frequency
> 1                 4          0.150
> 2                 8          0.070
> 3                 8          0.030
> 4                 8          0.001
> 
> 
> 
> This is just an representative example. In reality, I may have no of basel 
> asset classes. 4, 8
> etc are the IDs can be anything thus I cant hard code it as subset(mydat,
> mydat$basel_asset_class > 2).
> 
> 
> What I need is to select only those records for which there are more than two 
> default
> frequencies (defa_frequency), Thus, there is only one default frequency = 
> 0.150 w.r.t
> basel_asset_class = 4 whereas there are default frequencies w.r.t. basel 
> aseet class 4,
> similarly there could be another basel asset class having say 5 default 
> frequncies. Thus, I
> need to take subset of the data.frame s.t. the no of corresponding 
> defa_frequencies is
> greater than 2.
> 
> The idea is we try to fit exponential curve Y = A exp( BX ) for each of the 
> basel asset
> classes and to estimate values of A and B, mathematically one needs to have 
> at least two
> values of X.
> 
> I hope I may be able to express my requirement. Its not that I need the 
> subset of mydat
> s.t. basel asset class is > 2 (now 4 in revised example), but sbuset s.t. no 
> of default
> frequencies is greater than or equal to 2. This 2 is not same as basel asset 
> class 2.
> 
> Kindly guide
> 
> With warm regards
> 
> Katherine Gobin
> 
> 
> 
> 
> On Thursday, 17 October 2013 9:33 PM, Bert Gunter  
> wrote:
> 
> "Kindly guide" ...
> 
> This is a very basic question, so the kindest guide I can give is to read an 
> Introduction to R
> (ships with R) or a R web tutorial of your choice so that you can learn how R 
> works
> instead of posting to this list.
> 
> Cheers,
> Bert
> 
> 
> 
> 
> On Wed, Oct 16, 2013 at 11:55 PM, Katherine Gobin 
> wrote:
> 
> Dear Forum,
> >
> >I have a data frame as
> >
> >mydat = data.frame(basel_asset_class = c(2, 8, 8 ,8), defa_frequency = 
> >c(0.15, 0.07,
> 0.03, 0.001))
> >
> >> mydat
> >  basel_asset_class defa_frequency
> >1                 2          0.150
> >2                 8          0.070
> >3                 8          0.030
> >4                 8 

Re: [R] Selecting maximums between different variables

2013-10-17 Thread arun
Hi,

You could also check ?data.table() as it could be faster.

#Speed comparison


set.seed(498) 
oilT <- 
data.frame(YEAR=rep(rep(1800:2012,50),100),state=rep(rep(state.abb,each=213),100),value=sample(2000:8,1065000,replace=TRUE),stringsAsFactors=FALSE)
system.time(res1 <- 
oilT[as.logical(with(oilT,ave(value,list(YEAR),FUN=function(x) x%in% 
max(x,])
# user  system elapsed 
#  0.532   0.008   0.540 
 dim(res1) #as some years have duplicated maximums
#[1] 220   3

 res1[duplicated(res1[,1])|duplicated(res1[,1],fromLast=TRUE),]

library(data.table)
dt1 <- data.table(oilT,key='YEAR')
system.time( res2 <- dt1[dt1[,value %in% max(value),'YEAR']$V1])
#   user  system elapsed 
#  0.060   0.000   0.062 
 res1 <- res1[order(res1$YEAR),]
 row.names(res1) <- 1:nrow(res1)
 identical(res1,as.data.frame(res2))
#[1] TRUE


A.K.



On Thursday, October 17, 2013 1:35 PM, arun  wrote:
Hi,
You may try:


unlist(lapply(seq_len(nrow(oil)),function(i) oil[i,-1][which.max(oil[i,-1])])) 
 #  CA    ND 
#4 6 
#or
library(reshape2)

datM <- melt(oil,id.var="YEAR")


datM[as.logical(with(datM,ave(value,list(YEAR),FUN= function(x) x%in% 
max(x,]
#  YEAR variable value
#3 2011   CA 4
#8 2012   ND 6

A.K.




On Thursday, October 17, 2013 12:50 PM, Tim Umbach  wrote:
Hi there,

another beginners question, I'm afraid. Basically i want to selct the
maximum of values, that correspond to different variables. I have a table
of oil production that looks somewhat like this:

oil <- data.frame( YEAR = c(2011, 2012),
                   TX = c(2, 3),
                   CA = c(4, 25000),
                   AL = c(2,
21000),

                   ND = c(21000,6))

Now I want to find out, which state produced most oil in a given year. I
tried this:

attach(oil)
last_year = oil[ c(YEAR == 2012), ]
max(last_year)

Which works, but it doesnt't give me the corresponding values (i.e. it just
gives me the maximum output, not what state its from).
So I tried this:

oil[c(oil == max(last_year)),]
and this:
oil[c(last_year == max(last_year)),]
and this:
oil[which.max(last_year),]
and this:
last_year[max(last_year),]

None of them work, but they don't give error messages either, the output is
just "NA". The problem is, in my eyes, that I'm comparing the values of
different variables with each other. Because if i change the structure of
the dataframe (which I can't do with the real data, at least not with out
doing it by hand with a huge dataset), it looks like this and works
perfectly:

oil2 <- data.frame (
  names = c('YEAR', 'TX', 'CA', 'AL', 'ND'),
  oil_2011 = c(2011, 2, 4, 2, 21000),
  oil_2012 = c(2012, 3, 25000, 21000, 6)
  )
attach(oil2)
oil2[c(oil_2012 == max(oil_2012)),]

Any help is much appreciated.

Thanks, Tim Umbach

    [[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.



__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Newb: How I find random vector index?

2013-10-17 Thread Brian Diggs
On 10/17/2013 11:54 AM, Stock Beaver wrote:
> # Suppose I have a vector:
>
> myvec = c(1,0,3,0,77,9,0,1,2,0)
>
> # I want to randomly pick an element from myvec
> # where element == 0
> # and print the value of the corresponding index.
>
> # So, for example I might randomly pick the 3rd 0
> # and I would print the corresponding index
> # which is 7,
>
> # My initial approach is to use a for-loop.
> # Also I take a short-cut which assumes myvec is short:
>
> elm = 1
> while (elm != 0) {
># Pick a random index, (it might be a 0):
>rndidx = round(runif(1, min=1, max=length(myvec)))
>elm = myvec[rndidx]
>if(elm == 0)
>  print("I am done")
>else
>  print("I am not done")
> }
> print(rndidx)

It's a little easier if you re-arrange your problem statement. This is 
equivalent: return randomly one index of myvec for which the element of 
myvec equals 0. A direct implementation of this is

sample(which(myvec==0), 1)

which(myvec==0) returns a vector of indexes of myvec for which the value 
of the vector is 0. sample(..., 1) randomly selects one of those.

> # If myvec is large and/or contains no zeros,
> # The above loop is sub-optimal/faulty.

This approach also fails if there is no 0's in the vector. What do you 
want the result to be when that is the case? If we go with the simple 
answer of NA, then you can special case that (and wrap it up into a 
function)

OneZeroIndex <- function(myvec) {
   zeros <- which(myvec==0)
   if (length(zeros) > 0) {
 sample(zeros, 1)
   } else {
 NA
   }
}

> # I suspect that skilled R-people would approach this task differently.
> # Perhaps they would use features baked into R rather than use a loop?
>   [[alternative HTML version deleted]]
Please post plain text only.

-- 
Brian S. Diggs, PhD
Senior Research Associate, Department of Surgery
Oregon Health & Science University

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Weighted regression markers on scatter plots

2013-10-17 Thread David Winsemius

On Oct 16, 2013, at 10:04 AM, Msugarman wrote:

> Hi all,
> 
> I'm trying to graph the results of a weighted regression analysis. Is anyone
> aware of a way to make my markers appear a different sizes to be consistent
> with their respective weights?

You have not produced any data or code. If using base graphics then 
`plot.default` accepta vector for cex.


> 
> Thanks,
> -Mike Sugarman
> Wayne State University
> 
> --
> View this message in context: 
> http://r.789695.n4.nabble.com/Weighted-regression-markers-on-scatter-plots-tp4678370.html
> Sent from the R help mailing list archive at Nabble.com.

David Winsemius
Alameda, CA, USA

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Newb: How I find random vector index?

2013-10-17 Thread Sarah Goslee
Typo fix below:

On Thu, Oct 17, 2013 at 3:05 PM, Sarah Goslee  wrote:
> Not only does it not require a loop, this is a one-liner:
>
>> myvec <- c(1,0,3,0,77,9,0,1,2,0)
>> sample(which(myvec == 0), 1)
> [1] 4
>> sample(which(myvec == 0), 1)
> [1] 7
>> sample(which(myvec == 0), 1)
> [1] 2
>
> If there's a possibility of not having zeros then you'll need to check
> that separately, otherwise sample() will throw an error. For instance:
>
> if(any(myvec == 0)) {
>   sample(which(myvec == 0), 1)
> }
>
> which() will
  ^  just delete this.

>
> Sarah
>
>
> On Thu, Oct 17, 2013 at 2:54 PM, Stock Beaver  wrote:
>> # Suppose I have a vector:
>>
>> myvec = c(1,0,3,0,77,9,0,1,2,0)
>>
>> # I want to randomly pick an element from myvec
>> # where element == 0
>> # and print the value of the corresponding index.
>>
>> # So, for example I might randomly pick the 3rd 0
>> # and I would print the corresponding index
>> # which is 7,
>>
>> # My initial approach is to use a for-loop.
>> # Also I take a short-cut which assumes myvec is short:
>>
>> elm = 1
>> while (elm != 0) {
>>   # Pick a random index, (it might be a 0):
>>   rndidx = round(runif(1, min=1, max=length(myvec)))
>>   elm = myvec[rndidx]
>>   if(elm == 0)
>> print("I am done")
>>   else
>> print("I am not done")
>> }
>> print(rndidx)
>>
>> # If myvec is large and/or contains no zeros,
>> # The above loop is sub-optimal/faulty.
>>
>> # I suspect that skilled R-people would approach this task differently.
>> # Perhaps they would use features baked into R rather than use a loop?
>> [[alternative HTML version deleted]]
>>
>

-- 
Sarah Goslee

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Newb: How I find random vector index?

2013-10-17 Thread Sarah Goslee
Not only does it not require a loop, this is a one-liner:

> myvec <- c(1,0,3,0,77,9,0,1,2,0)
> sample(which(myvec == 0), 1)
[1] 4
> sample(which(myvec == 0), 1)
[1] 7
> sample(which(myvec == 0), 1)
[1] 2

If there's a possibility of not having zeros then you'll need to check
that separately, otherwise sample() will throw an error. For instance:

if(any(myvec == 0)) {
  sample(which(myvec == 0), 1)
}

which() will

Sarah


On Thu, Oct 17, 2013 at 2:54 PM, Stock Beaver  wrote:
> # Suppose I have a vector:
>
> myvec = c(1,0,3,0,77,9,0,1,2,0)
>
> # I want to randomly pick an element from myvec
> # where element == 0
> # and print the value of the corresponding index.
>
> # So, for example I might randomly pick the 3rd 0
> # and I would print the corresponding index
> # which is 7,
>
> # My initial approach is to use a for-loop.
> # Also I take a short-cut which assumes myvec is short:
>
> elm = 1
> while (elm != 0) {
>   # Pick a random index, (it might be a 0):
>   rndidx = round(runif(1, min=1, max=length(myvec)))
>   elm = myvec[rndidx]
>   if(elm == 0)
> print("I am done")
>   else
> print("I am not done")
> }
> print(rndidx)
>
> # If myvec is large and/or contains no zeros,
> # The above loop is sub-optimal/faulty.
>
> # I suspect that skilled R-people would approach this task differently.
> # Perhaps they would use features baked into R rather than use a loop?
> [[alternative HTML version deleted]]
>

-- 
Sarah Goslee
http://www.functionaldiversity.org

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Goodness of fit for a Rietveld Refinement

2013-10-17 Thread rflood13
Hi folks,

Wondering if anyone might be able to help me on this one. I have just done
some geochemistry with X-ray Diffraction and Rietveld Refinement in order to
quantify the data. I have a observed spectra from my sample and a calculated
spectra from the Rietveld Refinement (in the attached image, along with the
background). I was wondering is there a package in R that I might be able to
use that would essentially show me how well (or how poorly) fitted the
Rietveld calculated spectra was with regard to my observed spectra? It's
essentially a goodness of fit or R-squared value but I've been having some
difficulty finding the right way to assess the model fit. I'd appreciate any
information or tips anyone might have.

Kind regards,

Rory Flood.

--
Rory Flood
Postgraduate Research Student
Room 02 044, Elmwood Building
School of Geography, Archaeology and Palaeoecology
Queen's University Belfast
Belfast BT7 1NN
Co. Antrim
Northern Ireland

Tel: +44 (0) 28 9097 3929
Email: rfloo...@qub.ac.uk
__

 



--
View this message in context: 
http://r.789695.n4.nabble.com/Goodness-of-fit-for-a-Rietveld-Refinement-tp4678470.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Newb: How I find random vector index?

2013-10-17 Thread Stock Beaver
# Suppose I have a vector:

myvec = c(1,0,3,0,77,9,0,1,2,0)

# I want to randomly pick an element from myvec
# where element == 0
# and print the value of the corresponding index.

# So, for example I might randomly pick the 3rd 0
# and I would print the corresponding index
# which is 7,

# My initial approach is to use a for-loop.
# Also I take a short-cut which assumes myvec is short:

elm = 1
while (elm != 0) {
  # Pick a random index, (it might be a 0):
  rndidx = round(runif(1, min=1, max=length(myvec)))
  elm = myvec[rndidx]
  if(elm == 0)
    print("I am done")
  else
    print("I am not done")
}
print(rndidx)

# If myvec is large and/or contains no zeros,
# The above loop is sub-optimal/faulty.

# I suspect that skilled R-people would approach this task differently.
# Perhaps they would use features baked into R rather than use a loop?
[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Lattice xyplot: Fill Legend Points

2013-10-17 Thread Rich Shepard

On Thu, 17 Oct 2013, Richard M. Heiberger wrote:


I always get lost in simpleKey.


  As this is my first use of it I take what's offered by those more
experienced than I.


The approach of directly modifying the trellis object usually works.
tmp <- xyplot(pct.quant ~ sampdate, data = ffg.st, groups = func_feed_grp, type 
=
+ 'p', pch = 19, key = simpleKey(text = levels(ffg.st$func_feed_grp), space =
+ 'right', points = T, lines = F),par.settings = list(superpose.points =
+ list(col = rainbow(7)), superpose.lines = list(col = rainbow(7))), main =
+ 'Functional Feeding Groups (Individuals)', xlab = 'Year', ylab = 'Proportion
+ of Individuals')

tmp
str(tmp)
tmp$legend$right$args$key$points$pch

[1] 1 1 1 1 1 1 1

tmp$legend$right$args$key$points$pch[] <- 19
tmp$legend$right$args$key$points$pch

[1] 19 19 19 19 19 19 19


  OK. More steps but it will get the plots where they need to be.

Many thanks,

Rich

--
Richard B. Shepard, Ph.D.  |  Have knowledge, will travel.
Applied Ecosystem Services, Inc.   |
 Voice: 503-667-4517  Fax: 503-667-8863

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Subseting a data.frame

2013-10-17 Thread arun
You may try:
mydat[with(mydat,ave(seq_along(basel_asset_class),basel_asset_class,FUN=length)>2),]
#  basel_asset_class defa_frequency
#2 8  0.070
#3 8  0.030
#4 8  0.001


#or
library(plyr)
mydat[ddply(mydat,.(basel_asset_class),mutate,L=length(defa_frequency))[,3] 
>2,] #assuming it is sorted.

A.K.




On Thursday, October 17, 2013 1:59 PM, Katherine Gobin 
 wrote:
 I am sorry perhaps  was not able to put the question properly. I am not 
looking for the subset of the data.frame where the basel_asset_class is > 2. I 
do agree that would have been a basic requirement. Let me try to put the 
question again. 

I have a data frame as 

mydat = data.frame(basel_asset_class = c(4, 8, 8 ,8), defa_frequency = c(0.15, 
0.07, 0.03, 0.001))

# Please note I have changed the basel_asset_class to 4 from 2, to avoid 
confusion.

> mydat
  basel_asset_class defa_frequency
1                 4          0.150
2                 8          0.070
3                 8          0.030
4                 8          0.001



This is just an representative example. In reality, I may have no of basel 
asset classes. 4, 8 etc are the IDs can be anything thus I cant hard code it as 
subset(mydat, mydat$basel_asset_class > 2).


What I need is to select only those records for which there are more than two 
default frequencies (defa_frequency), Thus, there is only one default frequency 
= 0.150 w.r.t basel_asset_class = 4 whereas there are default frequencies 
w.r.t. basel aseet class 4, similarly there could be another basel asset class 
having say 5 default frequncies. Thus, I need to take subset of the data.frame 
s.t. the no of corresponding defa_frequencies is greater than 2.

The idea is we try to fit exponential curve Y = A exp( BX ) for each of the 
basel asset classes and to estimate values of A and B, mathematically one needs 
to have at least two values of X.

I hope I may be able to express my requirement. Its not that I need the subset 
of mydat s.t. basel asset class is > 2 (now 4 in revised example), but sbuset 
s.t. no of default frequencies is greater than or equal to 2. This 2 is not 
same as basel asset class 2.

Kindly guide

With warm regards

Katherine Gobin





On Thursday, 17 October 2013 9:33 PM, Bert Gunter  
wrote:

"Kindly guide" ...

This is a very basic question, so the kindest guide I can give is to read an 
Introduction to R (ships with R) or a R web tutorial of your choice so that you 
can learn how R works instead of posting to this list.

Cheers,
Bert




On Wed, Oct 16, 2013 at 11:55 PM, Katherine Gobin  
wrote:

Dear Forum,
>
>I have a data frame as 
>
>mydat = data.frame(basel_asset_class = c(2, 8, 8 ,8), defa_frequency = c(0.15, 
>0.07, 0.03, 0.001))
>
>> mydat
>  basel_asset_class defa_frequency
>1                 2          0.150
>2                 8          0.070
>3                 8          0.030
>4                 8          0.001
>
>
>I need to get the subset of this data.frame where no of records for the given 
>basel_asset_class is > 2, i.e. I need to obtain subset of above data.frame as 
>(since there is only 1 record, against basel_asset_class = 2, I want to filter 
>it)
>
>> mydat_a
>  basel_asset_class defa_frequency
>1                 8          0.070
>2                 8          0.030
>3                 8          0.001
>
>Kindly guide
>
>Katherine
>        [[alternative HTML version deleted]]
>
>
>__
>R-help@r-project.org mailing list
>https://stat.ethz.ch/mailman/listinfo/r-help
>PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
>and provide commented, minimal, self-contained, reproducible code.
>
>


-- 

Bert Gunter
Genentech Nonclinical Biostatistics

(650) 467-7374
    [[alternative HTML version deleted]]


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] S4 base class

2013-10-17 Thread Michael Meyer
@Martin Morgan, Duncan Murdoch:

OK Thanks.
I did not understand the callNextMethod.
I will investigate this in detail.
This is great!

Thanks again,

 
Michael Meyer

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Lattice xyplot: Fill Legend Points

2013-10-17 Thread Richard M. Heiberger
I always get lost in simpleKey.  The approach of directly modifying
the trellis object usually works.

> tmp <- xyplot(pct.quant ~ sampdate, data = ffg.st, groups = func_feed_grp, 
> type =
+ 'p', pch = 19, key = simpleKey(text = levels(ffg.st$func_feed_grp), space =
+ 'right', points = T, lines = F),par.settings = list(superpose.points =
+ list(col = rainbow(7)), superpose.lines = list(col = rainbow(7))), main =
+ 'Functional Feeding Groups (Individuals)', xlab = 'Year', ylab = 'Proportion
+ of Individuals')
> tmp
> str(tmp)
> tmp$legend$right$args$key$points$pch
[1] 1 1 1 1 1 1 1
> tmp$legend$right$args$key$points$pch[] <- 19
> tmp$legend$right$args$key$points$pch
[1] 19 19 19 19 19 19 19
> tmp
>

Rich

On Thu, Oct 17, 2013 at 12:57 PM, Rich Shepard  wrote:
> On Thu, 17 Oct 2013, Richard M. Heiberger wrote:
>
>> That should have worked.
>
>
>   That's what I thought when I first tried it.
>
>
>> I think something else is interfering. Did you redefine either T or F?
>
>
>   Not intentionally.
>
>
>> Please send the output from dput(head(ffg.st)) so we can experiment in
>> your setting.
>
>
> structure(list(sampdate = structure(c(13326, 13326, 13326, 13326, 13326,
> 13326), class = "Date"), func_feed_grp = structure(c(1L, 2L, 3L, 4L, 6L,
> 7L), .Label = c("Filterer", "Gatherer", "Grazer", "Omnivore", "Parasite",
> "Predator", "Shredder"), class = "factor"),
> quant = c(812L, 1880L, 624L, 11L, 948L, 1540L), pct.quant = c(0.14,
> 0.323, 0.107, 0.002, 0.163, 0.265), num.taxa = c(11L, 28L,
> 4L, 1L, 12L, 3L), pct.num.taxa = c(0.186, 0.475, 0.068, 0.017,
> 0.203, 0.051)), .Names = c("sampdate", "func_feed_grp", "quant",
> "pct.quant", "num.taxa", "pct.num.taxa"), row.names = 102:107, class =
> "data.frame")
>
>
> Rich
>
> --
> Richard B. Shepard, Ph.D.  |  Have knowledge, will travel.
> Applied Ecosystem Services, Inc.   |
>  Voice: 503-667-4517  Fax: 503-667-8863
>
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Subseting a data.frame

2013-10-17 Thread Katherine Gobin
Correction. (2nd para first three lines)
 
Pl read following line 

What I need is to select only those records for which there are more than two 
default frequencies (defa_frequency), Thus, there is only one default frequency 
= 0.150 w.r.t basel_asset_class = 4 whereas there are default frequencies 
w.r.t. basel aseet class 4,


as

What I need is to select only those records for which there are more than two 
default frequencies (defa_frequency), Thus, there is only one default frequency 
= 0.150 w.r.t basel_asset_class = 4 whereas there are THREE default frequencies 
w.r.t. basel aseet class 8,



I alpologize for the incovenience.

Regards

KAtherine








On , Katherine Gobin  wrote:
 
 I am sorry perhaps  was not able to put the question properly. I am not 
looking for the subset of the data.frame where the basel_asset_class is > 2. I 
do agree that would have been a basic requirement. Let me try to put the 
question again. 

I have a data frame as 

mydat = data.frame(basel_asset_class = c(4, 8, 8 ,8), defa_frequency = c(0.15, 
0.07, 0.03, 0.001))

# Please note I have changed the basel_asset_class to 4 from 2, to avoid 
confusion.

> mydat
  basel_asset_class defa_frequency
1                 4          0.150
2                 8          0.070
3                 8          0.030
4                 8          0.001



This is just an representative example. In reality, I may have no of basel 
asset classes. 4, 8 etc are the IDs can be anything thus I cant hard code it as 
subset(mydat, mydat$basel_asset_class > 2).


What I need is to select only those records for which there are more than two 
default frequencies (defa_frequency), Thus, there is only one default frequency 
= 0.150 w.r.t basel_asset_class = 4 whereas there are default frequencies 
w.r.t. basel aseet class 4, similarly there could be another basel asset class 
having say 5 default frequncies. Thus, I need to take subset of the data.frame 
s.t. the no of corresponding defa_frequencies is greater than 2.

The idea is we try to fit exponential curve Y = A exp( BX ) for each of the 
basel asset classes and to estimate values of A and B, mathematically one needs 
to have at least two values of X.

I hope I may be able to express my requirement. Its not that I need the subset 
of mydat s.t. basel asset class is > 2 (now 4 in revised example), but sbuset 
s.t. no of default frequencies is greater than or equal to 2. This 2 is not 
same as basel asset class 2.

Kindly guide

With warm regards

Katherine Gobin




On Thursday, 17 October 2013 9:33 PM, Bert Gunter  
wrote:
 
"Kindly guide" ...

This is a very basic question, so the kindest guide I can give is to read an 
Introduction to R (ships with R) or a R web tutorial of your choice so that you 
can learn how R works instead of posting to this list.

Cheers,
Bert




On Wed, Oct 16, 2013 at 11:55 PM, Katherine Gobin  
wrote:

Dear Forum,
>
>I have a data frame as 
>
>mydat = data.frame(basel_asset_class = c(2, 8, 8 ,8), defa_frequency = c(0.15, 
>0.07, 0.03, 0.001))
>
>> mydat
>  basel_asset_class defa_frequency
>1                 2          0.150
>2                 8          0.070
>3                 8          0.030
>4                 8          0.001
>
>
>I need to get the subset of this data.frame where no of records for the given 
>basel_asset_class is > 2, i.e. I need to obtain subset of above data.frame as 
>(since there is only 1 record, against basel_asset_class = 2, I want to filter 
>it)
>
>> mydat_a
>  basel_asset_class defa_frequency
>1                 8          0.070
>2                 8          0.030
>3                 8          0.001
>
>Kindly guide
>
>Katherine
>        [[alternative HTML version deleted]]
>
>
>__
>R-help@r-project.org mailing list
>https://stat.ethz.ch/mailman/listinfo/r-help
>PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
>and provide commented, minimal, self-contained, reproducible code.
>
>


-- 

Bert Gunter
Genentech Nonclinical Biostatistics

(650) 467-7374
[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Subseting a data.frame

2013-10-17 Thread Katherine Gobin
 I am sorry perhaps  was not able to put the question properly. I am not 
looking for the subset of the data.frame where the basel_asset_class is > 2. I 
do agree that would have been a basic requirement. Let me try to put the 
question again. 

I have a data frame as 

mydat = data.frame(basel_asset_class = c(4, 8, 8 ,8), defa_frequency = c(0.15, 
0.07, 0.03, 0.001))

# Please note I have changed the basel_asset_class to 4 from 2, to avoid 
confusion.

> mydat
  basel_asset_class defa_frequency
1                 4          0.150
2                 8          0.070
3                 8          0.030
4                 8          0.001



This is just an representative example. In reality, I may have no of basel 
asset classes. 4, 8 etc are the IDs can be anything thus I cant hard code it as 
subset(mydat, mydat$basel_asset_class > 2).


What I need is to select only those records for which there are more than two 
default frequencies (defa_frequency), Thus, there is only one default frequency 
= 0.150 w.r.t basel_asset_class = 4 whereas there are default frequencies 
w.r.t. basel aseet class 4, similarly there could be another basel asset class 
having say 5 default frequncies. Thus, I need to take subset of the data.frame 
s.t. the no of corresponding defa_frequencies is greater than 2.

The idea is we try to fit exponential curve Y = A exp( BX ) for each of the 
basel asset classes and to estimate values of A and B, mathematically one needs 
to have at least two values of X.

I hope I may be able to express my requirement. Its not that I need the subset 
of mydat s.t. basel asset class is > 2 (now 4 in revised example), but sbuset 
s.t. no of default frequencies is greater than or equal to 2. This 2 is not 
same as basel asset class 2.

Kindly guide

With warm regards

Katherine Gobin




On Thursday, 17 October 2013 9:33 PM, Bert Gunter  
wrote:
 
"Kindly guide" ...

This is a very basic question, so the kindest guide I can give is to read an 
Introduction to R (ships with R) or a R web tutorial of your choice so that you 
can learn how R works instead of posting to this list.

Cheers,
Bert




On Wed, Oct 16, 2013 at 11:55 PM, Katherine Gobin  
wrote:

Dear Forum,
>
>I have a data frame as 
>
>mydat = data.frame(basel_asset_class = c(2, 8, 8 ,8), defa_frequency = c(0.15, 
>0.07, 0.03, 0.001))
>
>> mydat
>  basel_asset_class defa_frequency
>1                 2          0.150
>2                 8          0.070
>3                 8          0.030
>4                 8          0.001
>
>
>I need to get the subset of this data.frame where no of records for the given 
>basel_asset_class is > 2, i.e. I need to obtain subset of above data.frame as 
>(since there is only 1 record, against basel_asset_class = 2, I want to filter 
>it)
>
>> mydat_a
>  basel_asset_class defa_frequency
>1                 8          0.070
>2                 8          0.030
>3                 8          0.001
>
>Kindly guide
>
>Katherine
>        [[alternative HTML version deleted]]
>
>
>__
>R-help@r-project.org mailing list
>https://stat.ethz.ch/mailman/listinfo/r-help
>PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
>and provide commented, minimal, self-contained, reproducible code.
>
>


-- 

Bert Gunter
Genentech Nonclinical Biostatistics

(650) 467-7374
[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] representing points in 3D space with trajectories over time

2013-10-17 Thread Greg Snow
If all your data is numeric then you can use an array instead of a data
frame and arrays can easily be 3, 4, or higher dimensional.  Or you can use
a data frame with a column each for x, y, z, and time; with possible other
columns representing groups or other attributes, essentially a 3
dimensional data frame with the 3rd dimension being stacked rather than
projecting out.


On Thu, Oct 17, 2013 at 6:59 AM, Umut Toprak  wrote:

> Dear all,
>
> I have a problem where I must represent points with XYZ coordinates
> changing over time. I will do a number of operations on this data such as
> calculating the YZ-projection distance of the points to the origin over
> time, the frequency spectrum of the X-T data etc. I am trying to find a
> good way of representing this data with an appropriate data structure.
>
> It appears like higher-dimensional data frames are not allowed and I do not
> know if I should use a list of data frames or if there is a better
> solution, possibly as part of an external package.
>
> Thank you for your time
> Umut Toprak
>
> [[alternative HTML version deleted]]
>
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>



-- 
Gregory (Greg) L. Snow Ph.D.
538...@gmail.com

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Selecting maximums between different variables

2013-10-17 Thread Berend Hasselman

On 17-10-2013, at 18:48, Tim Umbach  wrote:

> Hi there,
> 
> another beginners question, I'm afraid. Basically i want to selct the
> maximum of values, that correspond to different variables. I have a table
> of oil production that looks somewhat like this:
> 
> oil <- data.frame( YEAR = c(2011, 2012),
>   TX = c(2, 3),
>   CA = c(4, 25000),
>   AL = c(2,
> 21000),
> 
>   ND = c(21000,6))
> 
> Now I want to find out, which state produced most oil in a given year. I
> tried this:
> 
> attach(oil)
> last_year = oil[ c(YEAR == 2012), ]
> max(last_year)
> 

For a single year do

year <- which(oil[,"YEAR"]==2011)
oil[year,which.max(oil[year,]),drop=FALSE]

In the help look at base::[.data.frame  

Berend


> Which works, but it doesnt't give me the corresponding values (i.e. it just
> gives me the maximum output, not what state its from).
> So I tried this:
> 
> oil[c(oil == max(last_year)),]
> and this:
> oil[c(last_year == max(last_year)),]
> and this:
> oil[which.max(last_year),]
> and this:
> last_year[max(last_year),]
> 
> None of them work, but they don't give error messages either, the output is
> just "NA". The problem is, in my eyes, that I'm comparing the values of
> different variables with each other. Because if i change the structure of
> the dataframe (which I can't do with the real data, at least not with out
> doing it by hand with a huge dataset), it looks like this and works
> perfectly:
> 
> oil2 <- data.frame (
>  names = c('YEAR', 'TX', 'CA', 'AL', 'ND'),
>  oil_2011 = c(2011, 2, 4, 2, 21000),
>  oil_2012 = c(2012, 3, 25000, 21000, 6)
>  )
> attach(oil2)
> oil2[c(oil_2012 == max(oil_2012)),]
> 
> Any help is much appreciated.
> 
> Thanks, Tim Umbach
> 
>   [[alternative HTML version deleted]]
> 
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Weighted regression markers on scatter plots

2013-10-17 Thread Greg Snow
The simplest approach is to specify the cex parameter in the call to plot.
 plot(1:3, 1:3, cex=3:1) for example will plot the 1st point 3 times as
big, the 2nd 2 times as big, and the 3rd at the standard size.

You can get more control by using the symbols function instead of the plot
function and set the diameter of circles directly.  In either case you
probably want to scale by the square root of the weight.

The my.symbols function in the TeachingDemos package is another option if
the symbols function does not include the symbol you want or if you want a
little different level of control.


On Wed, Oct 16, 2013 at 11:04 AM, Msugarman  wrote:

> Hi all,
>
> I'm trying to graph the results of a weighted regression analysis. Is
> anyone
> aware of a way to make my markers appear a different sizes to be consistent
> with their respective weights?
>
> Thanks,
> -Mike Sugarman
> Wayne State University
>
>
>
> --
> View this message in context:
> http://r.789695.n4.nabble.com/Weighted-regression-markers-on-scatter-plots-tp4678370.html
> Sent from the R help mailing list archive at Nabble.com.
>
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>



-- 
Gregory (Greg) L. Snow Ph.D.
538...@gmail.com

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Selecting maximums between different variables

2013-10-17 Thread arun
Hi,
You may try:


unlist(lapply(seq_len(nrow(oil)),function(i) oil[i,-1][which.max(oil[i,-1])])) 
 #  CA    ND 
#4 6 
#or
library(reshape2)

datM <- melt(oil,id.var="YEAR")


datM[as.logical(with(datM,ave(value,list(YEAR),FUN= function(x) x%in% 
max(x,]
#  YEAR variable value
#3 2011   CA 4
#8 2012   ND 6

A.K.




On Thursday, October 17, 2013 12:50 PM, Tim Umbach  wrote:
Hi there,

another beginners question, I'm afraid. Basically i want to selct the
maximum of values, that correspond to different variables. I have a table
of oil production that looks somewhat like this:

oil <- data.frame( YEAR = c(2011, 2012),
                   TX = c(2, 3),
                   CA = c(4, 25000),
                   AL = c(2,
21000),

                   ND = c(21000,6))

Now I want to find out, which state produced most oil in a given year. I
tried this:

attach(oil)
last_year = oil[ c(YEAR == 2012), ]
max(last_year)

Which works, but it doesnt't give me the corresponding values (i.e. it just
gives me the maximum output, not what state its from).
So I tried this:

oil[c(oil == max(last_year)),]
and this:
oil[c(last_year == max(last_year)),]
and this:
oil[which.max(last_year),]
and this:
last_year[max(last_year),]

None of them work, but they don't give error messages either, the output is
just "NA". The problem is, in my eyes, that I'm comparing the values of
different variables with each other. Because if i change the structure of
the dataframe (which I can't do with the real data, at least not with out
doing it by hand with a huge dataset), it looks like this and works
perfectly:

oil2 <- data.frame (
  names = c('YEAR', 'TX', 'CA', 'AL', 'ND'),
  oil_2011 = c(2011, 2, 4, 2, 21000),
  oil_2012 = c(2012, 3, 25000, 21000, 6)
  )
attach(oil2)
oil2[c(oil_2012 == max(oil_2012)),]

Any help is much appreciated.

Thanks, Tim Umbach

    [[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] RWeka and multicore package

2013-10-17 Thread CEO'Riley
I received the following error message with the multicore package:

install.packages("multicore")
Warning in install.packages :
  package ‘multicore’ is not available (for R version 3.0.2)
Warning in install.packages :
  package ‘multicore’ is not available (for R version 3.0.2)
Warning message:
package ‘multicore’ is not available (for R version 3.0.2)


With gratitude,
CEO'Riley Jr.
Charles Ellis O'Riley Jr.

Ambition is a state of permanent dissatisfaction with the present


-Original Message-
From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On
Behalf Of Luís Paulo F. Garcia
Sent: Thursday, October 17, 2013 12:22 PM
To: r-help@r-project.org
Subject: [R] RWeka and multicore package

I work very mutch with the packages RWeka and multicore. If you try to run
J48 or any tree of RWeka with multicore we hava some errors.

Example I:

library(RWeka);
library(multicore);

mclapply(1:100, function(i) {
J48(Species ~., iris);
});


Output:  "Error in .jcall(o, \"Ljava/lang/Class;\", \"getClass\") : \n
java.lang.ClassFormatError: Incompatible magic value 1347093252 in class
file java/lang/ProcessEnvironment$StringEnvironment\n"


Example II:

library(multicore);

mclapply(1:100, function(i) {
RWeka::J48(Species ~., iris);
});

Output: Erro em .jcall(x$classifier, "S", "toString") :
  RcallMethod: attempt to call a method of a NULL object.


Do you know some way to work with parallel processing and RWeka? I tried MPI
and SNOW without success.

R version 3.0.2 (2013-09-25) -- "Frisbee Sailing"
Ubuntu 12.04 x64


--
Lums Paulo Faina Garcia
Engenheiro de Computagco - Universidade de Sco Paulo Sco Carlos - SP -
Brasil

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] S4 base class

2013-10-17 Thread Martin Morgan

On 10/17/2013 08:54 AM, Michael Meyer wrote:


Suppose you have a base class "Base" which implements a function "Base::F"
which works in most contexts but not in the context of "ComplicatedDerived" 
class
where some preparation has to happen before this very same function can be 
called.

You would then define

void ComplicatedDerived::F(...){

 preparation();
 Base::F();
}

You can nealry duplicate this in R via

setMethod("F",
signature(this="ComplicatedDerived"),
definition=function(this){

 preparation(this)
 F(as(this,"Base"))
})

but it will fail whenever F uses virtual functions (i.e. generics) which are 
only defined
for derived classes of Base


With

  .A <- setClass("A", representation(a="numeric"))
  .B <- setClass("B", representation(b="numeric"), contains="A")

  setGeneric("f", function(x, ...) standardGeneric("f"))

  setMethod("f", "A", function(x, ...) {
  message("f,A-method")
  g(x, ...)   # generic with methods only for derived classes
  })

  setMethod("f", "B", function(x, ...) {
  message("f,B-method")
  callNextMethod(x, ...)  # earlier response from Duncan Murdoch
  })

  setGeneric("g", function(x, ...) standardGeneric("g"))

  setMethod("g", "B", function(x, ...) {
  message("g,B-method")
  x
  })

one has

> f(.B())
f,B-method
f,A-method
g,B-method

An object of class "B"
Slot "b":
numeric(0)

Slot "a":
numeric(0)

?


--
Computational Biology / Fred Hutchinson Cancer Research Center
1100 Fairview Ave. N.
PO Box 19024 Seattle, WA 98109

Location: Arnold Building M1 B861
Phone: (206) 667-2793

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] RWeka and multicore package

2013-10-17 Thread Luís Paulo F . Garcia
I work very mutch with the packages RWeka and multicore. If you try to run
J48 or any tree of RWeka with multicore we hava some errors.

Example I:

library(RWeka);
library(multicore);

mclapply(1:100, function(i) {
J48(Species ~., iris);
});


Output:  "Error in .jcall(o, \"Ljava/lang/Class;\", \"getClass\") : \n
java.lang.ClassFormatError: Incompatible magic value 1347093252 in class
file java/lang/ProcessEnvironment$StringEnvironment\n"


Example II:

library(multicore);

mclapply(1:100, function(i) {
RWeka::J48(Species ~., iris);
});

Output: Erro em .jcall(x$classifier, "S", "toString") :
  RcallMethod: attempt to call a method of a NULL object.


Do you know some way to work with parallel processing and RWeka? I tried
MPI and SNOW without success.

R version 3.0.2 (2013-09-25) -- "Frisbee Sailing"
Ubuntu 12.04 x64


-- 
Luís Paulo Faina Garcia
Engenheiro de Computação - Universidade de São Paulo
São Carlos - SP - Brasil

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Comparing two groups

2013-10-17 Thread Greg Snow
>From your question it is not clear what your question/concerns really are,
and from what we can see it could very well be that you do not understand
the statistics that you are computing (not just the R implementation).  We
ask for a reproducible example because that helps us to help you, just a
couple of boxplots let us make some guesses, but we do not know the data
values or even the means and standard deviations, even the actual sample
sizes could help.

>From the graph it is not surprising that the wilcox test say that the 2
groups are different and that the t test says that they are not (but
knowing data values would help even more).  The 2 tests are testing very
different hypotheses.  The wilcox test is testing that the 2 distributions
are identical and the more specific way it tests that is by looking at all
possible pairs between the 2 groups and seeing what proportion of them have
each group higher, if the null were true then half the time the data point
from mixed would be higher than the data point from monoculture and half
the time the other way.  From the boxplot we can see that the median of
monoculture is below the 1st quartile of mixed, so it is not surprising at
all that the wilcox test rejects the null hypothesis.

The t-test (which version you used you do not say) is testing if the means
are equal, since monculture is clearly skewed to the right with potential
outliers, it would not be surprising if the sample means were close enough
to each other that the t-test does not see a significant difference.  The 2
tests give different answers because they are answering very different
questions.

You state that "I am not allowed to perform it" referring to the t-test.
 This indicates that you don't have a full understanding or appreciation of
the Central Limit Theorem (an important enough theorem that I have a
cross-stitch based on it hanging on my wall (along with 2 other
cross-stitches of Bayes theorem and the mean value theorem of
integration)).  The plot shows 18 outliers in the monoculture group which
implies a sample size of at least 72, which means the other group has a
sample size of at least 14 if I interpret "five times as big" correctly.
 This is a large enough sample size for the CLT to tell us the t-test will
give a reasonable approximation (provided the other assumptions hold
reasonably well and you are interested in the question being answered).

So, I believe that the advice to read a textbook, or otherwise get some
help in basic understanding of the statistical tools is reasonable.  Once
you have that, then if you still need help then give us a reproducible
example and make it clear what your question really is and you will be much
more likely to receive an answer.


On Tue, Oct 15, 2013 at 6:01 AM, Andrej  wrote:

> >So why not start with some statistical textbook? There are plenty of them
> available in CRAN.
>
> I wasn't implying, that I haven't read any textbook, or didn't do any
> research. I read some textbooks/Papers/etc. during the research about what
> to do and came across the wilcox test. I meant to imply that I could have
> problems understanding some of the answers, and that maybe additional
> explaining would be necessary.
>
> My doubts stem from the fact, that the wilcox test is a - as far as I know
> -
> ranking test, that states if two groups are different. My assumption is,
> due
> to the fact that the second group has a much higher sample size, it is
> clear
> that it differs from the first group. I performed a t-test (just to see; I
> am aware that I am not allowed to perform it, because my samples aren't
> normally distributed) and it gave me a p-value of 0.3.
> Actually I am not even entirely sure, if wilcox is the right test. I just
> want to know if the means of the two groups are significantly different.
>
>
>
> --
> View this message in context:
> http://r.789695.n4.nabble.com/Comparing-two-groups-tp4678190p4678277.html
> Sent from the R help mailing list archive at Nabble.com.
>
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>



-- 
Gregory (Greg) L. Snow Ph.D.
538...@gmail.com

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Lattice xyplot: Fill Legend Points

2013-10-17 Thread Rich Shepard

On Thu, 17 Oct 2013, Richard M. Heiberger wrote:


That should have worked.


  That's what I thought when I first tried it.


I think something else is interfering. Did you redefine either T or F?


  Not intentionally.


Please send the output from dput(head(ffg.st)) so we can experiment in
your setting.


structure(list(sampdate = structure(c(13326, 13326, 13326, 13326, 
13326, 13326), class = "Date"), func_feed_grp = structure(c(1L, 
2L, 3L, 4L, 6L, 7L), .Label = c("Filterer", "Gatherer", "Grazer", 
"Omnivore", "Parasite", "Predator", "Shredder"), class = "factor"),

quant = c(812L, 1880L, 624L, 11L, 948L, 1540L), pct.quant = c(0.14,
0.323, 0.107, 0.002, 0.163, 0.265), num.taxa = c(11L, 28L,
4L, 1L, 12L, 3L), pct.num.taxa = c(0.186, 0.475, 0.068, 0.017,
0.203, 0.051)), .Names = c("sampdate", "func_feed_grp", "quant", 
"pct.quant", "num.taxa", "pct.num.taxa"), row.names = 102:107, class =

"data.frame")

Rich

--
Richard B. Shepard, Ph.D.  |  Have knowledge, will travel.
Applied Ecosystem Services, Inc.   |
 Voice: 503-667-4517  Fax: 503-667-8863

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Selecting maximums between different variables

2013-10-17 Thread Tim Umbach
Hi there,

another beginners question, I'm afraid. Basically i want to selct the
maximum of values, that correspond to different variables. I have a table
of oil production that looks somewhat like this:

oil <- data.frame( YEAR = c(2011, 2012),
   TX = c(2, 3),
   CA = c(4, 25000),
   AL = c(2,
21000),

   ND = c(21000,6))

Now I want to find out, which state produced most oil in a given year. I
tried this:

attach(oil)
last_year = oil[ c(YEAR == 2012), ]
max(last_year)

Which works, but it doesnt't give me the corresponding values (i.e. it just
gives me the maximum output, not what state its from).
So I tried this:

oil[c(oil == max(last_year)),]
and this:
oil[c(last_year == max(last_year)),]
and this:
oil[which.max(last_year),]
and this:
last_year[max(last_year),]

None of them work, but they don't give error messages either, the output is
just "NA". The problem is, in my eyes, that I'm comparing the values of
different variables with each other. Because if i change the structure of
the dataframe (which I can't do with the real data, at least not with out
doing it by hand with a huge dataset), it looks like this and works
perfectly:

oil2 <- data.frame (
  names = c('YEAR', 'TX', 'CA', 'AL', 'ND'),
  oil_2011 = c(2011, 2, 4, 2, 21000),
  oil_2012 = c(2012, 3, 25000, 21000, 6)
  )
attach(oil2)
oil2[c(oil_2012 == max(oil_2012)),]

Any help is much appreciated.

Thanks, Tim Umbach

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Constraint on regression parameters

2013-10-17 Thread Greg Snow
You want the offset function in the formula:

lm( A ~ B + I(B^2) + offset(C), data=Dataset)

This will force the coefficient on C to be 1, if you wanted a coefficient
of another value then just do the multiplication yourself, e.g. offset( 2 *
C ) for a slope of 2.

Also you can use poly(B,2) to fit a linear and quadratic terms on B.


On Thu, Oct 17, 2013 at 3:45 AM, Robert U  wrote:

> Dear all,
>
> I have been trying to  find a simple solution to my problem without
> success, though i have a feeling a simple syntaxe detail coul make the job.
>
> I am doing a polynomial linear regression with 2 independent variables
> such as :
>
> lm(A ~ B + I(B^2) + I(lB^3) + C, data=Dataset))
>
> R return me a coefficient per independent variable, and I  would need the
> coefficient of the C parameter to equal 1.
>
>
> I've been loonking at "parameter constraints" on the  internet but it's
> always much more complicated that just "removing" the fit of a coefficient
> (or setting it to 1).
>
>
> I know many package allows to "not fit" an intercept with a "-1" parameter
> in the syntaxe, does that exists for independent variables ?
>
> Regards,
> [[alternative HTML version deleted]]
>
>
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>
>


-- 
Gregory (Greg) L. Snow Ph.D.
538...@gmail.com

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Lattice xyplot: Fill Legend Points

2013-10-17 Thread Richard M. Heiberger
That should have worked.  I think something else is interfering.
Did you redefine either T or F?

Please send the output from dput(head(ffg.st))
so we can experiment in your setting.

Rich

On Thu, Oct 17, 2013 at 12:12 PM, Rich Shepard  wrote:
> On Thu, 17 Oct 2013, Richard M. Heiberger wrote:
>
>> par.settings = list(
>>   superpose.points = list(col = rainbow(7), pch = 19),
>>   superpose.lines = list(col = rainbow(7))
>> )
>
>
>   I had tried that, too. Legend symbols stubbornly remain unfilled.
>
> Thanks, Richard,
>
> Rich
>
>
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] plot - how to vary the distances of the x axis?

2013-10-17 Thread Bretschneider (R)

On 17 Oct 2013, at 13:44 , Hermann Norpois wrote:

> Hello,
> 
> 
> my dots of 0 and 2 are quite close to the marging. So I would like to move
> the 0 and the 2 both towards the 1. I wish to be my dots more centered.
> And: I dont need so much space between 0,1 and 2.
> 
> How does it work?
> I tried:
> 
> plot (data, axes=FALSE, main=i, ylab= expression (z^2))
>  plot.window (xlim=c (0,2), ylim=c(0,80))
>  box (lwd=2)
>  axis (side=1, at = c (0,1,2))
>  axis (side =2)
> 
> dput (data)
> structure(list(Genotype = c(0, 0, 0, 1, 1, 1, 1, 1, 2), z =
> c(0.66429502114682,
> 0.258444359570075, 0.0702937908415368, 0.694376498254858,
> 0.0967863570760579,
> 0.213966209301163, 0.671497050546114, 0.60318070802847, 75.6011068681301
> )), .Names = c("Genotype", "z"), row.names = c(NA, 9L), class =
> "data.frame")
>> 
> 
> Thanks




If I understand what you want, set xlim() a bit wider, within in the 
plot-statement: xlim=c (-0.4,2.4), ylim=c(0,80)

Hope this helps, 
Best wishes,


Franklin
-




Dr. Franklin Bretschneider
Dept of Biology
Utrecht Unversity
Padualaan 8
3584 CH  Utrecht
The Netherlands
f.bretschnei...@uu.nl



[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Subseting a data.frame

2013-10-17 Thread Bert Gunter
"Kindly guide" ...

This is a very basic question, so the kindest guide I can give is to read
an Introduction to R (ships with R) or a R web tutorial of your choice so
that you can learn how R works instead of posting to this list.

Cheers,
Bert


On Wed, Oct 16, 2013 at 11:55 PM, Katherine Gobin  wrote:

> Dear Forum,
>
> I have a data frame as
>
> mydat = data.frame(basel_asset_class = c(2, 8, 8 ,8), defa_frequency =
> c(0.15, 0.07, 0.03, 0.001))
>
> > mydat
>   basel_asset_class defa_frequency
> 1 2  0.150
> 2 8  0.070
> 3 8  0.030
> 4 8  0.001
>
>
> I need to get the subset of this data.frame where no of records for the
> given basel_asset_class is > 2, i.e. I need to obtain subset of above
> data.frame as (since there is only 1 record, against basel_asset_class = 2,
> I want to filter it)
>
> > mydat_a
>   basel_asset_class defa_frequency
> 1 8  0.070
> 2 8  0.030
> 3 8  0.001
>
> Kindly guide
>
> Katherine
> [[alternative HTML version deleted]]
>
>
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>
>


-- 

Bert Gunter
Genentech Nonclinical Biostatistics

(650) 467-7374

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] plot - how to vary the distances of the x axis?

2013-10-17 Thread Hermann Norpois
Hello,


my dots of 0 and 2 are quite close to the marging. So I would like to move
the 0 and the 2 both towards the 1. I wish to be my dots more centered.
And: I dont need so much space between 0,1 and 2.

How does it work?
I tried:

plot (data, axes=FALSE, main=i, ylab= expression (z^2))
  plot.window (xlim=c (0,2), ylim=c(0,80))
  box (lwd=2)
  axis (side=1, at = c (0,1,2))
  axis (side =2)

dput (data)
structure(list(Genotype = c(0, 0, 0, 1, 1, 1, 1, 1, 2), z =
c(0.66429502114682,
0.258444359570075, 0.0702937908415368, 0.694376498254858,
0.0967863570760579,
0.213966209301163, 0.671497050546114, 0.60318070802847, 75.6011068681301
)), .Names = c("Genotype", "z"), row.names = c(NA, 9L), class =
"data.frame")
>

Thanks
<>__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] match values in dependence of ID and Date

2013-10-17 Thread arun
Hi,
Try:
dat <- read.table(text="
ID    Name
1    Andy
2    John
3    Amy",sep="",header=TRUE,stringsAsFactors=FALSE)

dat2 <- read.table(text="
ID  Date    Value
1    2013-10-01    10
1    2013-10-02    15
2    2013-10-01    7
2    2013-10-03    10
2    2013-10-04    15
3    2013-10-01    
10",sep="",header=TRUE,colClasses=c("numeric","Date","numeric"))

library(plyr)

 res <- 
reshape(ddply(merge(dat,dat2,by="ID"),.(ID),mutate,id=((seq_along(ID)-1)%%3+1))[,-3],idvar=c("ID","Name"),timevar="id",direction="wide")
 rownames(res) <- 1:nrow(res)
 colnames(res)[3:5] <- c("First", "Second", "Third")

 res
#  ID Name First Second Third
#1  1 Andy    10 15    NA
#2  2 John 7 10    15
#3  3  Amy    10 NA    NA
A.K.






On Thursday, October 17, 2013 7:42 AM, Mat  wrote:
hello togehter,

i have a little problem, maybe you can help me.

I have a data.frame like this one:

ID    Name
1     Andy
2     John
3     Amy

and a data.frame like this:

ID   Date            Value
1    2013-10-01    10
1    2013-10-02    15
2    2013-10-01    7
2    2013-10-03    10
2    2013-10-04    15
3    2013-10-01    10

the result should be this one:

ID    Name   First   Second    Third
1     Andy    10     15
2     John     7      10           15
3     Amy     10

maybe you can help me, to do this?

Thank you.

Mat



--
View this message in context: 
http://r.789695.n4.nabble.com/match-values-in-dependence-of-ID-and-Date-tp4678433.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Lattice xyplot: Fill Legend Points

2013-10-17 Thread Rich Shepard

On Thu, 17 Oct 2013, Richard M. Heiberger wrote:


par.settings = list(
  superpose.points = list(col = rainbow(7), pch = 19),
  superpose.lines = list(col = rainbow(7))
)


  I had tried that, too. Legend symbols stubbornly remain unfilled.

Thanks, Richard,

Rich

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Lattice xyplot: Fill Legend Points

2013-10-17 Thread Richard M. Heiberger
par.settings = list(
   superpose.points = list(col = rainbow(7), pch = 19),
   superpose.lines = list(col = rainbow(7))
)

On Thu, Oct 17, 2013 at 11:48 AM, Rich Shepard  wrote:
> On Thu, 17 Oct 2013, Richard M. Heiberger wrote:
>
>> put the pch into the par.settings
>
>
> Richard,
>
>   Tried this again, but I'm not finding the proper location within
> par.settings.
>
>
> par.settings = list(superpose.points = list(col = rainbow(7)),
> superpose.lines = list(col = rainbow(7)), pch = 19)
>
>
> If I put it prior to the (list ... group there's an error of an extra = ;
> when I put it anywhere in the list (the above is one of my tries), it has no
> effect on the legend symbols: they remain as outlines.
>
>   What have I missed?
>
>
> Thanks,
>
> Rich
>
> --
> Richard B. Shepard, Ph.D.  |  Have knowledge, will travel.
> Applied Ecosystem Services, Inc.   |
>  Voice: 503-667-4517  Fax: 503-667-8863
>
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] match values in dependence of ID and Date

2013-10-17 Thread arun
Hi,

I think based on your title, the output you provided is not clear. If it 
depends on Date, there should be four columns.
library(reshape2)

res1 <- dcast(merge(dat,dat2,by="ID"),ID+Name~Date,value.var="Value")
 colnames(res1)[3:6] <- c("First", "Second", "Third", "Fourth")
 rownames(res1) <- 1:nrow(res1)


#or
res2 <- 
reshape(merge(dat,dat2,by="ID"),idvar=c("ID","Name"),timevar="Date",direction="wide")
 dimnames(res2) <- dimnames(res1)

 res2
#  ID Name First Second Third Fourth
#1  1 Andy    10 15    NA NA
#2  2 John 7 NA    10 15
#3  3  Amy    10 NA    NA NA


A.K.






On Thursday, October 17, 2013 9:31 AM, arun  wrote:
Hi,
Try:
dat <- read.table(text="
ID    Name
1    Andy
2    John
3    Amy",sep="",header=TRUE,stringsAsFactors=FALSE)

dat2 <- read.table(text="
ID  Date    Value
1    2013-10-01    10
1    2013-10-02    15
2    2013-10-01    7
2    2013-10-03    10
2    2013-10-04    15
3    2013-10-01    
10",sep="",header=TRUE,colClasses=c("numeric","Date","numeric"))

library(plyr)

 res <- 
reshape(ddply(merge(dat,dat2,by="ID"),.(ID),mutate,id=((seq_along(ID)-1)%%3+1))[,-3],idvar=c("ID","Name"),timevar="id",direction="wide")
 rownames(res) <- 1:nrow(res)
 colnames(res)[3:5] <- c("First", "Second", "Third")

 res
#  ID Name First Second Third
#1  1 Andy    10 15    NA
#2  2 John 7 10    15
#3  3  Amy    10 NA    NA
A.K.







On Thursday, October 17, 2013 7:42 AM, Mat  wrote:
hello togehter,

i have a little problem, maybe you can help me.

I have a data.frame like this one:

ID    Name
1     Andy
2     John
3     Amy

and a data.frame like this:

ID   Date            Value
1    2013-10-01    10
1    2013-10-02    15
2    2013-10-01    7
2    2013-10-03    10
2    2013-10-04    15
3    2013-10-01    10

the result should be this one:

ID    Name   First   Second    Third
1     Andy    10     15
2     John     7      10           15
3     Amy     10

maybe you can help me, to do this?

Thank you.

Mat



--
View this message in context: 
http://r.789695.n4.nabble.com/match-values-in-dependence-of-ID-and-Date-tp4678433.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Incorporate Julia into R

2013-10-17 Thread Timo Schmid
Hi,

I have some code in R with a lot of matrix multiplication and inverting. R can 
be very slow for larger matrices like 5000x5000.
I have seen the new programming language Julia (www.julialang.org) which is 
quite fast in doing matrix algebra. So my idea is to set up the simulations in 
R and start the first calculations, then I want to give some objects to Julia 
and do there some matrix algebra and give the results back to R. 
Is this possible or does anybody know how to do this? Is there a package 
available?
A short example with some lines of code would be also very helpful. 

Thanks in advance,
Timo
  
[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] representing points in 3D space with trajectories over time

2013-10-17 Thread Umut Toprak
Dear all,

I have a problem where I must represent points with XYZ coordinates
changing over time. I will do a number of operations on this data such as
calculating the YZ-projection distance of the points to the origin over
time, the frequency spectrum of the X-T data etc. I am trying to find a
good way of representing this data with an appropriate data structure.

It appears like higher-dimensional data frames are not allowed and I do not
know if I should use a list of data frames or if there is a better
solution, possibly as part of an external package.

Thank you for your time
Umut Toprak

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Singular Matrix 'a' in solve

2013-10-17 Thread CL Tee
Hi,


I have a set of matrix data named “invest” consists of 450 observations (75
countries, 6 years) with 7 variables (set as I, pop, inv, gov, c, life, d;
which each is “numeric[450]”). The procedure is modify from code provided
by B.E. Hansen at http://www.ssc.wisc.edu/~bhansen/progs/ecnmt_00.html.



*Then the variable is being transformed to*

y<- lag_v(i,0)

cf   <- lag_v(c,0)

lpop   <- lag_v(pop,0)

linv<- lag_v(inv,0)

lgov   <- lag_v(gov,0)

d1 <- lag_v(d,0)

llife<- lag_v(life,0)

yt  <- tr(y)

ct  <- tr(cf)



y, cf, lpop, linv, lgov, d1, llife each is in “375x1 double matrix”

yt and ct each is “300x1 double matrix”

(I use R Studio so these characteristics are stated).



*The lag_v() and tr() process is as below:*

max_lag <- 1

tt <- t-max_lag

ty <- n*(t-max_lag-1)



lag_v <- function(x,lagn){

  yl <- matrix(c(0),nrow=n,ncol=t)

  for (i in 1:n) {

  yl[i,]<-x[(1+(i-1)*t):(t*i)]

  }

  yl <- yl[,(1+max_lag-lagn):(t-lagn)]

  out <- matrix(t(yl),nrow=nrow(yl)*ncol(yl),ncol=1)

  out

}



tr <- function(y){

   yf <- matrix(c(0),nrow=n,ncol=tt)

   for (i in 1:n) {

   yf[i,]<-y[(1+(i-1)*tt):(tt*i)]

   }

   yfm <- yf- colMeans(t(yf))

   yfm <- yfm[,1:(tt-1)]

   out <- matrix(t(yfm),nrow=nrow(yfm)*ncol(yfm),ncol=1)

   out

}



*Then before the computation, something is being setup*

x <- cbind(lpop, linv, lgov, llife, cf)

… (skip as I think is unrelated with the problem encounter)



*And, in the early stage of computation:*

sse_calc <- function(y,x){

 e <- y-x%*%qr.solve(x,y)

 out <- t(e)%*%e

 out

}

…



*It comes out with*

Error in qr.solve(x, y) : singular matrix 'a' in solve



I thought only square matrix would have this kind of problem. Would qr()
help in this case? Or is there any other possible solution for this problem?



Thanks.

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] S4 base class

2013-10-17 Thread Michael Meyer
Greetings,

I have an S4 class "B" (Base) which defines a function f=f(this="B",...) 
Dervided from B we have a derived class D which also defines a function 
f=f(this="D",...)

In the definition of D::f we want to call the version B::f and could do this by 
simply calling

f(baseClassObject(this),...)

The question is the following:

How do I refer to the base class object from the derived class?



Many thanks 

 
Michael Meyer

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Weighted regression markers on scatter plots

2013-10-17 Thread Msugarman
Hi all,

I'm trying to graph the results of a weighted regression analysis. Is anyone
aware of a way to make my markers appear a different sizes to be consistent
with their respective weights?

Thanks,
-Mike Sugarman
Wayne State University



--
View this message in context: 
http://r.789695.n4.nabble.com/Weighted-regression-markers-on-scatter-plots-tp4678370.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] map with inset

2013-10-17 Thread markw
Hi David,

That worked brilliantly! Many thanks. I also had trouble getting subplot()
to work with either TeachingDemos or Hmisc.

Best,
Mark



--
View this message in context: 
http://r.789695.n4.nabble.com/map-with-inset-tp4678341p4678426.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Subseting a data.frame

2013-10-17 Thread Charles Determan Jr
Katherine,

There are multiple ways to do this and I highly recommend you look into a
basic R manual or search the forums.  One quick example would be:

mysub <- subset(mydat, basel_asset_class > 2)

Cheers,
Charles


On Thu, Oct 17, 2013 at 1:55 AM, Katherine Gobin
wrote:

> Dear Forum,
>
> I have a data frame as
>
> mydat = data.frame(basel_asset_class = c(2, 8, 8 ,8), defa_frequency =
> c(0.15, 0.07, 0.03, 0.001))
>
> > mydat
>   basel_asset_class defa_frequency
> 1 2  0.150
> 2 8  0.070
> 3 8  0.030
> 4 8  0.001
>
>
> I need to get the subset of this data.frame where no of records for the
> given basel_asset_class is > 2, i.e. I need to obtain subset of above
> data.frame as (since there is only 1 record, against basel_asset_class = 2,
> I want to filter it)
>
> > mydat_a
>   basel_asset_class defa_frequency
> 1 8  0.070
> 2 8  0.030
> 3 8  0.001
>
> Kindly guide
>
> Katherine
> [[alternative HTML version deleted]]
>
>
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>
>


-- 
Charles Determan
Integrated Biosciences PhD Candidate
University of Minnesota

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Lattice xyplot: Fill Legend Points

2013-10-17 Thread Rich Shepard

On Thu, 17 Oct 2013, Richard M. Heiberger wrote:


put the pch into the par.settings


Richard,

  Tried this again, but I'm not finding the proper location within
par.settings.

par.settings = list(superpose.points = list(col = rainbow(7)),
superpose.lines = list(col = rainbow(7)), pch = 19)


If I put it prior to the (list ... group there's an error of an extra = ;
when I put it anywhere in the list (the above is one of my tries), it has no
effect on the legend symbols: they remain as outlines.

  What have I missed?

Thanks,

Rich

--
Richard B. Shepard, Ph.D.  |  Have knowledge, will travel.
Applied Ecosystem Services, Inc.   |
 Voice: 503-667-4517  Fax: 503-667-8863

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Comparing two groups

2013-10-17 Thread Andrej
>So why not start with some statistical textbook? There are plenty of them
available in CRAN. 

I wasn't implying, that I haven't read any textbook, or didn't do any
research. I read some textbooks/Papers/etc. during the research about what
to do and came across the wilcox test. I meant to imply that I could have
problems understanding some of the answers, and that maybe additional
explaining would be necessary.

My doubts stem from the fact, that the wilcox test is a - as far as I know -
ranking test, that states if two groups are different. My assumption is, due
to the fact that the second group has a much higher sample size, it is clear
that it differs from the first group. I performed a t-test (just to see; I
am aware that I am not allowed to perform it, because my samples aren't
normally distributed) and it gave me a p-value of 0.3.
Actually I am not even entirely sure, if wilcox is the right test. I just
want to know if the means of the two groups are significantly different.



--
View this message in context: 
http://r.789695.n4.nabble.com/Comparing-two-groups-tp4678190p4678277.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Subseting a data.frame

2013-10-17 Thread Katherine Gobin
Dear Forum,

I have a data frame as 

mydat = data.frame(basel_asset_class = c(2, 8, 8 ,8), defa_frequency = c(0.15, 
0.07, 0.03, 0.001))

> mydat
  basel_asset_class defa_frequency
1                 2          0.150
2                 8          0.070
3                 8          0.030
4                 8          0.001


I need to get the subset of this data.frame where no of records for the given 
basel_asset_class is > 2, i.e. I need to obtain subset of above data.frame as 
(since there is only 1 record, against basel_asset_class = 2, I want to filter 
it)

> mydat_a
  basel_asset_class defa_frequency
1                 8          0.070
2                 8          0.030
3                 8          0.001

Kindly guide

Katherine
[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] S4 base class

2013-10-17 Thread Michael Meyer
Quote

By the way, your use of the syntax D::f and B::f suggests that you're 
thinking from a C++ point of view.  That's very likely to lead to 
frustration:  the S4 object system is very different from C++.  Methods 
don't belong to classes, they belong to generics. There is no such thing 
as D::f or B::f, only f methods with different signatures.

Duncan Murdoch 


#---#

I am aware of this.
We can probably agree that we should use S4 classes and generic functions to 
duplicate more usual object oriented architecture as far as possible while 
remaining conscious of the regrettable differences.

For example we can pretend we are defining a virtual function in class Base by 
writing:

setGeneric("F",
function(this) standardGeneric("F")
)

where the code for Base  is, even though it has nothing to do with the class 
Base. 
We can even use it in other functions "defined in class Base" by writing 


setGeneric("G",
function(this) standardGeneric("G")
)
setMethod("G",
signature(this="Base"),
definition=function(this){

F(this)
})

which will work on all derived classes which implement F in some fashion:

setMethod("F",
signature(this="Derived"),
definition=function(this){

# do something appropriate for derived.
})

With this we can reproduce some semblance of object oriented programming
However, apparently we cannot solve in this manner a common problem of object 
oriented programming (from now on C++ parlance):

Suppose you have a base class "Base" which implements a function "Base::F" 
which works in most contexts but not in the context of "ComplicatedDerived" 
class
where some preparation has to happen before this very same function can be 
called.

You would then define

void ComplicatedDerived::F(...){

preparation();
Base::F();
}

You can nealry duplicate this in R via 

setMethod("F",
signature(this="ComplicatedDerived"),
definition=function(this){

preparation(this)
F(as(this,"Base"))
})

but it will fail whenever F uses virtual functions (i.e. generics) which are 
only defined
for derived classes of Base, whereas this is not a problem at all in normal 
object oriented
languages.

This is not a contrived problem but is rather basic.
I wonder if you can do it in R in some other way.


Many thanks,

Michael

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Help with combining functions

2013-10-17 Thread Carl Witthoft
This is the world-famous "fizzbuzz" problem.   You should be able to find
lots of implementations by Googling that word.  Here's a pointless
collection I wrote once:

# a really dumb fizzbuzz alg competition
#fbfun1 is 2.5x faster than fbfun2
# fbfun3 is 10x faster than fbfun1
# fbfun1 is 2x faster than fbfun4 
# fbfun5 is 20x faster than fbrun3
# Those are user times; in most cases the system time is very small indeed. 

fbfun1 <- function(xfoo) {
xfoo<-1:xfoo
fbfoo <- 1+(!as.logical(mod(xfoo,3)))*(as.logical(mod(xfoo,5))) +
2*(as.logical(mod(xfoo,3)))*(!as.logical(mod(xfoo,5)))+3*(!as.logical(mod(xfoo,3)))*(!as.logical(mod(xfoo,5)))

fbbar <- unlist(lapply(fbfoo, function(x)
switch(x,0,'fizz','buzz','fizzbuzz')))
return(fbbar)
}


fbfun3 <- function(xfoo) {
xfoo<-1:xfoo
fbfoo <- 1+(!as.logical(mod(xfoo,3)))*(as.logical(mod(xfoo,5))) +
2*(as.logical(mod(xfoo,3)))*(!as.logical(mod(xfoo,5)))+3*(!as.logical(mod(xfoo,3)))*(!as.logical(mod(xfoo,5)))
fbtab<-cbind(1:4,c('','fizz','buzz','fizzbuzz'))
fbbar <- fbtab[fbfoo,2]
return(fbbar)
}

# can I do it with recycled vectors, e.g. c('','','fizz') and
c('','','','','buzz') ?
fbfun4 <- function(xfoo) {
fiz<- rep(c('','','fizz'),length.out=xfoo)
buz<-rep(c('','','','','buzz'),length.out=xfoo)
fbbar <- unlist(lapply(1:xfoo, function(j)paste(fiz[j],buz[j]) ) )
return(fbbar)
}

# or completely sleazy:
fbfun5 <- function(xfoo) {
fiz<-
rep(c('','','fizz','','buzz','fizz','','','fizz','buzz','','fizz','','','fizzbuzz'),length.out=xfoo)
return(fiz)
}





--
View this message in context: 
http://r.789695.n4.nabble.com/Help-with-combining-functions-tp4678212p4678272.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] saveXML() prefix argument

2013-10-17 Thread Duncan Temple Lang
Milan is correct.
The prefix is used when saving the XML content that is represented in
a different format in R.

To get the prefix 
 
on the XML content that you save, use a document object

doc = newXMLDoc()
root = newXMLNode("foo", doc = doc)

saveXML(doc)





Sorry for the confusion.
 D

On 10/17/13 2:36 AM, Milan Bouchet-Valat wrote:
> Le mercredi 16 octobre 2013 à 23:45 -0400, Earl Brown a écrit :
>> I'm using the "XML" package and specifically the saveXML() function but I 
>> can't get the "prefix" argument of saveXML() to work:
>>
>> library("XML")
>> concepts <- c("one", "two", "three")
>> info <- c("info one", "info two", "info three")
>> root <- newXMLNode("root")
>> for (i in 1:length(concepts)) {
>>  cur.concept <- concepts[i]
>>  cur.info <- info[i]
>>  cur.tip <- newXMLNode("tip", attrs = c(id = i), parent = root)
>>  newXMLNode("h1", cur.concept, parent = cur.tip)
>>  newXMLNode("p", cur.info, parent = cur.tip)
>> }
>>
>> # None of the following output a prefix on the first line of the exported 
>> document
>> saveXML(root)
>> saveXML(root, file = "test.xml")
>> saveXML(root, file = "test.xml", prefix = '\n')
>>
>> Am I missing something obvious? Any ideas?
> It looks like the function XML:::saveXML.XMLInternalNode() does not use
> the 'prefix' parameter at all. So it won't be taken into account when
> calling saveXML() on objects of class XMLInternalNode.
> 
> I think you should report this to Duncan Temple Lang, as this is
> probably an oversight.
> 
> 
> Regards
> 
> 
>> Thanks in advance. Earl Brown
>>
>> -
>> Earl K. Brown, PhD
>> Assistant Professor of Spanish Linguistics
>> Advisor, TEFL MA Program
>> Department of Modern Languages
>> Kansas State University
>> www-personal.ksu.edu/~ekbrown
>>
>> __
>> R-help@r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
> 
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Lattice xyplot: Fill Legend Points

2013-10-17 Thread Richard M. Heiberger
put the pch into the par.settings

On Thu, Oct 17, 2013 at 11:17 AM, Rich Shepard  wrote:
>   When I specify pch = 19 for a scatter plot the points are filled circles.
> Deapite reading ?points and trial-and-error experimentation I have not found
> how to have the legend symbols (now open circles) filled.
>
>   An example command is:
>
> xyplot(pct.quant ~ sampdate, data = ffg.st, groups = func_feed_grp, type =
> 'p', pch = 19, key = simpleKey(text = levels(ffg.st$func_feed_grp), space =
> 'right', points = T, lines = F),par.settings = list(superpose.points =
> list(col = rainbow(7)), superpose.lines = list(col = rainbow(7))), main =
> 'Functional Feeding Groups (Individuals)', xlab = 'Year', ylab = 'Proportion
> of Individuals')
>
>   Please pass me a pointer on how to fill the legend points.
>
> TIA,
>
> Rich
>
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Lattice xyplot: Fill Legend Points

2013-10-17 Thread Rich Shepard

  When I specify pch = 19 for a scatter plot the points are filled circles.
Deapite reading ?points and trial-and-error experimentation I have not found
how to have the legend symbols (now open circles) filled.

  An example command is:

xyplot(pct.quant ~ sampdate, data = ffg.st, groups = func_feed_grp, type =
'p', pch = 19, key = simpleKey(text = levels(ffg.st$func_feed_grp), space =
'right', points = T, lines = F),par.settings = list(superpose.points =
list(col = rainbow(7)), superpose.lines = list(col = rainbow(7))), main =
'Functional Feeding Groups (Individuals)', xlab = 'Year', ylab = 'Proportion
of Individuals')

  Please pass me a pointer on how to fill the legend points.

TIA,

Rich

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] flatten a list of lists

2013-10-17 Thread Michael Friendly

Thanks to all who replied.

Here are two versions of a function (sans sanity checks) that do what I 
want:


foo1 <- multifoo(1:2, "A")
foo2 <- multifoo(1:2, "B")

mfoo <- list(A=foo1, B=foo2)
class(mfoo) <- c("foolist", "list")

#' flatten a list of lists

# from Duncan Murdoch
flatten <- function(list, unname=TRUE) {
res <- do.call(c, if(unname) unname(list) else list)
class(res) <- class(list)
res
}

# from David Carlson
flatten2 <- function(list, unname=TRUE) {
res <- unlist(list, recursive = FALSE)
if(unname) names(res) <- unlist(lapply(list, names))
class(res) <- class(list)
res
}

mflat1 <- flatten(mfoo)
mflat2 <- flatten2(mfoo)
all.equal(mflat1,mflat2)

> all.equal(mflat1,mflat2)
[1] TRUE

-Michael

On 10/17/2013 9:39 AM, David Carlson wrote:

Does this get you the rest of the way?


mfoo2 <- unlist(mfoo, recursive = FALSE)
names(mfoo2) <- unlist(lapply(mfoo, names))
class(mfoo2) <- "foolist"
str(mfoo2)

List of 4
  $ A1:List of 2
   ..$ x: int 3
   ..$ y: int 10
   ..- attr(*, "class")= chr "foo"
  $ A2:List of 2
   ..$ x: int [1:2] 6 4
   ..$ y: int [1:2] 8 9
   ..- attr(*, "class")= chr "foo"
  $ B1:List of 2
   ..$ x: int 2
   ..$ y: int 2
   ..- attr(*, "class")= chr "foo"
  $ B2:List of 2
   ..$ x: int [1:2] 3 6
   ..$ y: int [1:2] 4 2
   ..- attr(*, "class")= chr "foo"
  - attr(*, "class")= chr "foolist"

-
David L Carlson
Department of Anthropology
Texas A&M University
College Station, TX 77840-4352




-Original Message-
From: r-help-boun...@r-project.org
[mailto:r-help-boun...@r-project.org] On Behalf Of Ista Zahn
Sent: Thursday, October 17, 2013 8:23 AM
To: Michael Friendly
Cc: R-help
Subject: Re: [R] flatten a list of lists

unlist(mfoo, recursive = FALSE) gets you pretty close.

Best,
Ista

On Thu, Oct 17, 2013 at 9:15 AM, Michael Friendly
 wrote:

I have functions that generate lists objects of class "foo"

and lists of

lists of these, of class
"foolist", similar to what is shown below.

How can I flatten something like this to remove the top-level

list

structure, i.e.,
return a single-level list of "foo" objects, of class

"foolist"?

foo <- function(n) {
 result <- list(x=sample(1:10,n), y=sample(1:10,n))
 class(result) <- "foo"
 result
}

multifoo <- function(vec, label, ...) {
 result <- lapply(vec, foo, ...)
 names(result) <- paste0(label, vec)
 class(result) <- "foolist"
 result
}

foo1 <- multifoo(1:2, "A")
foo2 <- multifoo(1:2, "B")

mfoo <- list(A=foo1, B=foo2)

str(mfoo, 2)


str(mfoo, 2)

List of 2
  $ A:List of 2
   ..$ A1:List of 2
   .. ..- attr(*, "class")= chr "foo"
   ..$ A2:List of 2
   .. ..- attr(*, "class")= chr "foo"
   ..- attr(*, "class")= chr "foolist"
  $ B:List of 2
   ..$ B1:List of 2
   .. ..- attr(*, "class")= chr "foo"
   ..$ B2:List of 2
   .. ..- attr(*, "class")= chr "foo"
   ..- attr(*, "class")= chr "foolist"

In this case, what is wanted is a single-level list, of 4 foo

objects, A1,

A2, B1, B2,
all of class "foolist"

--
Michael Friendly Email: friendly AT yorku DOT ca
Professor, Psychology Dept. & Chair, Quantitative Methods
York University  Voice: 416 736-2100 x66249 Fax: 416

736-5814

4700 Keele StreetWeb:   http://www.datavis.ca
Toronto, ONT  M3J 1P3 CANADA

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide

http://www.R-project.org/posting-guide.html

and provide commented, minimal, self-contained, reproducible

code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible
code.




--
Michael Friendly Email: friendly AT yorku DOT ca
Professor, Psychology Dept. & Chair, Quantitative Methods
York University  Voice: 416 736-2100 x66249 Fax: 416 736-5814
4700 Keele StreetWeb:   http://www.datavis.ca
Toronto, ONT  M3J 1P3 CANADA

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] flatten a list of lists

2013-10-17 Thread David Carlson
Does this get you the rest of the way?

> mfoo2 <- unlist(mfoo, recursive = FALSE)
> names(mfoo2) <- unlist(lapply(mfoo, names))
> class(mfoo2) <- "foolist"
> str(mfoo2)
List of 4
 $ A1:List of 2
  ..$ x: int 3
  ..$ y: int 10
  ..- attr(*, "class")= chr "foo"
 $ A2:List of 2
  ..$ x: int [1:2] 6 4
  ..$ y: int [1:2] 8 9
  ..- attr(*, "class")= chr "foo"
 $ B1:List of 2
  ..$ x: int 2
  ..$ y: int 2
  ..- attr(*, "class")= chr "foo"
 $ B2:List of 2
  ..$ x: int [1:2] 3 6
  ..$ y: int [1:2] 4 2
  ..- attr(*, "class")= chr "foo"
 - attr(*, "class")= chr "foolist"

-
David L Carlson
Department of Anthropology
Texas A&M University
College Station, TX 77840-4352




-Original Message-
From: r-help-boun...@r-project.org
[mailto:r-help-boun...@r-project.org] On Behalf Of Ista Zahn
Sent: Thursday, October 17, 2013 8:23 AM
To: Michael Friendly
Cc: R-help
Subject: Re: [R] flatten a list of lists

unlist(mfoo, recursive = FALSE) gets you pretty close.

Best,
Ista

On Thu, Oct 17, 2013 at 9:15 AM, Michael Friendly
 wrote:
> I have functions that generate lists objects of class "foo"
and lists of
> lists of these, of class
> "foolist", similar to what is shown below.
>
> How can I flatten something like this to remove the top-level
list
> structure, i.e.,
> return a single-level list of "foo" objects, of class
"foolist"?
>
> foo <- function(n) {
> result <- list(x=sample(1:10,n), y=sample(1:10,n))
> class(result) <- "foo"
> result
> }
>
> multifoo <- function(vec, label, ...) {
> result <- lapply(vec, foo, ...)
> names(result) <- paste0(label, vec)
> class(result) <- "foolist"
> result
> }
>
> foo1 <- multifoo(1:2, "A")
> foo2 <- multifoo(1:2, "B")
>
> mfoo <- list(A=foo1, B=foo2)
>
> str(mfoo, 2)
>
>> str(mfoo, 2)
> List of 2
>  $ A:List of 2
>   ..$ A1:List of 2
>   .. ..- attr(*, "class")= chr "foo"
>   ..$ A2:List of 2
>   .. ..- attr(*, "class")= chr "foo"
>   ..- attr(*, "class")= chr "foolist"
>  $ B:List of 2
>   ..$ B1:List of 2
>   .. ..- attr(*, "class")= chr "foo"
>   ..$ B2:List of 2
>   .. ..- attr(*, "class")= chr "foo"
>   ..- attr(*, "class")= chr "foolist"
>
> In this case, what is wanted is a single-level list, of 4 foo
objects, A1,
> A2, B1, B2,
> all of class "foolist"
>
> --
> Michael Friendly Email: friendly AT yorku DOT ca
> Professor, Psychology Dept. & Chair, Quantitative Methods
> York University  Voice: 416 736-2100 x66249 Fax: 416
736-5814
> 4700 Keele StreetWeb:   http://www.datavis.ca
> Toronto, ONT  M3J 1P3 CANADA
>
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible
code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible
code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] flatten a list of lists

2013-10-17 Thread Duncan Murdoch

On 17/10/2013 9:15 AM, Michael Friendly wrote:

I have functions that generate lists objects of class "foo" and lists of
lists of these, of class
"foolist", similar to what is shown below.


You can use c() to join lists.  So in the example below,

c(mfoo$A, mfoo$B)

will give you a list with the components you want, though the class 
won't be set.   More generally, do.call(c, unname(mfoo)) will join any 
number of components.  (Without unname(), the names at the top level 
will be combined with the component names;

maybe you'd actually want that, but your example didn't do it.)

This won't work if your list doesn't have the regular "list of lists" 
structure, e.g. if it mixes foo objects with foolist objects at the same 
level.  Then you probably need a more complicated recursive approach.  
You might be able to do it with rapply().


Duncan Murdoch



How can I flatten something like this to remove the top-level list
structure, i.e.,
return a single-level list of "foo" objects, of class "foolist"?

foo <- function(n) {
  result <- list(x=sample(1:10,n), y=sample(1:10,n))
  class(result) <- "foo"
  result
}

multifoo <- function(vec, label, ...) {
  result <- lapply(vec, foo, ...)
  names(result) <- paste0(label, vec)
  class(result) <- "foolist"
  result
}

foo1 <- multifoo(1:2, "A")
foo2 <- multifoo(1:2, "B")

mfoo <- list(A=foo1, B=foo2)

str(mfoo, 2)

  > str(mfoo, 2)
List of 2
   $ A:List of 2
..$ A1:List of 2
.. ..- attr(*, "class")= chr "foo"
..$ A2:List of 2
.. ..- attr(*, "class")= chr "foo"
..- attr(*, "class")= chr "foolist"
   $ B:List of 2
..$ B1:List of 2
.. ..- attr(*, "class")= chr "foo"
..$ B2:List of 2
.. ..- attr(*, "class")= chr "foo"
..- attr(*, "class")= chr "foolist"

In this case, what is wanted is a single-level list, of 4 foo objects,
A1, A2, B1, B2,
all of class "foolist"



__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] flatten a list of lists

2013-10-17 Thread Ista Zahn
unlist(mfoo, recursive = FALSE) gets you pretty close.

Best,
Ista

On Thu, Oct 17, 2013 at 9:15 AM, Michael Friendly  wrote:
> I have functions that generate lists objects of class "foo" and lists of
> lists of these, of class
> "foolist", similar to what is shown below.
>
> How can I flatten something like this to remove the top-level list
> structure, i.e.,
> return a single-level list of "foo" objects, of class "foolist"?
>
> foo <- function(n) {
> result <- list(x=sample(1:10,n), y=sample(1:10,n))
> class(result) <- "foo"
> result
> }
>
> multifoo <- function(vec, label, ...) {
> result <- lapply(vec, foo, ...)
> names(result) <- paste0(label, vec)
> class(result) <- "foolist"
> result
> }
>
> foo1 <- multifoo(1:2, "A")
> foo2 <- multifoo(1:2, "B")
>
> mfoo <- list(A=foo1, B=foo2)
>
> str(mfoo, 2)
>
>> str(mfoo, 2)
> List of 2
>  $ A:List of 2
>   ..$ A1:List of 2
>   .. ..- attr(*, "class")= chr "foo"
>   ..$ A2:List of 2
>   .. ..- attr(*, "class")= chr "foo"
>   ..- attr(*, "class")= chr "foolist"
>  $ B:List of 2
>   ..$ B1:List of 2
>   .. ..- attr(*, "class")= chr "foo"
>   ..$ B2:List of 2
>   .. ..- attr(*, "class")= chr "foo"
>   ..- attr(*, "class")= chr "foolist"
>
> In this case, what is wanted is a single-level list, of 4 foo objects, A1,
> A2, B1, B2,
> all of class "foolist"
>
> --
> Michael Friendly Email: friendly AT yorku DOT ca
> Professor, Psychology Dept. & Chair, Quantitative Methods
> York University  Voice: 416 736-2100 x66249 Fax: 416 736-5814
> 4700 Keele StreetWeb:   http://www.datavis.ca
> Toronto, ONT  M3J 1P3 CANADA
>
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] S4 base class

2013-10-17 Thread Duncan Murdoch

On 17/10/2013 9:01 AM, Michael Meyer wrote:

Sorry,

if the previous message seems without context.
Indeed, the first message was bounced by filtering rules (triggered by subject 
heading than which nothing could be more benign or less liable to suspician). 
It was:

Greetings,

I have an S4 class "B" (Base) which defines a function f=f(this="B",...)
Dervided from B we have a derived class D which also defines a function 
f=f(this="D",...)

In the definition of D::f we want to call the version B::f and could do this by 
simply calling

f(baseClassObject(this),...)

The question is the following:

How do I refer to the base class object from the derived class?


You're asking the wrong question.  You should be asking how to call the 
method for the inherited class .  callNextMethod() is the answer to that 
question.


By the way, your use of the syntax D::f and B::f suggests that you're 
thinking from a C++ point of view.  That's very likely to lead to 
frustration:  the S4 object system is very different from C++.  Methods 
don't belong to classes, they belong to generics. There is no such thing 
as D::f or B::f, only f methods with different signatures.


Duncan Murdoch

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] flatten a list of lists

2013-10-17 Thread Michael Friendly
I have functions that generate lists objects of class "foo" and lists of 
lists of these, of class

"foolist", similar to what is shown below.

How can I flatten something like this to remove the top-level list 
structure, i.e.,

return a single-level list of "foo" objects, of class "foolist"?

foo <- function(n) {
result <- list(x=sample(1:10,n), y=sample(1:10,n))
class(result) <- "foo"
result
}

multifoo <- function(vec, label, ...) {
result <- lapply(vec, foo, ...)
names(result) <- paste0(label, vec)
class(result) <- "foolist"
result
}

foo1 <- multifoo(1:2, "A")
foo2 <- multifoo(1:2, "B")

mfoo <- list(A=foo1, B=foo2)

str(mfoo, 2)

> str(mfoo, 2)
List of 2
 $ A:List of 2
  ..$ A1:List of 2
  .. ..- attr(*, "class")= chr "foo"
  ..$ A2:List of 2
  .. ..- attr(*, "class")= chr "foo"
  ..- attr(*, "class")= chr "foolist"
 $ B:List of 2
  ..$ B1:List of 2
  .. ..- attr(*, "class")= chr "foo"
  ..$ B2:List of 2
  .. ..- attr(*, "class")= chr "foo"
  ..- attr(*, "class")= chr "foolist"

In this case, what is wanted is a single-level list, of 4 foo objects, 
A1, A2, B1, B2,

all of class "foolist"

--
Michael Friendly Email: friendly AT yorku DOT ca
Professor, Psychology Dept. & Chair, Quantitative Methods
York University  Voice: 416 736-2100 x66249 Fax: 416 736-5814
4700 Keele StreetWeb:   http://www.datavis.ca
Toronto, ONT  M3J 1P3 CANADA

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] S4 base class

2013-10-17 Thread Michael Meyer
Sorry,

if the previous message seems without context.
Indeed, the first message was bounced by filtering rules (triggered by subject 
heading than which nothing could be more benign or less liable to suspician). 
It was:

Greetings,

I have an S4 class "B" (Base) which defines a function f=f(this="B",...) 
Dervided from B we have a derived class D which also defines a function 
f=f(this="D",...)

In the definition of D::f we want to call the version B::f and could do this by 
simply calling

f(baseClassObject(this),...)

The question is the following:

How do I refer to the base class object from the derived class?



Many thanks 


Michael Meyer

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] S4 base class

2013-10-17 Thread Michael Meyer
Greetings,

Meanwhile I have figured out how do do it only to find out that I have more 
serious problems.
Generally calling Base::f on the base class object is not what you want, 
instead you want to call
Base::f on the full object for the following reasons:

If the base class is virtual, then Base::f might use virtual functions 
(not defined in Base but defined in derived classes).

If you then call Base::f on an object of class "Base" the call will fail.

Is it possible in R to call Base::f from within Derived (when there is also 
Derived::f) on the full object this?

I suspect not, which would be a serious drawback to the R class mechanism.


Thanks,


Michael Meyer

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] How to obtain restricted estimates from coxph()?

2013-10-17 Thread Andrews, Chris
Consider the function f(x) = x on the open interval (0,1).  It does not have a 
maximum.
That is what your likelihood function will look like.  The MLE does not exist.
Chris

(Although if everything is continuous and you are okay with limits there is an 
extension that gets you to Terry's original answer.)

-Original Message-
From: Y [mailto:yuhan...@gmail.com] 
Sent: Wednesday, October 16, 2013 7:08 PM
To: Göran Broström
Cc: r-help@r-project.org
Subject: Re: [R] How to obtain restricted estimates from coxph()?

Thanks very much for your help, Terry and G?ran!

As pointed out by G?ran, the difficult part is that it's an open set. How
to obtain a valid MLE in this case?


Thanks,
YH







On Wed, Oct 16, 2013 at 9:55 AM, G?ran Brostr?m wrote:

>
>
> On 2013-10-16 14:33, Terry Therneau wrote:
>
>>
>>
>> On 10/16/2013 05:00 AM, r-help-requ...@r-project.org wrote:
>>
>>> Hello,
>>>
>>> I'm trying to use coxph() function to fit a very simple Cox proportional
>>> hazards regression model (only one covariate) but the parameter space is
>>> restricted to an open set (0, 1). Can I still obtain a valid estimate by
>>> using coxph function in this scenario? If yes, how? Any suggestion would
>>> be
>>> greatly appreciated. Thanks!!!
>>>
>>
>> Easily:
>>  1.  Fit the unrestricted model.  If the solution is in 0-1 you are
>> done.
>>  2.  If it is outside, fix the coefficient.  Say that the solution is
>> 1.73, then the
>> optimal solution under contraint is 1.
>>
>
> OK, except for the small annoyance that 1 is not a member of the open set
> (interval) (0, 1). Maybe the answer is "No" in this case? Depends on what
> lies in the word 'valid'. If 'MLE', the answer is No.
>
>   Redo the fit adding the paramters  "init=1, iter=0".  This
>> forces the program to
>> give the loglik and etc for the fixed coefficient of 1.0.
>>
>> Terry Therneau
>>
>> __**
>> R-help@r-project.org mailing list
>> https://stat.ethz.ch/mailman/**listinfo/r-help
>> PLEASE do read the posting guide http://www.R-project.org/**
>> posting-guide.html 
>> and provide commented, minimal, self-contained, reproducible code.
>>
>>
> __**
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/**listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/**
> posting-guide.html 
> and provide commented, minimal, self-contained, reproducible code.
>

[[alternative HTML version deleted]]


**
Electronic Mail is not secure, may not be read every day, and should not be 
used for urgent or sensitive issues 

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] How would i sum the number of NA's in multiple vectors

2013-10-17 Thread Joshua Wiley
Or faster (both computational speed and amount of code):

colSums(is.na(rbind(c1, c2, c3)))


On Thu, Oct 17, 2013 at 4:34 AM, Carl Witthoft  wrote:

> mattbju2013 wrote
> > Hi guys this is my first post, i need help summing the number of NA's in
> a
> > few vectors
> >
> > for example..
> >
> > c1<-c(1,2,NA,3,4)
> > c2<-c(NA,1,2,3,4)
> > c3<-c(NA,1,2,3,4)
> >
> > how would i get a result that only sums the number of NA's in the vector?
> > the.result.i.want<-c(2,0,1,0,0)
>
> See ?is.na .
> Now, if I can interpret your question correctly, you're actually looking
> for
> the number of NA per *position* in the vectors, so let's make them into a
> matrix first.
>
> cmat<-rbind(c1,c2,c3)
> then use apply over columns
> apply(cmat,2,function(k)sum(is.na(k)))
>
>
>
>
>
> --
> View this message in context:
> http://r.789695.n4.nabble.com/How-would-i-sum-the-number-of-NA-s-in-multiple-vectors-tp4678411p4678432.html
> Sent from the R help mailing list archive at Nabble.com.
>
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>



-- 
Joshua Wiley
Ph.D. Student, Health Psychology
University of California, Los Angeles
http://joshuawiley.com/
Senior Analyst - Elkhart Group Ltd.
http://elkhartgroup.com

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] match values in dependence of ID and Date

2013-10-17 Thread Mat
hello togehter,

i have a little problem, maybe you can help me.

I have a data.frame like this one:

IDName
1 Andy
2 John
3 Amy

and a data.frame like this:

ID   DateValue
12013-10-0110
12013-10-0215
22013-10-017
22013-10-0310
22013-10-0415
32013-10-0110

the result should be this one:

IDName   First   SecondThird
1 Andy10 15
2 John 7  10   15
3 Amy 10

maybe you can help me, to do this?

Thank you.

Mat



--
View this message in context: 
http://r.789695.n4.nabble.com/match-values-in-dependence-of-ID-and-Date-tp4678433.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] How would i sum the number of NA's in multiple vectors

2013-10-17 Thread Carl Witthoft
mattbju2013 wrote
> Hi guys this is my first post, i need help summing the number of NA's in a
> few vectors
> 
> for example..
> 
> c1<-c(1,2,NA,3,4)
> c2<-c(NA,1,2,3,4)
> c3<-c(NA,1,2,3,4)
> 
> how would i get a result that only sums the number of NA's in the vector?
> the.result.i.want<-c(2,0,1,0,0)

See ?is.na .   
Now, if I can interpret your question correctly, you're actually looking for
the number of NA per *position* in the vectors, so let's make them into a
matrix first.

cmat<-rbind(c1,c2,c3)
then use apply over columns
apply(cmat,2,function(k)sum(is.na(k)))





--
View this message in context: 
http://r.789695.n4.nabble.com/How-would-i-sum-the-number-of-NA-s-in-multiple-vectors-tp4678411p4678432.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] extract column's from different dataframe

2013-10-17 Thread Jim Lemon

On 10/17/2013 06:17 PM, catalin roibu wrote:

Dear R users,

I want to extract column's from different data frame with different row
length.
How can I do this in R?


Hi catalin,
If I understand your question, which I think is:

I want to extract columns from different data frames with differing 
numbers of rows and store them in a single object.


The answer is probably to use a list:

datalist<-list()
datalist[[1]]<-dataframe1[,"variable1"]
datalist[[2]]<-dataframe2[,"variable3"]
...

where each element of "datalist" may have different numbers of values.

Jim

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Extract a predictors form constparty object (CHAID output) in R

2013-10-17 Thread Christiaan Pauw
For the record. I have found a possible sollution:

nn <- nodeapply(z)
n.names= names(unlist(nn[[1]]))
ext <- unlist(sapply(n.names, function(x) grep("split.varid.", x, value=T)))
ext <- gsub("kids.split.varid.", "", ext)
ext <- gsub("split.varid.", "", ext)
dep.var <- as.character(terms(z)[1][[2]])
plus = paste(ext, collapse=" + ")
mul = paste(ext, collapse=" * ")
shortform <- as.formula(paste (dep.var, plus, sep = " ~ "))
satform <- as.formula(paste (dep.var, mul, sep = " ~ "))
mosaic(shortform, data = ContraceptiveChoice)
#stp <- step(glm(satform, data=ContraceptiveChoice, family=binomial),
direction="both")


On 16 October 2013 20:18, Christiaan Pauw  wrote:

> I have a large dataset (questionnaire results) of mostly categorical
> variables. I have tested for dependency between the variables using
> chi-square test. There are an incomprehensible number of dependencies.
> I used the chaid() function in the CHAID package to detect
> interactions and separate out (what I hope to be) the underlying
> structure of these dependencies for each variable. What typically
> happens is that the chi-square test will reveal a large number of
> dependencies (say 10-20) for a variable and the chaid function will
> reduce this to something much more comprehensible (say 3-5). What I
> want to do is to extract the names of those variable that were shown
> to be relevant in the chaid() results.
>
> The chaid() output is in the form of a constparty object. My question
> is how to extract the variable names associated with the nodes in such
> an object.
>
> Here is a self contained code example:
>
> library(evtree) # for the ContraceptiveChoice dataset
> library(CHAID)
> library(vcd)
> library(MASS)
>
> data("ContraceptiveChoice")
> longform <- formula(contraceptive_method_used ~ wifes_education +
>  husbands_education +  wifes_religion + wife_now_working +
>  husbands_occupation + standard_of_living_index +
> media_exposure)
> z <- chaid(longform, data = ContraceptiveChoice)
> # plot(z)
> z
> # This is the part I want to do programatically
> shortform <- formula(contraceptive_method_used ~ wifes_education +
> husbands_occupation)
> # The thing I want is a programatic way to extract 'shortform'  from 'z'
>
> # Examples of use of 'shortfom'
> loglm(shortform, data = ContraceptiveChoice)
>
> Thanks in advance
> Christiaan
> --
> Christiaan Pauw
> Nova Institute
> www.nova.org.za
>



-- 
Christiaan Pauw
Nova Institute
www.nova.org.za

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Constraint on regression parameters

2013-10-17 Thread S Ellison


> -Original Message-
> I am doing a polynomial linear regression with 2 independent variables
> such as :
> 
> lm(A ~ B + I(B^2) + I(lB^3) + C, data=Dataset))
> 
> R return me a coefficient per independent variable, and I  would need
> the coefficient of the C parameter to equal 1.

Leaving aside the question of fitting simple polynomial coefficients instead of 
orthogonal polynomials - generally frowned upon, but not always serious - the 
problem you describe is one in which you are not fitting C at all; you're 
assuming C adds exactly. What you're really fitting is the difference between A 
and C. 

Try fitting 
A-C ~ B + I(B^2) + I(lB^3) 

to obtain the coefficients you're looking for. But be aware that you will still 
have a constant intercept, so the model you will have fitted is

A = b0 + b1.B +b2.B^2 +b3.B^3 + C + error

S Ellison


***
This email and any attachments are confidential. Any use...{{dropped:8}}

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Constraint on regression parameters

2013-10-17 Thread Robert U
Dear all,

I have been trying to  find a simple solution to my problem without success, 
though i have a feeling a simple syntaxe detail coul make the job.

I am doing a polynomial linear regression with 2 independent variables such as :

lm(A ~ B + I(B^2) + I(lB^3) + C, data=Dataset))

R return me a coefficient per independent variable, and I  would need the 
coefficient of the C parameter to equal 1. 


I've been loonking at "parameter constraints" on the  internet but it's always 
much more complicated that just "removing" the fit of a coefficient (or setting 
it to 1). 


I know many package allows to "not fit" an intercept with a "-1" parameter in 
the syntaxe, does that exists for independent variables ? 

Regards,
[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] saveXML() prefix argument

2013-10-17 Thread Milan Bouchet-Valat
Le mercredi 16 octobre 2013 à 23:45 -0400, Earl Brown a écrit :
> I'm using the "XML" package and specifically the saveXML() function but I 
> can't get the "prefix" argument of saveXML() to work:
> 
> library("XML")
> concepts <- c("one", "two", "three")
> info <- c("info one", "info two", "info three")
> root <- newXMLNode("root")
> for (i in 1:length(concepts)) {
>   cur.concept <- concepts[i]
>   cur.info <- info[i]
>   cur.tip <- newXMLNode("tip", attrs = c(id = i), parent = root)
>   newXMLNode("h1", cur.concept, parent = cur.tip)
>   newXMLNode("p", cur.info, parent = cur.tip)
> }
> 
> # None of the following output a prefix on the first line of the exported 
> document
> saveXML(root)
> saveXML(root, file = "test.xml")
> saveXML(root, file = "test.xml", prefix = '\n')
> 
> Am I missing something obvious? Any ideas?
It looks like the function XML:::saveXML.XMLInternalNode() does not use
the 'prefix' parameter at all. So it won't be taken into account when
calling saveXML() on objects of class XMLInternalNode.

I think you should report this to Duncan Temple Lang, as this is
probably an oversight.


Regards


> Thanks in advance. Earl Brown
> 
> -
> Earl K. Brown, PhD
> Assistant Professor of Spanish Linguistics
> Advisor, TEFL MA Program
> Department of Modern Languages
> Kansas State University
> www-personal.ksu.edu/~ekbrown
> 
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Plot time series data irregularly hourly-spaced

2013-10-17 Thread Charles Novaes de Santana
Wow!! Thank you so much for your suggestions! For now, A.K's suggestion #1
is perfect for me!

Thank you very much!

Best,

Charles


On Thu, Oct 17, 2013 at 2:34 AM, William Dunlap  wrote:

> You could bump up the day each time an hour was less than the previous
> one.  E.g.,
>   testtime <-
> c("20:00:00","22:10:00","22:20:00","23:15:00","23:43:00","00:00:00","00:51:00","01:00:00")
>   var <- seq_along(testtime) # so you know what the plot should look like
>   # turn it ino a POSIXlt object so you can do arithmetic on it
>   t <- strptime(testtime,format="%H:%M:%S")
>   # now add a day each time t[i]   td <- t + .difftime(cumsum(c(FALSE, diff(t)<0)), units="days")
>   # compare plots
>   par(mfrow=c(2,1))
>   plot(t,var,type="b",xlab="Time",ylab="Var")
>   plot(td,var,type="b",xlab="Time",ylab="Var")
> This is dicey because you may have skipped more than one day.
>
> Bill Dunlap
> Spotfire, TIBCO Software
> wdunlap tibco.com
>
>
> > -Original Message-
> > From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org]
> On Behalf
> > Of Law, Jason
> > Sent: Wednesday, October 16, 2013 5:04 PM
> > To: Charles Novaes de Santana; r-help@r-project.org
> > Subject: Re: [R] Plot time series data irregularly hourly-spaced
> >
> >  You just need the date, otherwise how would it know what time comes
> first?  In
> > strptime(), a date is being assumed.
> >
> > Try this:
> >
> > testtime<-
> >
> c("20:00:00","22:10:00","22:20:00","23:15:00","23:43:00","00:00:00","00:51:00","01:00:
> > 00")
> > testday <- rep(Sys.Date() - c(1,0), times = c(5,3))
> > plot(as.POSIXct(paste(testday, testtime)), var)
> >
> > Jason
> >
> > -Original Message-
> > From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org]
> On Behalf
> > Of Charles Novaes de Santana
> > Sent: Wednesday, October 16, 2013 2:58 PM
> > To: r-help@r-project.org
> > Subject: [R] Plot time series data irregularly hourly-spaced
> >
> > Dear all,
> >
> > I have a time series of data that I would like to represent in a plot.
> But I am facing some
> > problems to do it because the time is represented in "hours", it can
> start in one day and
> > end in another day, and it is not regularly spaced.
> >
> > My problem is that when I plot my data, my X-axis always starts from the
> lower values of
> > my time data. For example, I would like to plot data that starts at
> 20:00:00 and ends at
> > 01:00:00, but R considers that 01:00:00 is lower than 21:00:00 and my
> plot is kind of
> > "crossed over time".
> >
> > Please try this example to see it graphically:
> >
> > testtime<-
> >
> c("20:00:00","22:10:00","22:20:00","23:15:00","23:43:00","00:00:00","00:51:00","01:00:
> > 00")
> > var<-runif(length(testtime),0,1)
> >
> plot(strptime(testtime,format="%H:%M:%S"),var,type="b",xlab="Time",ylab="Var")
> >
> > In this case, I would like to have a plot that starts at 20:00:00 and
> ends at 01:00:00.
> >
> > Does anybody know how to make R understand that 00:00:00 comes after
> 20:00:00 in
> > this case? Or at least does anybody know a tip to make a plot with this
> kind of X-axis?
> >
> > Thanks for your time and thanks in advance for any help.
> >
> > Best regards,
> >
> > Charles
> > --
> > Um axé! :)
> >
> > --
> > Charles Novaes de Santana, PhD
> > http://www.imedea.uib-csic.es/~charles
> >
> >   [[alternative HTML version deleted]]
> >
> > __
> > R-help@r-project.org mailing list
> > https://stat.ethz.ch/mailman/listinfo/r-help
> > PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> > and provide commented, minimal, self-contained, reproducible code.
>



-- 
Um axé! :)

--
Charles Novaes de Santana, PhD
http://www.imedea.uib-csic.es/~charles

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] extract column's from different dataframe

2013-10-17 Thread catalin roibu
Dear R users,

I want to extract column's from different data frame with different row
length.
How can I do this in R?

Thank you very much!

best regards!

CR

-- 
---
Catalin-Constantin ROIBU
Lecturer PhD, Forestry engineer
Forestry Faculty of Suceava
Str. Universitatii no. 13, Suceava, 720229, Romania
office phone +4 0230 52 29 78, ext. 531
mobile phone   +4 0745 53 18 01
   +4 0766 71 76 58
FAX:+4 0230 52 16 64
silvic.usv.ro

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.