Re: [R] Drop values of one dataframe based on the value of another

2012-06-01 Thread Ethan Brown
Before using ddply, try adding an id variable to uniquely identify each
record (this is a good data integrity practice anyway). Then you can simply
create the new data frame by using all the ids that aren't in your
'To_remove' subset.

Here's the code for your example:

library(plyr)
library(outliers)

## A dataframe with some obviously extreme values
dfa - data.frame(Mins=runif(15, 0,1),
Fac=rep(c(Test1,Test2,Test3), each=5))
df.out - data.frame(Mins=c(3,4,5), Fac=c(Test1,Test2,Test3))
df - rbind(dfa, df.out)
df$Meta - runif(18,4,5)

##
## add an id variable
df$id - 1:nrow(df)
##

## Dataframe with the extreme value
To_remove-ddply(df, c(Fac), subset, Mins==outlier(Mins)); To_remove

##
## create dataframe without ids that are in To_remove
To_keep - df[!(df$id %in% To_remove$id),]

## or, more compactly since in this case the ids are row numbers,
To_keep - df[-To_remove$id,]

Best,
Ethan

P.S. Your email address and Google picture are so epic!


statisfactions.com -- the sounds of data and whimsy



On Fri, Jun 1, 2012 at 2:40 PM, Sam Albers tonightstheni...@gmail.comwrote:

 Hello all,

 Let me first say that this isn't a question about outliers. I am using
 the outlier function from the outliers package but I am using it only
 because it is a convenient wrapper to determine values that have the
 largest difference between itself and the sample mean. Where I am
 running into problems is that I am several groups where I want to
 calculate the outlier within that group. Then I want to create two
 data.frames, one with the outliers and the other those values
 dropped. And both dataframes need to include additional columns of
 data present before the subset. The first case is easy but I can't
 seem to figure out how to determine the next. So for example:

 library(plyr)
 library(outliers)

 ## A dataframe with some obviously extreme values
 dfa - data.frame(Mins=runif(15, 0,1),
 Fac=rep(c(Test1,Test2,Test3), each=5))
 df.out - data.frame(Mins=c(3,4,5), Fac=c(Test1,Test2,Test3))
 df - rbind(dfa, df.out)
 df$Meta - runif(18,4,5); df

 ## Dataframe with the extreme value
 To_remove-ddply(df, c(Fac), subset, Mins==outlier(Mins)); To_remove

 So now my question is how can I use this dataframe (To_remove) to
 remove all these values from df and create a new dataframe. Given a df
 (To_remove) with a list of values, how can I choose all values of
 another dataframe (df) that aren't those values in the To_remove
 dataframe?. There is a rm.outliers function in this same package but I
 having trouble with that and would like to try another approach.

 Thanks in advance!

 Sam

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] determine size (width and height) of a graphics file via R - how?

2012-05-14 Thread Ethan Brown
Hi Mark,

You can do this easily with the identify command in ImageMagick
http://www.imagemagick.org. Install it, and then from within an R
session:

system2(identify, yourimagename.jpg)

...and it should give you something like this:

yourimagename.jpg JPEG 800x533 800x533+0+0 8-bit DirectClass 378KB
0.000u 0:00.019

...which is overkill but does include the dimensions.

If you're on Windows you need an extra argument:

system2(identify, yourimagename.jpg, invisible = FALSE)

to make sure it actually shows you the result.

EBImage is an R interface to imagemagick but is probably more trouble
than it's worth for the simple task you're trying to do.

Hope this helps,
Ethan

On Sun, May 13, 2012 at 6:57 AM, Mark Heckmann mark.heckm...@gmx.de wrote:
 Hi,

 is there a way to determine the size (width, height) of a graphics file saved 
 on my hard disk, e.g. a .bmp, via R.
 What I want is basically the same information on the dimensions of the 
 graphic file that I get from my file browser.

 Thanks
 Mark

 PS. Why: I use the R2PPT and I need to determine the size of the original 
 graphic before adding it to a slide.

 末末
 Mark Heckmann
 Blog: www.markheckmann.de
 R-Blog: http://ryouready.wordpress.com











        [[alternative HTML version deleted]]


 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] write data using xlsReadWrite

2012-05-14 Thread Ethan Brown
You're trying to write an object that you've never created. If you
want to write `varHL2y`, which it appears you do, you would replace
that for `mydata` in your command.

Best,
Ethan


On Sun, May 13, 2012 at 1:33 AM, diyanah yanad...@gmail.com wrote:
 Hai, I'm trying to write these var output data from these codes inside excel
 file. My directory to store the data is
 /D:\FYP\image /
 but receive an error message :

 /Error in write.xls(mydata, D:\\FYP\\image.mydata.xls) :
  object 'mydata' not found/

 these are my codes, can you help give an advice or idea with my problem:

 /library(biOps)
 library(waveslim)
 library(xlsReadWrite)
 x - readTiff(D:\\FYP\\image\\SignatureImage\\user186g1.tif)
 y - imgBlockMedianFilter(x, 5)
 #Plot image
 #plot(y)
 y.modwt - modwt.2d(y, la8, 2)
 ## Level 2 decomposition
 par(mfrow=c(2,2), pty=s)
 ##Plot wavelets
 image(y.modwt$LH2, col=rainbow(128), axes=FALSE, main=LH2)
 image(y.modwt$HH2, col=rainbow(128), axes=FALSE, main=HH2)
 image(y.modwt$LL2, col=rainbow(128), axes=FALSE, main=LL2)
 image(y.modwt$HL2, col=rainbow(128), axes=FALSE, main=HL2)
 #---#
 ##Get the dimension
 ##LH2
 dimLH2 - dim(y.modwt$LH2)
 dimLH2x - dimLH2[1]
 dimLH2y - dimLH2[2]
 varLH2xlist - c(rep(0, dimLH2x))
 varLH2ylist - c(rep(0, dimLH2y))
 ##Loop to get variance from x axis
 for(i in seq(dimLH2x)){
    varLH2xlist[i] - var(y.modwt$LH2[i,])
 }
 ##Get the variance from the overall x variance
 varLH2x - var(varLH2xlist)
 ##Loop to get variance from y axis
 for(i in seq(dimLH2y)){
    varLH2ylist[i] - var(y.modwt$LH2[,i])
 }
 ##Get the variance from the overall y variance
 varLH2y - var(varLH2ylist)
 #-#
 ##Get the dimension
 ##HH2
 dimHH2 - dim(y.modwt$HH2)
 dimHH2x - dimHH2[1]
 dimHH2y - dimHH2[2]
 varHH2xlist - c(rep(0, dimHH2x))
 varHH2ylist - c(rep(0, dimHH2y))
 ##Loop to get variance from x axis
 for(i in seq(dimHH2x)){
    varHH2xlist[i] - var(y.modwt$HH2[i,])
 }
 ##Get the variance from the overall x variance
 varHH2x - var(varHH2xlist)
 ##Loop to get variance from y axis
 for(i in seq(dimHH2y)){
    varHH2ylist[i] - var(y.modwt$HH2[,i])
 }
 ##Get the variance from the overall y variance
 varHH2y - var(varHH2ylist)
 #-#
 ##Get the dimension
 ##LL2
 dimLL2 - dim(y.modwt$LL2)
 dimLL2x - dimLL2[1]
 dimLL2y - dimLL2[2]
 varLL2xlist - c(rep(0, dimLL2x))
 varLL2ylist - c(rep(0, dimLL2y))
 ##Loop to get variance from x axis
 for(i in seq(dimLL2x)){
    varLL2xlist[i] - var(y.modwt$LL2[i,])
 }
 ##Get the variance from the overall x variance
 varLL2x - var(varLL2xlist)
 ##Loop to get variance from y axis
 for(i in seq(dimLL2y)){
    varLL2ylist[i] - var(y.modwt$LL2[,i])
 }
 ##Get the variance from the overall y variance
 varLL2y - var(varLL2ylist)
 #-#
 ##Get the dimension
 ##HL2
 dimHL2 - dim(y.modwt$HL2)
 dimHL2x - dimHL2[1]
 dimHL2y - dimHL2[2]
 varHL2xlist - c(rep(0, dimHL2x))
 varHL2ylist - c(rep(0, dimHL2y))
 ##Loop to get variance from x axis
 for(i in seq(dimHL2x)){
    varHL2xlist[i] - var(y.modwt$HL2[i,])
 }
 ##Get the variance from the overall x variance
 varHL2x - var(varHL2xlist)
 ##Loop to get variance from y axis
 for(i in seq(dimHL2y)){
    varHL2ylist[i] - var(y.modwt$HL2[,i])
 }
 ##Get the variance from the overall y variance
 varHL2y - var(varHL2ylist)
 #-#
 ##write excel file
 write.xls(mydata, D:\\FYP\\image.mydata.xls)/

 --
 View this message in context: 
 http://r.789695.n4.nabble.com/write-data-using-xlsReadWrite-tp4629825.html
 Sent from the R help mailing list archive at Nabble.com.

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Strange Error: subscript out of bounds

2012-05-11 Thread Ethan Brown
Hi, this looks like a typo to me. The name of the argument to your
function is 'pre.mat', but you're trying to print an object called
'pred.mat' (with an extra 'd') that never appears before.

It's easier to help when you give a reproducible example that we can
execute on our own computers, as recommended by the posting guide.

Best,
Ethan


statisfactions.com -- The Sounds of Data and Whimsy

On Fri, May 11, 2012 at 9:41 AM, Petri Lankoski
petri.lanko...@gmail.com wrote:

 Dear all,


 I am trying to write a function for visualizing ordinal model results. The
 function works fine with some values, but then I get Error: subscript out
 of bounds even there the index should be pointing a legal item. Code is
 below as well as the example of failure:

 plotProb - function(pre.mat, parts, split, titles, xlab=) {
        par(mfrow = split)
        n - 1
        for(k in parts) {
                print(debug)
                print(k)
                print(n)
                print(length(pre.mat))
                print(pred.mat[k,])
                print(titles[n])
                plot(1:5, pred.mat[k,], lty=2, type=l, ylim=c(0,1),
 xlab=xlab, axes=FALSE, ylab=Probability, las=1, main=titles[n])
                axis(1)
                axis(2)
                lines(1:5, pred.mat[k+1,], lty=1)
                lines(1:5, pred.mat[k+2,], lty=3)
                legend(topright, c(avg. player, 5th %-tile player,
 95th %-tile player), lty=1:3, bty=n)
                n-n+1
        }
 }

 plotProb(q8.pred.mat.v13, c(193, 196, 199, 205, 217,241,289), c(3,3),
 q8.titles3, identification)
 [1] debug
 [1] 193
 [1] 1
 [1] 1920
 Error in print(pred.mat[k, ]) :
  error in evaluating the argument 'x' in selecting a method for function
 'print': Error: subscript out of bounds
 print(q8.pred.mat.v13[193,])
 [1] 0.28478261 0.39674035 0.22653917 0.07992141 0.01201645

 (However, a call, plotProb(q8.pred.mat.v13, c(1,4,7,13,25,49,97), c(3,3),
 q8.titles2, identification), seems to work ok.)


 Any hints how to tackle this are appreciated.


 --
 Petri Lankoski, petri.lanko...@iki.fi
 www.iki.fi/petri.lankoski

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] (no subject)

2011-06-20 Thread Ethan Brown
Hi Ungku, it's really difficult for us to take a huge block of code and
understand where an error happened. There's several things that can help us
help you:

1) First and foremost, what is the error message or undesired behavior
you're experiencing?
2) Second, please pare down the code to the place where you're experiencing
a problem. Maybe just generate some simple data and try making the plot from
there without all the calculations and formatting options, and see if it
works then. If it still doesn't, post that simplified code and someone here
will be much more able to help you.

The posting guide,
http://www.R-project.org/posting-guide.htmlhttp://www.r-project.org/posting-guide.html,
has some further tips.

Best,
Ethan

On Sun, Jun 19, 2011 at 7:05 PM, Ungku Akashah kasla...@yahoo.com wrote:

 HELLO, anybody... could you help me to check the below coding for volcano.
 what is the mistake?
 what the plot could not display?







 #volcano_plot.r
 #
 #Author:Amsha Nahid, Jairus Bowne, Gerard Murray
 #Purpose:Produces a volcano plot
 #
 #Input:Data matrix as specified in Data-matrix-format.pdf
 #Output:Plots log2(fold change) vs log10(t-test P-value)
 #
 #Notes:Group value for control must be alphanumerically first
 #  Script will return an error if there are more than 2 groups

 #
 #Load the data matrix
 #
 # Read in the .csv file
 data-read.csv(input5.csv, sep=,, row.names=1, header=TRUE)
 # Get groups information
 groups-data[,1]
 # Get levels for groups
 grp_levs-levels(groups)
 if (length(levels(groups))  2)
print(Number of groups is greater than 2!) else {

#
#Split the matrix by group
#
new_mats-c()
for (ii in 1:length(grp_levs))
new_mats[ii]-list(data[which(groups==levels(groups)[ii]),])

#
#Calculate the means
#
# For each matrix, calculate the averages per column
submeans-c()
# Preallocate a matrix for the means
means-matrix(
nrow = 2,
ncol = length(colnames(data[,-1])),
dimnames = list(grp_levs,colnames(data[,-1]))
)
# Calculate the means for each variable per sample
for (ii in 1:length(new_mats))
{submeans[ii]-list(apply(new_mats[[ii]][,-1],2,mean,na.rm=TRUE))
means[ii,]-submeans[[ii]]}

#
#Calculate the fold change
#
folds-matrix(
nrow=length(means[,1]),
ncol=length(means[1,]),
dimnames=list(rownames(means),colnames(means))
)
for (ii in 1:length(means[,1]))
for (jj in 1:length(means[1,]))
folds[ii,jj]-means[ii,jj]/means[1,jj]

#
#t-test P value data
#


 pvals-matrix(nrow=ncol(data[,-1]),ncol=1,dimnames=list(colnames(data[-1]),P-Value))


#
#Perform the t-Test
#
for(ii in 1:nrow(pvals)) {

  pvals[ii,1]-t.test(new_mats[[1]][,ii+1],new_mats[[2]][,ii+1])$p.value
}

m-length(pvals)
x_range-range(c(
min(
range(log2(folds[2,])),
range(c(-1.5,1.5))
),
max(
range(log2(folds[2,])),
range(c(-1.5,1.5))
)
))
y_range-range(c(
min(range(-log10(pvals)),
range(c(0,2))
),
max(range(-log10(pvals)),
range(c(0,2))
)
))

#
#Plot data
#
# Define a function, since it's rather involved
volcano_plot-function(fold, pval)
{plot(x_range, # x-dim
y_range,   # y-dim
type=n,  # empty plot
xlab=log2 Fold Change,   # x-axis title
ylab=-log10 t-Test P-value,  # y-axis title
main=Volcano Plot,   # plot title
)
abline(h=-log10(0.05),col=green,lty=44)# horizontal line at
 P=0.05
abline(v=c(-1,1),col=violet,lty=1343)  # vertical lines at
 2-fold
# Plot points based on their values:
for (ii in 1:m)
# If it's below 0.05, we're not overly interested: purple.
if (-log10(pvals[ii])(-log10(0.05))) {
# Otherwise, more checks;
# if it's greater than 2-fold decrease: blue
if (log2(folds[2,][ii])(-1)) {
# If it's significant but didn't change much: orange
if (log2(folds[2,][ii])1) {
points(
log2(folds[2,][ii]),
-log10(pvals[ii]),
col=orange,
pch=20
)
# Otherwise, greater than 2-fold increase: red
} else {
points(

Re: [R] (no subject)

2011-06-20 Thread Ethan Brown
=title)                           # title of plot
  #
  #     dev.off()
  #     }
  # pic_jpg(LDA.jpg, lda_result, Linear Discriminant Analysis)
  # end jpeg #
 
  # png #
  # pic_png-function(filename, matrix, title, cex_val=1)
  #     {# Start png device with basic settings
  #     png(filename,
  #         bg=white,                             # background colour
  #         res=300,                                # image resolution (dpi)
  #         units=in, width=8.3, height=5.8       # image dimensions 
  (inches)
  #         )
  #     par(mgp=c(5,2,0),                           # axis margins
  #                                                 #  (title, labels, line)
  #         mar=c(7,4,4,2),                         # plot margins (b,l,t,r)
  #         las=1                                   # horizontal labels
  #         )
  #     # Draw the plot
  #     plot(matrix,                                # data to plot
  #         cex=cex_val,                            # font size
  #         dimen=2                                 # dimensions to plot
  #         )
  #     title(main=title)                           # title of plot
  #
  #     dev.off()
  #     }
  # pic_png(LDA.png, lda_result, Linear Discriminant Analysis)
  # end png #
 
  # tiff #
  # pic_tiff-function(filename, matrix, title, cex_val=1)
  #     {# Start tiff device with basic settings
  #     tiff(filename,
  #         bg=white,                             # background colour
  #         res=300,                                # image resolution (dpi)
  #         units=in, width=8.3, height=5.8,      # image dimensions 
  (inches)
  #         compression=none                      # image compression
  #                                                 #  (one of none, lzw, zip)
  #         )
  #     par(mgp=c(5,2,0),                           # axis margins
  #                                                 #  (title, labels, line)
  #         mar=c(7,4,4,2),                         # plot margins (b,l,t,r)
  #         las=1                                   # horizontal labels
  #         )
  #     # Draw the plot
  #     plot(matrix,                                # data to plot
  #         cex=cex_val,                            # font size
  #         dimen=2                                 # dimensions to plot
  #         )
  #     title(main=title)                           # title of plot
  #
  #     dev.off()
  #     }
  # pic_tiff(LDA.tif, lda_result, Linear Discriminant Analysis)
  # end tiff #
 

 
 From: Ethan Brown ethancbr...@gmail.com
 To: Ungku Akashah kasla...@yahoo.com; r-help@r-project.org
 Sent: Tue, June 21, 2011 5:48:55 AM
 Subject: Re: [R] (no subject)

 Hi Ungku, it's really difficult for us to take a huge block of code and 
 understand where an error happened. There's several things that can help us 
 help you:

 1) First and foremost, what is the error message or undesired behavior you're 
 experiencing?
 2) Second, please pare down the code to the place where you're experiencing a 
 problem. Maybe just generate some simple data and try making the plot from 
 there without all the calculations and formatting options, and see if it 
 works then. If it still doesn't, post that simplified code and someone here 
 will be much more able to help you.

 The posting guide, http://www.R-project.org/posting-guide.html, has some 
 further tips.

 Best,
 Ethan

 On Sun, Jun 19, 2011 at 7:05 PM, Ungku Akashah kasla...@yahoo.com wrote:

 HELLO, anybody... could you help me to check the below coding for volcano.
 what is the mistake?
 what the plot could not display?







 #    volcano_plot.r
 #
 #    Author:    Amsha Nahid, Jairus Bowne, Gerard Murray
 #    Purpose:    Produces a volcano plot
 #
 #    Input:    Data matrix as specified in Data-matrix-format.pdf
 #    Output:    Plots log2(fold change) vs log10(t-test P-value)
 #
 #    Notes:    Group value for control must be alphanumerically first
 #              Script will return an error if there are more than 2 groups

 #
 #    Load the data matrix
 #
 # Read in the .csv file
 data-read.csv(input5.csv, sep=,, row.names=1, header=TRUE)
 # Get groups information
 groups-data[,1]
 # Get levels for groups
 grp_levs-levels(groups)
 if (length(levels(groups))  2)
    print(Number of groups is greater than 2!) else {

    #
    #    Split the matrix by group
    #
    new_mats-c()
    for (ii in 1:length(grp_levs))
        new_mats[ii]-list(data[which(groups==levels(groups)[ii]),])

    #
    #    Calculate the means
    #
    # For each matrix, calculate the averages per column
    submeans-c()
    # Preallocate a matrix for the means
    means-matrix(
        nrow = 2,
        ncol = length(colnames(data[,-1])),
        dimnames = list(grp_levs,colnames(data[,-1]))
        )
    # Calculate the means for each variable per sample
    for (ii in 1:length(new_mats))
        {submeans[ii]-list(apply

Re: [R] Help with Median test and Coxon-Mann-Whittney test

2011-06-11 Thread Ethan Brown
Hi Rob,

This list is primarily intended for questions about how to do things in R,
so you're more likely to get a helpful response elsewhere. You might want to
try some place like the Cross-Validated web site (
http://stats.stackexchange.com/) for general statistics and data analysis
questions.

Before you do that, you might want to clarify your question. It sounds like
you ran a couple tests on a very small dataset, and they came up with
different results. They're not even tests for the same thing, since the
t-test is testing whether the means are different, and the Mann-Whitney is
testing whether the medians are different. And in any case, it's hard to
imagine coming up with any kind of useful inference on this tiny sample.
When you post on Cross-Validated (or other site), people are going to want
to know the problem you are trying to solve, which isn't at all clear at
this point.

Best,
Ethan

On Fri, Jun 10, 2011 at 12:26 PM, Robert Johnson robjoh...@gmail.comwrote:

 Hi All,

 I have the following dataset (triplicates values of 5 independent
 measurements and duplicate vaues of a control):

 12  3   4   5  C
 181.8  58.2 288.9 273.2290.953.9
 120.3 116.8108.9 281.3 446  39.6
 86.1  148.5 52.9  126   150.3

 My aim was to find if mean values of Samples 1 - 5 are significantly higher
 than the mean value for C (control).

  At first, I carried out mean, SD and t-test (one-tail). Although SD error
 bars are large, two of the samples have mean values that are significantly
 higher than that of C.

 Second, I carried out median and Coxon-Mann-Whittney test because of my
 worry about the size of my data and the high variations in the replicates.
 I
 was surprised however, that median and Coxon-Mann-Whittney tests did not
 reveal statistical significant results.

 I will be happy if anyone could adivce me on the best way to analyse this
 dataset.

 Regards,

 Rob

[[alternative HTML version deleted]]

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] giving factor names

2011-06-10 Thread Ethan Brown
Hi Kieran,

I'm not very familiar with lattice, but here's a workaround that works for
me. Basically, I just created a new data.frame column that was a factor
(combo$zf), and forced its levels to be what you're looking for here.

require(lattice)

x-c(1,2,3)
y-c(2,4,6)
z-c(0.1,0.5,2)
combo-expand.grid(x,y,z)
combo-data.frame(combo)
names(combo)-c(x,y,z)
outcome-function(l)
{
(l[1]*l[2])/l[3]
}
resp-apply(combo,1,outcome)

## Create new column and assign levels
combo$zf - as.factor(combo$z)
levels(combo$zf) - paste(z=, levels(combo$zf), sep=)

## Now I use the new variable as the conditioning variable in the plot
levelplot(resp~x*y|zf, data=combo
,pretty=TRUE,region=TRUE,contour=FALSE)

In the future, it would help if you could specify the packages you're using,
since I had to do a little research to find where the levelplot function
is from.

Hope this helps,
Ethan



On Tue, Jun 7, 2011 at 9:30 AM, kieran martin kieranjmar...@gmail.comwrote:

 Hi,

 I've been driving myself insane with this problem. I have a trellis plot of
 contours, and I want each level to have something like z=value for each
 one. I can get each one to say z, or each one to say the value (by using
 as.factor) but not both. Heres an artificial example to show what I mean
 (as
 my actual data set is much larger!)

 x-c(1,2,3)
 y-c(2,4,6)
 z-c(0.1,0.5,2)
 combo-expand.grid(x,y,z)
 combo-data.frame(combo)
 names(combo)-c(x,y,z)
 outcome-function(l)
 {
 (l[1]*l[2])/l[3]
 }
 resp-apply(combo,1,outcome)
 levelplot(resp~x*y|z,data=combo
 ,pretty=TRUE,region=TRUE,contour=FALSE)

 , so in this final graph I want the z=0.1, z=0.5 and z=2 in turn.

 Thanks,

 Kieran Martin
 University of Southampton

[[alternative HTML version deleted]]

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Problem loading packages in R 2.13.0 on Mac

2011-06-10 Thread Ethan Brown
Hi Adrienne,

I'm not familiar with your interface, but it sounds like R thinks the
package mvtnorm is not installed. You can see the packages you've
installed with:

row.names(installed.packages())

Is mvtnorm in the output of that command? You could test with the command,

mvtnorm %in% row.names(installed.packages())

If the result of the above command is FALSE, you can install it with:

install.packages(mvtnorm)

Best,
Ethan

On Fri, Jun 10, 2011 at 12:00 PM, Adrienne Keller 
adrienne.kel...@umontana.edu wrote:

 I am having problem loading packages in the newest version of R (2.13.0) on
 my Mac. I have tried to install various packages (e.g. lawstat, Rcmdr, car)
 and load them using the Package Installer and Package Manager menu options
 but I get the follow error:

  library(lawstat)
 Loading required package: mvtnorm
 Error: package 'mvtnorm' could not be loaded
 In addition: Warning message:
 In library(pkg, character.only = TRUE, logical.return = TRUE, lib.loc =
 lib.loc) :
  there is no package called 'mvtnorm'

 When I click on the box for loading lawstat in the Package Manager, it
 immediately reverts back to an unchecked box.

 I have tried to load mvtnorm and then load lawstat but I get the same
 error.

 Help?

 Thanks,

 Adrienne

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Sorting DataFrames

2011-06-07 Thread Ethan Brown
Hello Samantha, I'm having some trouble understanding your question in
terms of what's happening in R. Are these bins columns of a
data.frame? Rows?

It's helpful for us to have a small example to look at--for instance,
you could create a small subset of your data called x, then type the
command

dump(x, file=stdout())

which will print an expression that will recreate the object x.

Best,
Ethan

On Tue, Jun 7, 2011 at 9:42 AM, Cox, Samantha Lucy
s.cox...@aberdeen.ac.uk wrote:
 I am a new user, and i am trying to sort out a data frame.



 I have for example bins of data.  Within each bin i have multiple counts of 
 animals and the depths at which these count were taken.  How would I 
 summarise this to being only the maximum count per bin alongisde the 
 corresponding height (but not the maximum depth - i want the depth at which 
 the maximum number of animals occurs).



 Thank you

        [[alternative HTML version deleted]]

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Question with RExcel

2011-06-06 Thread Ethan Brown
It's hard to see where the problem is from this information.

I would suggest subscribing to and asking this question of the RExcel
mailing list (accessible from http://rcom.univie.ac.at/) and providing
more detail of what you're trying to do, what is going wrong, error
messages (is it R or Excel giving the error?) and so on. For all I
know, it may well not be an R issue at all but a problem somewhere in
your Excel or VBA setup.

Best,
Ethan

On Mon, Jun 6, 2011 at 12:39 PM, Maria Helena Mourino Silva Nunes
mhnu...@fc.ul.pt wrote:
 Dear all,
 I’m doing some simulation studies in order to compare the estimates (and 
 estimated standard deviations) from the ARMA(2,1) Model with an estimator 
 that I’ve constructed. For carrying out the simulations I created a VBA 
 project within Excel.
 Now, I’m using the RExcel tool for running the R commands in the VBA project. 
 I run 2500 simulation using the “arima” function from R and it worked! 
 Nevertheless, the constant was badly estimated. So, I decided to use the 
 “arma” function from R, and the parameters are now well estimated. However, I 
 cannot run the 2500 simulations. It can only do 46 simulations! I’ve already 
 tried to run the program in another computer, but I’ve got the same problem.

 Do you have any suggestions?
 Thanks for your attention.
 Helena Mouriño.

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Merge two columns of a data frame

2011-06-06 Thread Ethan Brown
Another possibility:

dfs - list(df1, df2, df3)
df.1.2.3 - as.data.frame(unlist(sapply(dfs, function(x) do.call(paste, x

On Mon, Jun 6, 2011 at 2:37 PM, Ista Zahn iz...@psych.rochester.edu wrote:
 Hi Abraham,
 Just take it step by step. Paste the values together, combine them,
 and assign them to a data.frame column. Like this perhaps:

 df.1.2.3 - data.frame(Var1 =
        c(with(df1, paste(Var1, Var2, Var3)),
          with(df2, paste(Var1, Var2)),
          with(df3, paste(Var1, Var2

 Best,
 Ista

 On Mon, Jun 6, 2011 at 12:22 PM, Abraham Mathew abra...@thisorthat.com 
 wrote:
 I have the following data:

 prefix - c(cheap, budget)
 roots - c(car insurance, auto insurance)
 suffix - c(quote, quotes)

 prefix2 - c(cheap, budget)
 roots2 - c(car insurance, auto insurance)

 roots3 - c(car insurance, auto insurance)
 suffix3 - c(quote, quotes)

 df1 - expand.grid(prefix, roots, suffix)
 df2 - expand.grid(prefix2, roots2)
 df3 - expand.grid(roots3, suffix3)
 df1; df2; df3

 df1, df2, and df3 are seperate data structures with seperate columns for
 root, prefix, and suffix.

  Var1           Var2   Var3
 1  cheap  car insurance  quote
 2 budget  car insurance  quote
 3  cheap auto insurance  quote
 4 budget auto insurance  quote
 5  cheap  car insurance quotes
 6 budget  car insurance quotes
 7  cheap auto insurance quotes
 8 budget auto insurance quotes
    Var1           Var2
 1  cheap  car insurance
 2 budget  car insurance
 3  cheap auto insurance
 4 budget auto insurance
            Var1   Var2
 1  car insurance  quote
 2 auto insurance  quote
 3  car insurance quotes
 4 auto insurance quotes


 I want to merge df1, df2, and df3, into one data frame column which looks
 like.

                    Var1
  'cheap  car insurance  quote'
  'budget  car insurance  quote'
  'cheap auto insurance  quote'
  'budget auto insurance  quote'
  'cheap  car insurance quotes'
  'budget  car insurance quotes'
  'cheap auto insurance quotes'
  'budget auto insurance quotes'
         'cheap  car insurance'
         'budget  car insurance'
        'cheap auto insurance'
        'budget auto insurance'
        'car insurance  quote'
         'auto insurance  quote'
        'car insurance quotes'
        'auto insurance quotes'


 Help!
 WebRep
 Overall rating

        [[alternative HTML version deleted]]

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.




 --
 Ista Zahn
 Graduate student
 University of Rochester
 Department of Clinical and Social Psychology
 http://yourpsyche.org

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Question about example function

2011-06-05 Thread Ethan Brown
Hi Abhilash,

From ?example, under arguments:

local: logical: if ‘TRUE’ evaluate locally, if ‘FALSE’ evaluate in
the workspace.

So all you need to do is:

 x - 0
 example(mean, local=TRUE)

mean x - c(0:10, 50)

mean xm - mean(x)

mean c(xm, mean(x, trim = 0.10))
[1] 8.75 5.50

mean mean(USArrests, trim = 0.2)
  Murder  Assault UrbanPop Rape
7.42   167.6066.2020.16
 x
[1] 0

and nothing in your workspace is changed.

Best,
Ethan

On Sun, Jun 5, 2011 at 11:13 AM, Abhilash Balakrishnan
balaab...@gmail.com wrote:
 Dear Sirs,

 I am exploring the R package and its documentation.  I find there is the
 function example which runs examples from documentation pages.  What
 confuses me is that running example interferes with the variables I have in
 my workspace.

 x - 0
 example(mean)
 x
 Now x is a vector of some values coming from the example.

 Am I using example in the wrong way?  In situation like above running
 example apparently corrupts existing data, pollutes the workspace with
 variables I didn't create myself, and also leaves allocated data that
 consume memory.  Is there a way to run example to avoid this?  I tried the
 following:

 x - 0
 local(example(mean))
 x
 Still x is corrupted with example data.

 Thank you for support.
 Abhilash B.

        [[alternative HTML version deleted]]

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] VLOOKUP in R - tried everything.

2011-06-03 Thread Ethan Brown
Even after I discovered match(), it took me a little while to figure
out how to use it for this task, so to add on to Peter's comment--to
add a column for total for each value of coll.minus.release, try the
following:

data$ParasitoidMatch -
data$ParasitoidTotal[match(data$coll.minus.release,
data$release.days)]

Note also that match() only returns the first match it finds, and by
default returns NA for no match.





On Fri, Jun 3, 2011 at 10:43 AM, peter dalgaard pda...@gmail.com wrote:

 On Jun 3, 2011, at 16:59 , bjmjarrett wrote:

 I am attempting to emulate the VLOOKUP function from Excel in R.

 I want to compare one column (coll.minus.release) with another
 (release.days) to get the number of parasitoid released at that time
 (TotalParasitoids).

 for example:

 coll.minus.release    release.days    ParasitoidTotal
 -12                                -266                    1700
 8                                  -259                    1000
 8                                  -225                    1000
 28                                 -216                    1000
 41                                 -28                     1148
 77                                  -12                    1144
 105                                  0                     1160
 105                                  8                      972
 125                                 28                     1146
 125                                 41                     1004
 125                                 77                     1003
 125                                 97                     1010
 
 2772                                NA                       NA
 2801                                NA                       NA
 2834                                NA                       NA


 vlookup - function(x) data[data$release.days==x,6] # as I have three other
 columns that are not of interest

 vlookup(-12) = 1144, and so on, which is great.

 However, when I try:

 unlist(sapply(coll.minus.release,vlookup)) to apply it to the whole
 coll.minus.release

 it works up to a point, as it doesn't give me 132 values for the 132 values
 of coll.minus.release. Is this because the table of release.days and
 TotalParasitoid has less values than coll.minus.release (108 compared to
 132)? To fill the gap I put in 0, and as none of the coll.minus.release
 values = 0 I think it wouldn't affect it.


 I wager that a look at setdiff(coll.minus.release,release.days) and vice 
 versa would be illuminating. Notice that with your definition, 
 vlookup(31415926) or any other number absent from release.days gives a 
 zero-length vector.

 Presumably, you are looking for match().



 Other things I have tried include findInterval and match.

 data[findInterval(x=data$coll.minus.release,vec=data$release.days,ParasitoidTotal)]

 didn't work as it said vec must be sorted non-decreasingly and didn't work
 when I randomised the release.days and ParasitoidTotal columns as it doesn't
 matter which order they are in.

 Thanks for reading all the way through - I wanted all the information I felt
 you might need to help me in it.

 Any help will be greatly appreciated.

 --
 View this message in context: 
 http://r.789695.n4.nabble.com/VLOOKUP-in-R-tried-everything-tp3571107p3571107.html
 Sent from the R help mailing list archive at Nabble.com.

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

 --
 Peter Dalgaard
 Center for Statistics, Copenhagen Business School
 Solbjerg Plads 3, 2000 Frederiksberg, Denmark
 Phone: (+45)38153501
 Email: pd@cbs.dk  Priv: pda...@gmail.com

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Writing R code for moments

2010-10-08 Thread Ethan Brown
Hi July,

You might want to check out the moments package. It has a couple
functions, all.moments() and moment() that will compute these for
you.

By the way, RSeek.org is a great place to find packages like this; I
searched for sample moments and clicked on the functions tab to
find this.

HTH,
Ethan

On Fri, Oct 8, 2010 at 1:19 AM, July iown...@gmail.com wrote:

 Dear Experts,

 If I have a vector of numbers x.

 (1) How can I write R code to compute the moments around zero of order one
 to four?

 (2) How can I write R code to compute the moments around the mean of order
 one to four?

 Thank you very much!
 --
 View this message in context: 
 http://r.789695.n4.nabble.com/Writing-R-code-for-moments-tp2967946p2967946.html
 Sent from the R help mailing list archive at Nabble.com.

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] using a package function inside another function

2010-10-07 Thread Ethan Brown
Hi Alison,

By default, a function in R creates a copy of the variable that you
pass into it. insertRow() looks to be unusual in that it actually
changes the variable you pass into the function.

So if you run your insert_row_test(x), the function will create a copy
of x, insert a row into it using insertRow(), and then return that new
matrix.  So all you need to do is assign the output of your function
to the original object, like so:

 x - matrix(1:9, 3, 3)

 x
 [,1] [,2] [,3]
[1,]147
[2,]258
[3,]369

 x - insert_row_test(x)

 x
 [,1] [,2] [,3]
147
258
369
test000


See 
http://cran.r-project.org/doc/manuals/R-intro.html#Writing-your-own-functions
for more on this topic.

HTH,
Ethan Brown


On Thu, Oct 7, 2010 at 9:47 AM, Alison Callahan
alison.calla...@gmail.com wrote:
 Hello all,

 I am trying to use the micEcon 'insertRow' function inside a function
 I have written. For example:

 insert_row_test - function(m){

  insertRow(m,nrow(m)+1,v=0,rName=test)

 }

 However, when I try to call the 'insert_row_test' function (after
 loading the micEcon package), it does not insert a row into the matrix
 I pass in. When I call the insertRow function exactly as above in the
 R console, it works with no problem.

 Can anyone tell me why this is, and how to fix this problem?

 Thank you!

 Alison Callahan
 PhD candidate
 Department of Biology
 Carleton University

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Converting scraped data

2010-10-06 Thread Ethan Brown
Hi Simon,

You'll notice the test data.frame has a whole mix of characters in
the columns you're interested, including a - for missing values, and
that the columns you're interested in are in fact factors.

as.numeric(factor) returns the level of the factor, not the value of
the level. (See ?levels and ?factor)--that's why it's giving you those
irrelevant integers. I always end up using something like this handy
code snippet to deal with the situation:

unfactor - function(factors)
# From http://psychlab2.ucr.edu/rwiki/index.php/R_Code_Snippets#unfactor
# Transform a factor back into its factor names
{
   return(levels(factors)[factors])
}

Then, to get your data to where you want it, I'd do this:

require(XML)
theurl - http://www.queensu.ca/cora/_trends/mip_2006.htm;
tables - readHTMLTable(theurl)
n.rows - unlist(lapply(tables, function(t) dim(t)[1]))
class(tables)
test-data.frame(tables, stringsAsFactors=FALSE)


result - test[11:42, 1:5] #Extract the actual data we want
names(result) - c(Response, Q1, Q2,Q3,Q4)
for(i in 2:5) {
# Convert columns to factors
  result[,i] - as.numeric(unfactor(result[,i]))
}
result

From here you should be able to plot or do whatever else you want.

Hope this helps,
Ethan Brown


On Wed, Oct 6, 2010 at 9:52 AM, Simon Kiss sjk...@gmail.com wrote:
 Dear Colleagues,
 I used this code to scrape data from the URL conatined within.  This code
 should be reproducible.

 require(XML)
 library(XML)
 theurl - http://www.queensu.ca/cora/_trends/mip_2006.htm;
 tables - readHTMLTable(theurl)
 n.rows - unlist(lapply(tables, function(t) dim(t)[1]))
 class(tables)
 test-data.frame(tables, stringsAsFactors=FALSE)
 test[16,c(2:5)]
 as.numeric(test[16,c(2:5)])
 quartz()
 plot(c(1:4), test[15, c(2:5)])

 calling the values from the row of interest using test[16, c(2:5)] can bring
 them up as represented on the screen, plotting them or coercing them to
 numeric changes the values and in a way that doesn't make sense to me. My
 intuitino is that there is something going on with the way the characters
 are coded or classed when they're scraped into R.  I've looked around the
 help files for converting from character to numeric but can't find a
 solution.

 I also tried this:

 as.numeric(as.character(test[16,c(2:5)] and that also changed the values
 from what they originally were.

 I'm grateful for any suggestions.
 Yours, Simon Kiss



 *
 Simon J. Kiss, PhD
 Assistant Professor, Wilfrid Laurier University
 73 George Street
 Brantford, Ontario, Canada
 N3T 2C9
 Cell: +1 519 761 7606

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.