date:20100207

Re: [R] problem with Tinn-R

2010-02-07 Thread Dieter Menne

Roslina Zakaria wrote:
> 
> 
> I install Tinn-R 2.3.4.4 and when I want to execute the calculation, it
> gives me this error:
>  
> The preferred Rterm not defined.
>  
> 

Set the path to Rtermn in 

Options/Application/R/Path

Dieter
-- 
View this message in context: 
http://n4.nabble.com/problem-with-Tinn-R-tp1472562p1472633.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] ggplot2 stacked line plot

2010-02-07 Thread Jim Lemon


On 02/08/2010 03:40 PM, Liam Blanckenberg wrote:

Hi all,

I have been hunting around for hours trying to figure out how to
generate a stacked line chart using ggplot2. This type of chart can be
generated in excel 2007 by selecting: Chart type>  Line>  Stacked
line. I can generate a stacked area chart using the following code:

p<- ggplot2(~, aes(x = ~, y = ~, colour = Type)) +
geom_area(aes(position = 'stack', fill = Type))

However, when I try and replicate this using the following code for geom_line:

p<- ggplot(~, aes(x = ~, y = ~, colour = Type)) +
geom_line(aes(position = 'stack'))

the resulting plot is not stacked - i.e. each 'Type' is plotted at its
actual value rather than cumulatively to form a stacked chart... I
have poured through Hadley's ggplot2 book (ggplot2: elegant graphics
for data analysis), the R help list and also done general google
searching but cannot find a way to generate this type of plot.

R version: 2.9.2
ggplot2 version: 0.8.5
OS: windows 7 (64-bit).


Hi Liam,
Are you looking for something like stackpoly (plotrix package)? It's not 
ggplot, but it might do what you want.


Jim

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Help with apply()

2010-02-07 Thread Jim Lemon

On 02/08/2010 12:26 PM, Nathan S. Watson-Haigh wrote:

I have a 2 column data.frame:

 > d[1:5,]
a b
1 80015 C
2 80016 B
3 80023 C
4 80062 B
5 80069 B

I want to apply a function across each row:

 > for(i in 1:nrow(d)) {
+ myFun(con, d[i,]$a, d[i,]$b)
+ }

How do I do this using apply()? I'm unsure how to tell apply() to pass
data from columns a and b for a given row as arguments to the function
myFun().

Hi Nathan,
apply doesn't work with data frames unless they can be coerced to 
matrices or arrays (and sometimes not even then). What's wrong with 
using the code you have above?

Jim

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] specifying colors in a heatmap/image -like plot

2010-02-07 Thread Jim Lemon


On 02/08/2010 08:57 AM, kerimcan wrote:


Hi,

I have searched for a solution but I failed to find an answer. I am hoping
you may be able to help me.

I have a data set where I have observations for a number of units (n =~40)
over a period of time (t =~100) and I have a variable (Z) that codes a
categorical variable for each observation. I want to produce a 2D plot where
time is on the x-axis and units are on the y-axis. Then each block on the
2-d plot should take a color depending on variable Z. Z is not ordered so
using a scale (like in heatmaps) does not make sense. In fact the values of
Z have meanings that are intuitively related to colors (e.g. Z=3 means
involvement by the "United Nations" so I want its color to be "blue"). Below
is some code that gives an example of what I am aiming to do and why
"heatmap" and "image" functions don't work for me. Thanks in advance for
your help.


# Example: Suppose Z had 3 values (0,1,2) and I had 8 observations.

hitmep<- matrix(c(0,2,1,0,2,1,1,0),2,4)

# Graph 1:
heatmap(hitmep2, Rowv =NA, Colv =NA, labrow =NULL, scale ="none")
# Graph 2:
image(t(hitmep2), axes =FALSE)

# I like the layout of the plots. My problem with these is that I don't want
Z's values (0,1,2) to have colors on a scale. I want to specify, for
example, 1="blue", 2="yellow" and 3="green". Do you know how to do this?



Hi Kerim,
You can do this with color2D.matplot (plotrix) as well as with image or 
heatmap. Just pass the desired color vector as the "cellcolors" argument.


Jim

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] ggplot2 stacked line plot

2010-02-07 Thread Steve Lianoglou

Hi,

On Sun, Feb 7, 2010 at 11:40 PM, Liam Blanckenberg
 wrote:
> Hi all,
>
> I have been hunting around for hours trying to figure out how to
> generate a stacked line chart using ggplot2. This type of chart can be
> generated in excel 2007 by selecting: Chart type > Line > Stacked
> line. I can generate a stacked area chart using the following code:
>
>   p <- ggplot2(~, aes(x = ~, y = ~, colour = Type)) +
> geom_area(aes(position = 'stack', fill = Type))
>
> However, when I try and replicate this using the following code for geom_line:
>
>   p <- ggplot(~, aes(x = ~, y = ~, colour = Type)) +
> geom_line(aes(position = 'stack'))
>
> the resulting plot is not stacked - i.e. each 'Type' is plotted at its
> actual value rather than cumulatively to form a stacked chart... I
> have poured through Hadley's ggplot2 book (ggplot2: elegant graphics
> for data analysis), the R help list and also done general google
> searching but cannot find a way to generate this type of plot.
>
> R version: 2.9.2
> ggplot2 version: 0.8.5
> OS: windows 7 (64-bit).
>
> Any suggestions or assistance would be greatly appreciated.

Are you trying to show a graph that looks like Figure 4.5 from this page?

http://learnr.wordpress.com/2009/07/02/ggplot2-version-of-figures-in-lattice-multivariate-data-visualization-with-r-part-4/

sans the coord_flip(), perhaps?

That website is a good resource for ggplot graphics. He ran a whole
series recreating the graphs in the lattice graphics book with
ggplot2. His final post on that subject included a link to a pdf with
the code and graphics for all the posts in that series for easy
scanning, too. If this isn't the graph you wanted, perhaps you can
skim that document to see if there's a graphic that resembles what
you're after.

-steve
-- 
Steve Lianoglou
Graduate Student: Computational Systems Biology
 | Memorial Sloan-Kettering Cancer Center
 | Weill Medical College of Cornell University
Contact Info: http://cbio.mskcc.org/~lianos/contact

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] problem with Tinn-R

2010-02-07 Thread Roslina Zakaria

Hi,

I install Tinn-R 2.3.4.4 and when I want to execute the calculation, it gives 
me this error:
 
The preferred Rterm not defined.
 
Thank you so much for any help given.


  
[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] ggplot2 stacked line plot

2010-02-07 Thread Liam Blanckenberg

Hi all,

I have been hunting around for hours trying to figure out how to
generate a stacked line chart using ggplot2. This type of chart can be
generated in excel 2007 by selecting: Chart type > Line > Stacked
line. I can generate a stacked area chart using the following code:

   p <- ggplot2(~, aes(x = ~, y = ~, colour = Type)) +
geom_area(aes(position = 'stack', fill = Type))

However, when I try and replicate this using the following code for geom_line:

   p <- ggplot(~, aes(x = ~, y = ~, colour = Type)) +
geom_line(aes(position = 'stack'))

the resulting plot is not stacked - i.e. each 'Type' is plotted at its
actual value rather than cumulatively to form a stacked chart... I
have poured through Hadley's ggplot2 book (ggplot2: elegant graphics
for data analysis), the R help list and also done general google
searching but cannot find a way to generate this type of plot.

R version: 2.9.2
ggplot2 version: 0.8.5
OS: windows 7 (64-bit).

Any suggestions or assistance would be greatly appreciated.

Regards,

Liam

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Contributed packages

2010-02-07 Thread David Winsemius



On Feb 7, 2010, at 11:27 PM, Vadlamani, Satish {FLNA} wrote:


Folks:
If you wanted to find out about what are the contributed packages  
and classify them, how would you go about it? For someone new like  
me, I would like to know what the possibilities are. When I click on  
"install packages" on my Windows version of R, it gives me a list  
but it is hard to figure out from that list what is the purpose of  
each package and to what class it belongs (for example, class of  
regular expressions).


What is the equivalent of CPAN.org for Perl in R where you can  
browse Perl modules by category? Thanks.


The CRAN Task Views.


Satish

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] dataframe question

2010-02-07 Thread David Winsemius



On Feb 7, 2010, at 11:15 PM, Vadlamani, Satish {FLNA} wrote:


David:
Thanks for the idea. Both the one that you suggested and the one  
that Bill Venables suggested are very good. Unfortunately, this  
statement is creating out of memory issues like below (system  
limitations).


When I had padded white space before the number, read.csv.sql is  
correctly treating it as a factor. I am going to take out the  
padding so that it treats it as numeric and then I can proceed with  
further steps.


Idea: Write the dataframe and all other useful data to a csv or tab  
delimited file. Save all other useful data as well. Exit without  
saving the workspace. Restart and read data in with correct format  
using colClasses argument.


three_wk_out <- read.csv(file= "somename.csv", colClasses =  
rep("numeric", 209) )


Of course if it's that big, you may have problems doing anything  
useful with it in the space you have available. Details of your  
machine would be helpful, especially if you are using one of the  
Windows variant and have 4 GB of physical memory. There is information  
about this condition in the R-Win FAQ.


--
David.




Satish

Out of memory warning
Reached total allocation of 1535Mb: see help(memory.size)
34: In ans[[i]] <- tmp :
 Reached total allocation of 1535Mb: see help(memory.size)


Bill Venable's suggestion below


week_list <- paste("wk", 1:209, sep="")
### no need for c(...)

for(week in week_list)
three_wk_out[[week]] <- as.numeric(three_wk_out[[week]])

### no need for '{...}'

Bill Venables
CSIRO/CMIS Cleveland Laboratories


-Original Message-
From: David Winsemius [mailto:dwinsem...@comcast.net]
Sent: Sunday, February 07, 2010 8:51 PM
To: Vadlamani, Satish {FLNA}
Cc: r-help@r-project.org help
Subject: Re: [R] dataframe question


On Feb 7, 2010, at 8:14 PM, David Winsemius wrote:



On Feb 7, 2010, at 7:51 PM, Vadlamani, Satish {FLNA} wrote:


Folks:
Good day. Please see the code below. three_wk_out is a dataframe
with columns wk1 through wk209. I want to change the format of the
columns. I am trying the code below but it does not work.  I need
$week in the for loop interpreted as wk1, wk2, etc. Could you
please help? Thanks.
Satish

R code below
week_list <- paste("wk",c(1:209),sep="")



Or more "functionally":

three_wk_out <- as.data.frame( lapply(three_wk_out, some_function) )


Or if you wanted to just change the particular columns that matched
the "wk" pattern:

idx <- grep("wk", names(three_wk_out))
three_wk_out[, idx ] <- apply( three_wk_out[, idx ], 2, as.numeric)


(I probably should have used apply( ___ , 2,  fn) in the prior effort
rather than coercing a list back to a dataframe.)




E.g.:





a b c x
1 1 0 0 1
2 2 3 2 4
3 1 2 1 5
4 2 0 3 2


df <- as.data.frame(lapply(df, "^", 2))
df

 a  b  c   x
1  1  0  0   1
2 16 81 16 256
3  1 16  1 625
4 16  0 81  16



for (week in week_list)
{
 three_wk_out$week <- as.numeric(three_wk_out$week)
}

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.




__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Contributed packages

2010-02-07 Thread Vadlamani, Satish {FLNA}

Folks:
If you wanted to find out about what are the contributed packages and classify 
them, how would you go about it? For someone new like me, I would like to know 
what the possibilities are. When I click on "install packages" on my Windows 
version of R, it gives me a list but it is hard to figure out from that list 
what is the purpose of each package and to what class it belongs (for example, 
class of regular expressions).

What is the equivalent of CPAN.org for Perl in R where you can browse Perl 
modules by category? Thanks.
Satish

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] dataframe question

2010-02-07 Thread Vadlamani, Satish {FLNA}

David:
Thanks for the idea. Both the one that you suggested and the one that Bill 
Venables suggested are very good. Unfortunately, this statement is creating out 
of memory issues like below (system limitations).

When I had padded white space before the number, read.csv.sql is correctly 
treating it as a factor. I am going to take out the padding so that it treats 
it as numeric and then I can proceed with further steps.

Satish

Out of memory warning
Reached total allocation of 1535Mb: see help(memory.size)
34: In ans[[i]] <- tmp :
  Reached total allocation of 1535Mb: see help(memory.size)

>> Bill Venable's suggestion below

week_list <- paste("wk", 1:209, sep="")  
### no need for c(...)

for(week in week_list) 
three_wk_out[[week]] <- as.numeric(three_wk_out[[week]]) 

### no need for '{...}'

Bill Venables
CSIRO/CMIS Cleveland Laboratories


-Original Message-
From: David Winsemius [mailto:dwinsem...@comcast.net] 
Sent: Sunday, February 07, 2010 8:51 PM
To: Vadlamani, Satish {FLNA}
Cc: r-help@r-project.org help
Subject: Re: [R] dataframe question


On Feb 7, 2010, at 8:14 PM, David Winsemius wrote:

>
> On Feb 7, 2010, at 7:51 PM, Vadlamani, Satish {FLNA} wrote:
>
>> Folks:
>> Good day. Please see the code below. three_wk_out is a dataframe  
>> with columns wk1 through wk209. I want to change the format of the  
>> columns. I am trying the code below but it does not work.  I need  
>> $week in the for loop interpreted as wk1, wk2, etc. Could you  
>> please help? Thanks.
>> Satish
>>
>> R code below
>> week_list <- paste("wk",c(1:209),sep="")
>
>
> Or more "functionally":
>
> three_wk_out <- as.data.frame( lapply(three_wk_out, some_function) )

Or if you wanted to just change the particular columns that matched  
the "wk" pattern:

idx <- grep("wk", names(three_wk_out))
three_wk_out[, idx ] <- apply( three_wk_out[, idx ], 2, as.numeric)


(I probably should have used apply( ___ , 2,  fn) in the prior effort  
rather than coercing a list back to a dataframe.)


>
> E.g.:
> >

>  a b c x
> 1 1 0 0 1
> 2 2 3 2 4
> 3 1 2 1 5
> 4 2 0 3 2
>
> > df <- as.data.frame(lapply(df, "^", 2))
> > df
>   a  b  c   x
> 1  1  0  0   1
> 2 16 81 16 256
> 3  1 16  1 625
> 4 16  0 81  16
>
>
>> for (week in week_list)
>> {
>>   three_wk_out$week <- as.numeric(three_wk_out$week)
>> }
>>
>> __
>> R-help@r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Help with apply()

2010-02-07 Thread David Winsemius

On Feb 7, 2010, at 8:26 PM, Nathan S. Watson-Haigh wrote:

I have a 2 column data.frame:

> d[1:5,]
 a b
180015 C
280016 B
380023 C
480062 B
580069 B

I want to apply a function across each row:

> for(i in 1:nrow(d)) {
+myFun(con, d[i,]$a, d[i,]$b)
+ }

How do I do this using apply()? I'm unsure how to tell apply() to  
pass data from columns a and b for a given row as arguments to the  
function myFun().

apply(d, 1, function(x) myFun(x[1], x[2]) )

The reason you cannot use the "$" operator is that the row is passed  
to the function as a vector, rather than as a list.

--
David

Thanks in advance for any pointers,
Nathan

--

Dr. Nathan S. Watson-Haigh
OCE Post Doctoral Fellow
CSIRO Livestock Industries
University Drive
Townsville, QLD 4810
Australia

Tel: +61 (0)7 4753 8548
Fax: +61 (0)7 4753 8600
Web: http://www.csiro.au/people/Nathan.Watson-Haigh.html

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] mboost: Interpreting coefficients from glmboost if center=TRUE

2010-02-07 Thread Kyle Werner

Thanks for your reply. In fact, I do use the predict method for model
assessment, and it shows that centering leads to a substantial
improvement using even the bluntest of assessments of 'goodness'
(i.e., binary categorization accuracy). So I agree that the package
authors must have internal tools to reverse the effects of centering
the variables, at least within the predict method. But it seems to me
that the coefficients that I get out should be related to the values
that I input, not to the centered values. In other words, centering
seems like it should be done "invisibly;" unless I center the
variables myself, I would expect the coefficients to be applicable to
the original data.

I extract the coefficients returned by the model and store them in a
database which is web accessible. I reconstruct models periodically,
and track various statistics associated with these models in the
database. This is why I highly value the fact that mboost has
glmboost, which can return linearly interpretable coefficients. It is
also why I do not directly call upon R every time I want to query a
model. (As an aside, if I were to use R directly, I might consider the
gamboost or blackboost methods, which do not return scalar
coefficients that are readily extractable.)

On Sun, Feb 7, 2010 at 6:31 PM, David Winsemius  wrote:
>
> On Feb 7, 2010, at 5:03 PM, Kyle Werner wrote:
>
>> I'm running R 2.10.1 with mboost 2.0 in order to build predictive
>> models . I am performing prediction on a binomial outcome, using a
>> linear function (glmboost). However, I am running into some confusion
>> regarding centering. (I am not aware of an mboost-specific mailing
>> list, so if the main R list is not the right place for this topic,
>> please let me know.)
>>
>> The boost_control() function allows for the choice between center=TRUE
>> and center=FALSE. If I select center=FALSE, I am able to interpret the
>> coefficients just like those from standard logistic regression.
>> However, if I select center=TRUE, this is no longer the case. In
>> theory and in practice with my data, centering improves the
>> predictions made by the model, so this is an issue worth pursuing for
>> me.
>>
>> Below is output from running the exact same data in exactly the same
>> way, only differing by whether the "center" bit is flipped or not:
>>
>> Output with center=TRUE:
>> [(Intercept)] => -0.04543632
>> [painscore] => 0.007553608
>> [Offset] => -0.546520621809327
>>
>> Output with center=FALSE:
>> [(Intercept)] => -0.989742
>> [painscore] => 0.001342585
>> [Offset] => -0.546520621809327
>>
>> The mean of painscore is 741. It seems to me that for center=FALSE,
>> mboost should modify the intercept by subtracting 741*0.007553608 from
>> it (thus intercept should = -11.285). If I manually do this, the
>> output is credible, and in the ballpark of that given by other methods
>> (e.g., lrm or glm with a Binomial link function). If I don't do this,
>> then the inverse logistic interpretation of the output is off by
>> orders of magnitude.
>>
>> In the end, with "center=TRUE", and I want to make a prediction based
>> on the coefficients returned by mboost, the results only make sense if
>> I manually rescale my independent variables prior to making a
>> prediction. Is this the desired behavior, or am I doing something
>> wrong?
>
> I don't know, but my question is ... why aren't you using the predict method
> for that sort of object? Presumably the authors of the package know how to
> recognize the differences in the objects. Testing confirms this to be the
> case with the first example in the glmboost help page.
>
>
>>
>> Many thanks.
>>
>> __
>> R-help@r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide
>> http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>
> David Winsemius, MD
> Heritage Laboratories
> West Hartford, CT
>
>

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] split plot with aov

2010-02-07 Thread RICHARD M. HEIBERGER

The dummy variables for the factors in balanced designs are orthogonal.
The treatment dummy variables are not orthogonal to the block dummy variables
for unbalanced designs.  That is essentially what the term "balanced" means.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Help with apply()

2010-02-07 Thread Nathan S. Watson-Haigh


I have a 2 column data.frame:

> d[1:5,]
  a b
180015 C
280016 B
380023 C
480062 B
580069 B

I want to apply a function across each row:

> for(i in 1:nrow(d)) {
+myFun(con, d[i,]$a, d[i,]$b)
+ }

How do I do this using apply()? I'm unsure how to tell apply() to pass 
data from columns a and b for a given row as arguments to the function 
myFun().


Thanks in advance for any pointers,
Nathan

--

Dr. Nathan S. Watson-Haigh
OCE Post Doctoral Fellow
CSIRO Livestock Industries
University Drive
Townsville, QLD 4810
Australia

Tel: +61 (0)7 4753 8548
Fax: +61 (0)7 4753 8600
Web: http://www.csiro.au/people/Nathan.Watson-Haigh.html

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] split plot with aov

2010-02-07 Thread Penny B


I have a factor SAMPLES which is at the lowest level (within) of a split plot
anova model  but this factor also appears in the ANOVA table at the block
level. This happens for unbalanced responses but not for balanced responses.
I would be grateful for an explanation of this.

The block error term is Day:Treatment:Temp and the within error term is
Day:Treatment:Temp:Samples.
Day is the main plot, Treatment the first split, Temp is within Treatment
and then Samples within Temp.

Thanks,
Penny B.




-- 
View this message in context: 
http://n4.nabble.com/split-plot-with-aov-tp1472521p1472521.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] dataframe question

2010-02-07 Thread David Winsemius

On Feb 7, 2010, at 8:14 PM, David Winsemius wrote:

On Feb 7, 2010, at 7:51 PM, Vadlamani, Satish {FLNA} wrote:

Folks:
Good day. Please see the code below. three_wk_out is a dataframe  
with columns wk1 through wk209. I want to change the format of the  
columns. I am trying the code below but it does not work.  I need  
$week in the for loop interpreted as wk1, wk2, etc. Could you  
please help? Thanks.

Satish

R code below
week_list <- paste("wk",c(1:209),sep="")

Or more "functionally":

three_wk_out <- as.data.frame( lapply(three_wk_out, some_function) )

Or if you wanted to just change the particular columns that matched  
the "wk" pattern:

idx <- grep("wk", names(three_wk_out))
three_wk_out[, idx ] <- apply( three_wk_out[, idx ], 2, as.numeric)

(I probably should have used apply( ___ , 2,  fn) in the prior effort  
rather than coercing a list back to a dataframe.)

E.g.:
>

 a b c x
1 1 0 0 1
2 2 3 2 4
3 1 2 1 5
4 2 0 3 2

> df <- as.data.frame(lapply(df, "^", 2))
> df
  a  b  c   x
1  1  0  0   1
2 16 81 16 256
3  1 16  1 625
4 16  0 81  16

for (week in week_list)
{
  three_wk_out$week <- as.numeric(three_wk_out$week)
}

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Noval numbers

2010-02-07 Thread RICHARD M. HEIBERGER

The attached file gives functions to go both directions.  I have used
it in class for many years.

This is very useful when studying machine representations of numbers,
for understanding mixed-radix number systems, for example time (days, hours,
minutes, seconds) or British money (pounds, shillings, pence), and for unique
indexing of cells in designed experiments.

Rich
## base
## Richard M. Heiberger

## See Section 12.1.4.2 of
## Richard M. Heiberger
## Computation for the Analysis of Designed Experiments
## Wiley, 1989



## defaults to 8 bit binary

base <- function(x, basis=c(2,2,2,2,2,2,2,2)) {
  cb <- rev(cumprod(c(1,basis)))
  xx <- x
  y <- rep(0, length(cb))  
  for (i in 1:length(cb)) {
yy <- xx %/% cb[i]
if (yy > 0) {
  y[i] <- yy
  xx <- xx %% cb[i]
}
  }
  names(y) <- cb
  y
}

baseinv <- function(y, basis=c(2,2,2,2,2,2,2,2)) {
  sum(y * rev(cumprod(c(1,basis
}


base(200)
baseinv(.Last.value)


## British money
basis <- c(12,20)  ## 12 pence per shilling, 20 shillings per pound sterling
base(498, basis)
baseinv(.Last.value, basis)

## American weight
base(100, 16)  ## 16 ounces per pound avoirdupois
baseinv(.Last.value, 16)

## time
basis <- c(60,60,24) ## 60 seconds per minute, 60 minutes per hour, 24 hours 
per day
x <- c(1, 2, 3, 40)
y <- baseinv(x, basis)
y
base(y, basis)


## binary arithmetic with 8 bits

basis <- c(2,2,2,2,2,2,2,2)
x <- 100

y <- base(x, basis)
y
baseinv(y, basis)

base(1)
baseinv(.Last.value)

base(200)
baseinv(.Last.value)

base(1000)
baseinv(.Last.value)




## IEEE with 53 base 2 digits
x <- c(  101,   102,   103,
1001,  1002,  1003,
   10001, 10002, 10003 ## the last 
three values illustrate
   )   ## the effects 
of .Machine$double.eps
x
sprintf("%17.0f", x)

y <- sapply(x, base, basis=rep(2,54))
y

print(digits=17,
  apply(y, 2, baseinv, basis=rep(2,54))
  )



## base 9
a <- base(132, c(9,9,9))
b <- base(125, c(9,9,9))
a
b
a+b
baseinv(a+b, c(9,9,9))
base(baseinv(a+b, c(9,9,9)), c(9,9,9))
__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] metafor package: effect sizes are not fully independent

2010-02-07 Thread Mike Cheung

Dear Gang,

It seems that it is possible to use a univariate meta-analysis to
handle your multivariate effect sizes. If you want to calculate a
weighted average first, Hedges and Olkin (1985) has discussed this
approach.

Hedges, L. V., & Olkin, I. (1985). Statistical methods for
meta-analysis. Orlando, FL: Academic Press.

Regards,
Mike
-- 
-
 Mike W.L. Cheung   Phone: (65) 6516-3702
 Department of Psychology   Fax:   (65) 6773-1843
 National University of Singapore
 http://courses.nus.edu.sg/course/psycwlm/internet/
-

On Mon, Feb 8, 2010 at 6:48 AM, Gang Chen  wrote:
> Dear Mike,
>
> Thanks a lot for the kind help!
>
> Actually a few months ago I happened to read a couple of your posts on
> the R-help archive when I was exploring the possibility of using lme()
> in R for meta analysis.
>
> First of all, I didn't specify the meta analysis model for my cases
> correctly in my previous message. Currently I'm only interested in
> random- or mixed-effects meta analysis. So what you've suggested is
> directly relevant to what I've been looking for, especially for case
> (2). I'll try to gather those references you listed, and figure out
> the details.
>
> Also I think I didn't state my case (1) clearly in my previous post.
> In that case, all the effect sizes are the same and in the same
> condition too (e.g., happy), but each source has multiple samples of
> the measurement (and also measurement error, or standard error). Could
> this still be handled as a multivariate meta analysis since the
> samples for the the same source are correlated? Or somehow the
> multiple measures from the same source can be somehow summarized
> (weighted average?) before the meta analysis?
>
> Your suggestions are highly appreciated.
>
> Best wishes,
> Gang
>
>
> On Sun, Feb 7, 2010 at 10:39 AM, Mike Cheung  wrote:
>> Dear Gang,
>>
>> Here are just some general thoughts. Wolfgang Viechtbauer will be a
>> better position to answer questions related to metafor.
>>
>> For multivariate effect sizes, we first have to estimate the
>> asymptotic sampling covariance matrix among the effect sizes. Formulas
>> for some common effect sizes are provided by Gleser and Olkin (2009).
>>
>> If a fixed-effects model is required, it is quite easy to write your
>> own GLS function to conduct the multivariate meta-analysis (see e.g.,
>> Becker, 1992). If a random-effects model is required, it is more
>> challenging in R. SAS Proc MIXED can do the work (e.g., van
>> Houwelingen, Arends, & Stijnen, 2002).
>>
>> Sometimes, it is possible to transform the multivariate effect sizes
>> into independent effect sizes (Kalaian & Raudenbush, 1996; Raudenbush,
>> Becker, & Kalaian, 1988). Then univariate meta-analysis, e.g.,
>> metafor(), can be performed on the transformed effect sizes. This
>> approach works if it makes sense to pool the multivariate effect sizes
>> as in your case (2)- the effect sizes are the same but in different
>> conditions (happy, sad, and neutral). However, this approach does not
>> work if the multivariate effect sizes are measuring different
>> concepts, e.g., verbal achievement and mathematical achievement.
>>
>> Hope this helps.
>>
>> Becker, B. J. (1992). Using results from replicated studies to
>> estimate linear models. Journal of Educational Statistics, 17,
>> 341-362.
>> Gleser, L. J., & Olkin, I. (2009). Stochastically dependent effect
>> sizes. In H. Cooper, L. V. Hedges, and J. C. Valentine (Eds.), The
>> handbook of research synthesis and meta-analysis, 2nd edition (pp.
>> 357-376). New York: Russell Sage Foundation.
>> Kalaian, H. A., & Raudenbush, S. W. (1996). A multivariate mixed
>> linear model for meta-analysis. Psychological Methods, 1, 227-235.
>> Raudenbush, S. W., Becker, B. J., & Kalaian, H. (1988). Modeling
>> multivariate effect sizes. Psychological Bulletin, 103, 111-120.
>> van Houwelingen, H.C., Arends, L.R., & Stijnen, T. (2002). Advanced
>> methods in meta-analysis: multivariate approach and meta-regression.
>> Statistics in Medicine, 21, 589-624.
>>
>> Regards,
>> Mike
>> --
>> -
>>  Mike W.L. Cheung               Phone: (65) 6516-3702
>>  Department of Psychology       Fax:   (65) 6773-1843
>>  National University of Singapore
>>  http://courses.nus.edu.sg/course/psycwlm/internet/
>> -
>>
>> On Sat, Feb 6, 2010 at 6:07 AM, Gang Chen  wrote:
>>> In a classical meta analysis model y_i = X_i * beta_i + e_i, data
>>> {y_i} are assumed to be independent effect sizes. However, I'm
>>> encountering the following two scenarios:
>>>
>>> (1) Each source has multiple effect sizes, thus {y_i} are not fully
>>> independent with each other.
>>> (2) Each source has multiple effect sizes, and each of the effect size
>>> from a source

Re: [R] dataframe question

2010-02-07 Thread David Winsemius

On Feb 7, 2010, at 7:51 PM, Vadlamani, Satish {FLNA} wrote:

Folks:
Good day. Please see the code below. three_wk_out is a dataframe  
with columns wk1 through wk209. I want to change the format of the  
columns. I am trying the code below but it does not work.  I need  
$week in the for loop interpreted as wk1, wk2, etc. Could you please  
help? Thanks.

Satish

R code below
week_list <- paste("wk",c(1:209),sep="")

Or more "functionally":

three_wk_out <- as.data.frame( lapply(three_wk_out, some_function) )

E.g.:
> df
  a b c x
1 1 0 0 1
2 2 3 2 4
3 1 2 1 5
4 2 0 3 2

> df <- as.data.frame(lapply(df, "^", 2))
> df
   a  b  c   x
1  1  0  0   1
2 16 81 16 256
3  1 16  1 625
4 16  0 81  16

for (week in week_list)
{
   three_wk_out$week <- as.numeric(three_wk_out$week)
}

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] 3D plot of following data

2010-02-07 Thread Paul Murrell


Hi


Jim Lemon wrote:

On 02/02/2010 11:01 PM, walter.dju...@chello.at wrote:

Hello R-experts,

I am having difficulties with 3D plotting (i.e. the evolution of
various forward curves through time).

I have two comma seperated files both ordered by date (in the first
column) one containing contracts (meaning forward delivery months
from YEAR_&  Letter "F" ... January through letter "Z" ...
December) and the other holding the closing price of the respective
contract on the day also defined in the first column (see
attachments).

What I would like to do is plot a three dimensional figure with
trade day (date) on the X-axis, contract on the Y-axis and the
price of the forward contract being the z-value. I am quite a
newbie and did not manage to merge these two files in a logic way,
so that R could do a 3D plot.


Has anyone tried to program Hans Rosling's time evolution graphs in
R?



Take a look at 
http://www.omegahat.org/SVGAnnotation/JSSPaper.html#fig:gapMSS


Paul



Jim

__ R-help@r-project.org
mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do
read the posting guide http://www.R-project.org/posting-guide.html 
and provide commented, minimal, self-contained, reproducible code.


--
Dr Paul Murrell
Department of Statistics
The University of Auckland
Private Bag 92019
Auckland
New Zealand
64 9 3737599 x85392
p...@stat.auckland.ac.nz
http://www.stat.auckland.ac.nz/~paul/

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] dataframe question

2010-02-07 Thread Peter Alspach

Tena koe Satish

Try using

three_wk_out[,week] <- as.numeric(tree_wk_out[,week])

HTH 

Peter Alspach 

> -Original Message-
> From: r-help-boun...@r-project.org 
> [mailto:r-help-boun...@r-project.org] On Behalf Of Vadlamani, 
> Satish {FLNA}
> Sent: Monday, 8 February 2010 1:51 p.m.
> To: r-help@r-project.org
> Subject: [R] dataframe question
> 
> Folks:
>  Good day. Please see the code below. three_wk_out is a 
> dataframe with columns wk1 through wk209. I want to change 
> the format of the columns. I am trying the code below but it 
> does not work.  I need $week in the for loop interpreted as 
> wk1, wk2, etc. Could you please help? Thanks.
> Satish
> 
> R code below
> week_list <- paste("wk",c(1:209),sep="") for (week in week_list) {
> three_wk_out$week <- as.numeric(three_wk_out$week) }
> 
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide 
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
> 

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] dataframe question

2010-02-07 Thread Vadlamani, Satish {FLNA}

Folks:
 Good day. Please see the code below. three_wk_out is a dataframe with columns 
wk1 through wk209. I want to change the format of the columns. I am trying the 
code below but it does not work.  I need $week in the for loop interpreted as 
wk1, wk2, etc. Could you please help? Thanks.
Satish

R code below
week_list <- paste("wk",c(1:209),sep="")
for (week in week_list)
{
three_wk_out$week <- as.numeric(three_wk_out$week)
}

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Reading hierarchical data

2010-02-07 Thread Gabor Grothendieck

Here is a further simplification.  We use the colClasses= argument
with "NULL" for the columns we do not want so we do not have to later
remove those columns.

# record type ("1" or "2")
rectype <- substr(input, 7, 7)

# read in record type "1"
input1 <- input[rectype == "1"]
DF1 <- read.fwf(textConnection(input1), widths = c(5, 1, 1, 1, 1),
col.names = c("familyid", "", "", "", "dwelling"),
colClasses = c("numeric", "NULL", "NULL", "NULL", "numeric"))

# read in record type "2"
input2 <- input[rectype == "2"]
DF2 <- read.fwf(textConnection(input2), widths = c(5, 1, 1, 2, 1, 1),
col.names = c("personalid", "", "", "age", "", "sex"),
colClasses = c("numeric", "NULL", "NULL", "numeric", "NULL", "numeric"))

# ix is the index in DF1 of family row corresponding to each personal row in DF2
ix <- cumsum(rectype == "1")[rectype == "2"]
DF <- cbind(DF1[ix,], DF2)

DF


On Sun, Feb 7, 2010 at 6:30 PM, Gabor Grothendieck
 wrote:
> Try this. It uses input defined in Jim's post and defines the rectype
> of each row ("1" or "2").  It then reads the rectype "1" records into
> DF1 using read.fwf and the rectype "2" records into DF2 also using
> read.fwf.  ix is defined to have one component per personal record
> giving the row number in DF1 of the corresponding family.  We combine
> DF1 and DF2 using ix and remove the column names that start with "X".
>
> # record type ("1" or "2")
> rectype <- substr(input, 7, 7)
>
> # read in record type "1"
> input1 <- input[rectype == "1"]
> DF1 <- read.fwf(textConnection(input1), widths = c(5, 1, 1, 1, 1),
>        col.names = c("familyid", "X", "X", "X", "dwelling"))
>
> # read in record type "2"
> input2 <- input[rectype == "2"]
> DF2 <- read.fwf(textConnection(input2), widths = c(5, 1, 1, 2, 1, 1),
>        col.names = c("personalid", "X", "X", "age", "X", "sex"))
>
> # ix is the index in DF1 of family row corresponding to each personal row in 
> DF2
> ix <- cumsum(rectype == "1")[rectype == "2"]
> DF <- cbind(DF1[ix,], DF2)
> DF <- DF[substr(names(DF), 1, 1) != "X"]
>
> so DF looks like this:
>
>> DF
>    familyid dwelling personalid age sex
> 1       6470        1          1  32   0
> 1.1     6470        1          2  30   1
> 2       7470        0          1  40   1
> 3       8470        0          1  27   0
> 4       9470        0          1  13   1
> 4.1     9470        0          2  22   0
> 4.2     9470        0          3  24   1
> 5      10470        1          1  20   0
> 5.1    10470        1          2  11   1
> 6      11470        0          1  17   0
> 6.1    11470        0          2  10   1
> 6.2    11470        0          3  26   1
>
> On Sun, Feb 7, 2010 at 10:57 AM, Saba(Home)  wrote:
>>
>> I would like to read the following hierarchical data set. There is a family
>> record followed by one or more personal records.
>> If col. 7 is "1" it is a family record. If it is "2" it is a personal
>> record.
>> The family record is formatted as follows:
>> col. 1-5     family id
>> col. 7        "1"
>> col. 9        dwelling type code
>> The personal record is formatted as follows:
>> col. 1-5        personal id
>> col. 7   "2"
>> col. 8-9        age
>> col. 11 sex code
>>
>> The first six family and accompanying personal records look like this:
>> 06470 1 1
>>    1 232 0
>>    2 230 1
>> 07470 1 0
>>    1 240 1
>> 08470 1 0
>>    1 227 0
>> 09470 1 0
>>    1 213 1
>>    2 222 0
>>    3 224 1
>> 10470 1 1
>>    1 220 0
>>    2 211 1
>> 11470 1 0
>>    1 217 0
>>    2 210 1
>>    3 226 1
>>
>> I want to create a dataset containing
>> . family ID
>> . dwelling code
>> . person ID
>> . age
>> . sex code
>> The dataset will contain one observation per person, and the with family
>> information repeated for people in the same family.
>> Can anyone help?
>> Thanks,
>> Richard Saba
>>
>> __
>> R-help@r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>>
>

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Interactively editing point labels in a graph

2010-02-07 Thread Gabor Grothendieck

Create your plot and save it in wmf format, e.g.

DF <- as.data.frame(state.x77)
plot(Income ~ log(Population), DF, pch = 20)
with(DF, text(log(Population), Income, rownames(state.x77), cex = 0.5, pos = 4))
savePlot("states.wmf")

Then insert it into Microsoft Word, right click the image, choose Edit
and you can edit all the text labels.

On Wed, Feb 3, 2010 at 2:57 AM, trece por ciento
 wrote:
> Dear experts,
> I would like to be able to interactively (if possible, with mouse and clik) 
> edit point labels in graphs, particularly in multivariate graphs, such as the 
> biplots you get after a correspondence analysis (with, for example, package 
> ca), where labels tend to overlap. The graph aspect ratio is relevant (it 
> needs to be mantained). And I'm working with Windows XP.
> In this kind of graphs points in the graph are identified with labels, 
> generally long (see, for example: 
> http://www.white-history.com/Greece_files/hlafreq.jpg), and sometimes -as in 
> the example- it is good to group certain points within ellipses.
> Do you know if exists some package able to do this task?
> Thanks in advance,
> Hug
>
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] mboost: Interpreting coefficients from glmboost if center=TRUE

2010-02-07 Thread David Winsemius



On Feb 7, 2010, at 5:03 PM, Kyle Werner wrote:


I'm running R 2.10.1 with mboost 2.0 in order to build predictive
models . I am performing prediction on a binomial outcome, using a
linear function (glmboost). However, I am running into some confusion
regarding centering. (I am not aware of an mboost-specific mailing
list, so if the main R list is not the right place for this topic,
please let me know.)

The boost_control() function allows for the choice between center=TRUE
and center=FALSE. If I select center=FALSE, I am able to interpret the
coefficients just like those from standard logistic regression.
However, if I select center=TRUE, this is no longer the case. In
theory and in practice with my data, centering improves the
predictions made by the model, so this is an issue worth pursuing for
me.

Below is output from running the exact same data in exactly the same
way, only differing by whether the "center" bit is flipped or not:

Output with center=TRUE:
[(Intercept)] => -0.04543632
[painscore] => 0.007553608
[Offset] => -0.546520621809327

Output with center=FALSE:
[(Intercept)] => -0.989742
[painscore] => 0.001342585
[Offset] => -0.546520621809327

The mean of painscore is 741. It seems to me that for center=FALSE,
mboost should modify the intercept by subtracting 741*0.007553608 from
it (thus intercept should = -11.285). If I manually do this, the
output is credible, and in the ballpark of that given by other methods
(e.g., lrm or glm with a Binomial link function). If I don't do this,
then the inverse logistic interpretation of the output is off by
orders of magnitude.

In the end, with "center=TRUE", and I want to make a prediction based
on the coefficients returned by mboost, the results only make sense if
I manually rescale my independent variables prior to making a
prediction. Is this the desired behavior, or am I doing something
wrong?


I don't know, but my question is ... why aren't you using the predict  
method for that sort of object? Presumably the authors of the package  
know how to recognize the differences in the objects. Testing confirms  
this to be the case with the first example in the glmboost help page.





Many thanks.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


David Winsemius, MD
Heritage Laboratories
West Hartford, CT

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Reading hierarchical data

2010-02-07 Thread Gabor Grothendieck

Try this. It uses input defined in Jim's post and defines the rectype
of each row ("1" or "2").  It then reads the rectype "1" records into
DF1 using read.fwf and the rectype "2" records into DF2 also using
read.fwf.  ix is defined to have one component per personal record
giving the row number in DF1 of the corresponding family.  We combine
DF1 and DF2 using ix and remove the column names that start with "X".

# record type ("1" or "2")
rectype <- substr(input, 7, 7)

# read in record type "1"
input1 <- input[rectype == "1"]
DF1 <- read.fwf(textConnection(input1), widths = c(5, 1, 1, 1, 1),
col.names = c("familyid", "X", "X", "X", "dwelling"))

# read in record type "2"
input2 <- input[rectype == "2"]
DF2 <- read.fwf(textConnection(input2), widths = c(5, 1, 1, 2, 1, 1),
col.names = c("personalid", "X", "X", "age", "X", "sex"))

# ix is the index in DF1 of family row corresponding to each personal row in DF2
ix <- cumsum(rectype == "1")[rectype == "2"]
DF <- cbind(DF1[ix,], DF2)
DF <- DF[substr(names(DF), 1, 1) != "X"]

so DF looks like this:

> DF
familyid dwelling personalid age sex
1   64701  1  32   0
1.1 64701  2  30   1
2   74700  1  40   1
3   84700  1  27   0
4   94700  1  13   1
4.1 94700  2  22   0
4.2 94700  3  24   1
5  104701  1  20   0
5.1104701  2  11   1
6  114700  1  17   0
6.1114700  2  10   1
6.2114700  3  26   1

On Sun, Feb 7, 2010 at 10:57 AM, Saba(Home)  wrote:
>
> I would like to read the following hierarchical data set. There is a family
> record followed by one or more personal records.
> If col. 7 is "1" it is a family record. If it is "2" it is a personal
> record.
> The family record is formatted as follows:
> col. 1-5     family id
> col. 7        "1"
> col. 9        dwelling type code
> The personal record is formatted as follows:
> col. 1-5        personal id
> col. 7   "2"
> col. 8-9        age
> col. 11 sex code
>
> The first six family and accompanying personal records look like this:
> 06470 1 1
>    1 232 0
>    2 230 1
> 07470 1 0
>    1 240 1
> 08470 1 0
>    1 227 0
> 09470 1 0
>    1 213 1
>    2 222 0
>    3 224 1
> 10470 1 1
>    1 220 0
>    2 211 1
> 11470 1 0
>    1 217 0
>    2 210 1
>    3 226 1
>
> I want to create a dataset containing
> . family ID
> . dwelling code
> . person ID
> . age
> . sex code
> The dataset will contain one observation per person, and the with family
> information repeated for people in the same family.
> Can anyone help?
> Thanks,
> Richard Saba
>
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Why does smoothScatter clip when xlim and ylim increased?

2010-02-07 Thread Jennifer Lyon

On Sat, Feb 6, 2010 at 6:15 AM, Duncan Murdoch  wrote:
> On 06/02/2010 7:51 AM, Jennifer Lyon wrote:
>> Hi:
>>
>> Is there a way to get smoothScatter to not clip when I increase the xlim
>> and
>> ylim parameters?
>> Consider the following example:
>>
>> set.seed(17)
>> x1<-rnorm(100)
>> x2<-rnorm(100)
>> smoothScatter(x1,x2)
>>
>> #Now if I increase xlim and ylim notice that the plot seems to be clipped
>> at
>> the former xlim, and ylim boundaries:
>>
>> smoothScatter(x1,x2, xlim=c(-5,5), ylim=c(-5,5))
>
> If you follow the links on the help page, you'll see that smoothScatter uses
> bkde2D, which has a range.x argument to control the range of the smoothing.
>  The smoothScatter function never passes the xlim and ylim values to bkde2D,
> only to the plotting functions, presumably because the author expected you
> to use them to limit the range, not extend it.
>
> You can get the behaviour you want with specified xlim and ylim by modifying
> one line in smoothScatter:
>
> map <- grDevices:::.smoothScatterCalcDensity(x, nbin, bandwidth)
>
> should become
>
> map <- grDevices:::.smoothScatterCalcDensity(x, nbin, bandwidth, list(xlim,
> ylim))
>
> (You can use fix(smoothScatter) to edit your own local copy of smoothScatter
> and make this change.)
>
> However, this messes up the default plot, so a better patch would be needed
> to permanently fix this.
>
> Duncan Murdoch
>
Ah. A very helpful explanation. Further exploration led to the
realization that if I passed in par("usr") instead of xlim and ylim
that both the case I care about and the default case display without
clipping. Of course I discovered (very reasonably) that par("usr")
doesn't exist until plot() is called, so I ended up calling plot and
then modifying the image call with add=T. Along the lines of:
 plot(NA,NA, xlab = xlab, ylab = ylab, xlim = xlim, ylim = ylim, xaxs
= xaxs, yaxs = yaxs, type="n", ...)
usr<-par("usr")
map <- grDevices:::.smoothScatterCalcDensity(x, nbin, bandwidth,
 list(usr[1:2],usr[3:4]))
...
image(xm, ym, z = dens, col = colramp(256), xlab = xlab,
 ylab = ylab, xlim = xlim, ylim = ylim, xaxs = xaxs, yaxs = yaxs,
add=T, ...)

This is somewhat wasteful, as bkde2D is computing densities at grid
points well away from where the data is located, so I'll just have to
increase the number of grid points.

Thank you for your help.

Jen

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] using a variable name stored in another variable?

2010-02-07 Thread Peter Alspach

Tena koe Chris

Does the following help?

dfName <- 'myDf'
save(dfName, file='test1')
save('dfName', file='test2')
save('myDf', file='test3')
save(myDf, file='test4')

Peter Alspach 

> -Original Message-
> From: r-help-boun...@r-project.org 
> [mailto:r-help-boun...@r-project.org] On Behalf Of Chris Seidel
> Sent: Monday, 8 February 2010 12:05 p.m.
> To: r-help@r-project.org
> Subject: Re: [R] using a variable name stored in another variable?
> 
> Hi Charlie,
> 
> get() will return the contents (value) of a variable. But 
> what I want is to save the named object. Something like 
> save(get(myobjectname), ...) doesn't work. 
> 
> In the environment, is that object of interest, and a 
> variable which holds the name of the object of interest. If 
> you don't know the name of the object, but only the variable 
> which contains it's name, how do you use that information to 
> save the object?
> 
> -Chris
> 
> From: r-help-boun...@r-project.org 
> [r-help-boun...@r-project.org] On Behalf Of Sharpie 
> [ch...@sharpsteen.net]
> Sent: Sunday, February 07, 2010 4:13 PM
> To: r-help@r-project.org
> Subject: Re: [R] using a variable name stored in another variable?
> 
> Chris Seidel wrote:
> >
> > Hello,
> >
> > I'm trying to figure out how to create a data object, and 
> then save it 
> > with a user-defined name that is input as a command line 
> argument. I 
> > know how to create the object and assign it the new name, 
> however, I 
> > can't figure out how to refer to the new name for a future 
> operation 
> > such as save().
> >
> > ..snip..
> >
> >
> 
> You probably want the get() function:
> 
>   get( myobjectname )
> 
> The help page for get() has a note which states that it is 
> the compliment of assign().  Perhaps a similar note should be 
> added to the help page for assign...
> 
> Hope this helps!
> 
> -Charlie
> --
> 
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide 
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
> 

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] using a variable name stored in another variable?

2010-02-07 Thread Duncan Murdoch


On 07/02/2010 6:05 PM, Chris Seidel wrote:

Hi Charlie,

get() will return the contents (value) of a variable. But what I want is
to save the named object. Something like save(get(myobjectname), ...)
doesn't work. 


I think you want

save(list=myobjectname, file= ...)

assuming that the object has already been created with that name.  If it 
hasn't, you'll need two steps:


assign( myobjectname, value)
save(list=myobjectname, file=...)

These could be wrapped in local( { ... } ) if you are worried that 
myobjectname might be the name of an object you want to keep.  For example,


x <- 1  # Create a variable I don't want to mess with
name <- "x"  # choose a name to save under
local({ assign(name, 2) ; save(list=name, file="test.Rdata") })
# That created test.Rdata with x equal to 2


Duncan Murdoch


In the environment, is that object of interest, and a variable which
holds the name of the object of interest. If you don't know the name of
the object, but only the variable which contains it's name, how do you
use that information to save the object?

-Chris

From: r-help-boun...@r-project.org [r-help-boun...@r-project.org] On
Behalf Of Sharpie [ch...@sharpsteen.net]
Sent: Sunday, February 07, 2010 4:13 PM
To: r-help@r-project.org
Subject: Re: [R] using a variable name stored in another variable?

Chris Seidel wrote:

Hello,

I'm trying to figure out how to create a data object, and then save it
with a user-defined name that is input as a command line argument. I
know how to create the object and assign it the new name, however, I
can't figure out how to refer to the new name for a future operation
such as save().

..snip..




You probably want the get() function:

  get( myobjectname )

The help page for get() has a note which states that it is the compliment of
assign().  Perhaps a similar note should be added to the help page for
assign...

Hope this helps!

-Charlie
--

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] specifying colors in a heatmap/image -like plot

2010-02-07 Thread David Winsemius



On Feb 7, 2010, at 4:57 PM, kerimcan wrote:



Hi,

I have searched for a solution but I failed to find an answer. I am  
hoping

you may be able to help me.

I have a data set where I have observations for a number of units (n  
=~40)

over a period of time (t =~100) and I have a variable (Z) that codes a
categorical variable for each observation. I want to produce a 2D  
plot where
time is on the x-axis and units are on the y-axis. Then each block  
on the
2-d plot should take a color depending on variable Z. Z is not  
ordered so
using a scale (like in heatmaps) does not make sense. In fact the  
values of

Z have meanings that are intuitively related to colors (e.g. Z=3 means
involvement by the "United Nations" so I want its color to be  
"blue"). Below

is some code that gives an example of what I am aiming to do and why
"heatmap" and "image" functions don't work for me. Thanks in advance  
for

your help.


# Example: Suppose Z had 3 values (0,1,2) and I had 8 observations.

hitmep <- matrix(c(0,2,1,0,2,1,1,0),2,4)

# Graph 1:
heatmap(hitmep2, Rowv =NA, Colv =NA, labrow =NULL, scale ="none")
# Graph 2:
image(t(hitmep2), axes =FALSE)

# I like the layout of the plots. My problem with these is that I  
don't want

Z's values (0,1,2) to have colors on a scale. I want to specify, for
example, 1="blue", 2="yellow" and 3="green". Do you know how to do  
this?


Well, if you fix the name of your data vector and add the glaringly  
obvious color argument, it seems to "work":


hitmep2 <- matrix(c(0,2,1,0,2,1,1,0),2,4)

# Graph 1:
heatmap(hitmep2, col=c("red", "green", "blue"),Rowv =NA, Colv =NA,  
labrow =NULL, scale ="none")

# Graph 2:
image(t(hitmep2), col=c("red", "green", "blue"), axes =FALSE)

Unless I don't understand what you wanted... always a possibility.

--
David Winsemius, MD
Heritage Laboratories
West Hartford, CT

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] using a variable name stored in another variable?

2010-02-07 Thread Chris Seidel

Hi Charlie,

get() will return the contents (value) of a variable. But what I want is
to save the named object. Something like save(get(myobjectname), ...)
doesn't work. 

In the environment, is that object of interest, and a variable which
holds the name of the object of interest. If you don't know the name of
the object, but only the variable which contains it's name, how do you
use that information to save the object?

-Chris

From: r-help-boun...@r-project.org [r-help-boun...@r-project.org] On
Behalf Of Sharpie [ch...@sharpsteen.net]
Sent: Sunday, February 07, 2010 4:13 PM
To: r-help@r-project.org
Subject: Re: [R] using a variable name stored in another variable?

Chris Seidel wrote:
>
> Hello,
>
> I'm trying to figure out how to create a data object, and then save it
> with a user-defined name that is input as a command line argument. I
> know how to create the object and assign it the new name, however, I
> can't figure out how to refer to the new name for a future operation
> such as save().
>
> ..snip..
>
>

You probably want the get() function:

  get( myobjectname )

The help page for get() has a note which states that it is the compliment of
assign().  Perhaps a similar note should be added to the help page for
assign...

Hope this helps!

-Charlie
--

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Out-of-sample prediction with VAR

2010-02-07 Thread peter

Good day,

I'm using a VAR model to forecast sales with some extra variables (google
trends data). I have divided my dataset into a trainingset (weekly sales +
vars in 2006 and 2007) and a holdout set (2008).
It is unclear to me how I should predict the out-of-sample data, because
using the predict() function in the vars package seems to estimate my
google trends vars as well. However, I want to forecast the sales figures,
with knowledge of the actual google trends data.

My questions:
1. How should I do this? I currently extract the linear model generated by
the VAR(3) function to predict the holdout set, but that seems
inappropriate?
2. In case that I am doing it right, how is it possible that a
automatically fitted model with more variables actually performs less good
(in terms of MAPE)? Shouldn't it at least predict just as well as the
simple AR(3) by finding that the extra variables have no added value?

My code:

ts_Y <- ts(log_residuals[1:104]); # detrended sales data
ts_XGG <- ts(salesmodeldata$gtrends_global[1:104]);
ts_XGL <- ts(salesmodeldata$gtrends_local[1:104]);
training_matrix <- data.frame(ts_Y, ts_XGG, ts_XGL);

### Try VAR(3)
var_model <- VAR (y=training_matrix, p=3, type="both", 
season=NULL,
exogen=NULL,  lag.max=NULL);

## Out of sample forecasting
var.lm = lm(var_model$varresult$ts_Y); # the generated LM

ts_Y <- ts(log_residuals[105:155]);
ts_XGG <- ts(salesmodeldata$gtrends_global[105:155]);
ts_XGL <- ts(salesmodeldata$gtrends_local[105:155]);

# Notice how I manually create the lagged values to be used in 
the
Linear Model
holdout_matrix <- na.omit(data.frame(ts.union(ts_Y, ts_XGG, 
ts_XGL,
ts_Y.l1 = lag(ts_Y,-1), ts_Y.l2 = lag(ts_Y,-2), ts_Y.l3 = lag(ts_Y,-3),
ts_XGG.l1 = lag(ts_XGG,-1), ts_XGG.l2 = lag(ts_XGG,-2), ts_XGG.l3 =
lag(ts_XGG,-3), ts_XGL.l1 = lag(ts_XGL,-1), ts_XGL.l2 = lag(ts_XGL,-2),
ts_XGL.l3 = lag(ts_XGL,-3), const=1, trend=0.0001514194  )));

var.predict = predict(object=var_model, n.ahead=52, 
dumvar=holdout_matrix);

## Assess accuracy
calc_mape (holdout_matrix$ts_Y, var.predict, islog=T, print=T)

Some context:
For my Master's thesis I'm using R to test the predictive power of web
metrics (such as google trends data & pageviews) in sales forecasting. To
properly assess this, I employ a simple AR model (for time series without
the extra variables) and a VAR model for the predictions with the extra
variables. I also develop a random forest with, and without the buzz
variables and see if MAPE improves.

Many thanks in advance!

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Interactively editing point labels in a graph

2010-02-07 Thread Felix Andrews

The built-in R graphics system was not designed for interactivity --
there is no [feasible] way to detect the data point coordinates in a
base graphics plot. The playwith package tries to figure out the
coordinates from the data objects given in the call: this works for
simple scatterplots etc, but is non-trivial for your CA plot. You
*could* define functions to enable playwith to work correctly in this
case: the functions would be called something like
"plotCoords.plot.ca" and possibly "case.names.ca" (if
case.names.default does not already work).

Regards
-Felix

On 6 February 2010 23:11, trece por ciento  wrote:
> Many thanks, Felix
> It worked, simply importing the emf into PowerPoint!
> By the way, as you are the maintainer of playwith, a question: Why is 
> playwith unable to cope with it?
> I liked very much the playwith option because it is easy to use, and has all 
> the basic capabilities that I need.
> Best regards,
> Hug
>
> --- On Wed, 2/3/10, Felix Andrews  wrote:
>
>> From: Felix Andrews 
>> Subject: Re: [R] Interactively editing point labels in a graph
>> To: "trece por ciento" 
>> Cc: "Liviu Andronic" , r-help@r-project.org
>> Date: Wednesday, February 3, 2010, 4:51 PM
>> For your situation, perhaps the best
>> option is to save the plot in a
>> vector format like WMF, PDF or SVG, and open it with an
>> external
>> editor. Inkscape is a good one.
>>
>>
>> On 4 February 2010 06:46, trece por ciento 
>> wrote:
>> > Thanks, Liviu
>> > In a first look it seems OK. Two questions:
>> > 1. Playwith accept directly the plots created by the
>> ca package, but it seems unable to identify the point
>> labels
>> > For example:
>> > data(smoke)
>> > smoke
>> > ca(smoke)
>> > plot(ca(smoke))
>> > playwith(plot(ca(smoke)))
>> > Then, if I try to identify a label playwith gives the
>> message "Sorry, can not guess the data point coordinates.
>> Please contact the maintainer with suggestions".
>> > If I ask to select the label from a table playwith
>> sends the following message to RGui: "Error in
>> data.frame(..., check.names = FALSE) :
>> > arguments imply differing number of rows: 2, 0"
>> > 2. Can playwith draw ellipses or any other figure
>> around selected points?
>> >
>> > (For the first question it seems my fault, but I don't
>> know how to fix it)
>> >
>> > Hug
>> >
>> > --- On Wed, 2/3/10, Liviu Andronic 
>> wrote:
>> >
>> >> From: Liviu Andronic 
>> >> Subject: Re: [R] Interactively editing point
>> labels in a graph
>> >> To: "trece por ciento" 
>> >> Cc: r-help@r-project.org
>> >> Date: Wednesday, February 3, 2010, 3:49 AM
>> >> Hello
>> >>
>> >> On 2/3/10, trece por ciento 
>> >> wrote:
>> >> > Dear experts,
>> >> >  I would like to be able to interactively
>> (if
>> >> possible, with mouse and clik) edit point labels
>> in graphs,
>> >> >
>> >> Try playwith.
>> >> Liviu
>> >>
>> >> > particularly in multivariate graphs, such as
>> the
>> >> biplots you get after a correspondence analysis
>> (with, for
>> >> example, package ca), where labels tend to
>> overlap. The
>> >> graph aspect ratio is relevant (it needs to be
>> mantained).
>> >> And I'm working with Windows XP.
>> >> >  In this kind of graphs points in the graph
>> are
>> >> identified with labels, generally long (see, for
>> example: http://www.white-history.com/Greece_files/hlafreq.jpg),
>> >> and sometimes -as in the example- it is good to
>> group
>> >> certain points within ellipses.
>> >> >  Do you know if exists some package able to
>> do
>> >> this task?
>> >> >  Thanks in advance,
>> >> >  Hug
>> >> >
>> >> >
>> __
>> >> >  R-help@r-project.org
>> >> mailing list
>> >> >  https://stat.ethz.ch/mailman/listinfo/r-help
>> >> >  PLEASE do read the posting guide 
>> >> >http://www.R-project.org/posting-guide.html
>> >> >  and provide commented, minimal,
>> self-contained,
>> >> reproducible code.
>> >> >
>> >>
>> >>
>> >> --
>> >> Do you know how to read?
>> >> http://www.alienetworks.com/srtest.cfm
>> >> http://goodies.xfce.org/projects/applications/xfce4-dict#speed-reader
>> >> Do you know how to write?
>> >> http://garbl.home.comcast.net/~garbl/stylemanual/e.htm#e-mail
>> >>
>> >
>> >
>> >
>> >
>> > __
>> > R-help@r-project.org
>> mailing list
>> > https://stat.ethz.ch/mailman/listinfo/r-help
>> > PLEASE do read the posting guide 
>> > http://www.R-project.org/posting-guide.html
>> > and provide commented, minimal, self-contained,
>> reproducible code.
>> >
>>
>>
>>
>> --
>> Felix Andrews / 安福立
>> Postdoctoral Fellow
>> Integrated Catchment Assessment and Management (iCAM)
>> Centre
>> Fenner School of Environment and Society [Bldg 48a]
>> The Australian National University
>> Canberra ACT 0200 Australia
>> M: +61 410 400 963
>> T: + 61 2 6125 4670
>> E: felix.andr...@anu.edu.au
>> CRICOS Provider No. 00120C
>> --
>> http://www.neurofractal.org/felix/
>>
>
>
>
>



-- 
Felix Andrews / 安福立
Postdoctoral Fellow
Integrated Catchment Assessmen

[R] specifying colors in a heatmap/image -like plot

2010-02-07 Thread kerimcan


Hi,

I have searched for a solution but I failed to find an answer. I am hoping
you may be able to help me. 

I have a data set where I have observations for a number of units (n =~40)
over a period of time (t =~100) and I have a variable (Z) that codes a
categorical variable for each observation. I want to produce a 2D plot where
time is on the x-axis and units are on the y-axis. Then each block on the
2-d plot should take a color depending on variable Z. Z is not ordered so
using a scale (like in heatmaps) does not make sense. In fact the values of
Z have meanings that are intuitively related to colors (e.g. Z=3 means
involvement by the "United Nations" so I want its color to be "blue"). Below
is some code that gives an example of what I am aiming to do and why
"heatmap" and "image" functions don't work for me. Thanks in advance for
your help.


# Example: Suppose Z had 3 values (0,1,2) and I had 8 observations.

hitmep <- matrix(c(0,2,1,0,2,1,1,0),2,4)

# Graph 1:
heatmap(hitmep2, Rowv =NA, Colv =NA, labrow =NULL, scale ="none")
# Graph 2:
image(t(hitmep2), axes =FALSE)

# I like the layout of the plots. My problem with these is that I don't want
Z's values (0,1,2) to have colors on a scale. I want to specify, for
example, 1="blue", 2="yellow" and 3="green". Do you know how to do this?

Thanks in advance,
Kerim Can Kavakli



-- 
View this message in context: 
http://n4.nabble.com/specifying-colors-in-a-heatmap-image-like-plot-tp1472388p1472388.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Data views (Re: (Another) Bates fortune?)

2010-02-07 Thread Emmanuel Charpentier

Note : this post has been motivated more by the "hierarchical data"
subject than the aside joke of Douglas Bates, but might be of interest
to its respondents.

Le vendredi 05 février 2010 à 21:56 +0100, Peter Dalgaard a écrit :
> Peter Ehlers wrote:
> > I vote to 'fortunize' Doug Bates on
> > 
> >  Hierarchical data sets: which software to use?
> > 
> > "The widespread use of spreadsheets or SPSS data sets or SAS data sets
> > which encourage the "single table with a gargantuan number of columns,
> > most of which are missing data in most cases" approach to organization
> > of longitudinal data is regrettable."
> > 
> > http://n4.nabble.com/Hierarchical-data-sets-which-software-to-use-td1458477.html#a1470430
> >  
> > 
> > 
> 
> Hmm, well, it's not like "long format" data frames (which I actually 
> think are more common in connection with SAS's PROC MIXED) are much 
> better. Those tend to replicate base data unnecessarily - "as if rats 
> change sex with millisecond resolution".

[ Note to Achim Zeilis : the "rats changing sex with millisecond
resolution" quote is well worth a nomination to "fortune" fame ; it
seems it is not one already... ]

>   The correct data structure 
> would be a relational database with multiple levels of tables, but, to 
> my knowledge, no statistical software, including R, is prepared to deal 
> with data in that form.

Well, I can think of two exceptions :

- BUGS, in its various incarnations (WinBUGS, OpenBUGS, JAGS), does not
require its data to come from the same source. For example, while
programming a hierarchical model (a. k. a. mixed-effect model),
individual level variables may come from one source and various group
level variables may come from other sources. Quite handy : no previous
merge() required. Now, writing (and debugging !) such models in BUGS
is another story...

- SAS has had this concept of "data view" for a long time, its most
useful incarnation being a "data view" of an SQL view. Again, this
avoids the need to actually merge the datasets (which, AFAICR, is a
serious piece of pain in the @$$ in SAS (maybe that's the *real*
etymology of the name ?)).

This problem has bugged me for a while. I think that the concept of a
"data view" is right (after all, that's one of the core concepts of SQL
for a reason...), but that implementing it *cleanly* in R is probably
hard work. Using a DBMS for maintaining tables and views and querying
them "just at the right time" does help, but the ability of using these
DBMS data without importing them in R is, AFAIK, currently lacking.

One upon a time, a very old version of RPgSQL (a Bioconductor package),
aimed to such a representation : it created objects inheriting from
data.frame to represent Postgres-based data, allowing to use these data
"transparently". This package dropped into oblivon when his creator and
sole maintainer became unable to maintain it further.

As far as I understand it, the DBI specification *might* allow the
creation of such objects, but I am not aware of any driver actually
implementing that.

In fact, there are two elements of solution to this problem :
a) creation of (abstract) objects representing data collections as data
frames, with the same properties, but not requesting the creation of an
actual data frame. As far as my (very poor) object-oriented knowledge
goes, these objects should be, in C++/Python parlance, inherit from
data.frame.
b) creation of objects implementing various realizations of the objects
created in a) : DBMS querying, actual data.frame querying (here I'm
thinking of sqldf, which does this on the reverse direction, allowing
querying R data frames to be queried in SQL. Quite handy...), etc ...

I tried my hand once at building such a representation (for
DBMS-deposited data), with partial success (read-only was OK, read-write
was seriously buggy). But my S3 object-oriented code stinks, my Python
is pytiful, and, as a public health measure,  I won't even try to
qualify my C++... So I leave implementation to better programmers as an
exercise (a term project, or even a master's thesis subject is probably
closer to truth...).

A third, much larger, (implementation) element, is lacking in this
picture : the algorithms used on these data. SAS is notoriously good (in
some simple cases, such as ordinary regression) at handling datasets
larger than available memory because the algorithms have been written
with punched cards (maybe even paper tape) in mind : *one* *sequential*
read of the data was the only *practical* way to go back in those days.
So all the matrices and vectors necessary to the computation
(notionally, X'X and X'Y) were built in memory in *one* step.

Such an organization is probably impossible with most "modern"
algorithms : see Douglas Bates' description of the lmer() algorithms for
a nice, big counter-example, or consider MCMC... But coming closer to
such an organization *seems* possible : see for example biglm.

So I think t

Re: [R] using a variable name stored in another variable?

2010-02-07 Thread Sharpie

Chris Seidel wrote:
> 
> Hello,
> 
> I'm trying to figure out how to create a data object, and then save it
> with a user-defined name that is input as a command line argument. I
> know how to create the object and assign it the new name, however, I
> can't figure out how to refer to the new name for a future operation
> such as save(). 
> 
> ..snip..
> 
> 

You probably want the get() function:

  get( myobjectname )

The help page for get() has a note which states that it is the compliment of
assign().  Perhaps a similar note should be added to the help page for
assign...

Hope this helps!

-Charlie
-- 
View this message in context: 
http://n4.nabble.com/using-a-variable-name-stored-in-another-variable-tp1472371p1472400.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] mboost: Interpreting coefficients from glmboost if center=TRUE

2010-02-07 Thread Kyle Werner

I'm running R 2.10.1 with mboost 2.0 in order to build predictive
models . I am performing prediction on a binomial outcome, using a
linear function (glmboost). However, I am running into some confusion
regarding centering. (I am not aware of an mboost-specific mailing
list, so if the main R list is not the right place for this topic,
please let me know.)

The boost_control() function allows for the choice between center=TRUE
and center=FALSE. If I select center=FALSE, I am able to interpret the
coefficients just like those from standard logistic regression.
However, if I select center=TRUE, this is no longer the case. In
theory and in practice with my data, centering improves the
predictions made by the model, so this is an issue worth pursuing for
me.

Below is output from running the exact same data in exactly the same
way, only differing by whether the "center" bit is flipped or not:

Output with center=TRUE:
[(Intercept)] => -0.04543632
[painscore] => 0.007553608
[Offset] => -0.546520621809327

Output with center=FALSE:
[(Intercept)] => -0.989742
[painscore] => 0.001342585
[Offset] => -0.546520621809327

The mean of painscore is 741. It seems to me that for center=FALSE,
mboost should modify the intercept by subtracting 741*0.007553608 from
it (thus intercept should = -11.285). If I manually do this, the
output is credible, and in the ballpark of that given by other methods
(e.g., lrm or glm with a Binomial link function). If I don't do this,
then the inverse logistic interpretation of the output is off by
orders of magnitude.

In the end, with "center=TRUE", and I want to make a prediction based
on the coefficients returned by mboost, the results only make sense if
I manually rescale my independent variables prior to making a
prediction. Is this the desired behavior, or am I doing something
wrong?

Many thanks.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Noval numbers

2010-02-07 Thread Duncan Murdoch


On 07/02/2010 4:25 PM, Mag. Ferri Leberl wrote:

Dear everybody,
How can I transform numbers to a positional system with the base of, e.g., 
nine, and do further operations with them?


I don't understand what you want.  Decimal, noval or binary are just 
ways to represent numbers as strings of characters.  It doesn't make 
sense to me to say you are "transforming them" to a particular 
representation.  You can represent them in a variety of ways:  10 
(decimal), 11 (noval), 1010 (binary), ten (English), but it's still the 
same number.


It does make sense to ask if you can convert numbers to one of these 
representations, or convert the representation back to the number; is 
that what you meant?  Erich Neuwirth posted a function to do one way 
conversions:


http://finzi.psych.upenn.edu/Rhelp10/2008-September/175003.html

With his functions you can do

> makeDigitSeq(numberInBase(10, 9))
[1] "11"

Duncan Murdoch




Thank you in advance
Yours, sincerely
Mag. Ferri Leberl

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] contour & persp

2010-02-07 Thread Andrew Wang

I  have this data set that both x & y are ordered vectors of length 600 & 700 
respectively; z is a 600 by 700 matrix whose entry z[i,j] is either a missing 
value (indicated by 'NaN') or a real number between 0 and 1.  The contour 
function 

contour(x,y,z) 

gives me a blank picture. I guess the reason is that most of z-entries are 
missing, only less than 1% are non missing. 

Question (1) 

Is there a way that I could manipulate the data or function to have the 
non-missing values plotted?

Also, trying function "persp" gives me this error message

persp(x,y,z) 

Error in persp.default(x, y, z) : invalid 'z' limits

I look at the manual of "persp". I guess, the error message comes from its 
internal call

zlim = range(z, na.rm = TRUE)

it appears to me that "persp" can't handle missing value yet its manual states 
clearly

z: a matrix containing the values to be plotted ('NA's are
  allowed).  Note that ‘x’ can be used instead of ‘z’
  for convenience.


Question (2)

Can "persp" handle missing values in z? if the answer is a  sounding "yes", how 
should I do in my case? 

Please help, Thanks!

Your frustrated

Andrew




__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] using a variable name stored in another variable?

2010-02-07 Thread Chris Seidel

Hello,

I'm trying to figure out how to create a data object, and then save it
with a user-defined name that is input as a command line argument. I
know how to create the object and assign it the new name, however, I
can't figure out how to refer to the new name for a future operation
such as save(). The code below creates an object and uses assign() to
give it the user supplied name "MyName". However, since I don't know
what the new name is in advance, how do I refer to it in the save()
command? (the example below only saves an object with the name, not the
objec itself).

Is it some kind of dereference? Any ideas?

command: 

cat myscript.r | R --vanilla --args MyName

script: 

# get the command-line argument for the variable name
myobjectname <- commandArgs()[4]

# make some data
somedata <- matrix(rnorm(100),10,10)

# make a filename for the saved object
filename <- paste(myobjectname, ".RData", sep="")

# assign data to the new name
assign(myobjectname, somedata)

# save the object to disk
save(myobjectname, file=filename)

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Noval numbers

2010-02-07 Thread jim holtman

PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

It would be useful if you could at least provide some examples of what
you want to do.  There are various ways of converting numbers back and
forth.  Are these integers or floating point?  What type of operations
do you want to do on them?

On Sun, Feb 7, 2010 at 4:25 PM, Mag. Ferri Leberl  wrote:
> Dear everybody,
> How can I transform numbers to a positional system with the base of, e.g., 
> nine, and do further operations with them?
> Thank you in advance
> Yours, sincerely
> Mag. Ferri Leberl
>
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

-- 
Jim Holtman
Cincinnati, OH
+1 513 646 9390

What is the problem that you are trying to solve?

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Noval numbers

2010-02-07 Thread Mag. Ferri Leberl

Dear everybody,
How can I transform numbers to a positional system with the base of, e.g., 
nine, and do further operations with them?
Thank you in advance
Yours, sincerely
Mag. Ferri Leberl

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] x-axis plot problem

2010-02-07 Thread Jorge Ivan Velez

Hi abotaha,

Modify your matplot() call as

matplot(model, pch = c(1,22,17,16), type = "o",lty=c(2,2,2,5), col

 =c("gray10"," gray10","gray10","gray10"),xlab="Month-Year",ylab="Zinth",

 xaxt = "n", yaxs = "i",main="Model Output")


and then add


 axis(1, 1:6, time)


HTH,
Jorge


On Sun, Feb 7, 2010 at 3:22 PM, abotaha <> wrote:

>
> Hi all,
> I tried to have plot of many vector in one plot and i have got a nice plot
> but i have problem with x-axis. I want to have month and year only(Jul.07
> means July 2007) in x-axis without appearing other number behaind it.
>
> I would appercit any help.
>
> The R code:
>
> F<-c(7.49,6.91,6.78,6.99,7.44,7.42)
> M<-c(4.81,4.51,5.21,4.65,4.75,3.86)
> P<-c(7.49,15.03,15.19,15.32,15.42,15.45)
> B<-c(16.24,15.87,12.94,11.82,10.86,9.61)
>
> time<-c("Jul/07","Aug/07","Sep/07","Oct/07","Nov/07","Dec/07")
> model<-data.frame(F,M,P,B)
> row.names(model)<-c("Jul07","Aug07","Sep07","Oct07","Nov07","Dec007")
> model
>
> par(mgp=c(2, 1, 0),bty="o" )
> matplot(model, pch = c(1,22,17,16), type = "o",lty=c(2,2,2,5), col
> =c("gray10"," gray10","gray10","gray10"),xlab="Month-Year",ylab="Zinth",
> xaxs = "i", yaxs = "i",main="Model Output")
> legend("topleft", legend = c("F", "M","P","B"),text.width =
> strwidth("1,000,000,"),pch=c(1,22,17,16),col =c("gray10","
> gray10","gray10","gray10"),lty=c(2,2,2,5), xjust = 1, yjust = 1, bty="n",
> cex=0.8, ncol=2)
> axis(1, 1:6, row.names(model))
> --
> View this message in context:
> http://n4.nabble.com/x-axis-plot-problem-tp1472286p1472286.html
> Sent from the R help mailing list archive at Nabble.com.
>
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] x-axis plot problem

2010-02-07 Thread Rolf Turner

I think you just need to set axes=FALSE in your call to matplot().
You'll then need to add the y-axis manually --- do axis(2) in
addition to your call which draws the x axis.

You'll also need to do box() if you want a box around your graph.

cheers,

Rolf Turner

P. S. You are clearly a Good Person!  A relative newbie who has
read the Posting Guide! :-)

R. T.

On 8/02/2010, at 9:22 AM, abotaha wrote:

> 
> Hi all, 
> I tried to have plot of many vector in one plot and i have got a nice plot
> but i have problem with x-axis. I want to have month and year only(Jul.07
> means July 2007) in x-axis without appearing other number behaind it. 
> 
> I would appercit any help.
> 
> The R code:
> 
> F<-c(7.49,6.91,6.78,6.99,7.44,7.42)
> M<-c(4.81,4.51,5.21,4.65,4.75,3.86)
> P<-c(7.49,15.03,15.19,15.32,15.42,15.45)
> B<-c(16.24,15.87,12.94,11.82,10.86,9.61)
> 
> time<-c("Jul/07","Aug/07","Sep/07","Oct/07","Nov/07","Dec/07")
> model<-data.frame(F,M,P,B)
> row.names(model)<-c("Jul07","Aug07","Sep07","Oct07","Nov07","Dec007")
> model
> 
> par(mgp=c(2, 1, 0),bty="o" )
> matplot(model, pch = c(1,22,17,16), type = "o",lty=c(2,2,2,5), col
> =c("gray10"," gray10","gray10","gray10"),xlab="Month-Year",ylab="Zinth",
> xaxs = "i", yaxs = "i",main="Model Output")
> legend("topleft", legend = c("F", "M","P","B"),text.width =
> strwidth("1,000,000,"),pch=c(1,22,17,16),col =c("gray10","
> gray10","gray10","gray10"),lty=c(2,2,2,5), xjust = 1, yjust = 1, bty="n",
> cex=0.8, ncol=2)
> 

> -- 
> View this message in context: 
> http://n4.nabble.com/x-axis-plot-problem-tp1472286p1472286.html
> Sent from the R help mailing list archive at Nabble.com.
> 
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

##
Attention: 
This e-mail message is privileged and confidential. If you are not the 
intended recipient please delete the message and notify the sender. 
Any views or opinions presented are solely those of the author.

This e-mail has been scanned and cleared by MailMarshal 
www.marshalsoftware.com
##

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] x-axis plot problem

2010-02-07 Thread David Winsemius



On Feb 7, 2010, at 3:22 PM, abotaha wrote:



Hi all,
I tried to have plot of many vector in one plot and i have got a  
nice plot
but i have problem with x-axis. I want to have month and year  
only(Jul.07

means July 2007) in x-axis without appearing other number behaind it.


I'm going to assume that you did not want that period between the Mon  
and Yr since you did not include it in your label strings.





I would appercit any help.

The R code:

F<-c(7.49,6.91,6.78,6.99,7.44,7.42)
M<-c(4.81,4.51,5.21,4.65,4.75,3.86)
P<-c(7.49,15.03,15.19,15.32,15.42,15.45)
B<-c(16.24,15.87,12.94,11.82,10.86,9.61)

time<-c("Jul/07","Aug/07","Sep/07","Oct/07","Nov/07","Dec/07")
model<-data.frame(F,M,P,B)
row.names(model)<-c("Jul07","Aug07","Sep07","Oct07","Nov07","Dec007")
model

par(mgp=c(2, 1, 0),bty="o" )
matplot(model, pch = c(1,22,17,16), type = "o",lty=c(2,2,2,5), col
=c("gray10"," gray10","gray10","gray10"),xlab="Month- 
Year",ylab="Zinth",

xaxs = "i", yaxs = "i",main="Model Output")


# Change the xaxs="i" to xaxt="n" to suppress the numbers 1:6 from  
being stuck under the labels you later lay down with the axis command.



legend("topleft", legend = c("F", "M","P","B"),text.width =
strwidth("1,000,000,"),pch=c(1,22,17,16),col =c("gray10","
gray10","gray10","gray10"),lty=c(2,2,2,5), xjust = 1, yjust = 1,  
bty="n",

cex=0.8, ncol=2)
axis(1, 1:6, row.names(model))
--
View this message in context: 
http://n4.nabble.com/x-axis-plot-problem-tp1472286p1472286.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


David Winsemius, MD
Heritage Laboratories
West Hartford, CT

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] x-axis plot problem

2010-02-07 Thread abotaha


Hi all, 
I tried to have plot of many vector in one plot and i have got a nice plot
but i have problem with x-axis. I want to have month and year only(Jul.07
means July 2007) in x-axis without appearing other number behaind it. 

I would appercit any help.

The R code:

F<-c(7.49,6.91,6.78,6.99,7.44,7.42)
M<-c(4.81,4.51,5.21,4.65,4.75,3.86)
P<-c(7.49,15.03,15.19,15.32,15.42,15.45)
B<-c(16.24,15.87,12.94,11.82,10.86,9.61)

time<-c("Jul/07","Aug/07","Sep/07","Oct/07","Nov/07","Dec/07")
model<-data.frame(F,M,P,B)
row.names(model)<-c("Jul07","Aug07","Sep07","Oct07","Nov07","Dec007")
model

par(mgp=c(2, 1, 0),bty="o" )
matplot(model, pch = c(1,22,17,16), type = "o",lty=c(2,2,2,5), col
=c("gray10"," gray10","gray10","gray10"),xlab="Month-Year",ylab="Zinth",
xaxs = "i", yaxs = "i",main="Model Output")
legend("topleft", legend = c("F", "M","P","B"),text.width =
strwidth("1,000,000,"),pch=c(1,22,17,16),col =c("gray10","
gray10","gray10","gray10"),lty=c(2,2,2,5), xjust = 1, yjust = 1, bty="n",
cex=0.8, ncol=2)
axis(1, 1:6, row.names(model))
-- 
View this message in context: 
http://n4.nabble.com/x-axis-plot-problem-tp1472286p1472286.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] conditioned xyplot, many y variables

2010-02-07 Thread hadley wickham

On Sun, Feb 7, 2010 at 11:32 AM, Jacob Wegelin  wrote:
>
> The example below creates parallel time-series plots of three different y
> variables conditioned by a dichotomous factor. In the graphical layout,
>
>        •       Each y variable inhabits its own row and is plotted on its
> own distinct scale.
>
>        •       Each level of the factor has its own column, but within each
> row the scale is held constant across columns.
>
>        •       The panels fit tightly (as they do in lattice) without
> superfluous whitespace or ticks.
>
> Currently I know of no lattice solution to this problem, only a traditional
> graphics solution. Can one solve this problem elegantly using lattice?

It's easy with ggplot2:

library(ggplot2)
JUNKm <- melt(JUNK, measure = c("ppp", "QQQ", "z"))

ggplot(JUNKm, aes(TIME, value, group = ID)) +
  geom_line() +
  geom_point() +
  facet_grid(variable ~ Species, scales = "free_y") +
  scale_y_log10()

Hadley

-- 
http://had.co.nz/

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] convert R plots into annotated web-graphics

2010-02-07 Thread Barry Rowlingson

On Sun, Feb 7, 2010 at 2:35 PM, Rainer Tischler  wrote:

> If you have alternative ideas for interlinking tabular annotations with 
> plotted data points, I would appreciate any recommendation/suggestion.
> (I work with R 2.8.1 on different 32-bit PCs with both Linux and Windows 
> operating systems).

 As an alternative suggestion to my imagemap package, you could use a
javascript chart plotting library and just generate a data file and
the html from R. Maybe flot:

http://code.google.com/p/flot/

I find the R 'brew' package ideal for creating JS or HTML output files
from object.

 Warning: this answer contains small parts. Some assembly required.

Barry

-- 
blog: http://geospaced.blogspot.com/
web: http://www.maths.lancs.ac.uk/~rowlings
web: http://www.rowlingson.com/
twitter: http://twitter.com/geospacedman
pics: http://www.flickr.com/photos/spacedman

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] convert R plots into annotated web-graphics

2010-02-07 Thread Barry Rowlingson

On Sun, Feb 7, 2010 at 2:35 PM, Rainer Tischler  wrote:
> Dear all,
>
> I would like to make a large scatter plot created with R available as an 
> interactive web graphic, in combination with additional text-annotations for 
> each data point in the plot. The idea is to present the text-annotations in 
> an HTML-table and inter-link the data points in the plot with their 
> corresponding entries in the table, i.e. when clicking on a data point in the 
> plot, the corresponding entry in the table should be highlighted or centered 
> and vice-versa, when clicking on a table-entry, the corresponding point in 
> the plot should be highlighted.
>
> I have seen that CRAN contains various R-packages for SVG-based output of 
> interactive graphics (with hyperlinks and tool-tip annotations for each data 
> point); however, SVG is not supported by all browsers. Is anybody aware of 
> another solution for this problem (maybe based on image-maps and javascript)?
> If you have alternative ideas for interlinking tabular annotations with 
> plotted data points, I would appreciate any recommendation/suggestion.
> (I work with R 2.8.1 on different 32-bit PCs with both Linux and Windows 
> operating systems).
>

 My 'imagemaps' package?

https://r-forge.r-project.org/projects/imagemap/

Barry

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Why does aggregate fail?

2010-02-07 Thread David Winsemius



On Feb 7, 2010, at 1:57 PM, James Rome wrote:

On 2/7/2010 1:35 PM, David Winsemius wrote:But to answer your  
question:



apply(d, 1, function(z) aggregate(z, by=list(s), FUN=sum) )


David,

That works, but I do not understand why I could not use aggregate
directly. And the answer comes out as a list, which thus far baffles  
me.


It comes out as a data.frame, ... just as promised in the help page.


How do I get the answer as a matrix in my original code, which I
modified to use apply?


You could coerce either maxrdf or the aggregate returns to a matrix  
with as.matrix or data.matrix.




ha = matrix(nrow=7, ncol=24)
colnames(ha) = as.character(c(0:23))
rownames(ha) = rownames(maxrdf)
for(j in 1:7) {
   x = apply(maxrdf[j,], 1, function(z) aggregate(z, by=list(s),
FUN=sum) )
   ha[j,] = x[[1]][2]
}

Unfortunately, ha gets converted into a list, and then I can't use it
for my plots. And you can probably educate me on how to get what I am
aiming for (a matrix with the rows as the days, the columns as the
hours, and the content as the hourly sum of the 15-minute chunks)
without using the above for loop.


apply( data.matrix(maxrdf), 1  # loops over the rows
  function(z) aggregate(z, by=s, sum)
  )
#Gives you a bunch of dataframes produced by the serial application of  
aggregate.


sapply(apply(maxrdf, 1, function(z) aggregate(z, by=list(s), sum) ),  
'[', 2)


#Gives you a list of vectors by day...almost what you wanted ...
#  the  ' "[", 2' part is the extraction of the second column of the  
dataframe


# And what I think you were asking for:

do.call(rbind,
   sapply(apply(maxrdf, 1,
  function(z) aggregate(z, by=list(s), sum) ), '[', 2) )

# ... as a matrix with named rows
[,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10] [,11]  
[,12] [,13] [,14] [,15] [,16] [,17]
Sunday.x   10002000   15 0  
6 53024 3 1 8
Monday.x   4120300   12   2126 
213125 1 0 013
Tuesday.x  14218   22   23   34   2622 
1723371332 015
Wednesday.x7421   11   24   27   39   1920  
82024 12012 7
Thursday.x 00100   13   220   1617 
13 7 013393031
Friday.x   637090   18   25   1922 
102725 5 1 0 9
Saturday.x 73118   20   31   41   2319 
1726 0 1 1 111

[,18] [,19] [,20] [,21] [,22] [,23] [,24]
Sunday.x   3937282120 111
Monday.x   35 7 028 6 5 0
Tuesday.x  37373727232215
Wednesday.x1813283018 0 0
Thursday.x 34373529311917
Friday.x   28323632292020
Saturday.x 303232 7 0 0 0




Thanks for the help,
Jim


David Winsemius, MD
Heritage Laboratories
West Hartford, CT

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Why does aggregate fail?

2010-02-07 Thread James Rome

This works. But I wish I could write it without a lot of trial and
error. :-(

ha = matrix(nrow=7, ncol=24)
colnames(ha) = as.character(c(0:23))
rownames(ha) = rownames(maxrdf)
m = as.matrix(maxrdf)
for(j in 1:7) {
x = aggregate(m[j,], by=list(s), FUN=sum)   
ha[j,] = x[[2]]
}


On 2/7/2010 1:57 PM, James Rome wrote:
On 2/7/2010 1:35 PM, David Winsemius wrote:But to answer your question:

> apply(d, 1, function(z) aggregate(z, by=list(s), FUN=sum) )

David,

That works, but I do not understand why I could not use aggregate
directly. And the answer comes out as a list, which thus far baffles me.
How do I get the answer as a matrix in my original code, which I
modified to use apply?

ha = matrix(nrow=7, ncol=24)
colnames(ha) = as.character(c(0:23))
rownames(ha) = rownames(maxrdf)
for(j in 1:7) {
x = apply(maxrdf[j,], 1, function(z) aggregate(z, by=list(s),
FUN=sum) )
ha[j,] = x[[1]][2]
}

Unfortunately, ha gets converted into a list, and then I can't use it
for my plots. And you can probably educate me on how to get what I am
aiming for (a matrix with the rows as the days, the columns as the
hours, and the content as the hourly sum of the 15-minute chunks)
without using the above for loop.

Thanks for the help,
Jim

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] conditioned xyplot, many y variables

2010-02-07 Thread Deepayan Sarkar

On Sun, Feb 7, 2010 at 9:32 AM, Jacob Wegelin  wrote:
>
> The example below creates parallel time-series plots of three different y
> variables conditioned by a dichotomous factor. In the graphical layout,
>
>        •       Each y variable inhabits its own row and is plotted on its
> own distinct scale.
>
>        •       Each level of the factor has its own column, but within each
> row the scale is held constant across columns.
>
>        •       The panels fit tightly (as they do in lattice) without
> superfluous whitespace or ticks.
>
> Currently I know of no lattice solution to this problem, only a traditional
> graphics solution. Can one solve this problem elegantly using lattice?

Yes, for some definition of "elegantly". See below.

> The difficulty is to lock the levels of the factor (the columns) into the
> same scale for each y variable (for each row), while allowing the scales to
> differ between the y variables (between the rows).

This is not generally possible, as this makes sense only when rows and
columns correspond to conditioning variables, which is not always
true. (It is true for two conditioning variables with the default
layout, but lattice does not treat that case specially.)

However, you can start with the relation="free" version, and
(1) modify the limits to get "same" limits across rows,
(2) remove the labels for the second column, and
(3) remove the space allocated for those labels
to get what you want:

## assign the trellis object to a variable for further manipulation

fplot <-
xyplot ( ppp + QQQ + z ~ TIME | Species
   , group=ID
   , data=JUNK
   , ylab=c("ppp (mg/L)", "QQQ (pg/L)", "z (mIU/mL)")
   , xlab=c("Dog", "feline")
   , type="o"
   , strip= FALSE
   , outer=TRUE
   , layout=c(2,3)
   , scales=list(
   ppp=list( alternating=3)
   , y=list(
   relation="free"
   , alternating=3
   , rot=0
   , log=T
   )
   )
   )

## massage the limits (stored in fplot$y.limits) so that rows have the
## same limits.  The limits are stored as a linear list, and it is
## useful to make it an array first.

str(fplot$y.limits)

dim(fplot$y.limits) <- dim(fplot)

for (i in seq_len(ncol(fplot$y.limits)))
{
rng <- range(unlist(fplot$y.limits[,i]))
for (j in seq_len(nrow(fplot$y.limits)))
fplot$y.limits[j, i][[1]] <- rng
}

str(fplot$y.limits)

## Next, drop the y-axis labels for the second column, and zap the
## space allocated for them.

update(fplot,
   scales = list(y =
 list(at = rep(list(NA, numeric(0)), 3))),
   par.settings = list(layout.widths = list(axis.panel = c(1, 0


(Maybe I should wrap this up in a helper function.)

-Deepayan



>
> Details:
>
> #       Toy data:
>
> N<-15
> TIME <- (1:N)/N
> ppp <- TIME^2
> QQQ <- exp(TIME)
> z <- ppp / QQQ
> JUNK<-data.frame( ppp=ppp, QQQ=QQQ, z=z, TIME=TIME)
> JUNK$ID<-1
> jank<-JUNK
> jank$ID<-2
> jank$ppp<-jank$ppp / 2
> jank$QQQ<-jank$QQQ / 2
> jank$z<-jank$ppp/jank$QQQ
> JUNK<-rbind(JUNK, jank)
> jank<-JUNK
> jank$ppp<-(jank$ppp) ^(1/4)
> jank$QQQ<-(jank$QQQ) / 10
> jank$z <- jank$ppp / jank$QQQ
> JUNK$Species<-"Dog"
> jank$Species<-"feline"
> JUNK<-rbind(JUNK, jank)
> JUNK$Species<-factor(JUNK$Species)
> JUNK$ID<-factor(JUNK$ID)
> summary(JUNK)

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Why does aggregate fail?

2010-02-07 Thread James Rome

On 2/7/2010 1:35 PM, David Winsemius wrote:But to answer your question:

> apply(d, 1, function(z) aggregate(z, by=list(s), FUN=sum) )

David,

That works, but I do not understand why I could not use aggregate
directly. And the answer comes out as a list, which thus far baffles me.
How do I get the answer as a matrix in my original code, which I
modified to use apply?

ha = matrix(nrow=7, ncol=24)
colnames(ha) = as.character(c(0:23))
rownames(ha) = rownames(maxrdf)
for(j in 1:7) {
x = apply(maxrdf[j,], 1, function(z) aggregate(z, by=list(s),
FUN=sum) )   
ha[j,] = x[[1]][2]   
}

Unfortunately, ha gets converted into a list, and then I can't use it
for my plots. And you can probably educate me on how to get what I am
aiming for (a matrix with the rows as the days, the columns as the
hours, and the content as the hourly sum of the 15-minute chunks)
without using the above for loop.

Thanks for the help,
Jim

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Why does aggregate fail?

2010-02-07 Thread James Rome

On 2/7/2010 1:32 PM, David Winsemius wrote:You have a dataframe with 96
columns and a single row named "Sunday". My guess is that was not your
intent. How did "d" come to exist?

I was trying to make a simpler example. The actual code uses a data
frame maxrdf:
> dput(maxrdf)
structure(list(`0` = c(0, 1, 0, 3, 0, 2, 3), `1` = c(1, 1, 0,
2, 0, 1, 2), `2` = c(0, 1, 1, 1, 0, 2, 1), `3` = c(0, 1, 0, 1,
0, 1, 1), `4` = c(0, 1, 3, 2, 0, 1, 1), `5` = c(0, 0, 0, 1, 0,
1, 1), `6` = c(0, 0, 0, 1, 0, 1, 1), `7` = c(0, 0, 1, 0, 0, 0,
0), `8` = c(0, 1, 1, 0, 0, 2, 1), `9` = c(0, 0, 1, 2, 0, 3, 0
), `10` = c(0, 1, 0, 0, 1, 2, 0), `11` = c(0, 0, 0, 0, 0, 0,
0), `12` = c(0, 0, 0, 0, 0, 0, 1), `13` = c(0, 0, 1, 0, 0, 0,
0), `14` = c(0, 0, 0, 0, 0, 0, 0), `15` = c(0, 0, 0, 1, 0, 0,
0), `16` = c(0, 1, 1, 1, 0, 1, 0), `17` = c(2, 1, 1, 2, 0, 0,
1), `18` = c(0, 1, 3, 4, 0, 4, 2), `19` = c(0, 0, 3, 4, 0, 4,
5), `20` = c(0, 0, 5, 3, 1, 0, 4), `21` = c(0, 0, 5, 5, 2, 0,
4), `22` = c(0, 0, 5, 7, 0, 0, 5), `23` = c(0, 0, 7, 9, 10, 0,
7), `24` = c(0, 0, 6, 8, 4, 5, 9), `25` = c(0, 0, 6, 4, 5, 4,
7), `26` = c(0, 0, 4, 6, 5, 4, 5), `27` = c(0, 0, 7, 9, 8, 5,
10), `28` = c(0, 2, 9, 13, 0, 5, 14), `29` = c(0, 2, 10, 11,
0, 9, 11), `30` = c(0, 3, 9, 8, 0, 8, 9), `31` = c(0, 5, 6, 7,
0, 3, 7), `32` = c(5, 7, 7, 5, 4, 5, 7), `33` = c(5, 6, 8, 5,
7, 6, 5), `34` = c(5, 4, 5, 5, 4, 5, 6), `35` = c(0, 4, 6, 4,
1, 3, 5), `36` = c(0, 6, 5, 5, 4, 7, 5), `37` = c(0, 6, 6, 6,
5, 7, 5), `38` = c(0, 8, 6, 6, 5, 4, 5), `39` = c(0, 6, 5, 3,
3, 4, 4), `40` = c(0, 5, 2, 5, 3, 3, 2), `41` = c(0, 4, 5, 3,
4, 3, 4), `42` = c(6, 5, 6, 0, 3, 2, 5), `43` = c(0, 7, 4, 0,
3, 2, 6), `44` = c(3, 7, 6, 6, 5, 8, 4), `45` = c(1, 8, 5, 3,
2, 5, 9), `46` = c(0, 8, 7, 7, 0, 6, 5), `47` = c(1, 8, 5, 4,
0, 8, 8), `48` = c(6, 5, 8, 0, 0, 4, 0), `49` = c(8, 6, 13, 7,
0, 8, 0), `50` = c(9, 7, 8, 7, 0, 7, 0), `51` = c(7, 7, 8, 10,
0, 6, 0), `52` = c(9, 1, 8, 0, 4, 5, 0), `53` = c(10, 0, 1, 0,
1, 0, 1), `54` = c(5, 0, 3, 0, 3, 0, 0), `55` = c(0, 0, 1, 1,
5, 0, 0), `56` = c(1, 0, 10, 5, 10, 1, 0), `57` = c(0, 0, 8,
6, 12, 0, 1), `58` = c(1, 0, 8, 4, 11, 0, 0), `59` = c(1, 0,
6, 5, 6, 0, 0), `60` = c(1, 0, 0, 1, 4, 0, 0), `61` = c(0, 0,
0, 4, 9, 0, 1), `62` = c(0, 0, 0, 2, 5, 0, 0), `63` = c(0, 0,
0, 5, 12, 0, 0), `64` = c(1, 0, 0, 0, 9, 1, 1), `65` = c(0, 0,
0, 1, 7, 0, 0), `66` = c(0, 6, 8, 3, 6, 3, 4), `67` = c(7, 7,
7, 3, 9, 5, 6), `68` = c(10, 6, 7, 0, 6, 6, 6), `69` = c(9, 10,
9, 5, 9, 7, 8), `70` = c(9, 9, 10, 6, 8, 7, 8), `71` = c(11,
10, 11, 7, 11, 8, 8), `72` = c(11, 0, 9, 6, 10, 7, 7), `73` = c(8,
0, 9, 7, 8, 9, 12), `74` = c(8, 3, 9, 0, 9, 9, 7), `75` = c(10,
4, 10, 0, 10, 7, 6), `76` = c(7, 0, 8, 8, 9, 9, 11), `77` = c(6,
0, 12, 8, 9, 10, 7), `78` = c(7, 0, 9, 5, 9, 9, 7), `79` = c(8,
0, 8, 7, 8, 8, 7), `80` = c(7, 11, 7, 8, 4, 8, 5), `81` = c(4,
6, 9, 5, 7, 7, 1), `82` = c(4, 6, 5, 10, 9, 10, 1), `83` = c(6,
5, 6, 7, 9, 7, 0), `84` = c(5, 0, 8, 3, 8, 8, 0), `85` = c(5,
1, 5, 6, 8, 7, 0), `86` = c(5, 2, 5, 3, 7, 7, 0), `87` = c(5,
3, 5, 6, 8, 7, 0), `88` = c(0, 4, 6, 0, 6, 6, 0), `89` = c(0,
1, 8, 0, 4, 7, 0), `90` = c(0, 0, 5, 0, 7, 2, 0), `91` = c(1,
0, 3, 0, 2, 5, 0), `92` = c(6, 0, 5, 0, 6, 5, 0), `93` = c(2,
0, 4, 0, 5, 5, 0), `94` = c(3, 0, 3, 0, 2, 6, 0), `95` = c(0,
0, 3, 0, 4, 4, 0)), .Names = c("0", "1", "2", "3", "4", "5",
"6", "7", "8", "9", "10", "11", "12", "13", "14", "15", "16",
"17", "18", "19", "20", "21", "22", "23", "24", "25", "26", "27",
"28", "29", "30", "31", "32", "33", "34", "35", "36", "37", "38",
"39", "40", "41", "42", "43", "44", "45", "46", "47", "48", "49",
"50", "51", "52", "53", "54", "55", "56", "57", "58", "59", "60",
"61", "62", "63", "64", "65", "66", "67", "68", "69", "70", "71",
"72", "73", "74", "75", "76", "77", "78", "79", "80", "81", "82",
"83", "84", "85", "86", "87", "88", "89", "90", "91", "92", "93",
"94", "95"), row.names = c("Sunday", "Monday", "Tuesday", "Wednesday",
"Thursday", "Friday", "Saturday"), class = "data.frame")
> > maxrdf[1,]
   0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24
25 26
Sunday 0 1 0 0 0 0 0 0 0 0  0  0  0  0  0  0  0  2  0  0  0  0  0  0  0 
0  0
   27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48
49 50
Sunday  0  0  0  0  0  5  5  5  0  0  0  0  0  0  0  6  0  3  1  0  1 
6  8  9
   51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72
73 74
Sunday  7  9 10  5  0  1  0  1  1  1  0  0  0  1  0  0  7 10  9  9 11
11  8  8
   75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95
Sunday 10  7  6  7  8  7  4  4  6  5  5  5  5  0  0  0  1  6  2  3  0
>

And the code that fails is

ha = matrix(nrow=7, ncol=24)
colnames(ha) = as.character(c(0:23))
rownames(ha) = rownames(maxrdf)
for(j in 1:7) {
x = aggregate(maxrdf[j,], by=list(c(s)), FUN=sum)
ha[j,] = x[[2]]   
}

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting

Re: [R] Why does aggregate fail?

2010-02-07 Thread David Winsemius



On Feb 7, 2010, at 1:32 PM, David Winsemius wrote:

You have a dataframe with 96 columns and a single row named  
"Sunday". My guess is that was not your intent. How did "d" come to  
exist?


But to answer your question:

> apply(d, 1, function(z) aggregate(z, by=list(s), FUN=sum) )
$Sunday
   Group.1  x
10  1
21  0
32  0
43  0
54  2
65  0
76  0
87  0
98 15
10   9  0
11  10  6
12  11  5
13  12 30
14  13 24
15  14  3
16  15  1
17  16  8
18  17 39
19  18 37
20  19 28
21  20 21
22  21 20
23  22  1
24  23 11



--
David.

On Feb 7, 2010, at 1:29 PM, James Rome wrote:


dput(d)

structure(list(`0` = 0, `1` = 1, `2` = 0, `3` = 0, `4` = 0, `5` = 0,
  `6` = 0, `7` = 0, `8` = 0, `9` = 0, `10` = 0, `11` = 0, `12` = 0,
  `13` = 0, `14` = 0, `15` = 0, `16` = 0, `17` = 2, `18` = 0,
  `19` = 0, `20` = 0, `21` = 0, `22` = 0, `23` = 0, `24` = 0,
  `25` = 0, `26` = 0, `27` = 0, `28` = 0, `29` = 0, `30` = 0,
  `31` = 0, `32` = 5, `33` = 5, `34` = 5, `35` = 0, `36` = 0,
  `37` = 0, `38` = 0, `39` = 0, `40` = 0, `41` = 0, `42` = 6,
  `43` = 0, `44` = 3, `45` = 1, `46` = 0, `47` = 1, `48` = 6,
  `49` = 8, `50` = 9, `51` = 7, `52` = 9, `53` = 10, `54` = 5,
  `55` = 0, `56` = 1, `57` = 0, `58` = 1, `59` = 1, `60` = 1,
  `61` = 0, `62` = 0, `63` = 0, `64` = 1, `65` = 0, `66` = 0,
  `67` = 7, `68` = 10, `69` = 9, `70` = 9, `71` = 11, `72` = 11,
  `73` = 8, `74` = 8, `75` = 10, `76` = 7, `77` = 6, `78` = 7,
  `79` = 8, `80` = 7, `81` = 4, `82` = 4, `83` = 6, `84` = 5,
  `85` = 5, `86` = 5, `87` = 5, `88` = 0, `89` = 0, `90` = 0,
  `91` = 1, `92` = 6, `93` = 2, `94` = 3, `95` = 0), .Names = c("0",
"1", "2", "3", "4", "5", "6", "7", "8", "9", "10", "11", "12",
"13", "14", "15", "16", "17", "18", "19", "20", "21", "22", "23",
"24", "25", "26", "27", "28", "29", "30", "31", "32", "33", "34",
"35", "36", "37", "38", "39", "40", "41", "42", "43", "44", "45",
"46", "47", "48", "49", "50", "51", "52", "53", "54", "55", "56",
"57", "58", "59", "60", "61", "62", "63", "64", "65", "66", "67",
"68", "69", "70", "71", "72", "73", "74", "75", "76", "77", "78",
"79", "80", "81", "82", "83", "84", "85", "86", "87", "88", "89",
"90", "91", "92", "93", "94", "95"), row.names = "Sunday", class =
"data.frame")
On 2/7/2010 1:27 PM, David Winsemius wrote:
On Feb 7, 2010, at 1:08 PM, James Rome wrote:


I am trying to get hourly totals, given 15-minute bins.
s = seq(0, 95, 1)
s = floor(s/4)   # 0  0  0  0  1  1  1  1  2  2  2  2  3  3  3  3   
4 .

. .


s
[1]  0  0  0  0  1  1  1  1  2  2  2  2  3  3  3  3  4  4  4  4   
5  5

5  5  6
[26]  6  6  6  7  7  7  7  8  8  8  8  9  9  9  9 10 10 10 10 11  
11 11

11 12 12
[51] 12 12 13 13 13 13 14 14 14 14 15 15 15 15 16 16 16 16 17 17  
17 17

18 18 18
[76] 18 19 19 19 19 20 20 20 20 21 21 21 21 22 22 22 22 23 23 23 23


mode(d)

[1] "list"

d

0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24
25 26
Sunday 0 1 0 0 0 0 0 0 0 0  0  0  0  0  0  0  0  2  0  0  0  0  0   
0  0

0  0
27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47  
48

49 50
Sunday  0  0  0  0  0  5  5  5  0  0  0  0  0  0  0  6  0  3  1   
0  1

6  8  9
51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71  
72

73 74
Sunday  7  9 10  5  0  1  0  1  1  1  0  0  0  1  0  0  7 10  9  9  
11

11  8  8
75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95
Sunday 10  7  6  7  8  7  4  4  6  5  5  5  5  0  0  0  1  6  2   
3  0

x = aggregate(d, by=list(s), FUN="sum")

Error in FUN(X[[1L]], ...) : arguments must have same length


I don't know what sort of error is occurring. You have not created a
posting that easily lets us see what sort of object "d" really is.  
(And

it is not being display as though it were a simple list.)

dput(d) would have allowed us to see what sort of attributes it has.
Your code works if one strips out the data and puts it into a vector.


s = seq(0, 95, 1)
s = floor(s/4)



d <- scan()

1: 0 1 0 0 0 0 0 0 0 0  0  0  0  0  0  0  0  2  0  0  0  0  0  0  0
26: 0  0
28: 0  0  0  0  0  5  5  5  0  0  0  0  0  0  0  6  0  3  1  0  1
49: 6  8  9
52: 7  9 10  5  0  1  0  1  1  1  0  0  0  1  0  0  7 10  9  9 11
73: 11  8  8
76: 10  7  6  7  8  7  4  4  6  5  5  5  5  0  0  0  1  6  2  3  0
97:
Read 96 items

x = aggregate(d, by=list(s), FUN="sum")
x

 Group.1  x
10  1
21  0
32  0
43  0
54  2
65  0
76  0
87  0
98 15
10   9  0
11  10  6
12  11  5
13  12 30
14  13 24
15  14  3
16  15  1
17  16  8
18  17 39
19  18 37
20  19 28
21  20 21
22  21 20
23  22  1
24  23 11


length(s)

[1] 96

length(d)

[1] 96

What am I doing wrong?

Thanks in advance list,
Jim Rome

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the post

Re: [R] (Another) Bates fortune?

2010-02-07 Thread Achim Zeileis


On Fri, 5 Feb 2010, Peter Ehlers wrote:


I vote to 'fortunize' Doug Bates on

Hierarchical data sets: which software to use?

"The widespread use of spreadsheets or SPSS data sets or SAS data sets
which encourage the "single table with a gargantuan number of columns,
most of which are missing data in most cases" approach to organization
of longitudinal data is regrettable."

http://n4.nabble.com/Hierarchical-data-sets-which-software-to-use-td1458477.html#a1470430


Thanks, added to the devel-version on R-Forge.
Z


--
Peter Ehlers
University of Calgary

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.




__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Why does aggregate fail?

2010-02-07 Thread David Winsemius

You have a dataframe with 96 columns and a single row named "Sunday".  
My guess is that was not your intent. How did "d" come to exist?


--
David.

On Feb 7, 2010, at 1:29 PM, James Rome wrote:


dput(d)

structure(list(`0` = 0, `1` = 1, `2` = 0, `3` = 0, `4` = 0, `5` = 0,
   `6` = 0, `7` = 0, `8` = 0, `9` = 0, `10` = 0, `11` = 0, `12` = 0,
   `13` = 0, `14` = 0, `15` = 0, `16` = 0, `17` = 2, `18` = 0,
   `19` = 0, `20` = 0, `21` = 0, `22` = 0, `23` = 0, `24` = 0,
   `25` = 0, `26` = 0, `27` = 0, `28` = 0, `29` = 0, `30` = 0,
   `31` = 0, `32` = 5, `33` = 5, `34` = 5, `35` = 0, `36` = 0,
   `37` = 0, `38` = 0, `39` = 0, `40` = 0, `41` = 0, `42` = 6,
   `43` = 0, `44` = 3, `45` = 1, `46` = 0, `47` = 1, `48` = 6,
   `49` = 8, `50` = 9, `51` = 7, `52` = 9, `53` = 10, `54` = 5,
   `55` = 0, `56` = 1, `57` = 0, `58` = 1, `59` = 1, `60` = 1,
   `61` = 0, `62` = 0, `63` = 0, `64` = 1, `65` = 0, `66` = 0,
   `67` = 7, `68` = 10, `69` = 9, `70` = 9, `71` = 11, `72` = 11,
   `73` = 8, `74` = 8, `75` = 10, `76` = 7, `77` = 6, `78` = 7,
   `79` = 8, `80` = 7, `81` = 4, `82` = 4, `83` = 6, `84` = 5,
   `85` = 5, `86` = 5, `87` = 5, `88` = 0, `89` = 0, `90` = 0,
   `91` = 1, `92` = 6, `93` = 2, `94` = 3, `95` = 0), .Names = c("0",
"1", "2", "3", "4", "5", "6", "7", "8", "9", "10", "11", "12",
"13", "14", "15", "16", "17", "18", "19", "20", "21", "22", "23",
"24", "25", "26", "27", "28", "29", "30", "31", "32", "33", "34",
"35", "36", "37", "38", "39", "40", "41", "42", "43", "44", "45",
"46", "47", "48", "49", "50", "51", "52", "53", "54", "55", "56",
"57", "58", "59", "60", "61", "62", "63", "64", "65", "66", "67",
"68", "69", "70", "71", "72", "73", "74", "75", "76", "77", "78",
"79", "80", "81", "82", "83", "84", "85", "86", "87", "88", "89",
"90", "91", "92", "93", "94", "95"), row.names = "Sunday", class =
"data.frame")
On 2/7/2010 1:27 PM, David Winsemius wrote:
On Feb 7, 2010, at 1:08 PM, James Rome wrote:


I am trying to get hourly totals, given 15-minute bins.
s = seq(0, 95, 1)
s = floor(s/4)   # 0  0  0  0  1  1  1  1  2  2  2  2  3  3  3  3   
4 .

. .


s

[1]  0  0  0  0  1  1  1  1  2  2  2  2  3  3  3  3  4  4  4  4  5  5
5  5  6
[26]  6  6  6  7  7  7  7  8  8  8  8  9  9  9  9 10 10 10 10 11 11  
11

11 12 12
[51] 12 12 13 13 13 13 14 14 14 14 15 15 15 15 16 16 16 16 17 17 17  
17

18 18 18
[76] 18 19 19 19 19 20 20 20 20 21 21 21 21 22 22 22 22 23 23 23 23


mode(d)

[1] "list"

d

 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24
25 26
Sunday 0 1 0 0 0 0 0 0 0 0  0  0  0  0  0  0  0  2  0  0  0  0  0   
0  0

0  0
 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47  
48

49 50
Sunday  0  0  0  0  0  5  5  5  0  0  0  0  0  0  0  6  0  3  1  0  1
6  8  9
 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71  
72

73 74
Sunday  7  9 10  5  0  1  0  1  1  1  0  0  0  1  0  0  7 10  9  9 11
11  8  8
 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95
Sunday 10  7  6  7  8  7  4  4  6  5  5  5  5  0  0  0  1  6  2  3  0

x = aggregate(d, by=list(s), FUN="sum")

Error in FUN(X[[1L]], ...) : arguments must have same length


I don't know what sort of error is occurring. You have not created a
posting that easily lets us see what sort of object "d" really is.  
(And

it is not being display as though it were a simple list.)

dput(d) would have allowed us to see what sort of attributes it has.
Your code works if one strips out the data and puts it into a vector.


s = seq(0, 95, 1)
s = floor(s/4)



d <- scan()

1: 0 1 0 0 0 0 0 0 0 0  0  0  0  0  0  0  0  2  0  0  0  0  0  0  0
26: 0  0
28: 0  0  0  0  0  5  5  5  0  0  0  0  0  0  0  6  0  3  1  0  1
49: 6  8  9
52: 7  9 10  5  0  1  0  1  1  1  0  0  0  1  0  0  7 10  9  9 11
73: 11  8  8
76: 10  7  6  7  8  7  4  4  6  5  5  5  5  0  0  0  1  6  2  3  0
97:
Read 96 items

x = aggregate(d, by=list(s), FUN="sum")
x

  Group.1  x
10  1
21  0
32  0
43  0
54  2
65  0
76  0
87  0
98 15
10   9  0
11  10  6
12  11  5
13  12 30
14  13 24
15  14  3
16  15  1
17  16  8
18  17 39
19  18 37
20  19 28
21  20 21
22  21 20
23  22  1
24  23 11


length(s)

[1] 96

length(d)

[1] 96

What am I doing wrong?

Thanks in advance list,
Jim Rome

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


David Winsemius, MD
Heritage Laboratories
West Hartford, CT




David Winsemius, MD
Heritage Laboratories
West Hartford, CT

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained,

Re: [R] Why does aggregate fail?

2010-02-07 Thread James Rome

> dput(d)
structure(list(`0` = 0, `1` = 1, `2` = 0, `3` = 0, `4` = 0, `5` = 0,
`6` = 0, `7` = 0, `8` = 0, `9` = 0, `10` = 0, `11` = 0, `12` = 0,
`13` = 0, `14` = 0, `15` = 0, `16` = 0, `17` = 2, `18` = 0,
`19` = 0, `20` = 0, `21` = 0, `22` = 0, `23` = 0, `24` = 0,
`25` = 0, `26` = 0, `27` = 0, `28` = 0, `29` = 0, `30` = 0,
`31` = 0, `32` = 5, `33` = 5, `34` = 5, `35` = 0, `36` = 0,
`37` = 0, `38` = 0, `39` = 0, `40` = 0, `41` = 0, `42` = 6,
`43` = 0, `44` = 3, `45` = 1, `46` = 0, `47` = 1, `48` = 6,
`49` = 8, `50` = 9, `51` = 7, `52` = 9, `53` = 10, `54` = 5,
`55` = 0, `56` = 1, `57` = 0, `58` = 1, `59` = 1, `60` = 1,
`61` = 0, `62` = 0, `63` = 0, `64` = 1, `65` = 0, `66` = 0,
`67` = 7, `68` = 10, `69` = 9, `70` = 9, `71` = 11, `72` = 11,
`73` = 8, `74` = 8, `75` = 10, `76` = 7, `77` = 6, `78` = 7,
`79` = 8, `80` = 7, `81` = 4, `82` = 4, `83` = 6, `84` = 5,
`85` = 5, `86` = 5, `87` = 5, `88` = 0, `89` = 0, `90` = 0,
`91` = 1, `92` = 6, `93` = 2, `94` = 3, `95` = 0), .Names = c("0",
"1", "2", "3", "4", "5", "6", "7", "8", "9", "10", "11", "12",
"13", "14", "15", "16", "17", "18", "19", "20", "21", "22", "23",
"24", "25", "26", "27", "28", "29", "30", "31", "32", "33", "34",
"35", "36", "37", "38", "39", "40", "41", "42", "43", "44", "45",
"46", "47", "48", "49", "50", "51", "52", "53", "54", "55", "56",
"57", "58", "59", "60", "61", "62", "63", "64", "65", "66", "67",
"68", "69", "70", "71", "72", "73", "74", "75", "76", "77", "78",
"79", "80", "81", "82", "83", "84", "85", "86", "87", "88", "89",
"90", "91", "92", "93", "94", "95"), row.names = "Sunday", class =
"data.frame")
On 2/7/2010 1:27 PM, David Winsemius wrote:
On Feb 7, 2010, at 1:08 PM, James Rome wrote:

> I am trying to get hourly totals, given 15-minute bins.
> s = seq(0, 95, 1)
> s = floor(s/4)   # 0  0  0  0  1  1  1  1  2  2  2  2  3  3  3  3  4 .
> . .
>
>> s
> [1]  0  0  0  0  1  1  1  1  2  2  2  2  3  3  3  3  4  4  4  4  5  5
> 5  5  6
> [26]  6  6  6  7  7  7  7  8  8  8  8  9  9  9  9 10 10 10 10 11 11 11
> 11 12 12
> [51] 12 12 13 13 13 13 14 14 14 14 15 15 15 15 16 16 16 16 17 17 17 17
> 18 18 18
> [76] 18 19 19 19 19 20 20 20 20 21 21 21 21 22 22 22 22 23 23 23 23
>
>> mode(d)
> [1] "list"
>> d
>   0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24
> 25 26
> Sunday 0 1 0 0 0 0 0 0 0 0  0  0  0  0  0  0  0  2  0  0  0  0  0  0  0
> 0  0
>   27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48
> 49 50
> Sunday  0  0  0  0  0  5  5  5  0  0  0  0  0  0  0  6  0  3  1  0  1
> 6  8  9
>   51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72
> 73 74
> Sunday  7  9 10  5  0  1  0  1  1  1  0  0  0  1  0  0  7 10  9  9 11
> 11  8  8
>   75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95
> Sunday 10  7  6  7  8  7  4  4  6  5  5  5  5  0  0  0  1  6  2  3  0
>> x = aggregate(d, by=list(s), FUN="sum")
> Error in FUN(X[[1L]], ...) : arguments must have same length

I don't know what sort of error is occurring. You have not created a
posting that easily lets us see what sort of object "d" really is. (And
it is not being display as though it were a simple list.)

 dput(d) would have allowed us to see what sort of attributes it has.
Your code works if one strips out the data and puts it into a vector.

> s = seq(0, 95, 1)
> s = floor(s/4)

> d <- scan()
1: 0 1 0 0 0 0 0 0 0 0  0  0  0  0  0  0  0  2  0  0  0  0  0  0  0
26: 0  0
28: 0  0  0  0  0  5  5  5  0  0  0  0  0  0  0  6  0  3  1  0  1
49: 6  8  9
52: 7  9 10  5  0  1  0  1  1  1  0  0  0  1  0  0  7 10  9  9 11
73: 11  8  8
76: 10  7  6  7  8  7  4  4  6  5  5  5  5  0  0  0  1  6  2  3  0
97:
Read 96 items
> x = aggregate(d, by=list(s), FUN="sum")
> x
   Group.1  x
10  1
21  0
32  0
43  0
54  2
65  0
76  0
87  0
98 15
10   9  0
11  10  6
12  11  5
13  12 30
14  13 24
15  14  3
16  15  1
17  16  8
18  17 39
19  18 37
20  19 28
21  20 21
22  21 20
23  22  1
24  23 11

>> length(s)
> [1] 96
>> length(d)
> [1] 96
>
> What am I doing wrong?
>
> Thanks in advance list,
> Jim Rome
>
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

David Winsemius, MD
Heritage Laboratories
West Hartford, CT

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Why does aggregate fail?

2010-02-07 Thread David Winsemius

On Feb 7, 2010, at 1:08 PM, James Rome wrote:

I am trying to get hourly totals, given 15-minute bins.
s = seq(0, 95, 1)
s = floor(s/4)   # 0  0  0  0  1  1  1  1  2  2  2  2  3  3  3  3   
4 . . .

s

[1]  0  0  0  0  1  1  1  1  2  2  2  2  3  3  3  3  4  4  4  4  5  5
5  5  6
[26]  6  6  6  7  7  7  7  8  8  8  8  9  9  9  9 10 10 10 10 11 11 11
11 12 12
[51] 12 12 13 13 13 13 14 14 14 14 15 15 15 15 16 16 16 16 17 17 17 17
18 18 18
[76] 18 19 19 19 19 20 20 20 20 21 21 21 21 22 22 22 22 23 23 23 23

mode(d)

[1] "list"

d

  0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24
25 26
Sunday 0 1 0 0 0 0 0 0 0 0  0  0  0  0  0  0  0  2  0  0  0  0  0   
0  0

0  0
  27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47  
48

49 50
Sunday  0  0  0  0  0  5  5  5  0  0  0  0  0  0  0  6  0  3  1  0  1
6  8  9
  51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71  
72

73 74
Sunday  7  9 10  5  0  1  0  1  1  1  0  0  0  1  0  0  7 10  9  9 11
11  8  8
  75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95
Sunday 10  7  6  7  8  7  4  4  6  5  5  5  5  0  0  0  1  6  2  3  0

x = aggregate(d, by=list(s), FUN="sum")

Error in FUN(X[[1L]], ...) : arguments must have same length

I don't know what sort of error is occurring. You have not created a  
posting that easily lets us see what sort of object "d" really is.  
(And it is not being display as though it were a simple list.)

 dput(d) would have allowed us to see what sort of attributes it has.  
Your code works if one strips out the data and puts it into a vector.

> s = seq(0, 95, 1)
> s = floor(s/4)

> d <- scan()
1: 0 1 0 0 0 0 0 0 0 0  0  0  0  0  0  0  0  2  0  0  0  0  0  0  0
26: 0  0
28: 0  0  0  0  0  5  5  5  0  0  0  0  0  0  0  6  0  3  1  0  1
49: 6  8  9
52: 7  9 10  5  0  1  0  1  1  1  0  0  0  1  0  0  7 10  9  9 11
73: 11  8  8
76: 10  7  6  7  8  7  4  4  6  5  5  5  5  0  0  0  1  6  2  3  0
97:
Read 96 items
> x = aggregate(d, by=list(s), FUN="sum")
> x
   Group.1  x
10  1
21  0
32  0
43  0
54  2
65  0
76  0
87  0
98 15
10   9  0
11  10  6
12  11  5
13  12 30
14  13 24
15  14  3
16  15  1
17  16  8
18  17 39
19  18 37
20  19 28
21  20 21
22  21 20
23  22  1
24  23 11

length(s)

[1] 96

length(d)

[1] 96

What am I doing wrong?

Thanks in advance list,
Jim Rome

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

David Winsemius, MD
Heritage Laboratories
West Hartford, CT

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Why does aggregate fail?

2010-02-07 Thread James Rome

I am trying to get hourly totals, given 15-minute bins.
s = seq(0, 95, 1)
s = floor(s/4)   # 0  0  0  0  1  1  1  1  2  2  2  2  3  3  3  3  4 . . .

> s
 [1]  0  0  0  0  1  1  1  1  2  2  2  2  3  3  3  3  4  4  4  4  5  5 
5  5  6
[26]  6  6  6  7  7  7  7  8  8  8  8  9  9  9  9 10 10 10 10 11 11 11
11 12 12
[51] 12 12 13 13 13 13 14 14 14 14 15 15 15 15 16 16 16 16 17 17 17 17
18 18 18
[76] 18 19 19 19 19 20 20 20 20 21 21 21 21 22 22 22 22 23 23 23 23

> mode(d)
[1] "list"
> d
   0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24
25 26
Sunday 0 1 0 0 0 0 0 0 0 0  0  0  0  0  0  0  0  2  0  0  0  0  0  0  0 
0  0
   27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48
49 50
Sunday  0  0  0  0  0  5  5  5  0  0  0  0  0  0  0  6  0  3  1  0  1 
6  8  9
   51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72
73 74
Sunday  7  9 10  5  0  1  0  1  1  1  0  0  0  1  0  0  7 10  9  9 11
11  8  8
   75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95
Sunday 10  7  6  7  8  7  4  4  6  5  5  5  5  0  0  0  1  6  2  3  0
> x = aggregate(d, by=list(s), FUN="sum")
Error in FUN(X[[1L]], ...) : arguments must have same length
> length(s)
[1] 96
> length(d)
[1] 96

What am I doing wrong?

Thanks in advance list,
Jim Rome

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] conditioned xyplot, many y variables

2010-02-07 Thread Jacob Wegelin



The example below creates parallel time-series plots of three different y 
variables conditioned by a dichotomous factor. In the graphical layout,

•   Each y variable inhabits its own row and is plotted on its own 
distinct scale.

•   Each level of the factor has its own column, but within each 
row the scale is held constant across columns.

•   The panels fit tightly (as they do in lattice) without 
superfluous whitespace or ticks.

Currently I know of no lattice solution to this problem, only a traditional 
graphics solution. Can one solve this problem elegantly using lattice?

The difficulty is to lock the levels of the factor (the columns) into the same 
scale for each y variable (for each row), while allowing the scales to differ 
between the y variables (between the rows).

Details:

#   Toy data:

N<-15
TIME <- (1:N)/N
ppp <- TIME^2
QQQ <- exp(TIME)
z <- ppp / QQQ
JUNK<-data.frame( ppp=ppp, QQQ=QQQ, z=z, TIME=TIME)
JUNK$ID<-1
jank<-JUNK
jank$ID<-2
jank$ppp<-jank$ppp / 2
jank$QQQ<-jank$QQQ / 2
jank$z<-jank$ppp/jank$QQQ
JUNK<-rbind(JUNK, jank)
jank<-JUNK
jank$ppp<-(jank$ppp) ^(1/4)
jank$QQQ<-(jank$QQQ) / 10
jank$z <- jank$ppp / jank$QQQ
JUNK$Species<-"Dog"
jank$Species<-"feline"
JUNK<-rbind(JUNK, jank)
JUNK$Species<-factor(JUNK$Species)
JUNK$ID<-factor(JUNK$ID)
summary(JUNK)

#   Traditional graphics solution:

par(mfrow=c(3,2),mar=c(0,0,0,0)+0.0,oma=c(4,4,4,1),xpd=FALSE, las=0)

varNamesAndLabels<-data.frame(
name=c("z", "QQQ", "ppp")
, label=c("z (mIU/mL)", "QQQ (pg/L)", "ppp (mg/L)")
)
rownames( varNamesAndLabels)<- varNamesAndLabels$name

count_y_variables<-0
for(this_y_name in rownames( varNamesAndLabels) ) {
count_y_variables <- count_y_variables + 1

countSpecies<-0
for(thisSpecies in levels(JUNK$Species)) {
countSpecies<-countSpecies + 1
TEMPORARY<-JUNK[JUNK$Species==thisSpecies,]
if(countSpecies==1) {
plot(JUNK$TIME, JUNK[[this_y_name]], xlab="", ylab="", 
type="n",xaxt='n', log="y")
mtext( varNamesAndLabels[this_y_name,"label"], side=2, 
line=2.5)
}
else
plot(JUNK$TIME, JUNK[[this_y_name]] , xlab="", ylab="", type="n",xaxt='n', 
log="y", yaxt="n")
for( thisID in levels(TEMPORARY$ID)) {
lines(TEMPORARY$TIME[TEMPORARY$ID==thisID], 
TEMPORARY[[this_y_name]][TEMPORARY$ID==thisID], type="o")
}
if(count_y_variables == nrow(varNamesAndLabels)) mtext( 
thisSpecies, side=1, line=2.5)
}
}

library("lattice")

#   The three lattice partial solutions below differ only in the value of 
scales$y$relation.

#	scales$y$relation="same" 
#	forces ppp, QQQ, and z to the same scale, which obscures signal,

#   especially for ppp. But at least it enables us to see that the range of 
QQQ
#   differs immensely between Dog and feline.
xyplot ( ppp + QQQ + z ~ TIME | Species
, group=ID
, data=JUNK
, ylab=c("ppp (mg/L)", "QQQ (pg/L)", "z (mIU/mL)")
, xlab=c("Dog", "feline")
, type="o"
, strip= FALSE
, outer=TRUE
, layout=c(2,3)
, scales=list(
ppp=list( alternating=3)
, y=list(
relation="same"
, alternating=3
, rot=0
, log=T
)
)
)


#	scales$y$relation="free" 
#	displays ppp, QQQ, and z on different scales, but it also allows

#   the scales for each variable to differ between Dog and feline.
#   This prevents us from visually comparing the species.
xyplot ( ppp + QQQ + z ~ TIME | Species
, group=ID
, data=JUNK
, ylab=c("ppp (mg/L)", "QQQ (pg/L)", "z (mIU/mL)")
, xlab=c("Dog", "feline")
, type="o"
, strip= FALSE
, outer=TRUE
, layout=c(2,3)
, scales=list(
ppp=list( alternating=3)
, y=list(
relation="free"
, alternating=3
, rot=0
, log=T
)
)
)


#	scales$y$relation="sliced" 
#	shows us that the difference max(z)-min(z) differs greatly between

#   Dog and feline.  But it obscures the fact that
#   QQQ differs wildly between Dog and feline, as we saw when
#   relation="same".
xyplot ( ppp + QQQ + z ~ TIME | Species
, group=ID
, data=JUNK
, ylab=c("ppp (mg/L)", "QQQ (pg/L)", "z (mIU/mL)")
, xlab=c("Dog", "feline")
, type="o"
, strip= FALSE
, outer=TRUE
, layout=c(2,3)
, scales=list(
ppp=list( alternating=3)
, y=list(
relation="sliced"

Re: [R] Reading hierarchical data

2010-02-07 Thread jim holtman

Will this do it for you:

> input <- readLines(textConnection("06470 1 1
+ 1 232 0
+ 2 230 1
+ 07470 1 0
+ 1 240 1
+ 08470 1 0
+ 1 227 0
+ 09470 1 0
+ 1 213 1
+ 2 222 0
+ 3 224 1
+ 10470 1 1
+ 1 220 0
+ 2 211 1
+ 11470 1 0
+ 1 217 0
+ 2 210 1
+ 3 226 1"))
> closeAllConnections()
> fid <- NULL
> dwell <- NULL
> result <- do.call(rbind, lapply(input, function(.line){
+ values <- as.integer(substring(.line, c(1, 7, 9), c(5, 7, 9)))
# assume family record
+ if (values[2] == '1'){
+ fid <<- values[1]
+ dwell <<- values[3]
+ return(NULL)
+ } else {
+ values <- as.integer(substring(.line, c(1, 7, 8, 11), c(5, 7, 9, 11)))
+ return(c(fid=fid, dwell=dwell, pid=values[1], age=values[3],
sex=values[4]))
+ }
+ }))
>
> result
fid dwell pid age sex
 [1,]  6470 1   1  32   0
 [2,]  6470 1   2  30   1
 [3,]  7470 0   1  40   1
 [4,]  8470 0   1  27   0
 [5,]  9470 0   1  13   1
 [6,]  9470 0   2  22   0
 [7,]  9470 0   3  24   1
 [8,] 10470 1   1  20   0
 [9,] 10470 1   2  11   1
[10,] 11470 0   1  17   0
[11,] 11470 0   2  10   1
[12,] 11470 0   3  26   1


On Sun, Feb 7, 2010 at 10:57 AM, Saba(Home)  wrote:
>
> I would like to read the following hierarchical data set. There is a family
> record followed by one or more personal records.
> If col. 7 is "1" it is a family record. If it is "2" it is a personal
> record.
> The family record is formatted as follows:
> col. 1-5     family id
> col. 7        "1"
> col. 9        dwelling type code
> The personal record is formatted as follows:
> col. 1-5        personal id
> col. 7   "2"
> col. 8-9        age
> col. 11 sex code
>
> The first six family and accompanying personal records look like this:
> 06470 1 1
>    1 232 0
>    2 230 1
> 07470 1 0
>    1 240 1
> 08470 1 0
>    1 227 0
> 09470 1 0
>    1 213 1
>    2 222 0
>    3 224 1
> 10470 1 1
>    1 220 0
>    2 211 1
> 11470 1 0
>    1 217 0
>    2 210 1
>    3 226 1
>
> I want to create a dataset containing
> . family ID
> . dwelling code
> . person ID
> . age
> . sex code
> The dataset will contain one observation per person, and the with family
> information repeated for people in the same family.
> Can anyone help?
> Thanks,
> Richard Saba
>
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>



-- 
Jim Holtman
Cincinnati, OH
+1 513 646 9390

What is the problem that you are trying to solve?

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] predicting with stl() decomposition

2010-02-07 Thread Konrad Hoppe

Hi Dennis,

Ive already found that matrix, but I want to predict the time series, for
example with predict.loess() on every component. But actually Im unable to
extract the plotted series (trend and seasonal) as a loess object. This
representation is what Im looking for. 

At the moment I dont see any other possibility to predict the particular
components as with the loess prediction. 

Do you have an idea to extract this representation and not just the data?

Thanks in advance.

Konrad

  _  

Von: Dennis Murphy [mailto:djmu...@gmail.com] 
Gesendet: Sonntag, 7. Februar 2010 17:08
An: Konrad Hoppe
Betreff: Re: [R] predicting with stl() decomposition

Hi:

> str(decomp)
List of 8
 $ time.series: mts [1:300, 1:3] 0.0928 0.2906 -0.0852 -0.1877 0.0347 ...
  ..- attr(*, "dimnames")=List of 2
  .. ..$ : NULL
  .. ..$ : chr [1:3] "seasonal" "trend" "remainder"
  ..- attr(*, "tsp")= num [1:3] 1980 2005 12
  ..- attr(*, "class")= chr [1:2] "mts" "ts"
 $ weights: num [1:300] 1 1 1 1 1 1 1 1 1 1 ...
 $ call   : language stl(x = seriesTs, s.window = "periodic")
 $ win: Named num [1:3] 3001 19 13
  ..- attr(*, "names")= chr [1:3] "s" "t" "l"
 $ deg: Named int [1:3] 0 1 1
  ..- attr(*, "names")= chr [1:3] "s" "t" "l"
 $ jump   : Named num [1:3] 301 2 2
  ..- attr(*, "names")= chr [1:3] "s" "t" "l"
 $ inner  : int 2
 $ outer  : int 0
 - attr(*, "class")= chr "stl"

This tells you decomp$time.series is a matrix with respective columns
'seasonal', 'trend' and 'remainder', respectively. You can extract that and
go from there.

HTH,
Dennis

On Sun, Feb 7, 2010 at 7:55 AM, Konrad Hoppe  wrote:

Hi,

yes that error name is indeed kind of weird. But I think its thrown due to
the missing robustness of the estimation since every weight is one and hence
the fit is likely to be influenced by outliers in the provided data which
should be just an example.

But do you have an idea to extract the single components of the fit? I guess
there must be a possibility to predict those stl models.

Cheers,

Konrad

  _  

Von: Dennis Murphy [mailto:djmu...@gmail.com] 
Gesendet: Sonntag, 7. Februar 2010 16:30
An: Konrad Hoppe
Betreff: Re: [R] predicting with stl() decomposition

Hi:

When I ran your code, I got the following message in the first loess call:

> llrSaison <- loess(seriesTs~time , span=decomp$win[1] ,
+ degree=decomp$deg[1])
Warning messages:
1: Chernobyl! trL wrote:

Hi mailinglist members,

Im actually working on a time series prediction and my current approach is
to decompose the series first into a trend, a seasonal component and a
remainder. Therefore Im using the stl() function. But Im wondering how to
get the single components in order to predict the particular fitted series.

This code snippet illustrates my problem:

series <- vector(length=300)

noise <- rnorm(300,0,2)

time <- 1:300

series[1] <- noise[1]

for(i in 3:300){

   series[i] <- 0.5*series[i-1]+ noise[i] + 0.01*time[i]

}

seriesTs <- ts(series, start=c(1980,1), frequency=12)

decomp <- stl(seriesTs ,"periodic")

plot(decomp)

llrSaison <- loess(seriesTs~time , span=decomp$win[1] ,
degree=decomp$deg[1])

llrTrend  <- loess(seriesTs~time,  span=decomp$win[2] ,
degree=decomp$deg[2])

plot(llrSaison$fitted)

The last plot differs much from the seasonal plot in the plot(decomp) call.
This is why the llr estimator doesnt extract the seasonal component, but
how can I predict the single components at last? Or is there a function
which can predict the values of the stl-object. Predict() doesnt work, Ive
already tried it.

All the best,

Konrad Hoppe

   [[alternative HTML version deleted]]

__
R-help@r-project.org mailing list

PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Reading hierarchical data

2010-02-07 Thread Saba(Home)


I would like to read the following hierarchical data set. There is a family
record followed by one or more personal records.
If col. 7 is "1" it is a family record. If it is "2" it is a personal
record.
The family record is formatted as follows: 
col. 1-5 family id
col. 7"1"
col. 9dwelling type code
The personal record is formatted as follows:
col. 1-5personal id
col. 7   "2"
col. 8-9age
col. 11 sex code

The first six family and accompanying personal records look like this:
06470 1 1
1 232 0
2 230 1
07470 1 0
1 240 1
08470 1 0
1 227 0
09470 1 0
1 213 1
2 222 0
3 224 1
10470 1 1
1 220 0
2 211 1
11470 1 0
1 217 0
2 210 1
3 226 1

I want to create a dataset containing 
. family ID 
. dwelling code 
. person ID 
. age 
. sex code 
The dataset will contain one observation per person, and the with family
information repeated for people in the same family. 
Can anyone help?
Thanks,
Richard Saba

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] predicting with stl() decomposition

2010-02-07 Thread Konrad Hoppe

Hi,

yes that error name is indeed kind of weird. But I think its thrown due to
the missing robustness of the estimation since every weight is one and hence
the fit is likely to be influenced by outliers in the provided data which
should be just an example.

 

But do you have an idea to extract the single components of the fit? I guess
there must be a possibility to predict those stl models.

 

Cheers,

Konrad

  _  

Von: Dennis Murphy [mailto:djmu...@gmail.com] 
Gesendet: Sonntag, 7. Februar 2010 16:30
An: Konrad Hoppe
Betreff: Re: [R] predicting with stl() decomposition

 

Hi:

When I ran your code, I got the following message in the first loess call:

> llrSaison <- loess(seriesTs~time , span=decomp$win[1] ,
+ degree=decomp$deg[1])
Warning messages:
1: Chernobyl! trL wrote:

Hi mailinglist members,



Im actually working on a time series prediction and my current approach is
to decompose the series first into a trend, a seasonal component and a
remainder. Therefore Im using the stl() function. But Im wondering how to
get the single components in order to predict the particular fitted series.

This code snippet illustrates my problem:



series <- vector(length=300)

noise <- rnorm(300,0,2)

time <- 1:300

series[1] <- noise[1]



for(i in 3:300){

   series[i] <- 0.5*series[i-1]+ noise[i] + 0.01*time[i]

}

seriesTs <- ts(series, start=c(1980,1), frequency=12)



decomp <- stl(seriesTs ,"periodic")

plot(decomp)

llrSaison <- loess(seriesTs~time , span=decomp$win[1] ,
degree=decomp$deg[1])

llrTrend  <- loess(seriesTs~time,  span=decomp$win[2] ,
degree=decomp$deg[2])



plot(llrSaison$fitted)



The last plot differs much from the seasonal plot in the plot(decomp) call.
This is why the llr estimator doesnt extract the seasonal component, but
how can I predict the single components at last? Or is there a function
which can predict the values of the stl-object. Predict() doesnt work, Ive
already tried it.



All the best,

Konrad Hoppe


   [[alternative HTML version deleted]]


__
R-help@r-project.org mailing list

PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.




[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] metafor package: effect sizes are not fully independent

2010-02-07 Thread Mike Cheung

Dear Gang,

Here are just some general thoughts. Wolfgang Viechtbauer will be a
better position to answer questions related to metafor.

For multivariate effect sizes, we first have to estimate the
asymptotic sampling covariance matrix among the effect sizes. Formulas
for some common effect sizes are provided by Gleser and Olkin (2009).

If a fixed-effects model is required, it is quite easy to write your
own GLS function to conduct the multivariate meta-analysis (see e.g.,
Becker, 1992). If a random-effects model is required, it is more
challenging in R. SAS Proc MIXED can do the work (e.g., van
Houwelingen, Arends, & Stijnen, 2002).

Sometimes, it is possible to transform the multivariate effect sizes
into independent effect sizes (Kalaian & Raudenbush, 1996; Raudenbush,
Becker, & Kalaian, 1988). Then univariate meta-analysis, e.g.,
metafor(), can be performed on the transformed effect sizes. This
approach works if it makes sense to pool the multivariate effect sizes
as in your case (2)- the effect sizes are the same but in different
conditions (happy, sad, and neutral). However, this approach does not
work if the multivariate effect sizes are measuring different
concepts, e.g., verbal achievement and mathematical achievement.

Hope this helps.

Becker, B. J. (1992). Using results from replicated studies to
estimate linear models. Journal of Educational Statistics, 17,
341-362.
Gleser, L. J., & Olkin, I. (2009). Stochastically dependent effect
sizes. In H. Cooper, L. V. Hedges, and J. C. Valentine (Eds.), The
handbook of research synthesis and meta-analysis, 2nd edition (pp.
357-376). New York: Russell Sage Foundation.
Kalaian, H. A., & Raudenbush, S. W. (1996). A multivariate mixed
linear model for meta-analysis. Psychological Methods, 1, 227-235.
Raudenbush, S. W., Becker, B. J., & Kalaian, H. (1988). Modeling
multivariate effect sizes. Psychological Bulletin, 103, 111-120.
van Houwelingen, H.C., Arends, L.R., & Stijnen, T. (2002). Advanced
methods in meta-analysis: multivariate approach and meta-regression.
Statistics in Medicine, 21, 589-624.

Regards,
Mike
--
-
 Mike W.L. Cheung   Phone: (65) 6516-3702
 Department of Psychology   Fax:   (65) 6773-1843
 National University of Singapore
 http://courses.nus.edu.sg/course/psycwlm/internet/
-

On Sat, Feb 6, 2010 at 6:07 AM, Gang Chen  wrote:
> In a classical meta analysis model y_i = X_i * beta_i + e_i, data
> {y_i} are assumed to be independent effect sizes. However, I'm
> encountering the following two scenarios:
>
> (1) Each source has multiple effect sizes, thus {y_i} are not fully
> independent with each other.
> (2) Each source has multiple effect sizes, each of the effect size
> from a source can be categorized as one of a factor levels (e.g.,
> happy, sad, and neutral). Maybe better denote the data as y_ij, effect
> size at the j-th level from the i-th source. I can code the levels
> with dummy variables into the X_i matrix, but apparently the data from
> the same source are correlated with each other. In this case, I would
> like to run a few tests one of which is, for example, whether there is
> any difference across all the levels of the factor.
>
> Can metafor handle these two cases?
>
> Thanks,
> Gang
>
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

-- 
-
 Mike W.L. Cheung   Phone: (65) 6516-3702
 Department of Psychology   Fax:   (65) 6773-1843
 National University of Singapore
 http://courses.nus.edu.sg/course/psycwlm/internet/

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Non-linear regression

2010-02-07 Thread David Winsemius

It appears my suspicions about this being homework were unfounded.  
Given the additional problems with excess zeroes, you may want to  
examine the extremely informative material on analysis of such  
problems written by Zeileis, Kleiber and Jackman:
(easily found in case you have misplaced it, as I had, with a Google  
search for:

"r-project" zero-inflated hurdle models

"Regression Models for Count Data in R"
http://cran.cnr.berkeley.edu/web/packages/pscl/vignettes/countreg.pdf

--
David.

On Feb 6, 2010, at 10:56 PM, kupz wrote:



Agreed, it would be simple to propose the relationship, however the
regression is necessary to model the data properly. Unfortunately a  
simple
decay based on those two points does not have the proper shape  
necessary.
This is due to an extreme amount of zero inflation with this  
fisheries data.


On another note, I have a working solution for the problem, I am  
excluding a
portion of the zero data based on some other apriori assumptions..  
Thanks

for your help though.
--
View this message in context: 
http://n4.nabble.com/Non-linear-regression-tp1471736p1471749.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


David Winsemius, MD
Heritage Laboratories
West Hartford, CT

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] R-Help

2010-02-07 Thread Douglas Bates

On Sat, Feb 6, 2010 at 2:46 PM, David Winsemius  wrote:
>
> On Feb 6, 2010, at 3:29 PM, Ravi Ramaswamy wrote:
>
>> Hi - I am not familiar with R.  Could I ask you a quick question?
>>
>> When I read a file like this, I get an error.  Not sure what I am doing
>> wrong.  I use a MAC.  How do I specify a full path name for a file in R?
>>  Or
>> do files have to reside locally?
>>
>>> KoreaAuto <- read.table(""/Users/

Especially when just starting using R the simplest approach is

KoreaAuto <- read.table(file.choose())

which brings up a file chooser panel so you can point and click your
way to the desired file.  If the file is tab-delimited, as appears to
be the case in the file you enclosed, you may want to use read.delim
instead of read.table.  The read.delim function sets up the defaults
for the many optional arguments to read.table specifically for
tab-delimited files with a header line of column names as you have
shown.

> I think the opening and clsing quotes meant that you supplied an empty
> string to the file argument.
>
>> raviramaswamy/Documents/Rutgers/STT 586/HW1 Data.txt"")
>> Error: unexpected numeric constant in "KoreaAuto <-
>> read.table(""/Users/raviramaswamy/Documents/Rutgers/STT 586"
>>>
>>
>
> Using single instances of either sort of quote ( " or ' ) on the ends of
> strings should work. If you drag a file from a Finder window to the
> R-console you should get a fully specified file path and name.
>
>> Seems like the working directory is
>>>
>>> getwd()
>>
>> [1] "/Users/raviramaswamy"
>
>> rd <- read.table(file="/Users/davidwinsemius/Downloads/meminfo.csv",
>> sep=",", header=TRUE)
>> rd
>     time      RSS      VSZ  MEM
> 1       1  3027932  3141808  4.5
> 2       2  3028572  3141808  4.5
> 3       3  3030208  3141808  4.5
> 4       4  302  3150004  4.5
> 5       5  3035036  3150004  4.5
>
> You can also shorten the Users/ part to "~"
>> rd <- read.table(file="~/Downloads/meminfo.csv", sep=",", header=TRUE)
>
>
>>>
>>
>> so I said this and still got an error
>>
>>> KoreaAuto <- read.table(/Documents/Rutgers/HW1Data)
>>
>> Error: unexpected '/' in "KoreaAuto <- read.table(/"
>
> But using no quotes will definitely not work. (And that was not a full path
> name anyway.)
>
>>
>>
>> Could someone please help me with the correct syntax?
>>
>> Thanks
>>
>> Ravi
>>
>>   Year   AO      GNP          CP   OP
>> 01    1974 .0022     183          2322 189
>> 02    1975 .0024     238          2729 206
>> 03    1976 .0027     319          3069 206
>> 04    1977 .0035     408          2763 190
>> 05    1978 .0050     540          2414 199
>> 06    1979 .0064     676          2440 233
>> 07    1980 .0065     785          2430 630
>> 08    1981 .0069     944          2631 740
>> 09    1982 .0078     1036         3155 740
>> 10    1983 .0095     1171         3200 660
>>
>
> David Winsemius, MD
> Heritage Laboratories
> West Hartford, CT
>
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] convert R plots into annotated web-graphics

2010-02-07 Thread Rainer Tischler

Dear all,

I would like to make a large scatter plot created with R available as an 
interactive web graphic, in combination with additional text-annotations for 
each data point in the plot. The idea is to present the text-annotations in an 
HTML-table and inter-link the data points in the plot with their corresponding 
entries in the table, i.e. when clicking on a data point in the plot, the 
corresponding entry in the table should be highlighted or centered and 
vice-versa, when clicking on a table-entry, the corresponding point in the plot 
should be highlighted.

I have seen that CRAN contains various R-packages for SVG-based output of 
interactive graphics (with hyperlinks and tool-tip annotations for each data 
point); however, SVG is not supported by all browsers. Is anybody aware of 
another solution for this problem (maybe based on image-maps and javascript)?
If you have alternative ideas for interlinking tabular annotations with plotted 
data points, I would appreciate any recommendation/suggestion.
(I work with R 2.8.1 on different 32-bit PCs with both Linux and Windows 
operating systems).


Many thanks,
Rainer

_
inen herausragenden Schutz gegen Massenmails. 
http://mail.yahoo.com

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Posting an 'S4-creating Package Problem'...

2010-02-07 Thread Martin Morgan

On 02/06/2010 03:39 PM, Daniel Kosztyla wrote:
> Hello R-Team,
> 
> May you help me to post a 'S4-creating Package Problem'?
> Thanks already now for supporting.
> The problem sounds like:
> 
> Hello R forum,
> 
> while compiling my R-package these 'Warnings' occur:
> 
> ...
> Warnung in matchSignature(signature, fdef, where) :
>   in the method signature for function "plot" no definition for class:
> "prediction"
> Warnung in matchSignature(signature, fdef, where) :
>   in the method signature for function "plot" no definition for class:
> "validation"
> ** help
> *** installing help indices
> ...
> 
> Maybe my NAMESPACE file looks wrong. Has anybody an idea how it has to
> look like to solve
> this problem? ( I use exportClasses(...), exportMethods(...). )
> 
> I have 3 classes: 'prediction', 'validation', 'nvalidation' which have a
> plot function.
> There's no warning for class 'nvalidation' but for the other two.
> Any suggestions?

Hi Dan

Files in a package are collated and then sourced. If your 'prediction'
class is in prediction.R, and your plot method is in plot.R, then the
files will be collated plot.R, prediction.R, and the class definition
for prediction will be unknown when the plot method is defined. Use
Collate: in the DESCRIPTION file, or put class (and generic) definitions
in files that collate early, e.g., AllClasses.R, AllGenerics.R.

Hope that helps,

Martin
> 
> Greetings. Dan
> 
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

-- 
Martin Morgan
Computational Biology / Fred Hutchinson Cancer Research Center
1100 Fairview Ave. N.
PO Box 19024 Seattle, WA 98109

Location: Arnold Building M1 B861
Phone: (206) 667-2793

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] optimized R-selection and R-replacement inside a matrix need, strings coerced to factors

2010-02-07 Thread Christine SINOQUET


Hello,

I need to modify some huge arrays (2000 individuals x 50 000
variables).

To format the data, I think I should benefit from optimized R-selection
and R-replacement inside a matrix and prohibite a naive use of loops.

Thank you in advance for providing information about the following problem :

file A  :
2 000 individuals in rows
50 000 columns corresponding to 50 000 variables : each value belongs to
{0, 1, 2}


file B :
50 000 variables in rows
1st column : character (A,C,G,T) corresponding to code 0
2nd colomn : character corresponding to code 1

convention:
if A[,j]=0, one wants to replace 0 with  character in  B[j,1] twice
if A[,j]=1, one wants to replace 1 with  character in  B[j,1] and
character in B[j,2]
if A[,j]=2, one wants to replace 2 with  character in  B[j,2] and
character in B[j,2]

C <- matrix(0,2000,0) # initialization to void matrix

for(j in 1:2000){

c <- A[,j]
zeros <- which(c==0);
ones <- which(c==1);
twos <- which(c==2);
rm(c)

c1 <- matrix("Z",2000)
c2 <- matrix("Z",2000)
c1[zeros] <-  B$V1[j]; c2[zeros]  <-B$V1[j]
c1[ones]  <-  B$V1[j]; c2[ones]   <-B$V2[j]
c1[twos]  <-  B$V2[j]; c2[twos]   <-B$V2[j]

C <- cbind(C, cbind(c1,c2))
}

I do think some more elaborated solution might exist.


Thanks in advance for your help.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] predicting with stl() decomposition

2010-02-07 Thread Konrad Hoppe

Hi mailinglist members,

 

Im actually working on a time series prediction and my current approach is
to decompose the series first into a trend, a seasonal component and a
remainder. Therefore Im using the stl() function. But Im wondering how to
get the single components in order to predict the particular fitted series.

This code snippet illustrates my problem:

 

series <- vector(length=300)

noise <- rnorm(300,0,2)

time <- 1:300

series[1] <- noise[1]

 

for(i in 3:300){

series[i] <- 0.5*series[i-1]+ noise[i] + 0.01*time[i]

}

seriesTs <- ts(series, start=c(1980,1), frequency=12)

 

decomp <- stl(seriesTs ,"periodic")

plot(decomp)

llrSaison <- loess(seriesTs~time , span=decomp$win[1] ,
degree=decomp$deg[1])

llrTrend  <- loess(seriesTs~time,  span=decomp$win[2] ,
degree=decomp$deg[2])

 

plot(llrSaison$fitted)

 

The last plot differs much from the seasonal plot in the plot(decomp) call.
This is why the llr estimator doesnt extract the seasonal component, but
how can I predict the single components at last? Or is there a function
which can predict the values of the stl-object. Predict() doesnt work, Ive
already tried it.

 

All the best,

Konrad Hoppe 


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] The KJV

2010-02-07 Thread Ted Harding

On 07-Feb-10 12:49:23, Barry Rowlingson wrote:
> On Sun, Feb 7, 2010 at 8:28 AM, Ted Harding
>  wrote:
>>
>> Delightful! And fascinating in the detail too.
>>
>> _length(tt)
>> _# [1] 5078
>>
>> with slight changes like:
>>
>> _barplot(rev(tt[1:50]),horiz=TRUE,las=1,cex.names=0.6,log="x")
>> _# ...
>> _barplot(rev(tt[101:150]),horiz=TRUE,las=1,cex.names=0.6,log="x")
>> _# ...
>>
>> and see the likes of
>>
>> _tt["lord"]
>> _# lord
>> _# 1939
>>
>> _tt["god"]
>> _# god
>> _# 822
>>
>> _tt["men"]
>> _# men
>> _# 204
>>
>> _tt["women"]
>> _# women
>> _# _ _26
>>
>> I'm now wondering how it matches up with Zipf's Law (or perhaps
>> Fisher's logarithmic ... )
>>
>> Thanks, Ben!
> 
>  I'm wondering if someone is now going to write an R package to look
> for 'bible codes':
> 
> http://en.wikipedia.org/wiki/Bible_code
> 
>  it's all in there:
> 
> http://www.biblecodewisdom.com/code/model-goodness-fit-test
> 
> Barry

Barry, these things can become distracting! Like the "Weighing
Pennies Problem" (given N pennies, one of which has a different
weight from all the others, and a two-pan balance, what is the
minimum nmber of weighings required to determine which is the
one with the different weight?). With reference to the work of
British Defence scientists during World War II:

"It was said that the 'weighing-pennies' problem wasted 10,000
scientist-hours of war-work, and that there was a proposal to
drop it over Germany." [page 155 of the Bollobás edition of
Littlewood's "A Mathematician's Miscellany"].

And now, Baz, you come up with Bible Codes ...

Ted.

E-Mail: (Ted Harding) 
Fax-to-email: +44 (0)870 094 0861
Date: 07-Feb-10   Time: 13:47:09
-- XFMail --

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] The KJV

2010-02-07 Thread Barry Rowlingson

On Sun, Feb 7, 2010 at 8:28 AM, Ted Harding
 wrote:
>
> Delightful! And fascinating in the detail too.
>
>  length(tt)
>  # [1] 5078
>
> with slight changes like:
>
>  barplot(rev(tt[1:50]),horiz=TRUE,las=1,cex.names=0.6,log="x")
>  # ...
>  barplot(rev(tt[101:150]),horiz=TRUE,las=1,cex.names=0.6,log="x")
>  # ...
>
> and see the likes of
>
>  tt["lord"]
>  # lord
>  # 1939
>
>  tt["god"]
>  # god
>  # 822
>
>  tt["men"]
>  # men
>  # 204
>
>  tt["women"]
>  # women
>  #    26
>
> I'm now wondering how it matches up with Zipf's Law (or perhaps
> Fisher's logarithmic ... )
>
> Thanks, Ben!

 I'm wondering if someone is now going to write an R package to look
for 'bible codes':

http://en.wikipedia.org/wiki/Bible_code

 it's all in there:

http://www.biblecodewisdom.com/code/model-goodness-fit-test

Barry

-- 
blog: http://geospaced.blogspot.com/
web: http://www.maths.lancs.ac.uk/~rowlings
web: http://www.rowlingson.com/
twitter: http://twitter.com/geospacedman
pics: http://www.flickr.com/photos/spacedman

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Is there an R implementation for the "Barnard's exact test" (a substitute for fisher.test) ?

2010-02-07 Thread Tal Galili

Hello all,
After almost half a year, I received a friendly e-mail from Peter Calhoun,
sharing his R implementation of Barnard's exact test. With his permission, I
posted his code here:
http://www.r-statistics.com/2010/02/barnards-exact-test-a-powerful-alternative-for-fishers-exact-test-implemented-in-r/

I hope others will find it useful.
Please note that the code is not as fast as could be. If someone would wish
to give a faster version of the code, please let me know and I'll gladly
post it.

Cheers,
Tal

Contact
Details:---
Contact me: tal.gal...@gmail.com |  972-52-7275845
Read me: www.talgalili.com (Hebrew) | www.biostatistics.co.il (Hebrew) |
www.r-statistics.com (English)
--

On Sun, Jul 26, 2009 at 2:09 PM, Tal Galili  wrote:

> Hello R help members. I came across today with an article on Barnard's
> exact test (http://www.cytel.com/Papers/twobinomials.pdf), that is
> supposed to give a more powerful fisher.test - Because it doesn't assume
> that we know the row and column totals are in advance. Any pointers to such
> a function ? Thanks, Tal
>
>
>
>
> --
> --
>
>
> My contact information:
> Tal Galili
> Phone number: 972-50-3373767
> FaceBook: Tal Galili
> My Blogs:
> http://www.r-statistics.com/
> http://www.talgalili.com
> http://www.biostatistics.co.il
>
>
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] embedFonts with pdf files and Windows 7

2010-02-07 Thread Prof Brian Ripley


Do your systems actually have the fonts you are trying to embed?
I doubt it: Helvetica is a commercial font, and most likely the Linux 
system is embedding a substitute.  It would be better to do


pdf("test.pdf", family="NimbusSan", useDingbats=FALSE)
plot(matrix(rnorm(200),nc=2))
dev.off()
myCall <- embedFonts("test.pdf",outfile = "test-a.pdf")

This is what ?pdf says you should expect:

 Since ‘embedFonts’
 makes use of Ghostscript, it should be able to embed the URW-based
 families for use with other viewers.

If that does not work, you need to get help with your Ghostscript 
installation (it is all to do with how it is set up to handle font 
substitution, which the above should avoid).



On Sat, 6 Feb 2010, James M. Curran wrote:

I am trying to embed fonts in my PDF images so that they are embedded for the 
publisher of my book.


I am running:

Windows 7 - 64 Enterprise
R 2.10.1
Ghostscript 8.70
Ghostview 4.9
MiKTeX 2.8

I have this tiny test script:

pdf("test.pdf")
plot(matrix(rnorm(200),nc=2))
graphics.off()

myCall = embedFonts("test.pdf",outfile = "test-a.pdf")

which successfully issues this command to ghostscript:

> myCall
[1] "gswin32c.exe -dNOPAUSE -dBATCH -q -dAutoRotatePages=/None 
-sDEVICE=pdfwrite 
-sOutputFile=C:\\Users\\curran\\AppData\\Local\\Temp\\RtmpSkHosh\\Rembed136f65f3 
-sFONTPATH=  test.pdf"


The file test.pdf is about 9kb with no fonts embedded. The file test-a.pdf is 
about 4kb with no fonts embedded.


I have tried altering the options:

options = "-dEmbedAllFonts=true",

and the font path

fontpath = "C:\\Windows\\Fonts"

To no avail. The only way I can get embedFonts to work is to shift the work 
over to our Linux system.


Any help would be greatly appreciated.


--
Brian D. Ripley,  rip...@stats.ox.ac.uk
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford, Tel:  +44 1865 272861 (self)
1 South Parks Road, +44 1865 272866 (PA)
Oxford OX1 3TG, UKFax:  +44 1865 272595__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] The KJV

2010-02-07 Thread Ted Harding

On 07-Feb-10 01:06:40, Ben Bolker wrote:
> Jim Lemon  bitwrit.com.au> writes:
> 
>> 
>> On 02/06/2010 06:57 PM, Charlotte Maia wrote:
>> > Hey all,
>> >
>> > Does anyone know if there are any R packages with a copy of the KJV?
>> > I'm guessing the answer is no...
>> >
>> > So the next question, and the more important one is:
>> > Does anyone think it would be useful (e.g. for text-mining
>> > purposes)?
>> > I know almost nothing about theology,
>> > so I'm not sure what kind of questions theologists might have (that
>> > R
>> > could answer).
>> >
>> > An alternative, that would achieve a similar result (I think),
>> > would be an R interface to another open source system, such as
>> > Sword.
>> >
>> Hi Charlotte,
>> Try
>> 
>> http://www.gutenberg.org/etext/10
>> 
>> Jim
>> 
> 
>  I couldn't help it:
> 
> x <- url("http://www.gutenberg.org/dirs/etext90/kjv10.txt",open="r";)
> X <- readLines(x,n=2)
> z <- grep("First Book of Moses",X)
> X <- X[-(1:z)]
> X <- X[nchar(X)>0]
> length(X) ## 15058
> words <- tolower(unlist(strsplit(X,"[ .,:;()]")))
> words2 <- grep("[^0-9]",words,value=TRUE)
> tt <- rev(sort(table(words2)))
> barplot(rev(tt[1:100]),horiz=TRUE,las=1,cex.names=0.4,log="x")

Delightful! And fascinating in the detail too.

  length(tt)
  # [1] 5078

with slight changes like:

  barplot(rev(tt[1:50]),horiz=TRUE,las=1,cex.names=0.6,log="x")
  # ...
  barplot(rev(tt[101:150]),horiz=TRUE,las=1,cex.names=0.6,log="x")
  # ...

and see the likes of

  tt["lord"]
  # lord 
  # 1939 

  tt["god"]
  # god 
  # 822 

  tt["men"]
  # men 
  # 204 

  tt["women"]
  # women 
  #26 

I'm now wondering how it matches up with Zipf's Law (or perhaps
Fisher's logarithmic ... )

Thanks, Ben!
Ted.


E-Mail: (Ted Harding) 
Fax-to-email: +44 (0)870 094 0861
Date: 07-Feb-10   Time: 08:28:30
-- XFMail --

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

78 matches

Mail list logo