Re: [R] Simple parallel for loop

2012-05-14 Thread R. Michael Weylandt
Perhaps mcmapply from the parallel package? It's a parallel mapply to
complement mclapply.

Michael

On Tue, May 15, 2012 at 2:28 AM, Alaios  wrote:
> Thanks Michael,
> last comment that I am trying to figure out, is that my called function has
> two inputs arguments, while foreach looks to be working with only on one,
> being also able to execute one command while I need two. ReadDataSet (based
> on iteration number). Call PlotFunction(based on two inputs that depend on
> iteration number).
>
> I am not quite sure how to pass with for each two input arguments in my
> function.
>
> I will also try to look for mclapply, but it also looks, at least for now,
> that only passed one input argument to my function.
>
> Cheers
> Alex
>
> 
> From: R. Michael Weylandt 
> To: Alaios 
> Cc: R help 
> Sent: Tuesday, May 15, 2012 8:24 AM
> Subject: Re: [R] Simple parallel for loop
>
> I haven't actually used foreach very much myself, but I would imagine
> that you could just take advantage of the fact that most plot
> functions return their arguments silently and then just throw the
> results away (i.e., don't assign them)
>
> Switching %do% to %dopar% automatically activates parallelization
> (dopar being "do in parallel")
>
> I believe you decide the number of cores to use when you set up your
> parallel backend (either multicore or snow)
>
> Hope this helps,
> Michael
>
> On Tue, May 15, 2012 at 2:20 AM, Alaios  wrote:
>> Hello Michael,
>> thanks for the answer, it looks like that the foreach package might do
>> what
>> I want. Few comments though
>>
>> The foreach loop asks for a way to combine results, which I do not want to
>> have any. AFter I load a dataset the subsequent function does plotting and
>> save the files as pdfs, nothing more.
>>
>> What is the difference between %do% and %dopar%, they look actually the
>> same.
>>
>> I do not see to be anyway to contol the number of used cores, like set to
>> use only 4, or 8 or 16.
>>
>> Regards
>> Alex
>>
>> 
>> From: R. Michael Weylandt 
>> To: Alaios 
>> Cc: R help 
>> Sent: Tuesday, May 15, 2012 8:00 AM
>> Subject: Re: [R] Simple parallel for loop
>>
>> Take a look at foreach() and %dopar$ from the CRAN package foreach.
>>
>> Michael
>>
>> On Tue, May 15, 2012 at 1:57 AM, Alaios  wrote:
>>> Dear all,
>>> I am having a for loop that iterates a given number of measurements that
>>> I
>>> would like to split over 16 available cores. The code is in the following
>>> format
>>>
>>> inputForFunction<-expand.grid(caseList,filterList)
>>> for (i in c(1:length(inputForFunction$Var1))){#
>>>       FileList<-GetFileList(flag=as.vector(inputForFunction$Var1[i]));
>>>        print(sprintf("Calling the plotsCreate for %s
>>>
>>> and%s",as.vector(inputForFunction$Var1[i]),as.vector(inputForFunction$Var2[i])))
>>>
>>>
>>>
>>> plotsCreate(Folder=mainFolder,case=as.vector(inputForFunction$Var1[i]),DataList=FileList,DataFilter=as.vector(inputForFunction$Var2[i]))
>>>  }
>>>
>>> as you can see after the inputForFunction is calculated then my code
>>> iterates over the available combinations of caseList and filterList. It
>>> would be great, without major changes, split these "tasks" to all the
>>> available processors.
>>>
>>> Is there some way to do that?
>>>
>>> Regards
>>> Alex
>>>
>>>        [[alternative HTML version deleted]]
>>>
>>>
>>> __
>>> R-help@r-project.org mailing list
>>> https://stat.ethz.ch/mailman/listinfo/r-help
>>> PLEASE do read the posting guide
>>> http://www.R-project.org/posting-guide.html
>>> and provide commented, minimal, self-contained, reproducible code.
>>>
>>
>>
>
>

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Simple parallel for loop

2012-05-14 Thread Alaios
Thanks Michael,
last comment that I am trying to figure out, is that my called function has two 
inputs arguments, while foreach looks to be working with only on one, being 
also able to execute one command while I need two. ReadDataSet (based on 
iteration number). Call PlotFunction(based on two inputs that depend on 
iteration number).

I am not quite sure how to pass with for each two input arguments in my 
function.

I will also try to look for mclapply, but it also looks, at least for now, that 
only passed one input argument to my function.

Cheers
Alex




 From: R. Michael Weylandt 

Cc: R help  
Sent: Tuesday, May 15, 2012 8:24 AM
Subject: Re: [R] Simple parallel for loop

I haven't actually used foreach very much myself, but I would imagine
that you could just take advantage of the fact that most plot
functions return their arguments silently and then just throw the
results away (i.e., don't assign them)

Switching %do% to %dopar% automatically activates parallelization
(dopar being "do in parallel")

I believe you decide the number of cores to use when you set up your
parallel backend (either multicore or snow)

Hope this helps,
Michael


> Hello Michael,
> thanks for the answer, it looks like that the foreach package might do what
> I want. Few comments though
>
> The foreach loop asks for a way to combine results, which I do not want to
> have any. AFter I load a dataset the subsequent function does plotting and
> save the files as pdfs, nothing more.
>
> What is the difference between %do% and %dopar%, they look actually the
> same.
>
> I do not see to be anyway to contol the number of used cores, like set to
> use only 4, or 8 or 16.
>
> Regards
> Alex
>
> 
> From: R. Michael Weylandt 

> Cc: R help 
> Sent: Tuesday, May 15, 2012 8:00 AM
> Subject: Re: [R] Simple parallel for loop
>
> Take a look at foreach() and %dopar$ from the CRAN package foreach.
>
> Michael
>

>> Dear all,
>> I am having a for loop that iterates a given number of measurements that I
>> would like to split over 16 available cores. The code is in the following
>> format
>>
>> inputForFunction<-expand.grid(caseList,filterList)
>> for (i in c(1:length(inputForFunction$Var1))){#
>>       FileList<-GetFileList(flag=as.vector(inputForFunction$Var1[i]));
>>        print(sprintf("Calling the plotsCreate for %s
>> and%s",as.vector(inputForFunction$Var1[i]),as.vector(inputForFunction$Var2[i])))
>>
>>
>> plotsCreate(Folder=mainFolder,case=as.vector(inputForFunction$Var1[i]),DataList=FileList,DataFilter=as.vector(inputForFunction$Var2[i]))
>>  }
>>
>> as you can see after the inputForFunction is calculated then my code
>> iterates over the available combinations of caseList and filterList. It
>> would be great, without major changes, split these "tasks" to all the
>> available processors.
>>
>> Is there some way to do that?
>>
>> Regards
>> Alex
>>
>>        [[alternative HTML version deleted]]
>>
>>
>> __
>> R-help@r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide
>> http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>>
>
>
[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Simple parallel for loop

2012-05-14 Thread R. Michael Weylandt
I haven't actually used foreach very much myself, but I would imagine
that you could just take advantage of the fact that most plot
functions return their arguments silently and then just throw the
results away (i.e., don't assign them)

Switching %do% to %dopar% automatically activates parallelization
(dopar being "do in parallel")

I believe you decide the number of cores to use when you set up your
parallel backend (either multicore or snow)

Hope this helps,
Michael

On Tue, May 15, 2012 at 2:20 AM, Alaios  wrote:
> Hello Michael,
> thanks for the answer, it looks like that the foreach package might do what
> I want. Few comments though
>
> The foreach loop asks for a way to combine results, which I do not want to
> have any. AFter I load a dataset the subsequent function does plotting and
> save the files as pdfs, nothing more.
>
> What is the difference between %do% and %dopar%, they look actually the
> same.
>
> I do not see to be anyway to contol the number of used cores, like set to
> use only 4, or 8 or 16.
>
> Regards
> Alex
>
> 
> From: R. Michael Weylandt 
> To: Alaios 
> Cc: R help 
> Sent: Tuesday, May 15, 2012 8:00 AM
> Subject: Re: [R] Simple parallel for loop
>
> Take a look at foreach() and %dopar$ from the CRAN package foreach.
>
> Michael
>
> On Tue, May 15, 2012 at 1:57 AM, Alaios  wrote:
>> Dear all,
>> I am having a for loop that iterates a given number of measurements that I
>> would like to split over 16 available cores. The code is in the following
>> format
>>
>> inputForFunction<-expand.grid(caseList,filterList)
>> for (i in c(1:length(inputForFunction$Var1))){#
>>       FileList<-GetFileList(flag=as.vector(inputForFunction$Var1[i]));
>>        print(sprintf("Calling the plotsCreate for %s
>> and%s",as.vector(inputForFunction$Var1[i]),as.vector(inputForFunction$Var2[i])))
>>
>>
>> plotsCreate(Folder=mainFolder,case=as.vector(inputForFunction$Var1[i]),DataList=FileList,DataFilter=as.vector(inputForFunction$Var2[i]))
>>  }
>>
>> as you can see after the inputForFunction is calculated then my code
>> iterates over the available combinations of caseList and filterList. It
>> would be great, without major changes, split these "tasks" to all the
>> available processors.
>>
>> Is there some way to do that?
>>
>> Regards
>> Alex
>>
>>        [[alternative HTML version deleted]]
>>
>>
>> __
>> R-help@r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide
>> http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>>
>
>

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Simple parallel for loop

2012-05-14 Thread Alaios
Hello Michael,
thanks for the answer, it looks like that the foreach package might do what I 
want. Few comments though

The foreach loop asks for a way to combine results, which I do not want to have 
any. AFter I load a dataset the subsequent function does plotting and save the 
files as pdfs, nothing more.

What is the difference between %do% and %dopar%, they look actually the same.

I do not see to be anyway to contol the number of used cores, like set to use 
only 4, or 8 or 16.

Regards
Alex




 From: R. Michael Weylandt 

Cc: R help  
Sent: Tuesday, May 15, 2012 8:00 AM
Subject: Re: [R] Simple parallel for loop

Take a look at foreach() and %dopar$ from the CRAN package foreach.

Michael


> Dear all,
> I am having a for loop that iterates a given number of measurements that I 
> would like to split over 16 available cores. The code is in the following 
> format
>
> inputForFunction<-expand.grid(caseList,filterList)
> for (i in c(1:length(inputForFunction$Var1))){#
>       FileList<-GetFileList(flag=as.vector(inputForFunction$Var1[i]));
>        print(sprintf("Calling the plotsCreate for %s 
> and%s",as.vector(inputForFunction$Var1[i]),as.vector(inputForFunction$Var2[i])))
>
>  
> plotsCreate(Folder=mainFolder,case=as.vector(inputForFunction$Var1[i]),DataList=FileList,DataFilter=as.vector(inputForFunction$Var2[i]))
>  }
>
> as you can see after the inputForFunction is calculated then my code iterates 
> over the available combinations of caseList and filterList. It would be 
> great, without major changes, split these "tasks" to all the available 
> processors.
>
> Is there some way to do that?
>
> Regards
> Alex
>
>        [[alternative HTML version deleted]]
>
>
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>
[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] How to Un-group a grouped data set?

2012-05-14 Thread R. Michael Weylandt
It is a nifty and surprisingly useful construct whenever you need to
construct a function call programmatically or apply it to a list.

R-News 2/2 has some useful tips on this and related functions in the
Programmer's Note section if you're interested.

Best,
Michael

On Tue, May 15, 2012 at 2:05 AM, Cheenghee AM Koh  wrote:
> Thank you so much!  I can't believe I spent the whole night by not knowing
> this one command "do.call"
> This is so handy!
> Best, Koh
>
>
> On Tue, May 15, 2012 at 12:52 AM, R. Michael Weylandt
>  wrote:
>>
>> Sorry -- I missed the bit about the AE in your original post. Perhaps
>> you can work with my bit for the repeats, but it looks like if you
>> want to use your function, it should suffice to do something like
>>
>> do.call("rbind", lapply(NewFuncName, 1:6))
>>
>> Best,
>> Michael
>>
>> On Tue, May 15, 2012 at 1:50 AM, R. Michael Weylandt
>>  wrote:
>> > Don't use subset for a function name -- it's already the name of a
>> > rather important function as is data (but at least that one's not a
>> > function in your use so it's not quite so bad). Finally, use dput()
>> > when sending data so we get a plaintext reproducible version.
>> >
>> > I'd try something like this:
>> >
>> > dats <- structure(list(Study = c(1L, 1L, 2L, 2L, 3L, 3L), TX = c(1L,
>> > 0L, 1L, 0L, 1L, 0L), AEs = c(3L, 2L, 1L, 2L, 1L, 1L), N = c(5L,
>> > 7L, 10L, 7L, 8L, 4L)), .Names = c("Study", "TX", "AEs", "N"), class =
>> > "data.frame", row.names = c("1",
>> > "2", "3", "4", "5", "6"))
>> >
>> > # See how handy dput can be :-)
>> >
>> > dats[unlist(mapply(FUN = function(x,y) rep(x, y), 1:NROW(dats),
>> > dats$N)), -4]
>> >
>> > which isn't super elegant, but others might have something better.
>> >
>> > Best,
>> > Michael
>> >
>> > On Tue, May 15, 2012 at 1:24 AM, Cheenghee AM Koh 
>> > wrote:
>> >> Hello, R-fellows,
>> >>
>> >> I have a question that I really don't know how to solve. I have spent
>> >> hours
>> >> on line surfing for possible solutions but in veil. Please if anyone
>> >> could
>> >> help me handle this issue, you would be so appreciated!
>> >>
>> >> I have a "grouped" dataset like this:
>> >>
>> >>> data
>> >>  Study TX AEs   N
>> >> 1     1     1    3       5
>> >> 2     1     0    2       7
>> >> 3     2     1    1      10
>> >> 4     2     0    2       7
>> >> 5     3     1    1       8
>> >> 6     3     0    1       4
>> >>
>> >> where Study is the study id, TX is treatment, AEs is how many people in
>> >> this trial is positive, and N is the number of the subjects. Therefore,
>> >> for
>> >> the row 1, it stands for: It is the treatment arm for the study one,
>> >> where
>> >> there are 5 subjects and 3 of them are positive. The row 2 stands for:
>> >> It
>> >> is the control arm of the study 1 where there are 7 subjects and 2 of
>> >> them
>> >> are positive.
>> >>
>> >> Now I would like to "un-group them", make it like:
>> >>
>> >> Study  TX   AEs
>> >>   1         1      1
>> >>   1         1      1
>> >>   1         1      1
>> >>   1         1      0
>> >>   1         1      0
>> >>   1         0      1
>> >>   1         0      1
>> >>   1         0      0
>> >>   1         0      0
>> >>   1         0      0
>> >>   1         0      0
>> >>   1         0      0
>> >>   2         1      1
>> >>   .
>> >>  .
>> >>
>> >>
>> >> But I wasn't able to do it. In fact I wrote a small function, and use
>> >> "lapply" to get what I want. It worked well, and did give me what I
>> >> want.
>> >> But I wasn't able to collapse all the returns into one single data
>> >> frame
>> >> for subsequent analysis.
>> >>
>> >> The function I wrote:
>> >>
>> >> subset = function(i){
>> >> d = c(rep(data[i,1], data[i,4]), rep(data[i,2], data[i,4]), rep(0:1,
>> >> c(data[i,4] - data[i,3],data[i,3])))
>> >> d = matrix(d, data[i,4],3)
>> >> d
>> >> }
>> >>
>> >> then:
>> >>
>> >> Data = lapply(1:6, subset)
>> >> Data
>> >>
>> >> Therefore, I tried to write a loop. But no matter how I tried, I can't
>> >> get
>> >> what I want.
>> >>
>> >> Any idea?
>> >>
>> >> Thank you so much!
>> >>
>> >> Best,
>> >>
>> >>
>> >> --
>> >> Cheenghee Masaki Koh, MSW, MS(c), PhD Student
>> >> School of Social Service Administration
>> >> Department of Health Studies, Division of Biological Science
>> >> University of Chicago
>> >>
>> >>        [[alternative HTML version deleted]]
>> >>
>> >> __
>> >> R-help@r-project.org mailing list
>> >> https://stat.ethz.ch/mailman/listinfo/r-help
>> >> PLEASE do read the posting guide
>> >> http://www.R-project.org/posting-guide.html
>> >> and provide commented, minimal, self-contained, reproducible code.
>
>
>
>
> --
> Cheenghee Masaki Koh, MSW, MS(c), PhD Student
> School of Social Service Administration
> Department of Health Studies, Division of Biological Science
> University of Chicago
>

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listin

Re: [R] How to Un-group a grouped data set?

2012-05-14 Thread Cheenghee AM Koh
Thank you so much!  I can't believe I spent the whole night by not knowing
this one command "do.call"
This is so handy!
Best, Koh


On Tue, May 15, 2012 at 12:52 AM, R. Michael Weylandt <
michael.weyla...@gmail.com> wrote:

> Sorry -- I missed the bit about the AE in your original post. Perhaps
> you can work with my bit for the repeats, but it looks like if you
> want to use your function, it should suffice to do something like
>
> do.call("rbind", lapply(NewFuncName, 1:6))
>
> Best,
> Michael
>
> On Tue, May 15, 2012 at 1:50 AM, R. Michael Weylandt
>  wrote:
> > Don't use subset for a function name -- it's already the name of a
> > rather important function as is data (but at least that one's not a
> > function in your use so it's not quite so bad). Finally, use dput()
> > when sending data so we get a plaintext reproducible version.
> >
> > I'd try something like this:
> >
> > dats <- structure(list(Study = c(1L, 1L, 2L, 2L, 3L, 3L), TX = c(1L,
> > 0L, 1L, 0L, 1L, 0L), AEs = c(3L, 2L, 1L, 2L, 1L, 1L), N = c(5L,
> > 7L, 10L, 7L, 8L, 4L)), .Names = c("Study", "TX", "AEs", "N"), class =
> > "data.frame", row.names = c("1",
> > "2", "3", "4", "5", "6"))
> >
> > # See how handy dput can be :-)
> >
> > dats[unlist(mapply(FUN = function(x,y) rep(x, y), 1:NROW(dats),
> dats$N)), -4]
> >
> > which isn't super elegant, but others might have something better.
> >
> > Best,
> > Michael
> >
> > On Tue, May 15, 2012 at 1:24 AM, Cheenghee AM Koh 
> wrote:
> >> Hello, R-fellows,
> >>
> >> I have a question that I really don't know how to solve. I have spent
> hours
> >> on line surfing for possible solutions but in veil. Please if anyone
> could
> >> help me handle this issue, you would be so appreciated!
> >>
> >> I have a "grouped" dataset like this:
> >>
> >>> data
> >>  Study TX AEs   N
> >> 1 1 13   5
> >> 2 1 02   7
> >> 3 2 11  10
> >> 4 2 02   7
> >> 5 3 11   8
> >> 6 3 01   4
> >>
> >> where Study is the study id, TX is treatment, AEs is how many people in
> >> this trial is positive, and N is the number of the subjects. Therefore,
> for
> >> the row 1, it stands for: It is the treatment arm for the study one,
> where
> >> there are 5 subjects and 3 of them are positive. The row 2 stands for:
> It
> >> is the control arm of the study 1 where there are 7 subjects and 2 of
> them
> >> are positive.
> >>
> >> Now I would like to "un-group them", make it like:
> >>
> >> Study  TX   AEs
> >>   1 1  1
> >>   1 1  1
> >>   1 1  1
> >>   1 1  0
> >>   1 1  0
> >>   1 0  1
> >>   1 0  1
> >>   1 0  0
> >>   1 0  0
> >>   1 0  0
> >>   1 0  0
> >>   1 0  0
> >>   2 1  1
> >>   .
> >>  .
> >>
> >>
> >> But I wasn't able to do it. In fact I wrote a small function, and use
> >> "lapply" to get what I want. It worked well, and did give me what I
> want.
> >> But I wasn't able to collapse all the returns into one single data frame
> >> for subsequent analysis.
> >>
> >> The function I wrote:
> >>
> >> subset = function(i){
> >> d = c(rep(data[i,1], data[i,4]), rep(data[i,2], data[i,4]), rep(0:1,
> >> c(data[i,4] - data[i,3],data[i,3])))
> >> d = matrix(d, data[i,4],3)
> >> d
> >> }
> >>
> >> then:
> >>
> >> Data = lapply(1:6, subset)
> >> Data
> >>
> >> Therefore, I tried to write a loop. But no matter how I tried, I can't
> get
> >> what I want.
> >>
> >> Any idea?
> >>
> >> Thank you so much!
> >>
> >> Best,
> >>
> >>
> >> --
> >> Cheenghee Masaki Koh, MSW, MS(c), PhD Student
> >> School of Social Service Administration
> >> Department of Health Studies, Division of Biological Science
> >> University of Chicago
> >>
> >>[[alternative HTML version deleted]]
> >>
> >> __
> >> R-help@r-project.org mailing list
> >> https://stat.ethz.ch/mailman/listinfo/r-help
> >> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> >> and provide commented, minimal, self-contained, reproducible code.
>



-- 
Cheenghee Masaki Koh, MSW, MS(c), PhD Student
School of Social Service Administration
Department of Health Studies, Division of Biological Science
University of Chicago

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Simple parallel for loop

2012-05-14 Thread R. Michael Weylandt
Take a look at foreach() and %dopar$ from the CRAN package foreach.

Michael

On Tue, May 15, 2012 at 1:57 AM, Alaios  wrote:
> Dear all,
> I am having a for loop that iterates a given number of measurements that I 
> would like to split over 16 available cores. The code is in the following 
> format
>
> inputForFunction<-expand.grid(caseList,filterList)
> for (i in c(1:length(inputForFunction$Var1))){#
>       FileList<-GetFileList(flag=as.vector(inputForFunction$Var1[i]));
>        print(sprintf("Calling the plotsCreate for %s 
> and%s",as.vector(inputForFunction$Var1[i]),as.vector(inputForFunction$Var2[i])))
>
>  
> plotsCreate(Folder=mainFolder,case=as.vector(inputForFunction$Var1[i]),DataList=FileList,DataFilter=as.vector(inputForFunction$Var2[i]))
>  }
>
> as you can see after the inputForFunction is calculated then my code iterates 
> over the available combinations of caseList and filterList. It would be 
> great, without major changes, split these "tasks" to all the available 
> processors.
>
> Is there some way to do that?
>
> Regards
> Alex
>
>        [[alternative HTML version deleted]]
>
>
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Simple parallel for loop

2012-05-14 Thread Alaios
Dear all,
I am having a for loop that iterates a given number of measurements that I 
would like to split over 16 available cores. The code is in the following format

inputForFunction<-expand.grid(caseList,filterList)
for (i in c(1:length(inputForFunction$Var1))){# 
      FileList<-GetFileList(flag=as.vector(inputForFunction$Var1[i]));
       print(sprintf("Calling the plotsCreate for %s 
and%s",as.vector(inputForFunction$Var1[i]),as.vector(inputForFunction$Var2[i])))
 

 
plotsCreate(Folder=mainFolder,case=as.vector(inputForFunction$Var1[i]),DataList=FileList,DataFilter=as.vector(inputForFunction$Var2[i]))
 }

as you can see after the inputForFunction is calculated then my code iterates 
over the available combinations of caseList and filterList. It would be great, 
without major changes, split these "tasks" to all the available processors.

Is there some way to do that?

Regards
Alex

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] How to Un-group a grouped data set?

2012-05-14 Thread R. Michael Weylandt
Sorry -- I missed the bit about the AE in your original post. Perhaps
you can work with my bit for the repeats, but it looks like if you
want to use your function, it should suffice to do something like

do.call("rbind", lapply(NewFuncName, 1:6))

Best,
Michael

On Tue, May 15, 2012 at 1:50 AM, R. Michael Weylandt
 wrote:
> Don't use subset for a function name -- it's already the name of a
> rather important function as is data (but at least that one's not a
> function in your use so it's not quite so bad). Finally, use dput()
> when sending data so we get a plaintext reproducible version.
>
> I'd try something like this:
>
> dats <- structure(list(Study = c(1L, 1L, 2L, 2L, 3L, 3L), TX = c(1L,
> 0L, 1L, 0L, 1L, 0L), AEs = c(3L, 2L, 1L, 2L, 1L, 1L), N = c(5L,
> 7L, 10L, 7L, 8L, 4L)), .Names = c("Study", "TX", "AEs", "N"), class =
> "data.frame", row.names = c("1",
> "2", "3", "4", "5", "6"))
>
> # See how handy dput can be :-)
>
> dats[unlist(mapply(FUN = function(x,y) rep(x, y), 1:NROW(dats), dats$N)), -4]
>
> which isn't super elegant, but others might have something better.
>
> Best,
> Michael
>
> On Tue, May 15, 2012 at 1:24 AM, Cheenghee AM Koh  wrote:
>> Hello, R-fellows,
>>
>> I have a question that I really don't know how to solve. I have spent hours
>> on line surfing for possible solutions but in veil. Please if anyone could
>> help me handle this issue, you would be so appreciated!
>>
>> I have a "grouped" dataset like this:
>>
>>> data
>>  Study TX AEs   N
>> 1     1     1    3       5
>> 2     1     0    2       7
>> 3     2     1    1      10
>> 4     2     0    2       7
>> 5     3     1    1       8
>> 6     3     0    1       4
>>
>> where Study is the study id, TX is treatment, AEs is how many people in
>> this trial is positive, and N is the number of the subjects. Therefore, for
>> the row 1, it stands for: It is the treatment arm for the study one, where
>> there are 5 subjects and 3 of them are positive. The row 2 stands for: It
>> is the control arm of the study 1 where there are 7 subjects and 2 of them
>> are positive.
>>
>> Now I would like to "un-group them", make it like:
>>
>> Study  TX   AEs
>>   1         1      1
>>   1         1      1
>>   1         1      1
>>   1         1      0
>>   1         1      0
>>   1         0      1
>>   1         0      1
>>   1         0      0
>>   1         0      0
>>   1         0      0
>>   1         0      0
>>   1         0      0
>>   2         1      1
>>   .
>>  .
>>
>>
>> But I wasn't able to do it. In fact I wrote a small function, and use
>> "lapply" to get what I want. It worked well, and did give me what I want.
>> But I wasn't able to collapse all the returns into one single data frame
>> for subsequent analysis.
>>
>> The function I wrote:
>>
>> subset = function(i){
>> d = c(rep(data[i,1], data[i,4]), rep(data[i,2], data[i,4]), rep(0:1,
>> c(data[i,4] - data[i,3],data[i,3])))
>> d = matrix(d, data[i,4],3)
>> d
>> }
>>
>> then:
>>
>> Data = lapply(1:6, subset)
>> Data
>>
>> Therefore, I tried to write a loop. But no matter how I tried, I can't get
>> what I want.
>>
>> Any idea?
>>
>> Thank you so much!
>>
>> Best,
>>
>>
>> --
>> Cheenghee Masaki Koh, MSW, MS(c), PhD Student
>> School of Social Service Administration
>> Department of Health Studies, Division of Biological Science
>> University of Chicago
>>
>>        [[alternative HTML version deleted]]
>>
>> __
>> R-help@r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] How to Un-group a grouped data set?

2012-05-14 Thread R. Michael Weylandt
Don't use subset for a function name -- it's already the name of a
rather important function as is data (but at least that one's not a
function in your use so it's not quite so bad). Finally, use dput()
when sending data so we get a plaintext reproducible version.

I'd try something like this:

dats <- structure(list(Study = c(1L, 1L, 2L, 2L, 3L, 3L), TX = c(1L,
0L, 1L, 0L, 1L, 0L), AEs = c(3L, 2L, 1L, 2L, 1L, 1L), N = c(5L,
7L, 10L, 7L, 8L, 4L)), .Names = c("Study", "TX", "AEs", "N"), class =
"data.frame", row.names = c("1",
"2", "3", "4", "5", "6"))

# See how handy dput can be :-)

dats[unlist(mapply(FUN = function(x,y) rep(x, y), 1:NROW(dats), dats$N)), -4]

which isn't super elegant, but others might have something better.

Best,
Michael

On Tue, May 15, 2012 at 1:24 AM, Cheenghee AM Koh  wrote:
> Hello, R-fellows,
>
> I have a question that I really don't know how to solve. I have spent hours
> on line surfing for possible solutions but in veil. Please if anyone could
> help me handle this issue, you would be so appreciated!
>
> I have a "grouped" dataset like this:
>
>> data
>  Study TX AEs   N
> 1     1     1    3       5
> 2     1     0    2       7
> 3     2     1    1      10
> 4     2     0    2       7
> 5     3     1    1       8
> 6     3     0    1       4
>
> where Study is the study id, TX is treatment, AEs is how many people in
> this trial is positive, and N is the number of the subjects. Therefore, for
> the row 1, it stands for: It is the treatment arm for the study one, where
> there are 5 subjects and 3 of them are positive. The row 2 stands for: It
> is the control arm of the study 1 where there are 7 subjects and 2 of them
> are positive.
>
> Now I would like to "un-group them", make it like:
>
> Study  TX   AEs
>   1         1      1
>   1         1      1
>   1         1      1
>   1         1      0
>   1         1      0
>   1         0      1
>   1         0      1
>   1         0      0
>   1         0      0
>   1         0      0
>   1         0      0
>   1         0      0
>   2         1      1
>   .
>  .
>
>
> But I wasn't able to do it. In fact I wrote a small function, and use
> "lapply" to get what I want. It worked well, and did give me what I want.
> But I wasn't able to collapse all the returns into one single data frame
> for subsequent analysis.
>
> The function I wrote:
>
> subset = function(i){
> d = c(rep(data[i,1], data[i,4]), rep(data[i,2], data[i,4]), rep(0:1,
> c(data[i,4] - data[i,3],data[i,3])))
> d = matrix(d, data[i,4],3)
> d
> }
>
> then:
>
> Data = lapply(1:6, subset)
> Data
>
> Therefore, I tried to write a loop. But no matter how I tried, I can't get
> what I want.
>
> Any idea?
>
> Thank you so much!
>
> Best,
>
>
> --
> Cheenghee Masaki Koh, MSW, MS(c), PhD Student
> School of Social Service Administration
> Department of Health Studies, Division of Biological Science
> University of Chicago
>
>        [[alternative HTML version deleted]]
>
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] How to Un-group a grouped data set?

2012-05-14 Thread Cheenghee AM Koh
Hello, R-fellows,

I have a question that I really don't know how to solve. I have spent hours
on line surfing for possible solutions but in veil. Please if anyone could
help me handle this issue, you would be so appreciated!

I have a "grouped" dataset like this:

> data
  Study TX AEs   N
1 1 13   5
2 1 02   7
3 2 11  10
4 2 02   7
5 3 11   8
6 3 01   4

where Study is the study id, TX is treatment, AEs is how many people in
this trial is positive, and N is the number of the subjects. Therefore, for
the row 1, it stands for: It is the treatment arm for the study one, where
there are 5 subjects and 3 of them are positive. The row 2 stands for: It
is the control arm of the study 1 where there are 7 subjects and 2 of them
are positive.

Now I would like to "un-group them", make it like:

Study  TX   AEs
   1 1  1
   1 1  1
   1 1  1
   1 1  0
   1 1  0
   1 0  1
   1 0  1
   1 0  0
   1 0  0
   1 0  0
   1 0  0
   1 0  0
   2 1  1
   .
  .


But I wasn't able to do it. In fact I wrote a small function, and use
"lapply" to get what I want. It worked well, and did give me what I want.
But I wasn't able to collapse all the returns into one single data frame
for subsequent analysis.

The function I wrote:

subset = function(i){
d = c(rep(data[i,1], data[i,4]), rep(data[i,2], data[i,4]), rep(0:1,
c(data[i,4] - data[i,3],data[i,3])))
d = matrix(d, data[i,4],3)
d
}

then:

Data = lapply(1:6, subset)
Data

Therefore, I tried to write a loop. But no matter how I tried, I can't get
what I want.

Any idea?

Thank you so much!

Best,


-- 
Cheenghee Masaki Koh, MSW, MS(c), PhD Student
School of Social Service Administration
Department of Health Studies, Division of Biological Science
University of Chicago

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] how to create a new data given a vector of variable names

2012-05-14 Thread christy
oh, thank you very much!




 From: Michael Weylandt [via R] 
To: christy  
Sent: Tuesday, May 15, 2012 12:28 AM
Subject: Re: how to create a new data given a vector of variable names


I think you're making this too hard: 

x[, vars] 

should do it. 

For a smaller example, consider 

x <- data.frame(a = 1:5, b = rnorm(5), c = letters[1:5]) 

x[, c("b","a")] 

Best, 
Michael 

On Tue, May 15, 2012 at 1:12 AM, christy <[hidden email]> wrote: 

> hi, please help me on this. I'm very new to R.  I've been figuring out how 
> to 
> do this the whole day, and I could not get the correct R code. 
> 
> Suppose I have a dataframe called x and it consists of 10variables. 
> 
>>x 
> 
>          h 1        h 2        h 3         h 4        h 
> 5        h 6 
> h 7        h 8        h 9 
> 1  0.38971928 0.62884802 0.32708216 0.093909834 0.57773251 0.41258918 
> 0.37360577 0.65259411 0.88204799 
> 2  0.51890830 0.15949863 0.75715149 0.871781822 0.06321826 0.91844114 
> 0.05692871 0.84588084 0.77173376 
> 3  0.94057256 0.16100731 0.80961141 0.239716639 0.55804412 0.42854829 
> 0.54987115 0.68416629 0.24353692 
> 4  0.19895720 0.52955693 0.98471869 0.378197899 0.16774788 0.68029534 
> 0.42039730 0.82217244 0.74397124 
> 5  0.27899679 0.29145024 0.07198476 0.732466508 0.14887818 0.90658800 
> 0.64186885 0.66542828 0.98182923 
> 6  0.69375077 0.05840897 0.77325437 0.866099979 0.75063858 0.94230759 
> 0.72182389 0.65574673 0.27406027 
> 7  0.35033643 0.22525597 0.81657974 0.000762193 0.88383211 0.98120966 
> 0.29471244 0.32119662 0.10313222 
> 8  0.40616362 0.37962815 0.80085463 0.919385580 0.47183711 0.15078169 
> 0.93693666 0.24638847 0.12288727 
> 9  0.07939773 0.39030956 0.50235863 0.516507293 0.49247563 0.30633870 
> 0.45665595 0.25479969 0.34689089 
> 10 0.68677267 0.32089352 0.61330153 0.444584299 0.15588483 0.30584289 
> 0.78482250 0.55628942 0.81763581 
> 11 0.47406350 0.75586693 0.19546691 0.698137899 0.47609057 0.56439955 
> 0.33120842 0.54064656 0.36384570 
> 12 0.73796417 0.32741375 0.60800036 0.249716033 0.21919825 0.14749886 
> 0.53495852 0.74101013 0.69063797 
> 13 0.87890769 0.77631054 0.76307442 0.561350947 0.73865259 0.58031305 
> 0.06972116 0.53286669 0.09135791 
> 14 0.91022993 0.52290742 0.21219953 0.209784849 0.90892801 0.03580675 
> 0.19870342 0.79300520 0.85703181 
> 15 0.11331488 0.67744821 0.96226396 0.350925439 0.32038355 0.39465379 
> 0.38653925 0.09538576 0.04436648 
> 16 0.71950535 0.77548893 0.60316799 0.123102348 0.1028 0.05392754 
> 0.17026972 0.17092818 0.35550621 
> 17 0.29593089 0.75526797 0.52088596 0.629731365 0.13592383 0.20219434 
> 0.63906356 0.55297375 0.30580842 
> 18 0.02915505 0.56244353 0.62397566 0.770202648 0.07929744 0.08574671 
> 0.36506494 0.47563923 0.84796898 
> 19 0.27369892 0.95739919 0.63443013 0.810165262 0.10230919 0.52165672 
> 0.84467928 0.60684813 0.02245486 
> 20 0.31494866 0.26169713 0.84314426 0.239598362 0.59996122 0.46954979 
> 0.99728261 0.28905422 0.91817317 
>          h 10 
> 1  0.552413907 
> 2  0.130387427 
> 3  0.523121318 
> 4  0.61351 
> 5  0.005378552 
> 6  0.275925081 
> 7  0.939273614 
> 8  0.152024143 
> 9  0.216325412 
> 10 0.577869906 
> 11 0.484999656 
> 12 0.686217251 
> 13 0.920351777 
> 14 0.924500707 
> 15 0.577019180 
> 16 0.824386203 
> 17 0.130089829 
> 18 0.539668426 
> 19 0.776488706 
> 20 0.992742685 
> 
> 
> and i have a vector of strings which I called vars. 
> 
>> vars 
> [1] "h 8"  "h 4"   "h 10" "h 1" 
> 
> 
> the variables inside vars are subset of the column names of x.  The order is 
> important. In my dataframe, I want to obtain the following: 
> 
>>newdata 
>           h 8        h 4        h 10         h 1 
>  [1,] 0.65259411 0.093909834 0.552413907 0.38971928 
>  [2,] 0.84588084 0.871781822 0.130387427 0.51890830 
>  [3,] 0.68416629 0.239716639 0.523121318 0.94057256 
>  [4,] 0.82217244 0.378197899 0.61351 0.19895720 
>  [5,] 0.66542828 0.732466508 0.005378552 0.27899679 
>  [6,] 0.65574673 0.866099979 0.275925081 0.69375077 
>  [7,] 0.32119662 0.000762193 0.939273614 0.35033643 
>  [8,] 0.24638847 0.919385580 0.152024143 0.40616362 
>  [9,] 0.25479969 0.516507293 0.216325412 0.07939773 
> [10,] 0.55628942 0.444584299 0.577869906 0.68677267 
> [11,] 0.54064656 0.698137899 0.484999656 0.47406350 
> [12,] 0.74101013 0.249716033 0.686217251 0.73796417 
> [13,] 0.53286669 0.561350947 0.920351777 0.87890769 
> [14,] 0.79300520 0.209784849 0.924500707 0.91022993 
> [15,] 0.09538576 0.350925439 0.577019180 0.11331488 
> [16,] 0.17092818 0.123102348 0.824386203 0.71950535 
> [17,] 0.55297375 0.629731365 0.130089829 0.29593089 
> [18,] 0.47563923 0.770202648 0.539668426 0.02915505 
> [19,] 0.60684813 0.810165262 0.776488706 0.27369892 
> [20,] 0.28905422 0.239598362 0.992742685 0.31494866 
> 
> I tried to do the following but it does not give me what I want: 
> 
> x[names(x)==names(x[names(x)%in%vars])

Re: [R] how to create a new data given a vector of variable names

2012-05-14 Thread R. Michael Weylandt
I think you're making this too hard:

x[, vars]

should do it.

For a smaller example, consider

x <- data.frame(a = 1:5, b = rnorm(5), c = letters[1:5])

x[, c("b","a")]

Best,
Michael

On Tue, May 15, 2012 at 1:12 AM, christy  wrote:
> hi, please help me on this. I'm very new to R.  I've been figuring out how to
> do this the whole day, and I could not get the correct R code.
>
> Suppose I have a dataframe called x and it consists of 10variables.
>
>>x
>
>          h 1        h 2        h 3         h 4        h 5        h 6
> h 7        h 8        h 9
> 1  0.38971928 0.62884802 0.32708216 0.093909834 0.57773251 0.41258918
> 0.37360577 0.65259411 0.88204799
> 2  0.51890830 0.15949863 0.75715149 0.871781822 0.06321826 0.91844114
> 0.05692871 0.84588084 0.77173376
> 3  0.94057256 0.16100731 0.80961141 0.239716639 0.55804412 0.42854829
> 0.54987115 0.68416629 0.24353692
> 4  0.19895720 0.52955693 0.98471869 0.378197899 0.16774788 0.68029534
> 0.42039730 0.82217244 0.74397124
> 5  0.27899679 0.29145024 0.07198476 0.732466508 0.14887818 0.90658800
> 0.64186885 0.66542828 0.98182923
> 6  0.69375077 0.05840897 0.77325437 0.866099979 0.75063858 0.94230759
> 0.72182389 0.65574673 0.27406027
> 7  0.35033643 0.22525597 0.81657974 0.000762193 0.88383211 0.98120966
> 0.29471244 0.32119662 0.10313222
> 8  0.40616362 0.37962815 0.80085463 0.919385580 0.47183711 0.15078169
> 0.93693666 0.24638847 0.12288727
> 9  0.07939773 0.39030956 0.50235863 0.516507293 0.49247563 0.30633870
> 0.45665595 0.25479969 0.34689089
> 10 0.68677267 0.32089352 0.61330153 0.444584299 0.15588483 0.30584289
> 0.78482250 0.55628942 0.81763581
> 11 0.47406350 0.75586693 0.19546691 0.698137899 0.47609057 0.56439955
> 0.33120842 0.54064656 0.36384570
> 12 0.73796417 0.32741375 0.60800036 0.249716033 0.21919825 0.14749886
> 0.53495852 0.74101013 0.69063797
> 13 0.87890769 0.77631054 0.76307442 0.561350947 0.73865259 0.58031305
> 0.06972116 0.53286669 0.09135791
> 14 0.91022993 0.52290742 0.21219953 0.209784849 0.90892801 0.03580675
> 0.19870342 0.79300520 0.85703181
> 15 0.11331488 0.67744821 0.96226396 0.350925439 0.32038355 0.39465379
> 0.38653925 0.09538576 0.04436648
> 16 0.71950535 0.77548893 0.60316799 0.123102348 0.1028 0.05392754
> 0.17026972 0.17092818 0.35550621
> 17 0.29593089 0.75526797 0.52088596 0.629731365 0.13592383 0.20219434
> 0.63906356 0.55297375 0.30580842
> 18 0.02915505 0.56244353 0.62397566 0.770202648 0.07929744 0.08574671
> 0.36506494 0.47563923 0.84796898
> 19 0.27369892 0.95739919 0.63443013 0.810165262 0.10230919 0.52165672
> 0.84467928 0.60684813 0.02245486
> 20 0.31494866 0.26169713 0.84314426 0.239598362 0.59996122 0.46954979
> 0.99728261 0.28905422 0.91817317
>          h 10
> 1  0.552413907
> 2  0.130387427
> 3  0.523121318
> 4  0.61351
> 5  0.005378552
> 6  0.275925081
> 7  0.939273614
> 8  0.152024143
> 9  0.216325412
> 10 0.577869906
> 11 0.484999656
> 12 0.686217251
> 13 0.920351777
> 14 0.924500707
> 15 0.577019180
> 16 0.824386203
> 17 0.130089829
> 18 0.539668426
> 19 0.776488706
> 20 0.992742685
>
>
> and i have a vector of strings which I called vars.
>
>> vars
> [1] "h 8"  "h 4"   "h 10" "h 1"
>
>
> the variables inside vars are subset of the column names of x.  The order is
> important. In my dataframe, I want to obtain the following:
>
>>newdata
>           h 8        h 4        h 10         h 1
>  [1,] 0.65259411 0.093909834 0.552413907 0.38971928
>  [2,] 0.84588084 0.871781822 0.130387427 0.51890830
>  [3,] 0.68416629 0.239716639 0.523121318 0.94057256
>  [4,] 0.82217244 0.378197899 0.61351 0.19895720
>  [5,] 0.66542828 0.732466508 0.005378552 0.27899679
>  [6,] 0.65574673 0.866099979 0.275925081 0.69375077
>  [7,] 0.32119662 0.000762193 0.939273614 0.35033643
>  [8,] 0.24638847 0.919385580 0.152024143 0.40616362
>  [9,] 0.25479969 0.516507293 0.216325412 0.07939773
> [10,] 0.55628942 0.444584299 0.577869906 0.68677267
> [11,] 0.54064656 0.698137899 0.484999656 0.47406350
> [12,] 0.74101013 0.249716033 0.686217251 0.73796417
> [13,] 0.53286669 0.561350947 0.920351777 0.87890769
> [14,] 0.79300520 0.209784849 0.924500707 0.91022993
> [15,] 0.09538576 0.350925439 0.577019180 0.11331488
> [16,] 0.17092818 0.123102348 0.824386203 0.71950535
> [17,] 0.55297375 0.629731365 0.130089829 0.29593089
> [18,] 0.47563923 0.770202648 0.539668426 0.02915505
> [19,] 0.60684813 0.810165262 0.776488706 0.27369892
> [20,] 0.28905422 0.239598362 0.992742685 0.31494866
>
> I tried to do the following but it does not give me what I want:
>
> x[names(x)==names(x[names(x)%in%vars])]
>
>
> Thank you very much.
>
> ~Christy
>
>
>
>
> --
> View this message in context: 
> http://r.789695.n4.nabble.com/how-to-create-a-new-data-given-a-vector-of-variable-names-tp4630024.html
> Sent from the R help mailing list archive at Nabble.com.
>
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.

Re: [R] Inf and lazy evaluation

2012-05-14 Thread R. Michael Weylandt
The only place I know lazy evaluation really is visible and widely
used is in the passing of function arguments. It's what allows magic
like

zz <- 1:5
plot(zz)

to know your variable was called "zz." It can also show up in some
places through the promise mechanism, but you have to do a little bit
of work to see them:

zz <- lapply(1:3, function(i) function(x) x^i)

zz[[2]](2)

Without lazy evaluation this would have been 4. Sometimes this winds
up hurting folks -- I'm not sure if it has a "good reason" to be there
or if its a consequence of lazy mechanisms elsewhere (which improve
overall performance)

But I don't believe R allows lazy constructors in any context.

Best,
Michael

On Mon, May 14, 2012 at 4:29 PM, J Toll  wrote:
> Thank you all for the replies.
>
> On Mon, May 14, 2012 at 2:45 PM, R. Michael Weylandt
>  wrote:
>> R is lazy, but not quite that lazy ;-)
>
> Oh, what is this world coming to when you can't count on laziness to
> be lazy. ;)  I should probably stop reading about Haskell and their
> lazy way of doing things.
>
> As a relatively naive observation, in R, it seems like argument
> recycling kind of breaks the power of lazy evaluation.
>
> Thanks for the suggestion of list.files()
>
>
> James

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] how to create a new data given a vector of variable names

2012-05-14 Thread Jorge I Velez
Hi Christy,

Try either of the following

subset(x, select = vars)
x[, vars]

HTH,
Jorge.-


On Tue, May 15, 2012 at 1:12 AM, christy < >wrote:

> hi, please help me on this. I'm very new to R.  I've been figuring out how
> to
> do this the whole day, and I could not get the correct R code.
>
> Suppose I have a dataframe called x and it consists of 10variables.
>
> >x
>
>  h 1h 2h 3 h 4h 5h 6
> h 7h 8h 9
> 1  0.38971928 0.62884802 0.32708216 0.093909834 0.57773251 0.41258918
> 0.37360577 0.65259411 0.88204799
> 2  0.51890830 0.15949863 0.75715149 0.871781822 0.06321826 0.91844114
> 0.05692871 0.84588084 0.77173376
> 3  0.94057256 0.16100731 0.80961141 0.239716639 0.55804412 0.42854829
> 0.54987115 0.68416629 0.24353692
> 4  0.19895720 0.52955693 0.98471869 0.378197899 0.16774788 0.68029534
> 0.42039730 0.82217244 0.74397124
> 5  0.27899679 0.29145024 0.07198476 0.732466508 0.14887818 0.90658800
> 0.64186885 0.66542828 0.98182923
> 6  0.69375077 0.05840897 0.77325437 0.866099979 0.75063858 0.94230759
> 0.72182389 0.65574673 0.27406027
> 7  0.35033643 0.22525597 0.81657974 0.000762193 0.88383211 0.98120966
> 0.29471244 0.32119662 0.10313222
> 8  0.40616362 0.37962815 0.80085463 0.919385580 0.47183711 0.15078169
> 0.93693666 0.24638847 0.12288727
> 9  0.07939773 0.39030956 0.50235863 0.516507293 0.49247563 0.30633870
> 0.45665595 0.25479969 0.34689089
> 10 0.68677267 0.32089352 0.61330153 0.444584299 0.15588483 0.30584289
> 0.78482250 0.55628942 0.81763581
> 11 0.47406350 0.75586693 0.19546691 0.698137899 0.47609057 0.56439955
> 0.33120842 0.54064656 0.36384570
> 12 0.73796417 0.32741375 0.60800036 0.249716033 0.21919825 0.14749886
> 0.53495852 0.74101013 0.69063797
> 13 0.87890769 0.77631054 0.76307442 0.561350947 0.73865259 0.58031305
> 0.06972116 0.53286669 0.09135791
> 14 0.91022993 0.52290742 0.21219953 0.209784849 0.90892801 0.03580675
> 0.19870342 0.79300520 0.85703181
> 15 0.11331488 0.67744821 0.96226396 0.350925439 0.32038355 0.39465379
> 0.38653925 0.09538576 0.04436648
> 16 0.71950535 0.77548893 0.60316799 0.123102348 0.1028 0.05392754
> 0.17026972 0.17092818 0.35550621
> 17 0.29593089 0.75526797 0.52088596 0.629731365 0.13592383 0.20219434
> 0.63906356 0.55297375 0.30580842
> 18 0.02915505 0.56244353 0.62397566 0.770202648 0.07929744 0.08574671
> 0.36506494 0.47563923 0.84796898
> 19 0.27369892 0.95739919 0.63443013 0.810165262 0.10230919 0.52165672
> 0.84467928 0.60684813 0.02245486
> 20 0.31494866 0.26169713 0.84314426 0.239598362 0.59996122 0.46954979
> 0.99728261 0.28905422 0.91817317
>  h 10
> 1  0.552413907
> 2  0.130387427
> 3  0.523121318
> 4  0.61351
> 5  0.005378552
> 6  0.275925081
> 7  0.939273614
> 8  0.152024143
> 9  0.216325412
> 10 0.577869906
> 11 0.484999656
> 12 0.686217251
> 13 0.920351777
> 14 0.924500707
> 15 0.577019180
> 16 0.824386203
> 17 0.130089829
> 18 0.539668426
> 19 0.776488706
> 20 0.992742685
>
>
> and i have a vector of strings which I called vars.
>
> > vars
> [1] "h 8"  "h 4"   "h 10" "h 1"
>
>
> the variables inside vars are subset of the column names of x.  The order
> is
> important. In my dataframe, I want to obtain the following:
>
> >newdata
>   h 8h 4h 10 h 1
>  [1,] 0.65259411 0.093909834 0.552413907 0.38971928
>  [2,] 0.84588084 0.871781822 0.130387427 0.51890830
>  [3,] 0.68416629 0.239716639 0.523121318 0.94057256
>  [4,] 0.82217244 0.378197899 0.61351 0.19895720
>  [5,] 0.66542828 0.732466508 0.005378552 0.27899679
>  [6,] 0.65574673 0.866099979 0.275925081 0.69375077
>  [7,] 0.32119662 0.000762193 0.939273614 0.35033643
>  [8,] 0.24638847 0.919385580 0.152024143 0.40616362
>  [9,] 0.25479969 0.516507293 0.216325412 0.07939773
> [10,] 0.55628942 0.444584299 0.577869906 0.68677267
> [11,] 0.54064656 0.698137899 0.484999656 0.47406350
> [12,] 0.74101013 0.249716033 0.686217251 0.73796417
> [13,] 0.53286669 0.561350947 0.920351777 0.87890769
> [14,] 0.79300520 0.209784849 0.924500707 0.91022993
> [15,] 0.09538576 0.350925439 0.577019180 0.11331488
> [16,] 0.17092818 0.123102348 0.824386203 0.71950535
> [17,] 0.55297375 0.629731365 0.130089829 0.29593089
> [18,] 0.47563923 0.770202648 0.539668426 0.02915505
> [19,] 0.60684813 0.810165262 0.776488706 0.27369892
> [20,] 0.28905422 0.239598362 0.992742685 0.31494866
>
> I tried to do the following but it does not give me what I want:
>
> x[names(x)==names(x[names(x)%in%vars])]
>
>
> Thank you very much.
>
> ~Christy
>
>
>
>
> --
> View this message in context:
> http://r.789695.n4.nabble.com/how-to-create-a-new-data-given-a-vector-of-variable-names-tp4630024.html
> Sent from the R help mailing list archive at Nabble.com.
>
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducib

[R] how to create a new data given a vector of variable names

2012-05-14 Thread christy
hi, please help me on this. I'm very new to R.  I've been figuring out how to
do this the whole day, and I could not get the correct R code.

Suppose I have a dataframe called x and it consists of 10variables. 

>x

  h 1h 2h 3 h 4h 5h 6   
h 7h 8h 9
1  0.38971928 0.62884802 0.32708216 0.093909834 0.57773251 0.41258918
0.37360577 0.65259411 0.88204799
2  0.51890830 0.15949863 0.75715149 0.871781822 0.06321826 0.91844114
0.05692871 0.84588084 0.77173376
3  0.94057256 0.16100731 0.80961141 0.239716639 0.55804412 0.42854829
0.54987115 0.68416629 0.24353692
4  0.19895720 0.52955693 0.98471869 0.378197899 0.16774788 0.68029534
0.42039730 0.82217244 0.74397124
5  0.27899679 0.29145024 0.07198476 0.732466508 0.14887818 0.90658800
0.64186885 0.66542828 0.98182923
6  0.69375077 0.05840897 0.77325437 0.866099979 0.75063858 0.94230759
0.72182389 0.65574673 0.27406027
7  0.35033643 0.22525597 0.81657974 0.000762193 0.88383211 0.98120966
0.29471244 0.32119662 0.10313222
8  0.40616362 0.37962815 0.80085463 0.919385580 0.47183711 0.15078169
0.93693666 0.24638847 0.12288727
9  0.07939773 0.39030956 0.50235863 0.516507293 0.49247563 0.30633870
0.45665595 0.25479969 0.34689089
10 0.68677267 0.32089352 0.61330153 0.444584299 0.15588483 0.30584289
0.78482250 0.55628942 0.81763581
11 0.47406350 0.75586693 0.19546691 0.698137899 0.47609057 0.56439955
0.33120842 0.54064656 0.36384570
12 0.73796417 0.32741375 0.60800036 0.249716033 0.21919825 0.14749886
0.53495852 0.74101013 0.69063797
13 0.87890769 0.77631054 0.76307442 0.561350947 0.73865259 0.58031305
0.06972116 0.53286669 0.09135791
14 0.91022993 0.52290742 0.21219953 0.209784849 0.90892801 0.03580675
0.19870342 0.79300520 0.85703181
15 0.11331488 0.67744821 0.96226396 0.350925439 0.32038355 0.39465379
0.38653925 0.09538576 0.04436648
16 0.71950535 0.77548893 0.60316799 0.123102348 0.1028 0.05392754
0.17026972 0.17092818 0.35550621
17 0.29593089 0.75526797 0.52088596 0.629731365 0.13592383 0.20219434
0.63906356 0.55297375 0.30580842
18 0.02915505 0.56244353 0.62397566 0.770202648 0.07929744 0.08574671
0.36506494 0.47563923 0.84796898
19 0.27369892 0.95739919 0.63443013 0.810165262 0.10230919 0.52165672
0.84467928 0.60684813 0.02245486
20 0.31494866 0.26169713 0.84314426 0.239598362 0.59996122 0.46954979
0.99728261 0.28905422 0.91817317
  h 10
1  0.552413907
2  0.130387427
3  0.523121318
4  0.61351
5  0.005378552
6  0.275925081
7  0.939273614
8  0.152024143
9  0.216325412
10 0.577869906
11 0.484999656
12 0.686217251
13 0.920351777
14 0.924500707
15 0.577019180
16 0.824386203
17 0.130089829
18 0.539668426
19 0.776488706
20 0.992742685


and i have a vector of strings which I called vars.

> vars
[1] "h 8"  "h 4"   "h 10" "h 1" 


the variables inside vars are subset of the column names of x.  The order is
important. In my dataframe, I want to obtain the following:

>newdata
   h 8h 4h 10 h 1 
 [1,] 0.65259411 0.093909834 0.552413907 0.38971928
 [2,] 0.84588084 0.871781822 0.130387427 0.51890830
 [3,] 0.68416629 0.239716639 0.523121318 0.94057256
 [4,] 0.82217244 0.378197899 0.61351 0.19895720
 [5,] 0.66542828 0.732466508 0.005378552 0.27899679
 [6,] 0.65574673 0.866099979 0.275925081 0.69375077
 [7,] 0.32119662 0.000762193 0.939273614 0.35033643
 [8,] 0.24638847 0.919385580 0.152024143 0.40616362
 [9,] 0.25479969 0.516507293 0.216325412 0.07939773
[10,] 0.55628942 0.444584299 0.577869906 0.68677267
[11,] 0.54064656 0.698137899 0.484999656 0.47406350
[12,] 0.74101013 0.249716033 0.686217251 0.73796417
[13,] 0.53286669 0.561350947 0.920351777 0.87890769
[14,] 0.79300520 0.209784849 0.924500707 0.91022993
[15,] 0.09538576 0.350925439 0.577019180 0.11331488
[16,] 0.17092818 0.123102348 0.824386203 0.71950535
[17,] 0.55297375 0.629731365 0.130089829 0.29593089
[18,] 0.47563923 0.770202648 0.539668426 0.02915505
[19,] 0.60684813 0.810165262 0.776488706 0.27369892
[20,] 0.28905422 0.239598362 0.992742685 0.31494866

I tried to do the following but it does not give me what I want: 

x[names(x)==names(x[names(x)%in%vars])]


Thank you very much.

~Christy




--
View this message in context: 
http://r.789695.n4.nabble.com/how-to-create-a-new-data-given-a-vector-of-variable-names-tp4630024.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] OpenStreetMap Library

2012-05-14 Thread Pierre-Olivier Chasset
Hello,

I am trying to use the OpenStreetMap library on Mac OS X Lion with Java 6 & 7.

Loading the library is not a problem:
> library("OpenStreetMap")
Le chargement a nécessité le package : rJava
Le chargement a nécessité le package : sp
Le chargement a nécessité le package : maptools
Le chargement a nécessité le package : foreign
Le chargement a nécessité le package : lattice
Checking rgeos availability: TRUE
Le chargement a nécessité le package : raster
raster 1.9-92 (1-May-2012)

I get an issue, when I try to use the openmap function:
> map <- openmap(c(52,-5), c(42,8))
Erreur dans .jcall("RJavaTools", "Ljava/lang/Object;", "invokeMethod", cl,  : 
  java.lang.InternalError: Can't start the AWT because Java was started on the 
first thread.  Make sure StartOnFirstThread is not specified in your 
application's Info.plist or on the command line

Have you got any idea in order to get it works?

Best regards,

Pierre-Olivier


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] package rgl not installing on Mac OS X

2012-05-14 Thread Duncan Murdoch

On 14/05/2012 6:12 PM, Brenda McCowan wrote:

This is what I am getting:


install.packages("rgl")

Installing package(s) into Œ/Users/brenda/Library/R/2.15/library¹
(as Œlib¹ is unspecified)
trying URL
'http://cran.cnr.Berkeley.edu/bin/macosx/leopard/contrib/2.15/rgl_0.92.880.t
gz'
Content type 'application/x-gzip' length 10757858 bytes (10.3 Mb)
opened URL
=
downloaded 411 Kb

rgl/fonts/FreeSerif.ttf: Truncated tar archive
tar: Error exit delayed from previous errors.

The downloaded binary packages are in

/var/folders/YG/YGlE8Mx6HHKHOY6P5UdzNU+++TM/-Tmp-//RtmpambK9X/downloaded_pac
kages
Warning messages:
1: In download.file(url, destfile, method, mode = "wb", ...) :
   downloaded length 421070 != reported length 10757858
2: 'tar' returned non-zero exit code 1


Sure looks like a download error.  I'd try again, or try a different 
mirror if you get the same error.


You could also try getting it from R-forge.r-project.org; it has a very 
slightly newer version (0.92.881) than CRAN.


Duncan Murdoch




Thanks!

Brenda



On 5/14/12 3:01 PM, "Steve Lianoglou"
wrote:


Hi,

Did you simply try:

install.packages("adehabitat")

or, if that doesn't work, maybe:

install.packages("adehabitat", type="source")

Or change your cran mirror?

-steve

On Mon, May 14, 2012 at 5:44 PM, Kristi Glover
  wrote:


Hi R- User,I tried to load package adehabitat for R version 2.15.0. But,  I
got error. even I download and save in local drive and tried to install from
the local file. but still i got the following errors
  install.packages("adehabitat", repos='C:/adehabitat_1.8.10.tgz')


package Œadehabitat¹ is not available (for R version 2.15.0)


install.packages("adehabitat",
repos='http://cran.skazkaforyou.com/bin/macosx/leopard/contrib/2.15/adehabit
at_1.8.10.tgz')

Warning: unable to access index for repository
http://cran.skazkaforyou.com/bin/macosx/leopard/contrib/2.15/adehabitat_1.8.1
0.tgz/bin/macosx/leopard/contrib/2.15
Warning message:
package Œadehabitat¹ is not available (for R version 2.15.0)
I need to install it as I have been using a script that has used this package
too besides other packages.
Would any one help me how I can install the package on Mac OS X?
cheers,
KG





[[alternative HTML version deleted]]


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.






__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Scraping a web page.

2012-05-14 Thread J Toll
On Mon, May 14, 2012 at 4:17 PM, Keith Weintraub  wrote:
> Folks,
>  I want to scrape a series of web-page sources for strings like the following:
>
> "/en/Ships/A-8605507.html"
> "/en/Ships/Aalborg-8122830.html"
>
> which appear in an href inside an  tag inside a  tag inside a table.
>
> In fact all I want is the (exactly) 7-digit number before ".html".
>
> The good news is that as far as I can tell the the  tag is always on it's 
> own line so some kind of line-by-line grep should suffice once I figure out 
> the following:
>
> What is the best package/command to use to get the source of a web page. I 
> tried using something like:
> if(url.exists("http://www.omegahat.org/RCurl";)) {
>  h = basicTextGatherer()
>  curlPerform(url = "http://www.omegahat.org/RCurl";, writefunction = h$update)
>   # Now read the text that was cumulated during the query response.
>  h$value()
> }
>
> which works except that I get one long streamed html doc without the line 
> breaks.

You could use:

h <- readLines("http://www.omegahat.org/RCurl";)

-- or --

download.file(url = "http://www.omegahat.org/RCurl";, destfile = "tmp.html")
h = scan("tmp.html", what = "", sep = "\n")

and then use grep or the XML package for processing.

HTH

James

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Interpreting Q-Q Plots

2012-05-14 Thread S Ellison
>And since I don't have the experience, the only way to gain it is by
>  learning from those with practice reading chicken entrails.
This can be hard on the chicken population.

Try comparing QQ plots for simulated random data from different distributions 
with something more immediately interpretable on the measurement scale,  such 
as dot plots, box plots and density plots. That should add up to a fair bit of 
experience quite quickly. 

#Example
par(mfrow=c(1,2))
qqnorm(x<-rlnorm(200, 1,0.5))
qqline(x)
plot(density(x))

qqnorm(x<-rnorm(200, sample(c(0,4), 200, replace=TRUE))) #bimodal
qqline(x)
plot(density(x))


and so on.

Notice that qqnorm's vertical scale by defult corresponds to the horizontal 
scale in density plots and stripcharts. I personally prefer datax=TRUE, but 
really that's only a choice about whether to face north or east when reading 
the entrails.

***
This email and any attachments are confidential. Any use...{{dropped:8}}

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] package ‘adehabitat’ is not available (for R version 2.15.0) on Mac OS X

2012-05-14 Thread David Winsemius


On May 14, 2012, at 5:44 PM, Kristi Glover wrote:



Hi R- User,I tried to load package adehabitat for R version 2.15.0.  
But,  I got error. even I download and save in local drive and tried  
to install from the local file. but still i got the following errors

install.packages("adehabitat", repos='C:/adehabitat_1.8.10.tgz')


Now _that_ would be strange directory specification for a Mac-sited  
installation of R. The "C:' drive spec is usually specific to Windows  
devices.




package ‘adehabitat’ is not available (for R version 2.15.0)


install.packages("adehabitat", 
repos='http://cran.skazkaforyou.com/bin/macosx/leopard/contrib/2.15/adehabitat_1.8.10.tgz')


That repository had a binary copy as of a minute or two ago. Could it  
have been temporarily unavailable and you just now need to try again?




Warning: unable to access index for repository 
http://cran.skazkaforyou.com/bin/macosx/leopard/contrib/2.15/adehabitat_1.8.10.tgz/bin/macosx/leopard/contrib/2.15
Warning message:
package ‘adehabitat’ is not available (for R version 2.15.0)
I need to install it as I have been using a script that has used  
this package too besides other packages.

Would any one help me how I can install the package on Mac OS X?
cheers,
KG
[[alternative HTML version deleted]]


--

David Winsemius, MD
West Hartford, CT

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Interpreting Q-Q Plots

2012-05-14 Thread Rich Shepard

On Tue, 15 May 2012, Peter Alspach wrote:


Probably highly skewed to the right, with discrete values (perhaps due to
the limitations in the accuracy of the assessment equipment).


Peter,

  Most of these data are near zero or the lower detection limit. A few
values are very much higher. I didn't think of skewness as a reason.


 But note:

library(fortunes)
fortune('chicken')


  And since I don't have the experience, the only way to gain it is by
learning from those with practice reading chicken entrails.

Thanks,

Rich

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Error in names(x) <- value: 'names' attribute must be the same length as the vector

2012-05-14 Thread David Winsemius


On May 14, 2012, at 2:35 PM, Priya Bhatt wrote:


Dear R-helpers,

I am stuck on an error in R:  When I run my code (below), I get this  
error

back:

Error in names(x) <- value :
 'names' attribute must be the same length as the vector


Then when I use traceback(), R gives me back this in return:

`colnames<-`(`*tmp*`, value = c(""Item", "Color" ,"Number", "Size"))



I'm not exactly sure how to fix this problem.  Any advice would be  
greatly

appreciated!


Why not throw in some print statements to figure out which file is  
causing trouble? ( And I would suggest not using "c" as a loop  
variable.)


Then you use str() to look at 'currentCSVFile' or  'newdf.int' after  
isolating the source of problem.


--
David.


Thanks,
Priya


MODIFIED CODE:
# Looping through a series of CSV files
for (c in csvfiles)
{
 #A DF (prevdf) was created based on an initial csv file..
 #so the condition below states that if there are rows with NAs or the
number of rows in prevdf is zero
 if( (apply(prevdf, 1, function(y) !sum(!is.na(y))==1) > 0) ||
(nrow(prevdf) == 0) )
 {
   #Open a new file
   currentCSVFile <- read.csv(c, header=TRUE)
   #pick only the few columns we want from the file
   currentCSVFile <- data.frame(currentCSVFile$Item,
currentCSVFile$Color..type , currentCSVFile$Number..owned,
currentCSVFile$Size..shirt)
   #rename the column names
   colnames(currentCSVFile) <- c("Item", "Color" ,"Number", "Size")

   #find the rows in prevdf that do not have any values. (sum should  
be 1

because the Item name is unique for every row)
   NArows <- prevdf[apply(prevdf, 1, function(y) sum(!is.na(y))==1),]

   #if NAs rows is not equal to zero
   if (nrow(NArows) != 0 )
   {
 #find the rows in the current CSV file where there is missing  
data in

prevdf (this info is in NArows)
 intersectItem<- intersect(currentCSVFile$Item, NArows$Item)

 #initiate another data frame to put the data in
 newdf.int <- data.frame(Item=c(), Color=c(), Number=c(),  
Size=c())



 print(nrow(currentCSVFile))
 for (i in 1:nrow(currentCSVFile))

 {
   print("In loop") # check for me
   row <- currentCSVFile[i,]

   if (row$Item %in% intersectItem){  # this is where the code  
stops

and throws back error
 .
 .
 .
  # do stuff to fill vectors named Item, Color, Number and  
Size

 .
 .
 .

 newdf.int <-rbind(newdf.int, c(Item, Color, Number, Size)
   }

   colnames(newdf.int) <- c("Item", "Color", "Number", "Size")
   prevdf <- merge(newdf.int, prevdf, by=c("Item", "Color",  
"Number",

"Size"), all=TRUE)
   prevdf <- prevdf[apply(prevdf, 1, function(y) !sum(! 
is.na(y))==1),]

   print("after removing row = 1")


 } # end of for loop

   } # end of NA rows condition

 } # end of main if statement

 else
 {
   break
 }


}

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


David Winsemius, MD
West Hartford, CT

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] package rgl not installing on Mac OS X

2012-05-14 Thread Brenda McCowan
This is what I am getting:

> install.packages("rgl")
Installing package(s) into Œ/Users/brenda/Library/R/2.15/library¹
(as Œlib¹ is unspecified)
trying URL 
'http://cran.cnr.Berkeley.edu/bin/macosx/leopard/contrib/2.15/rgl_0.92.880.t
gz'
Content type 'application/x-gzip' length 10757858 bytes (10.3 Mb)
opened URL
=
downloaded 411 Kb

rgl/fonts/FreeSerif.ttf: Truncated tar archive
tar: Error exit delayed from previous errors.

The downloaded binary packages are in

/var/folders/YG/YGlE8Mx6HHKHOY6P5UdzNU+++TM/-Tmp-//RtmpambK9X/downloaded_pac
kages
Warning messages:
1: In download.file(url, destfile, method, mode = "wb", ...) :
  downloaded length 421070 != reported length 10757858
2: 'tar' returned non-zero exit code 1


Thanks!

Brenda



On 5/14/12 3:01 PM, "Steve Lianoglou" 
wrote:

> Hi,
> 
> Did you simply try:
> 
> install.packages("adehabitat")
> 
> or, if that doesn't work, maybe:
> 
> install.packages("adehabitat", type="source")
> 
> Or change your cran mirror?
> 
> -steve
> 
> On Mon, May 14, 2012 at 5:44 PM, Kristi Glover
>  wrote:
>> 
>> Hi R- User,I tried to load package adehabitat for R version 2.15.0. But,  I
>> got error. even I download and save in local drive and tried to install from
>> the local file. but still i got the following errors
>>  install.packages("adehabitat", repos='C:/adehabitat_1.8.10.tgz')
>> 
>> 
>> package Œadehabitat¹ is not available (for R version 2.15.0)
>> 
>>> install.packages("adehabitat",
>>> repos='http://cran.skazkaforyou.com/bin/macosx/leopard/contrib/2.15/adehabit
>>> at_1.8.10.tgz')
>> Warning: unable to access index for repository
>> http://cran.skazkaforyou.com/bin/macosx/leopard/contrib/2.15/adehabitat_1.8.1
>> 0.tgz/bin/macosx/leopard/contrib/2.15
>> Warning message:
>> package Œadehabitat¹ is not available (for R version 2.15.0)
>> I need to install it as I have been using a script that has used this package
>> too besides other packages.
>> Would any one help me how I can install the package on Mac OS X?
>> cheers,
>> KG
>> 
>> 
>> 
>> 
>> 
>>        [[alternative HTML version deleted]]
>> 
>> 
>> __
>> R-help@r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>> 
> 
> 

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Interpreting Q-Q Plots

2012-05-14 Thread Peter Alspach
Tena koe Rich

Probably highly skewed to the right, with discrete values (perhaps due to the 
limitations in the accuracy of the assessment equipment).  But note:

library(fortunes)
fortune('chicken')

HTH .

Peter Alspach

-Original Message-
From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On 
Behalf Of Rich Shepard
Sent: Tuesday, 15 May 2012 9:53 a.m.
To: r-help@r-project.org
Subject: [R] Interpreting Q-Q Plots

   My understanding of Q-Q plots is that if the tails of the plotted points 
fall above or below the x=y line the distribution of observed/measured values 
is under or over dispersed. But, how do I interpret measured values that are in 
horizontal lines? The attached plot illustrates this situation.

TIA,

Rich

The contents of this e-mail are confidential and may be subject to legal 
privilege.
 If you are not the intended recipient you must not use, disseminate, 
distribute or
 reproduce all or any part of this e-mail or attachments.  If you have received 
this
 e-mail in error, please notify the sender and delete all material pertaining 
to this
 e-mail.  Any opinion or views expressed in this e-mail are those of the 
individual
 sender and may not represent those of The New Zealand Institute for Plant and
 Food Research Limited.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] package ‘adehabitat’ is not available (for R version 2.15.0) on Mac OS X

2012-05-14 Thread Steve Lianoglou
Hi,

Did you simply try:

install.packages("adehabitat")

or, if that doesn't work, maybe:

install.packages("adehabitat", type="source")

Or change your cran mirror?

-steve

On Mon, May 14, 2012 at 5:44 PM, Kristi Glover
 wrote:
>
> Hi R- User,I tried to load package adehabitat for R version 2.15.0. But,  I 
> got error. even I download and save in local drive and tried to install from 
> the local file. but still i got the following errors
>  install.packages("adehabitat", repos='C:/adehabitat_1.8.10.tgz')
>
>
> package ‘adehabitat’ is not available (for R version 2.15.0)
>
>> install.packages("adehabitat", 
>> repos='http://cran.skazkaforyou.com/bin/macosx/leopard/contrib/2.15/adehabitat_1.8.10.tgz')
> Warning: unable to access index for repository 
> http://cran.skazkaforyou.com/bin/macosx/leopard/contrib/2.15/adehabitat_1.8.10.tgz/bin/macosx/leopard/contrib/2.15
> Warning message:
> package ‘adehabitat’ is not available (for R version 2.15.0)
> I need to install it as I have been using a script that has used this package 
> too besides other packages.
> Would any one help me how I can install the package on Mac OS X?
> cheers,
> KG
>
>
>
>
>
>        [[alternative HTML version deleted]]
>
>
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>



-- 
Steve Lianoglou
Graduate Student: Computational Systems Biology
 | Memorial Sloan-Kettering Cancer Center
 | Weill Medical College of Cornell University
Contact Info: http://cbio.mskcc.org/~lianos/contact

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Interpreting Q-Q Plots

2012-05-14 Thread Rich Shepard

  My understanding of Q-Q plots is that if the tails of the plotted points
fall above or below the x=y line the distribution of observed/measured
values is under or over dispersed. But, how do I interpret measured values
that are in horizontal lines? The attached plot illustrates this situation.

TIA,

Rich

chromium_norm.pdf
Description: Adobe PDF document
__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] package ‘adehabitat’ is not available (for R version 2.15.0) on Mac OS X

2012-05-14 Thread Kristi Glover

Hi R- User,I tried to load package adehabitat for R version 2.15.0. But,  I got 
error. even I download and save in local drive and tried to install from the 
local file. but still i got the following errors
 install.packages("adehabitat", repos='C:/adehabitat_1.8.10.tgz')


package ‘adehabitat’ is not available (for R version 2.15.0) 

> install.packages("adehabitat", 
> repos='http://cran.skazkaforyou.com/bin/macosx/leopard/contrib/2.15/adehabitat_1.8.10.tgz')
Warning: unable to access index for repository 
http://cran.skazkaforyou.com/bin/macosx/leopard/contrib/2.15/adehabitat_1.8.10.tgz/bin/macosx/leopard/contrib/2.15
Warning message:
package ‘adehabitat’ is not available (for R version 2.15.0) 
I need to install it as I have been using a script that has used this package 
too besides other packages. 
Would any one help me how I can install the package on Mac OS X?
cheers,
KG




  
[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] select data

2012-05-14 Thread S Ellison


>  From: David L Carlson [dcarl...@tamu.edu]
>This overwrites the data so you might want to create a copy first.
> 
> example <- data.frame(V1=c(3, -1), V2=c(-2, 4), V3=c(4, 1))
> tf <- ifelse(example<0, TRUE, FALSE)
> example[tf] <- NA
> apply(example, 1, mean, na.rm=TRUE)

'simpler' to do something like
apply(example, 1, function(x) mean(x[x>=0]))

Also note that this is averaging _rows_; if you want the column (variable) 
means, which would be much more usual in a data frame, use apply(example, 2, 
function(x) mean(x[x>=0]))

S Ellisn***
This email and any attachments are confidential. Any use...{{dropped:8}}

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] write data using xlsReadWrite

2012-05-14 Thread Ethan Brown
You're trying to write an object that you've never created. If you
want to write `varHL2y`, which it appears you do, you would replace
that for `mydata` in your command.

Best,
Ethan


On Sun, May 13, 2012 at 1:33 AM, diyanah  wrote:
> Hai, I'm trying to write these var output data from these codes inside excel
> file. My directory to store the data is
> /D:\FYP\image /
> but receive an error message :
>
> /Error in write.xls(mydata, "D:\\FYP\\image.mydata.xls") :
>  object 'mydata' not found/
>
> these are my codes, can you help give an advice or idea with my problem:
>
> /library("biOps")
> library("waveslim")
> library("xlsReadWrite")
> x <- readTiff("D:\\FYP\\image\\SignatureImage\\user186g1.tif")
> y <- imgBlockMedianFilter(x, 5)
> #Plot image
> #plot(y)
> y.modwt <- modwt.2d(y, "la8", 2)
> ## Level 2 decomposition
> par(mfrow=c(2,2), pty="s")
> ##Plot wavelets
> image(y.modwt$LH2, col=rainbow(128), axes=FALSE, main="LH2")
> image(y.modwt$HH2, col=rainbow(128), axes=FALSE, main="HH2")
> image(y.modwt$LL2, col=rainbow(128), axes=FALSE, main="LL2")
> image(y.modwt$HL2, col=rainbow(128), axes=FALSE, main="HL2")
> #---#
> ##Get the dimension
> ##LH2
> dimLH2 <- dim(y.modwt$LH2)
> dimLH2x <- dimLH2[1]
> dimLH2y <- dimLH2[2]
> varLH2xlist <- c(rep(0, dimLH2x))
> varLH2ylist <- c(rep(0, dimLH2y))
> ##Loop to get variance from x axis
> for(i in seq(dimLH2x)){
>    varLH2xlist[i] <- var(y.modwt$LH2[i,])
> }
> ##Get the variance from the overall x variance
> varLH2x <- var(varLH2xlist)
> ##Loop to get variance from y axis
> for(i in seq(dimLH2y)){
>    varLH2ylist[i] <- var(y.modwt$LH2[,i])
> }
> ##Get the variance from the overall y variance
> varLH2y <- var(varLH2ylist)
> #-#
> ##Get the dimension
> ##HH2
> dimHH2 <- dim(y.modwt$HH2)
> dimHH2x <- dimHH2[1]
> dimHH2y <- dimHH2[2]
> varHH2xlist <- c(rep(0, dimHH2x))
> varHH2ylist <- c(rep(0, dimHH2y))
> ##Loop to get variance from x axis
> for(i in seq(dimHH2x)){
>    varHH2xlist[i] <- var(y.modwt$HH2[i,])
> }
> ##Get the variance from the overall x variance
> varHH2x <- var(varHH2xlist)
> ##Loop to get variance from y axis
> for(i in seq(dimHH2y)){
>    varHH2ylist[i] <- var(y.modwt$HH2[,i])
> }
> ##Get the variance from the overall y variance
> varHH2y <- var(varHH2ylist)
> #-#
> ##Get the dimension
> ##LL2
> dimLL2 <- dim(y.modwt$LL2)
> dimLL2x <- dimLL2[1]
> dimLL2y <- dimLL2[2]
> varLL2xlist <- c(rep(0, dimLL2x))
> varLL2ylist <- c(rep(0, dimLL2y))
> ##Loop to get variance from x axis
> for(i in seq(dimLL2x)){
>    varLL2xlist[i] <- var(y.modwt$LL2[i,])
> }
> ##Get the variance from the overall x variance
> varLL2x <- var(varLL2xlist)
> ##Loop to get variance from y axis
> for(i in seq(dimLL2y)){
>    varLL2ylist[i] <- var(y.modwt$LL2[,i])
> }
> ##Get the variance from the overall y variance
> varLL2y <- var(varLL2ylist)
> #-#
> ##Get the dimension
> ##HL2
> dimHL2 <- dim(y.modwt$HL2)
> dimHL2x <- dimHL2[1]
> dimHL2y <- dimHL2[2]
> varHL2xlist <- c(rep(0, dimHL2x))
> varHL2ylist <- c(rep(0, dimHL2y))
> ##Loop to get variance from x axis
> for(i in seq(dimHL2x)){
>    varHL2xlist[i] <- var(y.modwt$HL2[i,])
> }
> ##Get the variance from the overall x variance
> varHL2x <- var(varHL2xlist)
> ##Loop to get variance from y axis
> for(i in seq(dimHL2y)){
>    varHL2ylist[i] <- var(y.modwt$HL2[,i])
> }
> ##Get the variance from the overall y variance
> varHL2y <- var(varHL2ylist)
> #-#
> ##write excel file
> write.xls(mydata, "D:\\FYP\\image.mydata.xls")/
>
> --
> View this message in context: 
> http://r.789695.n4.nabble.com/write-data-using-xlsReadWrite-tp4629825.html
> Sent from the R help mailing list archive at Nabble.com.
>
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] determine size (width and height) of a graphics file via R - how?

2012-05-14 Thread Ethan Brown
Hi Mark,

You can do this easily with the "identify" command in ImageMagick
. Install it, and then from within an R
session:

system2("identify", "yourimagename.jpg")

...and it should give you something like this:

yourimagename.jpg JPEG 800x533 800x533+0+0 8-bit DirectClass 378KB
0.000u 0:00.019

...which is overkill but does include the dimensions.

If you're on Windows you need an extra argument:

system2("identify", "yourimagename.jpg", invisible = FALSE)

to make sure it actually shows you the result.

EBImage is an R interface to imagemagick but is probably more trouble
than it's worth for the simple task you're trying to do.

Hope this helps,
Ethan

On Sun, May 13, 2012 at 6:57 AM, Mark Heckmann  wrote:
> Hi,
>
> is there a way to determine the size (width, height) of a graphics file saved 
> on my hard disk, e.g. a .bmp, via R.
> What I want is basically the same information on the dimensions of the 
> graphic file that I get from my file browser.
>
> Thanks
> Mark
>
> PS. Why: I use the R2PPT and I need to determine the size of the original 
> graphic before adding it to a slide.
>
> 末末
> Mark Heckmann
> Blog: www.markheckmann.de
> R-Blog: http://ryouready.wordpress.com
>
>
>
>
>
>
>
>
>
>
>
>        [[alternative HTML version deleted]]
>
>
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Scraping a web page.

2012-05-14 Thread Keith Weintraub
Folks,
  I want to scrape a series of web-page sources for strings like the following:

"/en/Ships/A-8605507.html"
"/en/Ships/Aalborg-8122830.html"

which appear in an href inside an  tag inside a  tag inside a table.

In fact all I want is the (exactly) 7-digit number before ".html".

The good news is that as far as I can tell the the  tag is always on it's 
own line so some kind of line-by-line grep should suffice once I figure out the 
following:

What is the best package/command to use to get the source of a web page. I 
tried using something like:
if(url.exists("http://www.omegahat.org/RCurl";)) {
  h = basicTextGatherer()
  curlPerform(url = "http://www.omegahat.org/RCurl";, writefunction = h$update)
   # Now read the text that was cumulated during the query response.
  h$value()
}

which works except that I get one long streamed html doc without the line 
breaks.


Thanks in advance for your help,
KW


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Plot

2012-05-14 Thread Tyler Rinker

I noticed I was remiss in addressing your origin question:
Use the xlim and ylim, setting the lower limit to 0.  Here's an example of this 
with the CO2 dataset:


plot(uptake~Plant, data=CO2)


plot(uptake~as.numeric(Plant), data=CO2)


plot(uptake~as.numeric(Plant), data=CO2, ylim=c(0, 50), xlim=c(0, 14))



Cheers,Tyler

> From: tyler_rin...@hotmail.com
> To: kellycoo...@yahoo.com; r-help@r-project.org
> Date: Mon, 14 May 2012 16:31:26 -0400
> Subject: Re: [R] Plot
>
>
>
> That is likely because ferm is a factor.  A scatterplot is two numeric 
> variables.  To make it a scatterplot wrap ferm with as.numeric.
> Cheers,
> Tyler
> 
> Date: Mon, 14 May 2012 12:21:12 -0700
> From: kellycoo...@yahoo.com
> To: r-help@r-project.org
> Subject: [R] Plot
>
>
> Hello,
>
> I am trying to make a plot of the rates of an enzyme against three different 
> protein concentrations (there are 45 rates in total and split up into 3 
> groups of 15, each receiving one of the 3 protein concentrations). When I 
> enter the following code I instead get 3 separate boxplots for each of the 
> three different protein concentrations ...
>
>
> plot(rate ~ ferm, data=LDH, col=LDH$rate, 
> pch=c(17,18,19)[(as.numeric(LDH$rate)%%3)+1])
>
>
> but I want a scatterplot showing 3 different lines indicating each of the 
> protein concentrations. I'm not sure if I need to tweak my data set in order 
> to get what I want?
>
> I was also wondering how to include the origin (0,0) in the plots. I'm not 
> sure if I'm missing something on the plot help page?
>
> Any help would be appreciated. Thanks so much.
> [[alternative HTML version deleted]]
>
>
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
  
__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Error in names(x) <- value: 'names' attribute must be the same length as the vector

2012-05-14 Thread Tyler Rinker


I'd throw a browser() in at that point and see what colnames(newdf.int) gives 
you.  If you have less columns than names this is likely the reason for the 
error.
You can get the same error with:

colnames(mtcars) <- LETTERS

Cheers,Tyler

> Date: Mon, 14 May 2012 11:35:07 -0700
> From: bhatt...@gmail.com
> To: r-help@r-project.org
> Subject: [R] Error in names(x) <- value: 'names' attribute must be the same 
> length as the vector
>
> Dear R-helpers,
>
> I am stuck on an error in R: When I run my code (below), I get this error
> back:
>
> Error in names(x) <- value :
> 'names' attribute must be the same length as the vector
>
>
> Then when I use traceback(), R gives me back this in return:
>
> `colnames<-`(`*tmp*`, value = c(""Item", "Color" ,"Number", "Size"))
>
>
>
> I'm not exactly sure how to fix this problem. Any advice would be greatly
> appreciated!
>
> Thanks,
> Priya
>
>
> MODIFIED CODE:
> # Looping through a series of CSV files
> for (c in csvfiles)
> {
> #A DF (prevdf) was created based on an initial csv file..
> #so the condition below states that if there are rows with NAs or the
> number of rows in prevdf is zero
> if( (apply(prevdf, 1, function(y) !sum(!is.na(y))==1) > 0) ||
> (nrow(prevdf) == 0) )
> {
> #Open a new file
> currentCSVFile <- read.csv(c, header=TRUE)
> #pick only the few columns we want from the file
> currentCSVFile <- data.frame(currentCSVFile$Item,
> currentCSVFile$Color..type , currentCSVFile$Number..owned,
> currentCSVFile$Size..shirt)
> #rename the column names
> colnames(currentCSVFile) <- c("Item", "Color" ,"Number", "Size")
>
> #find the rows in prevdf that do not have any values. (sum should be 1
> because the Item name is unique for every row)
> NArows <- prevdf[apply(prevdf, 1, function(y) sum(!is.na(y))==1),]
>
> #if NAs rows is not equal to zero
> if (nrow(NArows) != 0 )
> {
> #find the rows in the current CSV file where there is missing data in
> prevdf (this info is in NArows)
> intersectItem<- intersect(currentCSVFile$Item, NArows$Item)
>
> #initiate another data frame to put the data in
> newdf.int <- data.frame(Item=c(), Color=c(), Number=c(), Size=c())
>
>
> print(nrow(currentCSVFile))
> for (i in 1:nrow(currentCSVFile))
>
> {
> print("In loop") # check for me
> row <- currentCSVFile[i,]
>
> if (row$Item %in% intersectItem){ # this is where the code stops
> and throws back error
> .
> .
> .
> # do stuff to fill vectors named Item, Color, Number and Size
> .
> .
> .
>
> newdf.int <-rbind(newdf.int, c(Item, Color, Number, Size)
> }
>
> colnames(newdf.int) <- c("Item", "Color", "Number", "Size")
> prevdf <- merge(newdf.int, prevdf, by=c("Item", "Color", "Number",
> "Size"), all=TRUE)
> prevdf <- prevdf[apply(prevdf, 1, function(y) !sum(!is.na(y))==1),]
> print("after removing row = 1")
>
>
> } # end of for loop
>
> } # end of NA rows condition
>
> } # end of main if statement
>
> else
> {
> break
> }
>
>
> }
>
> [[alternative HTML version deleted]]
>
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
  
__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Aligning time series

2012-05-14 Thread Matthew Johnson
Very useful, however as i will be aligning large datasets, is it
possible to query a variable for the lead / lag for each series? So i
can more 'automatically' align?

ideally the lag is stored in d$align or something

Best

mj

Sent from my iPad

On 14/05/2012, at 9:36 PM, Gabor Grothendieck  wrote:

> On Mon, May 14, 2012 at 5:09 AM, Matthew Johnson  wrote:
>> Sir,
>>
>> I have large data sets of economic indicators and would like to align
>> them to a reference series - say the unemployment rate or industrial
>> production.
>>
>> Is there a canned routine that returns the optimal lead / lag
>> according to some (or a variety) of algorithims? not all series will
>> be of the same length, however i would like to constrain the matching
>> such that each element in the reference series matches to only one
>> element in the comparison series.
>>
>> I do not mind if some data extend forward (in the case of series that
>> lead the reference series), or back (in the case of laggers or series
>> that are longer than the reference series).
>>
>> As a toy example, say my data set is constructed as follows:
>>
>> Index <- seq(0, 2*pi, length=100)
>>
>> With the reference series:
>>
>> S1 <- sin(index)
>>
>> And the series that i want to align to S1 are:
>>
>> S2 <- 2(cos(index) + runif(10)/100)
>> S3 <- 0.5(sin(index +pi/4) + runif(10)/100)
>> S4 <- sin(index + pi/3) + runif(10)/100
>> S4[1:25] <- NA
>>
>> In this case, I am looking for a function that tells me by how many
>> periods i ought to advance the three series (S2, S3 and S4) to
>> maximise their relationship with S1 - though it is not always the case
>> that the reference series will lead.
>
> Check out the dtw in the dtw package:
>
> library(dtw)
> d <- dtw(S1, S2)
> d$index1
> d$index2
>
> ?dtw
>
>
> --
> Statistics & Software Consulting
> GKX Group, GKX Associates Inc.
> tel: 1-877-GKX-GROUP
> email: ggrothendieck at gmail.com

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Plot

2012-05-14 Thread Tyler Rinker


That is likely because ferm is a factor.  A scatterplot is two numeric 
variables.  To make it a scatterplot wrap ferm with as.numeric. 
Cheers,
Tyler

Date: Mon, 14 May 2012 12:21:12 -0700
From: kellycoo...@yahoo.com
To: r-help@r-project.org
Subject: [R] Plot


Hello,

I am trying to make a plot of the rates of an enzyme against three different 
protein concentrations (there are 45 rates in total and split up into 3 groups 
of 15, each receiving one of the 3 protein concentrations). When I enter the 
following code I instead get 3 separate boxplots for each of the three 
different protein concentrations ...


plot(rate ~ ferm, data=LDH, col=LDH$rate, 
pch=c(17,18,19)[(as.numeric(LDH$rate)%%3)+1])


but I want a scatterplot showing 3 different lines indicating each of the 
protein concentrations. I'm not sure if I need to tweak my data set in order to 
get what I want?

I was also wondering how to include the origin (0,0) in the plots. I'm not sure 
if I'm missing something on the plot help page?

Any help would be appreciated. Thanks so much.
[[alternative HTML version deleted]]


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
  
__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Inf and lazy evaluation

2012-05-14 Thread J Toll
Thank you all for the replies.

On Mon, May 14, 2012 at 2:45 PM, R. Michael Weylandt
 wrote:
> R is lazy, but not quite that lazy ;-)

Oh, what is this world coming to when you can't count on laziness to
be lazy. ;)  I should probably stop reading about Haskell and their
lazy way of doing things.

As a relatively naive observation, in R, it seems like argument
recycling kind of breaks the power of lazy evaluation.

Thanks for the suggestion of list.files()


James

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] range segment exclusion using range endpoints

2012-05-14 Thread William Dunlap
Yes, package:intervals uses the same idiom as my code.
Mine allows Date and POSIXct objects as interval endpoints
(which is why it represents the objects as data.frames instead
of matrices).  package:intervals has more than basic set
opertions on the collections of intervals.

Bill Dunlap
Spotfire, TIBCO Software
wdunlap tibco.com

From: Ben quant [mailto:ccqu...@gmail.com]
Sent: Monday, May 14, 2012 1:19 PM
To: Steve Lianoglou
Cc: William Dunlap; r-help@r-project.org
Subject: Re: [R] range segment exclusion using range endpoints

Thank you Steve!

This does everything I need (at this point):

(this excludes ranges y2 from range y1)

library('intervals')
y1 = Intervals(c(-100,100))
y2 = Intervals(rbind(
  c(-100.5,30),
  c(0.77,10),
  c(25,35),
  c(70,80.3),
  c(90,95)
  ))
interval_difference(y1,y2)
Object of class Intervals_full
3 intervals over R:
(35, 70)
(80.3, 90)
(95, 100]

PS - I'm pretty sure William's solution worked as well, but opted for the 
package solution which is a bit more robust.

Thanks everyone!
Ben
On Mon, May 14, 2012 at 1:06 PM, Steve Lianoglou 
mailto:mailinglist.honey...@gmail.com>> wrote:
Hi all,

Nice code samples presented all around.

Just wanted to point out that I think the stuff found in the
`intervals` package might also be helpful:

http://cran.at.r-project.org/web/packages/intervals/index.html

HTH,
-steve

On Mon, May 14, 2012 at 2:54 PM, Ben quant 
mailto:ccqu...@gmail.com>> wrote:
> Yes, it is. I'm looking into understanding this now...
>
> thanks!
> Ben
>
> On Mon, May 14, 2012 at 12:38 PM, William Dunlap 
> mailto:wdun...@tibco.com>> wrote:
>
>> To the list of function I sent, add another that converts a list of
>> intervals
>> into a Ranges object:
>>  as.Ranges.list <- function (x, ...) {
>>  stopifnot(nargs() == 1, all(vapply(x, length, 0) == 2))
>>  # use c() instead of unlist() because c() doesn't mangle POSIXct and
>> Date objects
>>  x <- unname(do.call(c, x))
>>  odd <- seq(from = 1, to = length(x), by = 2)
>>  as.Ranges(bottoms = x[odd], tops = x[odd + 1])
>>  }
>> Then stop using get() and assign() all over the place and instead make
>> lists of
>> related intervals and convert them to Ranges objects:
>>  > x <- as.Ranges(list(x_rng))
>>  > s <- as.Ranges(list(s1_rng, s2_rng, s3_rng, s4_rng, s5_rng))
>>  > x
>>bottoms tops
>>  1-100  100
>>  > s
>>bottoms tops
>>  1 -250.50 30.0
>>  20.77 10.0
>>  3   25.00 35.0
>>  4   70.00 80.3
>>  5   90.00 95.0
>> and then compute the difference between the sets x and s (i.e., describe
>> the points in x but not s as a union of intervals):
>>  > setdiffRanges(x, s)
>>bottoms tops
>>  135.0   70
>>  280.3   90
>>  395.0  100
>> and for a graphical check do
>>  > plot(x, s, setdiffRanges(x, s))
>> Are those the numbers you want?
>>
>> I find it easier to use standard functions and data structures for this
>> than
>> to adapt the cumsum/order idiom to different situations.
>>
>> Bill Dunlap
>> Spotfire, TIBCO Software
>> wdunlap tibco.com
>>
>>
>> > -Original Message-
>> > From: r-help-boun...@r-project.org 
>> > [mailto:r-help-boun...@r-project.org]
>> On Behalf
>> > Of Ben quant
>> > Sent: Monday, May 14, 2012 11:07 AM
>> > To: jim holtman
>> > Cc: r-help@r-project.org
>> > Subject: Re: [R] range segment exclusion using range endpoints
>> >
>> > Turns out this solution doesn't work if the s range is outside the range
>> of
>> > the x range. I didn't include that in my examples, but it is something I
>> > have to deal with quite often.
>> >
>> > For example s1_rng below causes an issue:
>> >
>> > x_rng = c(-100,100)
>> > s1_rng = c(-250.5,30)
>> > s2_rng = c(0.77,10)
>> > s3_rng = c(25,35)
>> > s4_rng = c(70,80.3)
>> > s5_rng = c(90,95)
>> >
>> > sNames <- grep("s[0-9]+_rng", ls(), value = TRUE)
>> > queue <- rbind(c(x_rng[1], 1), c(x_rng[2], 1))
>> > for (i in sNames){
>> >   queue <- rbind(queue
>> >  , c(get(i)[1], 1)  # enter queue
>> >  , c(get(i)[2], -1)  # exit queue
>> >  )
>> > }
>> > queue <- queue[order(queue[, 1]), ]  # sort
>> > queue <- cbind(queue, cumsum(queue[, 2]))  # of people in the queue
>> > for (i in which(queue[, 3] == 1)){
>> >   cat("start:", queue[i, 1L], '  end:', queue[i + 1L, 1L], "\n")
>> > }
>> >
>> > Regards,
>> >
>> > ben
>> > On Sat, May 12, 2012 at 12:50 PM, jim holtman 
>> > mailto:jholt...@gmail.com>>
>> wrote:
>> >
>> > > Here is an example of how you might do it.  It uses a technique of
>> > > counting how many items are in a queue based on their arrival times;
>> > > it can be used to also find areas of overlap.
>> > >
>> > > Note that it would be best to use a list for the 's' end points
>> > >
>> > > 
>> > > > # note the next statement removes names of the format 's[0-9]+_rng'
>> > > > # it would be best to create a list 

Re: [R] Help with V function in igraph

2012-05-14 Thread bmccowan
Thank for your response. It is oddly working now.

Thanks again,

Brenda



On 5/14/12 10:36 AM, "Gábor Csárdi-2 [via R]"
 wrote:

> Something weird must be going on in your s641_social object. Can you just
> simply check that the vertex names look OK with 'V(s641_social)$name'? If they
> look good, then can you send me the s641_social object in private? (Or part of
> it, assuming a part is enough to reproduce the problem.) Best, Gabor On Sat,
> May 12, 2012 at 8:07 PM, bmccowan <[hidden email]
>  > wrote:
>> > I am using the code below to output some network measures: > >
>> central_social <- data.frame(V(s641_social)$name, indegree_social, >
>> outdegree_social, incloseness_social, outcloseness_social, >
>> betweenness_social, eigen_social) > > and I get the following error: > > >
>> Error in Re(z) : non-numeric argument to function > > I know this has to do
>> with V but I cannot figure out what is wrong. > s641-social is a graph object
>> and the vertices do have a name attribute. > > What can I do to fix? > >
>> Thanks > > -- > View this message in context:
>> http://r.789695.n4.nabble.com/Help-with-V-function-in-igraph-tp4629767.html>
>> Sent from the R help mailing list archive at Nabble.com. > >
>> __ > [hidden email]
>>   mailing list >
>> https://stat.ethz.ch/mailman/listinfo/r-help> PLEASE do read the posting
>> guide http://www.R-project.org/posting-guide.html> and provide commented,
>> minimal, self-contained, reproducible code.
> -- Gabor Csardi <[hidden email]
>  >     MTA KFKI RMKI
> __ [hidden email]
>   mailing list
> https://stat.ethz.ch/mailman/listinfo/r-helpPLEASE do read the posting guide
> http://www.R-project.org/posting-guide.htmland provide commented, minimal,
> self-contained, reproducible code.
> 
> If you reply to this email, your message will be added to the discussion
> below:
> http://r.789695.n4.nabble.com/Help-with-V-function-in-igraph-tp4629767p4629973
> .html 
> To unsubscribe from Help with V function in igraph, click here
>  ode&node=4629767&code=YmptY2Nvd2FuQHVjZGF2aXMuZWR1fDQ2Mjk3Njd8MTg1NDg1MDYx> .
> NAML 
>  instant_html%21nabble%3Aemail.naml&base=nabble.naml.namespaces.BasicNamespace-
> nabble.view.web.template.NabbleNamespace-nabble.view.web.template.NodeNamespac
> e&breadcrumbs=notify_subscribers%21nabble%3Aemail.naml-instant_emails%21nabble
> %3Aemail.naml-send_instant_email%21nabble%3Aemail.naml>
> 



--
View this message in context: 
http://r.789695.n4.nabble.com/Help-with-V-function-in-igraph-tp4629767p4629989.html
Sent from the R help mailing list archive at Nabble.com.
[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Error in names(x) <- value: 'names' attribute must be the same length as the vector

2012-05-14 Thread Priya Bhatt
Dear R-helpers,

I am stuck on an error in R:  When I run my code (below), I get this error
back:

Error in names(x) <- value :
  'names' attribute must be the same length as the vector


Then when I use traceback(), R gives me back this in return:

`colnames<-`(`*tmp*`, value = c(""Item", "Color" ,"Number", "Size"))



I'm not exactly sure how to fix this problem.  Any advice would be greatly
appreciated!

Thanks,
Priya


MODIFIED CODE:
# Looping through a series of CSV files
for (c in csvfiles)
{
  #A DF (prevdf) was created based on an initial csv file..
  #so the condition below states that if there are rows with NAs or the
number of rows in prevdf is zero
  if( (apply(prevdf, 1, function(y) !sum(!is.na(y))==1) > 0) ||
(nrow(prevdf) == 0) )
  {
#Open a new file
currentCSVFile <- read.csv(c, header=TRUE)
#pick only the few columns we want from the file
currentCSVFile <- data.frame(currentCSVFile$Item,
currentCSVFile$Color..type , currentCSVFile$Number..owned,
currentCSVFile$Size..shirt)
#rename the column names
colnames(currentCSVFile) <- c("Item", "Color" ,"Number", "Size")

#find the rows in prevdf that do not have any values. (sum should be 1
because the Item name is unique for every row)
NArows <- prevdf[apply(prevdf, 1, function(y) sum(!is.na(y))==1),]

#if NAs rows is not equal to zero
if (nrow(NArows) != 0 )
{
  #find the rows in the current CSV file where there is missing data in
prevdf (this info is in NArows)
  intersectItem<- intersect(currentCSVFile$Item, NArows$Item)

  #initiate another data frame to put the data in
  newdf.int <- data.frame(Item=c(), Color=c(), Number=c(), Size=c())


  print(nrow(currentCSVFile))
  for (i in 1:nrow(currentCSVFile))

  {
print("In loop") # check for me
row <- currentCSVFile[i,]

if (row$Item %in% intersectItem){  # this is where the code stops
and throws back error
  .
  .
  .
   # do stuff to fill vectors named Item, Color, Number and Size
  .
  .
  .

  newdf.int <-rbind(newdf.int, c(Item, Color, Number, Size)
}

colnames(newdf.int) <- c("Item", "Color", "Number", "Size")
prevdf <- merge(newdf.int, prevdf, by=c("Item", "Color", "Number",
"Size"), all=TRUE)
prevdf <- prevdf[apply(prevdf, 1, function(y) !sum(!is.na(y))==1),]
print("after removing row = 1")


  } # end of for loop

} # end of NA rows condition

  } # end of main if statement

  else
  {
break
  }


}

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Plot

2012-05-14 Thread Kelly Cool
Hello,

I am trying to make a plot of the rates of an enzyme against three different 
protein concentrations (there are 45 rates in total and split up into 3 groups 
of 15, each receiving one of the 3 protein concentrations). When I enter the 
following code I instead get 3 separate boxplots for each of the three 
different protein concentrations ...


plot(rate ~ ferm, data=LDH, col=LDH$rate, 
pch=c(17,18,19)[(as.numeric(LDH$rate)%%3)+1])


but I want a scatterplot showing 3 different lines indicating each of the 
protein concentrations. I'm not sure if I need to tweak my data set in order to 
get what I want? 

I was also wondering how to include the origin (0,0) in the plots. I'm not sure 
if I'm missing something on the plot help page? 

Any help would be appreciated. Thanks so much. 
[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] range segment exclusion using range endpoints

2012-05-14 Thread Ben quant
Thank you Steve!

This does everything I need (at this point):

(this excludes ranges y2 from range y1)

library('intervals')
y1 = Intervals(c(-100,100))
y2 = Intervals(rbind(
  c(-100.5,30),
  c(0.77,10),
  c(25,35),
  c(70,80.3),
  c(90,95)
  ))
interval_difference(y1,y2)
Object of class Intervals_full
3 intervals over R:
(35, 70)
(80.3, 90)
(95, 100]

PS - I'm pretty sure William's solution worked as well, but opted for the
package solution which is a bit more robust.

Thanks everyone!
Ben

On Mon, May 14, 2012 at 1:06 PM, Steve Lianoglou <
mailinglist.honey...@gmail.com> wrote:

> Hi all,
>
> Nice code samples presented all around.
>
> Just wanted to point out that I think the stuff found in the
> `intervals` package might also be helpful:
>
> http://cran.at.r-project.org/web/packages/intervals/index.html
>
> HTH,
> -steve
>
> On Mon, May 14, 2012 at 2:54 PM, Ben quant  wrote:
> > Yes, it is. I'm looking into understanding this now...
> >
> > thanks!
> > Ben
> >
> > On Mon, May 14, 2012 at 12:38 PM, William Dunlap 
> wrote:
> >
> >> To the list of function I sent, add another that converts a list of
> >> intervals
> >> into a Ranges object:
> >>  as.Ranges.list <- function (x, ...) {
> >>  stopifnot(nargs() == 1, all(vapply(x, length, 0) == 2))
> >>  # use c() instead of unlist() because c() doesn't mangle POSIXct
> and
> >> Date objects
> >>  x <- unname(do.call(c, x))
> >>  odd <- seq(from = 1, to = length(x), by = 2)
> >>  as.Ranges(bottoms = x[odd], tops = x[odd + 1])
> >>  }
> >> Then stop using get() and assign() all over the place and instead make
> >> lists of
> >> related intervals and convert them to Ranges objects:
> >>  > x <- as.Ranges(list(x_rng))
> >>  > s <- as.Ranges(list(s1_rng, s2_rng, s3_rng, s4_rng, s5_rng))
> >>  > x
> >>bottoms tops
> >>  1-100  100
> >>  > s
> >>bottoms tops
> >>  1 -250.50 30.0
> >>  20.77 10.0
> >>  3   25.00 35.0
> >>  4   70.00 80.3
> >>  5   90.00 95.0
> >> and then compute the difference between the sets x and s (i.e., describe
> >> the points in x but not s as a union of intervals):
> >>  > setdiffRanges(x, s)
> >>bottoms tops
> >>  135.0   70
> >>  280.3   90
> >>  395.0  100
> >> and for a graphical check do
> >>  > plot(x, s, setdiffRanges(x, s))
> >> Are those the numbers you want?
> >>
> >> I find it easier to use standard functions and data structures for this
> >> than
> >> to adapt the cumsum/order idiom to different situations.
> >>
> >> Bill Dunlap
> >> Spotfire, TIBCO Software
> >> wdunlap tibco.com
> >>
> >>
> >> > -Original Message-
> >> > From: r-help-boun...@r-project.org [mailto:
> r-help-boun...@r-project.org]
> >> On Behalf
> >> > Of Ben quant
> >> > Sent: Monday, May 14, 2012 11:07 AM
> >> > To: jim holtman
> >> > Cc: r-help@r-project.org
> >> > Subject: Re: [R] range segment exclusion using range endpoints
> >> >
> >> > Turns out this solution doesn't work if the s range is outside the
> range
> >> of
> >> > the x range. I didn't include that in my examples, but it is
> something I
> >> > have to deal with quite often.
> >> >
> >> > For example s1_rng below causes an issue:
> >> >
> >> > x_rng = c(-100,100)
> >> > s1_rng = c(-250.5,30)
> >> > s2_rng = c(0.77,10)
> >> > s3_rng = c(25,35)
> >> > s4_rng = c(70,80.3)
> >> > s5_rng = c(90,95)
> >> >
> >> > sNames <- grep("s[0-9]+_rng", ls(), value = TRUE)
> >> > queue <- rbind(c(x_rng[1], 1), c(x_rng[2], 1))
> >> > for (i in sNames){
> >> >   queue <- rbind(queue
> >> >  , c(get(i)[1], 1)  # enter queue
> >> >  , c(get(i)[2], -1)  # exit queue
> >> >  )
> >> > }
> >> > queue <- queue[order(queue[, 1]), ]  # sort
> >> > queue <- cbind(queue, cumsum(queue[, 2]))  # of people in the queue
> >> > for (i in which(queue[, 3] == 1)){
> >> >   cat("start:", queue[i, 1L], '  end:', queue[i + 1L, 1L], "\n")
> >> > }
> >> >
> >> > Regards,
> >> >
> >> > ben
> >> > On Sat, May 12, 2012 at 12:50 PM, jim holtman 
> >> wrote:
> >> >
> >> > > Here is an example of how you might do it.  It uses a technique of
> >> > > counting how many items are in a queue based on their arrival times;
> >> > > it can be used to also find areas of overlap.
> >> > >
> >> > > Note that it would be best to use a list for the 's' end points
> >> > >
> >> > > 
> >> > > > # note the next statement removes names of the format
> 's[0-9]+_rng'
> >> > > > # it would be best to create a list with the 's' endpoints, but
> this
> >> is
> >> > > > # what the OP specified
> >> > > >
> >> > > > rm(list = grep('s[0-9]+_rng', ls(), value = TRUE))  # Danger Will
> >> > > Robinson!!
> >> > > >
> >> > > > # ex 1
> >> > > > x_rng = c(-100,100)
> >> > > >
> >> > > > s1_rng = c(-25.5,30)
> >> > > > s2_rng = c(0.77,10)
> >> > > > s3_rng = c(25,35)
> >> > > > s4_rng = c(70,80.3)
> >> > > > s5_rng = c(90,95)
> >> > > >
> >> > > > # ex 2
> >> > > > # x_rng = c(-50.5,100)
> >> > > >
> >> > > > # s1_rng = c(-75.3,30

Re: [R] Inf and lazy evaluation

2012-05-14 Thread R. Michael Weylandt
R is lazy, but not quite that lazy ;-)

It's likely much easier to do this with regexps

something like

list.files()[grepl(paste0(filename, "-"[0123456789]+""), list.files())]

Michael


On Mon, May 14, 2012 at 3:34 PM, J Toll  wrote:
> Hi,
>
> I have a question involving Inf, lazy evaluation, and maybe argument
> recycling.  I have a directory where I am trying to check for the
> existence of files of a certain pattern, basically something like
> "filename-#", where # is an integer.  I can do something like this,
> which works.
>
> file.exists(paste(filename, "-", 1:100, sep = ""))
>
> But I don't like the fact that I am only checking the first 100
> possibilities.  What I would prefer is this:
>
> file.exists(paste(filename, "-", 1:Inf, sep = ""))
>
> But that doesn't work, I get the error:
>
> Error in 1:Inf : result would be too long a vector
>
> On one hand, with lazy evaluation, you would think that 1:Inf should
> work.  On the other hand, I'm not quite sure what the output would be
> if it was working, especially if there were large gaps in the
> integers.  Is there a way to get the behavior I seek (i.e. the lazy
> evaluation of 1:Inf).
>
> Thanks,
>
>
> James
>
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Inf and lazy evaluation

2012-05-14 Thread Bert Gunter
?list.files


-- Bert

On Mon, May 14, 2012 at 12:34 PM, J Toll  wrote:
> Hi,
>
> I have a question involving Inf, lazy evaluation, and maybe argument
> recycling.  I have a directory where I am trying to check for the
> existence of files of a certain pattern, basically something like
> "filename-#", where # is an integer.  I can do something like this,
> which works.
>
> file.exists(paste(filename, "-", 1:100, sep = ""))
>
> But I don't like the fact that I am only checking the first 100
> possibilities.  What I would prefer is this:
>
> file.exists(paste(filename, "-", 1:Inf, sep = ""))
>
> But that doesn't work, I get the error:
>
> Error in 1:Inf : result would be too long a vector
>
> On one hand, with lazy evaluation, you would think that 1:Inf should
> work.  On the other hand, I'm not quite sure what the output would be
> if it was working, especially if there were large gaps in the
> integers.  Is there a way to get the behavior I seek (i.e. the lazy
> evaluation of 1:Inf).
>
> Thanks,
>
>
> James
>
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.



-- 

Bert Gunter
Genentech Nonclinical Biostatistics

Internal Contact Info:
Phone: 467-7374
Website:
http://pharmadevelopment.roche.com/index/pdb/pdb-functional-groups/pdb-biostatistics/pdb-ncb-home.htm

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Inf and lazy evaluation

2012-05-14 Thread J Toll
Hi,

I have a question involving Inf, lazy evaluation, and maybe argument
recycling.  I have a directory where I am trying to check for the
existence of files of a certain pattern, basically something like
"filename-#", where # is an integer.  I can do something like this,
which works.

file.exists(paste(filename, "-", 1:100, sep = ""))

But I don't like the fact that I am only checking the first 100
possibilities.  What I would prefer is this:

file.exists(paste(filename, "-", 1:Inf, sep = ""))

But that doesn't work, I get the error:

Error in 1:Inf : result would be too long a vector

On one hand, with lazy evaluation, you would think that 1:Inf should
work.  On the other hand, I'm not quite sure what the output would be
if it was working, especially if there were large gaps in the
integers.  Is there a way to get the behavior I seek (i.e. the lazy
evaluation of 1:Inf).

Thanks,


James

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] range segment exclusion using range endpoints

2012-05-14 Thread Steve Lianoglou
Hi all,

Nice code samples presented all around.

Just wanted to point out that I think the stuff found in the
`intervals` package might also be helpful:

http://cran.at.r-project.org/web/packages/intervals/index.html

HTH,
-steve

On Mon, May 14, 2012 at 2:54 PM, Ben quant  wrote:
> Yes, it is. I'm looking into understanding this now...
>
> thanks!
> Ben
>
> On Mon, May 14, 2012 at 12:38 PM, William Dunlap  wrote:
>
>> To the list of function I sent, add another that converts a list of
>> intervals
>> into a Ranges object:
>>  as.Ranges.list <- function (x, ...) {
>>      stopifnot(nargs() == 1, all(vapply(x, length, 0) == 2))
>>      # use c() instead of unlist() because c() doesn't mangle POSIXct and
>> Date objects
>>      x <- unname(do.call(c, x))
>>      odd <- seq(from = 1, to = length(x), by = 2)
>>      as.Ranges(bottoms = x[odd], tops = x[odd + 1])
>>  }
>> Then stop using get() and assign() all over the place and instead make
>> lists of
>> related intervals and convert them to Ranges objects:
>>  > x <- as.Ranges(list(x_rng))
>>  > s <- as.Ranges(list(s1_rng, s2_rng, s3_rng, s4_rng, s5_rng))
>>  > x
>>    bottoms tops
>>  1    -100  100
>>  > s
>>    bottoms tops
>>  1 -250.50 30.0
>>  2    0.77 10.0
>>  3   25.00 35.0
>>  4   70.00 80.3
>>  5   90.00 95.0
>> and then compute the difference between the sets x and s (i.e., describe
>> the points in x but not s as a union of intervals):
>>  > setdiffRanges(x, s)
>>    bottoms tops
>>  1    35.0   70
>>  2    80.3   90
>>  3    95.0  100
>> and for a graphical check do
>>  > plot(x, s, setdiffRanges(x, s))
>> Are those the numbers you want?
>>
>> I find it easier to use standard functions and data structures for this
>> than
>> to adapt the cumsum/order idiom to different situations.
>>
>> Bill Dunlap
>> Spotfire, TIBCO Software
>> wdunlap tibco.com
>>
>>
>> > -Original Message-
>> > From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org]
>> On Behalf
>> > Of Ben quant
>> > Sent: Monday, May 14, 2012 11:07 AM
>> > To: jim holtman
>> > Cc: r-help@r-project.org
>> > Subject: Re: [R] range segment exclusion using range endpoints
>> >
>> > Turns out this solution doesn't work if the s range is outside the range
>> of
>> > the x range. I didn't include that in my examples, but it is something I
>> > have to deal with quite often.
>> >
>> > For example s1_rng below causes an issue:
>> >
>> > x_rng = c(-100,100)
>> > s1_rng = c(-250.5,30)
>> > s2_rng = c(0.77,10)
>> > s3_rng = c(25,35)
>> > s4_rng = c(70,80.3)
>> > s5_rng = c(90,95)
>> >
>> > sNames <- grep("s[0-9]+_rng", ls(), value = TRUE)
>> > queue <- rbind(c(x_rng[1], 1), c(x_rng[2], 1))
>> > for (i in sNames){
>> >   queue <- rbind(queue
>> >                  , c(get(i)[1], 1)  # enter queue
>> >                  , c(get(i)[2], -1)  # exit queue
>> >                  )
>> > }
>> > queue <- queue[order(queue[, 1]), ]  # sort
>> > queue <- cbind(queue, cumsum(queue[, 2]))  # of people in the queue
>> > for (i in which(queue[, 3] == 1)){
>> >   cat("start:", queue[i, 1L], '  end:', queue[i + 1L, 1L], "\n")
>> > }
>> >
>> > Regards,
>> >
>> > ben
>> > On Sat, May 12, 2012 at 12:50 PM, jim holtman 
>> wrote:
>> >
>> > > Here is an example of how you might do it.  It uses a technique of
>> > > counting how many items are in a queue based on their arrival times;
>> > > it can be used to also find areas of overlap.
>> > >
>> > > Note that it would be best to use a list for the 's' end points
>> > >
>> > > 
>> > > > # note the next statement removes names of the format 's[0-9]+_rng'
>> > > > # it would be best to create a list with the 's' endpoints, but this
>> is
>> > > > # what the OP specified
>> > > >
>> > > > rm(list = grep('s[0-9]+_rng', ls(), value = TRUE))  # Danger Will
>> > > Robinson!!
>> > > >
>> > > > # ex 1
>> > > > x_rng = c(-100,100)
>> > > >
>> > > > s1_rng = c(-25.5,30)
>> > > > s2_rng = c(0.77,10)
>> > > > s3_rng = c(25,35)
>> > > > s4_rng = c(70,80.3)
>> > > > s5_rng = c(90,95)
>> > > >
>> > > > # ex 2
>> > > > # x_rng = c(-50.5,100)
>> > > >
>> > > > # s1_rng = c(-75.3,30)
>> > > >
>> > > > # ex 3
>> > > > # x_rng = c(-75.3,30)
>> > > >
>> > > > # s1_rng = c(-50.5,100)
>> > > >
>> > > > # ex 4
>> > > > # x_rng = c(-100,100)
>> > > >
>> > > > # s1_rng = c(-105,105)
>> > > >
>> > > > # find all the names -- USE A LIST NEXT TIME
>> > > > sNames <- grep("s[0-9]+_rng", ls(), value = TRUE)
>> > > >
>> > > > # initial matrix with the 'x' endpoints
>> > > > queue <- rbind(c(x_rng[1], 1), c(x_rng[2], 1))
>> > > >
>> > > > # add the 's' end points to the list
>> > > > # this will be used to determine how many things are in a queue (or
>> > > areas that
>> > > > # overlap)
>> > > > for (i in sNames){
>> > > +     queue <- rbind(queue
>> > > +                 , c(get(i)[1], 1)  # enter queue
>> > > +                 , c(get(i)[2], -1)  # exit queue
>> > > +                 )
>> > > + }
>> > > > queue <- queue[order(queue[, 1]), ]  # sort
>>

Re: [R] Random forests prediction

2012-05-14 Thread Liaw, Andy
That's not how RF works at all.  The setting of mtry is irrelevant to this.

Andy 

-Original Message-
From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On 
Behalf Of matt
Sent: Monday, May 14, 2012 10:22 AM
To: r-help@r-project.org
Subject: Re: [R] Random forests prediction

But shouldn't it be resolved when I set mtry to the maximum number of
variables? 
Then the model explores all the variables for the next step, so it will
still be able to find the better ones? And then in the later steps it could
use the (less important) variables.

Matthijs

--
View this message in context: 
http://r.789695.n4.nabble.com/Random-forests-prediction-tp4627409p4629944.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Notice:  This e-mail message, together with any attachme...{{dropped:11}}

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] range segment exclusion using range endpoints

2012-05-14 Thread Ben quant
Yes, it is. I'm looking into understanding this now...

thanks!
Ben

On Mon, May 14, 2012 at 12:38 PM, William Dunlap  wrote:

> To the list of function I sent, add another that converts a list of
> intervals
> into a Ranges object:
>  as.Ranges.list <- function (x, ...) {
>  stopifnot(nargs() == 1, all(vapply(x, length, 0) == 2))
>  # use c() instead of unlist() because c() doesn't mangle POSIXct and
> Date objects
>  x <- unname(do.call(c, x))
>  odd <- seq(from = 1, to = length(x), by = 2)
>  as.Ranges(bottoms = x[odd], tops = x[odd + 1])
>  }
> Then stop using get() and assign() all over the place and instead make
> lists of
> related intervals and convert them to Ranges objects:
>  > x <- as.Ranges(list(x_rng))
>  > s <- as.Ranges(list(s1_rng, s2_rng, s3_rng, s4_rng, s5_rng))
>  > x
>bottoms tops
>  1-100  100
>  > s
>bottoms tops
>  1 -250.50 30.0
>  20.77 10.0
>  3   25.00 35.0
>  4   70.00 80.3
>  5   90.00 95.0
> and then compute the difference between the sets x and s (i.e., describe
> the points in x but not s as a union of intervals):
>  > setdiffRanges(x, s)
>bottoms tops
>  135.0   70
>  280.3   90
>  395.0  100
> and for a graphical check do
>  > plot(x, s, setdiffRanges(x, s))
> Are those the numbers you want?
>
> I find it easier to use standard functions and data structures for this
> than
> to adapt the cumsum/order idiom to different situations.
>
> Bill Dunlap
> Spotfire, TIBCO Software
> wdunlap tibco.com
>
>
> > -Original Message-
> > From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org]
> On Behalf
> > Of Ben quant
> > Sent: Monday, May 14, 2012 11:07 AM
> > To: jim holtman
> > Cc: r-help@r-project.org
> > Subject: Re: [R] range segment exclusion using range endpoints
> >
> > Turns out this solution doesn't work if the s range is outside the range
> of
> > the x range. I didn't include that in my examples, but it is something I
> > have to deal with quite often.
> >
> > For example s1_rng below causes an issue:
> >
> > x_rng = c(-100,100)
> > s1_rng = c(-250.5,30)
> > s2_rng = c(0.77,10)
> > s3_rng = c(25,35)
> > s4_rng = c(70,80.3)
> > s5_rng = c(90,95)
> >
> > sNames <- grep("s[0-9]+_rng", ls(), value = TRUE)
> > queue <- rbind(c(x_rng[1], 1), c(x_rng[2], 1))
> > for (i in sNames){
> >   queue <- rbind(queue
> >  , c(get(i)[1], 1)  # enter queue
> >  , c(get(i)[2], -1)  # exit queue
> >  )
> > }
> > queue <- queue[order(queue[, 1]), ]  # sort
> > queue <- cbind(queue, cumsum(queue[, 2]))  # of people in the queue
> > for (i in which(queue[, 3] == 1)){
> >   cat("start:", queue[i, 1L], '  end:', queue[i + 1L, 1L], "\n")
> > }
> >
> > Regards,
> >
> > ben
> > On Sat, May 12, 2012 at 12:50 PM, jim holtman 
> wrote:
> >
> > > Here is an example of how you might do it.  It uses a technique of
> > > counting how many items are in a queue based on their arrival times;
> > > it can be used to also find areas of overlap.
> > >
> > > Note that it would be best to use a list for the 's' end points
> > >
> > > 
> > > > # note the next statement removes names of the format 's[0-9]+_rng'
> > > > # it would be best to create a list with the 's' endpoints, but this
> is
> > > > # what the OP specified
> > > >
> > > > rm(list = grep('s[0-9]+_rng', ls(), value = TRUE))  # Danger Will
> > > Robinson!!
> > > >
> > > > # ex 1
> > > > x_rng = c(-100,100)
> > > >
> > > > s1_rng = c(-25.5,30)
> > > > s2_rng = c(0.77,10)
> > > > s3_rng = c(25,35)
> > > > s4_rng = c(70,80.3)
> > > > s5_rng = c(90,95)
> > > >
> > > > # ex 2
> > > > # x_rng = c(-50.5,100)
> > > >
> > > > # s1_rng = c(-75.3,30)
> > > >
> > > > # ex 3
> > > > # x_rng = c(-75.3,30)
> > > >
> > > > # s1_rng = c(-50.5,100)
> > > >
> > > > # ex 4
> > > > # x_rng = c(-100,100)
> > > >
> > > > # s1_rng = c(-105,105)
> > > >
> > > > # find all the names -- USE A LIST NEXT TIME
> > > > sNames <- grep("s[0-9]+_rng", ls(), value = TRUE)
> > > >
> > > > # initial matrix with the 'x' endpoints
> > > > queue <- rbind(c(x_rng[1], 1), c(x_rng[2], 1))
> > > >
> > > > # add the 's' end points to the list
> > > > # this will be used to determine how many things are in a queue (or
> > > areas that
> > > > # overlap)
> > > > for (i in sNames){
> > > + queue <- rbind(queue
> > > + , c(get(i)[1], 1)  # enter queue
> > > + , c(get(i)[2], -1)  # exit queue
> > > + )
> > > + }
> > > > queue <- queue[order(queue[, 1]), ]  # sort
> > > > queue <- cbind(queue, cumsum(queue[, 2]))  # of people in the queue
> > > > print(queue)
> > > [,1] [,2] [,3]
> > >  [1,] -100.0011
> > >  [2,]  -25.5012
> > >  [3,]0.7713
> > >  [4,]   10.00   -12
> > >  [5,]   25.0013
> > >  [6,]   30.00   -12
> > >  [7,]   35.00   -11
> > >  [8,]   70.0012
> > >  [9,]   80.30   -11
> > > [10,]   90.0012
> > > [11,]   9

Re: [R] range segment exclusion using range endpoints

2012-05-14 Thread William Dunlap
To the list of function I sent, add another that converts a list of intervals
into a Ranges object:
  as.Ranges.list <- function (x, ...) {
  stopifnot(nargs() == 1, all(vapply(x, length, 0) == 2))
  # use c() instead of unlist() because c() doesn't mangle POSIXct and Date 
objects
  x <- unname(do.call(c, x))
  odd <- seq(from = 1, to = length(x), by = 2)
  as.Ranges(bottoms = x[odd], tops = x[odd + 1])
  }
Then stop using get() and assign() all over the place and instead make lists of
related intervals and convert them to Ranges objects:
  > x <- as.Ranges(list(x_rng))
  > s <- as.Ranges(list(s1_rng, s2_rng, s3_rng, s4_rng, s5_rng))
  > x
bottoms tops
  1-100  100
  > s
bottoms tops
  1 -250.50 30.0
  20.77 10.0
  3   25.00 35.0
  4   70.00 80.3
  5   90.00 95.0
and then compute the difference between the sets x and s (i.e., describe
the points in x but not s as a union of intervals):
  > setdiffRanges(x, s)
bottoms tops
  135.0   70
  280.3   90
  395.0  100
and for a graphical check do
  > plot(x, s, setdiffRanges(x, s))
Are those the numbers you want?

I find it easier to use standard functions and data structures for this than
to adapt the cumsum/order idiom to different situations.

Bill Dunlap
Spotfire, TIBCO Software
wdunlap tibco.com


> -Original Message-
> From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On 
> Behalf
> Of Ben quant
> Sent: Monday, May 14, 2012 11:07 AM
> To: jim holtman
> Cc: r-help@r-project.org
> Subject: Re: [R] range segment exclusion using range endpoints
> 
> Turns out this solution doesn't work if the s range is outside the range of
> the x range. I didn't include that in my examples, but it is something I
> have to deal with quite often.
> 
> For example s1_rng below causes an issue:
> 
> x_rng = c(-100,100)
> s1_rng = c(-250.5,30)
> s2_rng = c(0.77,10)
> s3_rng = c(25,35)
> s4_rng = c(70,80.3)
> s5_rng = c(90,95)
> 
> sNames <- grep("s[0-9]+_rng", ls(), value = TRUE)
> queue <- rbind(c(x_rng[1], 1), c(x_rng[2], 1))
> for (i in sNames){
>   queue <- rbind(queue
>  , c(get(i)[1], 1)  # enter queue
>  , c(get(i)[2], -1)  # exit queue
>  )
> }
> queue <- queue[order(queue[, 1]), ]  # sort
> queue <- cbind(queue, cumsum(queue[, 2]))  # of people in the queue
> for (i in which(queue[, 3] == 1)){
>   cat("start:", queue[i, 1L], '  end:', queue[i + 1L, 1L], "\n")
> }
> 
> Regards,
> 
> ben
> On Sat, May 12, 2012 at 12:50 PM, jim holtman  wrote:
> 
> > Here is an example of how you might do it.  It uses a technique of
> > counting how many items are in a queue based on their arrival times;
> > it can be used to also find areas of overlap.
> >
> > Note that it would be best to use a list for the 's' end points
> >
> > 
> > > # note the next statement removes names of the format 's[0-9]+_rng'
> > > # it would be best to create a list with the 's' endpoints, but this is
> > > # what the OP specified
> > >
> > > rm(list = grep('s[0-9]+_rng', ls(), value = TRUE))  # Danger Will
> > Robinson!!
> > >
> > > # ex 1
> > > x_rng = c(-100,100)
> > >
> > > s1_rng = c(-25.5,30)
> > > s2_rng = c(0.77,10)
> > > s3_rng = c(25,35)
> > > s4_rng = c(70,80.3)
> > > s5_rng = c(90,95)
> > >
> > > # ex 2
> > > # x_rng = c(-50.5,100)
> > >
> > > # s1_rng = c(-75.3,30)
> > >
> > > # ex 3
> > > # x_rng = c(-75.3,30)
> > >
> > > # s1_rng = c(-50.5,100)
> > >
> > > # ex 4
> > > # x_rng = c(-100,100)
> > >
> > > # s1_rng = c(-105,105)
> > >
> > > # find all the names -- USE A LIST NEXT TIME
> > > sNames <- grep("s[0-9]+_rng", ls(), value = TRUE)
> > >
> > > # initial matrix with the 'x' endpoints
> > > queue <- rbind(c(x_rng[1], 1), c(x_rng[2], 1))
> > >
> > > # add the 's' end points to the list
> > > # this will be used to determine how many things are in a queue (or
> > areas that
> > > # overlap)
> > > for (i in sNames){
> > + queue <- rbind(queue
> > + , c(get(i)[1], 1)  # enter queue
> > + , c(get(i)[2], -1)  # exit queue
> > + )
> > + }
> > > queue <- queue[order(queue[, 1]), ]  # sort
> > > queue <- cbind(queue, cumsum(queue[, 2]))  # of people in the queue
> > > print(queue)
> > [,1] [,2] [,3]
> >  [1,] -100.0011
> >  [2,]  -25.5012
> >  [3,]0.7713
> >  [4,]   10.00   -12
> >  [5,]   25.0013
> >  [6,]   30.00   -12
> >  [7,]   35.00   -11
> >  [8,]   70.0012
> >  [9,]   80.30   -11
> > [10,]   90.0012
> > [11,]   95.00   -11
> > [12,]  100.0012
> > >
> > > # print out values where the last column is 1
> > > for (i in which(queue[, 3] == 1)){
> > + cat("start:", queue[i, 1L], '  end:', queue[i + 1L, 1L], "\n")
> > + }
> > start: -100   end: -25.5
> > start: 35   end: 70
> > start: 80.3   end: 90
> > start: 95   end: 100
> > >
> > >
> > =
> >
> > On Sat, May 

Re: [R] as.function parameters

2012-05-14 Thread David Winsemius


On May 14, 2012, at 1:56 PM, jackl wrote:


Hi,

~ well that seems to be a better solution.
Question is how much an enviroment for each node costs in terms of
save space..


Seems unlike that it would expand you space very much. Every function  
you create will have an environment.




The example code is hard to present, because it is really based on  
that
problem. The frame of the problem is, that I have to write a program  
that gives
each node in a binomial tree a function with individual parameters..  
in my

case the current stock price at that node.

I don't want to create these functions manually.. that would be too  
much of an
overload with a tree even of moderate size. But if I define these  
functions

via the as.function functionality it gives me the above problem.


You need to demonstrate what "tree" structure you propose to populate  
and then access. Since tree's are not an R data type, you should show  
us _exactly_ how you will create one, presumably using "lists". Once  
you have a small example it is very easy to present by using the  
output of the 'dput' function.




--
View this message in context: 
http://r.789695.n4.nabble.com/as-function-parameters-tp4620390p4629979.html
Sent from the R help mailing list archive at Nabble.com.


You should also realize that you are probably limiting your audience  
severely by not posting context. Most people do not read Rhelp on  
Nabble.


--

David Winsemius, MD
West Hartford, CT

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] range segment exclusion using range endpoints

2012-05-14 Thread Ben quant
Turns out this solution doesn't work if the s range is outside the range of
the x range. I didn't include that in my examples, but it is something I
have to deal with quite often.

For example s1_rng below causes an issue:

x_rng = c(-100,100)
s1_rng = c(-250.5,30)
s2_rng = c(0.77,10)
s3_rng = c(25,35)
s4_rng = c(70,80.3)
s5_rng = c(90,95)

sNames <- grep("s[0-9]+_rng", ls(), value = TRUE)
queue <- rbind(c(x_rng[1], 1), c(x_rng[2], 1))
for (i in sNames){
  queue <- rbind(queue
 , c(get(i)[1], 1)  # enter queue
 , c(get(i)[2], -1)  # exit queue
 )
}
queue <- queue[order(queue[, 1]), ]  # sort
queue <- cbind(queue, cumsum(queue[, 2]))  # of people in the queue
for (i in which(queue[, 3] == 1)){
  cat("start:", queue[i, 1L], '  end:', queue[i + 1L, 1L], "\n")
}

Regards,

ben
On Sat, May 12, 2012 at 12:50 PM, jim holtman  wrote:

> Here is an example of how you might do it.  It uses a technique of
> counting how many items are in a queue based on their arrival times;
> it can be used to also find areas of overlap.
>
> Note that it would be best to use a list for the 's' end points
>
> 
> > # note the next statement removes names of the format 's[0-9]+_rng'
> > # it would be best to create a list with the 's' endpoints, but this is
> > # what the OP specified
> >
> > rm(list = grep('s[0-9]+_rng', ls(), value = TRUE))  # Danger Will
> Robinson!!
> >
> > # ex 1
> > x_rng = c(-100,100)
> >
> > s1_rng = c(-25.5,30)
> > s2_rng = c(0.77,10)
> > s3_rng = c(25,35)
> > s4_rng = c(70,80.3)
> > s5_rng = c(90,95)
> >
> > # ex 2
> > # x_rng = c(-50.5,100)
> >
> > # s1_rng = c(-75.3,30)
> >
> > # ex 3
> > # x_rng = c(-75.3,30)
> >
> > # s1_rng = c(-50.5,100)
> >
> > # ex 4
> > # x_rng = c(-100,100)
> >
> > # s1_rng = c(-105,105)
> >
> > # find all the names -- USE A LIST NEXT TIME
> > sNames <- grep("s[0-9]+_rng", ls(), value = TRUE)
> >
> > # initial matrix with the 'x' endpoints
> > queue <- rbind(c(x_rng[1], 1), c(x_rng[2], 1))
> >
> > # add the 's' end points to the list
> > # this will be used to determine how many things are in a queue (or
> areas that
> > # overlap)
> > for (i in sNames){
> + queue <- rbind(queue
> + , c(get(i)[1], 1)  # enter queue
> + , c(get(i)[2], -1)  # exit queue
> + )
> + }
> > queue <- queue[order(queue[, 1]), ]  # sort
> > queue <- cbind(queue, cumsum(queue[, 2]))  # of people in the queue
> > print(queue)
> [,1] [,2] [,3]
>  [1,] -100.0011
>  [2,]  -25.5012
>  [3,]0.7713
>  [4,]   10.00   -12
>  [5,]   25.0013
>  [6,]   30.00   -12
>  [7,]   35.00   -11
>  [8,]   70.0012
>  [9,]   80.30   -11
> [10,]   90.0012
> [11,]   95.00   -11
> [12,]  100.0012
> >
> > # print out values where the last column is 1
> > for (i in which(queue[, 3] == 1)){
> + cat("start:", queue[i, 1L], '  end:', queue[i + 1L, 1L], "\n")
> + }
> start: -100   end: -25.5
> start: 35   end: 70
> start: 80.3   end: 90
> start: 95   end: 100
> >
> >
> =
>
> On Sat, May 12, 2012 at 1:54 PM, Ben quant  wrote:
> > Hello,
> >
> > I'm posting this again (with some small edits). I didn't get any replies
> > last time...hoping for some this time. :)
> >
> > Currently I'm only coming up with brute force solutions to this issue
> > (loops). I'm wondering if anyone has a better way to do this. Thank you
> for
> > your help in advance!
> >
> > The problem: I have endpoints of one x range (x_rng) and an unknown
> number
> > of s ranges (s[#]_rng) also defined by the range endpoints. I'd like to
> > remove the x ranges that overlap with the s ranges. The examples below
> > demonstrate what I mean.
> >
> > What is the best way to do this?
> >
> > Ex 1.
> > For:
> > x_rng = c(-100,100)
> >
> > s1_rng = c(-25.5,30)
> > s2_rng = c(0.77,10)
> > s3_rng = c(25,35)
> > s4_rng = c(70,80.3)
> > s5_rng = c(90,95)
> >
> > I would get:
> > -100,-25.5
> > 35,70
> > 80.3,90
> > 95,100
> >
> > Ex 2.
> > For:
> > x_rng = c(-50.5,100)
> >
> > s1_rng = c(-75.3,30)
> >
> > I would get:
> > 30,100
> >
> > Ex 3.
> > For:
> > x_rng = c(-75.3,30)
> >
> > s1_rng = c(-50.5,100)
> >
> > I would get:
> > -75.3,-50.5
> >
> > Ex 4.
> > For:
> > x_rng = c(-100,100)
> >
> > s1_rng = c(-105,105)
> >
> > I would get something like:
> > NA,NA
> > or...
> > NA
> >
> > Ex 5.
> > For:
> > x_rng = c(-100,100)
> >
> > s1_rng = c(-100,100)
> >
> > I would get something like:
> > -100,-100
> > 100,100
> > or just...
> > -100
> >  100
> >
> > PS - You may have noticed that in all of the examples I am including the
> s
> > range endpoints in the desired results, which I can deal with later in my
> > program so its not a problem...  I think leaving in the s range endpoints
> > simplifies the problem.
> >
> > Thanks!
> > Ben
> >
> >[[alternative HTML version deleted]]
> >
> > _

Re: [R] as.function parameters

2012-05-14 Thread jackl
Hi,

~ well that seems to be a better solution. 
Question is how much an enviroment for each node costs in terms of 
save space..

The example code is hard to present, because it is really based on that
problem.
The frame of the problem is, that I have to write a program that gives
each node in a binomial tree a function with individual parameters.. in my
case
the current stock price at that node. 

I don't want to create these functions manually.. that would be too much of
an
overload with a tree even of moderate size. But if I define these functions
via the as.function functionality it gives me the above problem.

Hope that clarifies the problem a bit

thanks again

--
View this message in context: 
http://r.789695.n4.nabble.com/as-function-parameters-tp4620390p4629979.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] rgl: cylinder3d() with elliptical cross-section

2012-05-14 Thread Duncan Murdoch

On 10/03/2012 9:44 AM, Duncan Murdoch wrote:

My first reply to this went privately, by accident.  I've done a little
editing to it, but mainly this is for the archives.

On 12-03-09 2:36 PM, Michael Friendly wrote:
>  For a paper dealing with generalized ellipsoids, I want to illustrate in
>  3D an ellipsoid that is unbounded
>  in one dimension, having the shape of an infinite cylinder along, say,
>  z, but whose cross-section in (x,y)
>  is an ellipse, say, given by the 2x2 matrix cov(x,y).
>
>  I've looked at rgl:::cylinder3d, but don't see any way to make it
>  accomplish this.  Does anyone have
>  any ideas?

rgl has no way to display curved surfaces that are unbounded.  (It has
lines and planes that adapt to the viewport.)  So you would need to make
a finite cylinder, and it will be up to you to choose how big to make it.

The cylinder3d() function can do that, but it's not very good at
cylinders that are straight. (This is a little embarrassing...) It sets
up a local coordinate system based on the curvature, but if there are no
curves, it fails, and you have to supply your own coordinates.


This got forgotten for a while, but is now fixed on R-forge.  If you get 
rev 881 or newer of rgl, it can draw straight cylinders.


Duncan Murdoch



So here's how I would do what you want:

center<- cbind(0, 0, 1:10)  # cylinder centered on points (0, 0, z)
e2<- cbind(1, 0, rep(0, 10)) # define the normal vectors
cyl<- cylinder3d(center, e2=e2)

# Now you have an octagonal cylinder.  Use the sides arg to cylinder3d
# if it doesn't end up smooth enough, but in most cases I've seen, 8
# is sufficient.

# Define a transformation to the x and y coordinates to give the
# elliptical shape; use it as the
# top left 2x2 matrix of a 3x3 matrix
xfrm<- matrix( c(2, 1, 0,
 1, 3, 0,
 0, 0, 1), 3,3, byrow=TRUE)
cyl<- transform3d(cyl, xfrm)
cyl<- addNormals(cyl)  # this makes it shade smoothly
shade3d(cyl, col="green")
decorate3d()  # show some axes for scale

Duncan Murdoch


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] New to R

2012-05-14 Thread Sarah Goslee
Hi,

On Mon, May 14, 2012 at 12:44 PM, Bert Gunter  wrote:
> I usually try google searches first. While not always successful, I am
> frequently surprised by how well it does.

rseek.org is simply a custom Google search for R-related things. It
does an even better job pulling out only R-related material, and
organizing it in a useful format.


> I strongly second the use of CRAN task views. IMHO, some fine folks
> have volunteered their time and efforts to produce this very well
> written series of guides to what's in R. It and they deserve greater
> recognition.

Absolutely.

Sarah
-- 
Sarah Goslee
http://www.sarahgoslee.com

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Discrete choice model maximum likelihood estimation

2012-05-14 Thread Berend Hasselman

See below.
On 14-05-2012, at 13:21, infinitehorizon wrote:

> Hello again,
> 
> I changed the name to tt. 
> and for a and tt actually I was getting them from data, I didn't put them
> here in the question. Now I restructured my code and below I copy the full
> code, I tried many things but still getting the same error, I don't
> understand where is the mistake.
> 
> I also defined one more variable to increase comprehension of the problem.
> Instead of data, I define three representative vectors in the beginning. If
> you run this code, you will see the error message.
> 
> # Variables, a: discrete choice variable-dependent, x and tt are independent
> variables, tt is binary
> a  <- c(1,1,2,3,2,3,1,2,2,3,1,1)
> x  <- c(23,26,12,27,10,30,13,20,23,44,17,15)
> tt <- c(1,0,0,1,1,1,0,0,1,0,1,1)
> 
> # First, just to see, the linear model
> 
> LM<-lm(a ~ x+tt)
> coefLM<- coefficients(LM)
> summary(LM)
> 
> # Probabilities for discrete choices for a=3, a=2 and a=1 respectively 
> P3 <- function(bx,b3,b,tt) { 
> P <- exp(bx*x+b3+b*(tt==1))/(1-exp(bx*x+b3+b*(tt==1))) 
> return(P) 
> } 
> P2 <- function(bx,b2,b,tt) { 
> P <- exp(bx*x+b2+b*(tt==1))/(1-exp(bx*x+b2+b*(tt==1))) 
> return(P) 
> } 
> P1 <- function(bx,b1,b,tt) { 
> P <- exp(bx*x+b1+b*(tt==1))/(1-exp(bx*x+b1+b*(tt==1))) 
> return(P) 
> }
> 
> # Likelihood functions for discrete choices for a=3, a=2 and a=1
> respectively
> 
> L3 <- function(bx,b1,b2,b3,b,tt) { 
> P11 <- P1(bx,b1,b,tt) 
> P22 <- P2(bx,b2,b,tt) 
> P33 <- P3(bx,b3,b,tt) 
> 
> L3l <- P11*P22*P33 
> return(L3l) 
> } 
> 
> L2 <- function(bx,b1,b2,b,tt) { 
> P11 <- P1(bx,b1,b,tt) 
> P22 <- P2(bx,b2,b,tt) 
> 
> L2l <- P11*P22 
> return(L2l) 
> } 
> 
> L1 <- function(bx,b1,b,tt) { 
> P11 <- P1(bx,b1,b,tt) 
> 
> L1l <- P11 
> return(L1l) 
> }
> 
> # Log-likelihood function 
> 
> llfn <- function(param) { 
> 
> bx <- param[1] 
> b1 <- param[2] 
> b2 <- param[3] 
> b3 <- param[4] 
> b <- param[5] 
> 
> lL1 <- log(L1(bx,b1,b2,b,tt)) 
> lL2 <- log(L2(bx,b1,b2,b3,b,tt)) 
> lL3 <- log(L3(bx,b1,b2,b3,b,tt)) 
> 
> llfn <- (a==1)*lL1+(a==2)*lL2+(a==3)*lL3 
> } 
> start.par <- c(0,0,0,0,0) 
> est <- optim(param=start.par,llfn,x=x, a=a, tt=tt, method =
> c("CG"),control=list(trace=2,maxit=2000), hessian=TRUE)
> 

Due to urgent matters, I can only answer briefly.

optim doesn't have an argument param. It does have an argument par so you 
should have written par=start.par.

If you do that you will get other error messages. You are calling L1, L2 and L3 
with too many arguments.

And then you will find that the return value of llfn is a vector and not a 
scalar.

Finally why are passing x=x in the  optim call to llfn? It is not used anywhere.

You need to rethink your approach. And certainly read the help for optim.

Berend

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] How to apply a function to a multidimensional array based on its indices

2012-05-14 Thread David Winsemius


On May 14, 2012, at 10:09 AM, math_daddy wrote:

Hello. I have a 4 dimensional array and I want to fill in the slots  
with
values which are a function of the inputs. Through searching the  
forums here
I found that the function "outer" is helpful for 2x2 matrices but  
cannot be
applied to general multidimensional arrays. Is there anything which  
can

achieve, more efficiently than the following code, the job I want?

K <- array(0,dim=c(2,2,2,2)) #dimensions will be much larger
for(x1 in 1:2)
{
 for(y1 in 1:2)
 {
   for(x2 in 1:2)
   {
 for(y2 in 1:2)
 {
   K[x1,y1,x2,y2] <- x1*y2 - sin(x2*y1) #this is just a dummy  
function.

 }
   }
 }
}



If you can create a data.frame or matrix that has the indices  
x1,x2,y1,y2 and the values you can use the:  K[cbind(index-vectors)]  
<- values construction:


mtx<- data.matrix( expand.grid(x1=1:2,x2=1:2,y1=1:2,y2=1:2) )
K[mtx] <- apply(mtx, 1, function(x) x["x1"]*x["y2"] -  
sin(x['x2']*x['y1']) )

#
> K
, , 1, 1

 [,1]   [,2]
[1,] 0.158529 0.09070257
[2,] 1.158529 1.09070257

, , 2, 1

   [,1] [,2]
[1,] 0.09070257 1.756802
[2,] 1.09070257 2.756802

, , 1, 2

 [,1] [,2]
[1,] 1.158529 1.090703
[2,] 3.158529 3.090703

, , 2, 2

 [,1] [,2]
[1,] 1.090703 2.756802
[2,] 3.090703 4.756802



Thank you in advance for any help.


--
View this message in context: 
http://r.789695.n4.nabble.com/How-to-apply-a-function-to-a-multidimensional-array-based-on-its-indices-tp4629940.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


David Winsemius, MD
West Hartford, CT

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] select data

2012-05-14 Thread David L Carlson
This overwrites the data so you might want to create a copy first.

example <- data.frame(V1=c(3, -1), V2=c(-2, 4), V3=c(4, 1))
tf <- ifelse(example<0, TRUE, FALSE)
example[tf] <- NA
apply(example, 1, mean, na.rm=TRUE)

--
David L Carlson
Associate Professor of Anthropology
Texas A&M University
College Station, TX 77843-4352

> -Original Message-
> From: r-help-boun...@r-project.org [mailto:r-help-bounces@r-
> project.org] On Behalf Of Andrea Sica
> Sent: Monday, May 14, 2012 11:32 AM
> To: r-help@r-project.org
> Subject: [R] select data
> 
> Dear all,
> 
> I am sure it won't be difficult for you!!
> I need to calculate the average among variables for the single units of
> my
> dataset.
> But, while doing it, I need to do not consider some values.
> To better explain, think like there are two units and three variables:
> 
>   V1V2 V3
> [1]   3 -2  4
> [2]  -1  4  1
> 
> and you want to calculate the average by row, without considering those
> negative values:
> 
> => mean(1row) = (3+4)/2
> => mean(2row) = (4+1)/2
> 
> Could anyone please give me the commands to do that in R?
> 
> Thank you so much
> 
>   [[alternative HTML version deleted]]
> 
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-
> guide.html
> and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Help with V function in igraph

2012-05-14 Thread Gábor Csárdi
Something weird must be going on in your s641_social object. Can you
just simply check that the vertex names look OK with
'V(s641_social)$name'?

If they look good, then can you send me the s641_social object in
private? (Or part of it, assuming a part is enough to reproduce the
problem.)

Best,
Gabor

On Sat, May 12, 2012 at 8:07 PM, bmccowan  wrote:
> I am using the code below to output some network measures:
>
> central_social <- data.frame(V(s641_social)$name, indegree_social,
> outdegree_social, incloseness_social, outcloseness_social,
> betweenness_social, eigen_social)
>
> and I get the following error:
>
>
> Error in Re(z) : non-numeric argument to function
>
> I know this has to do with V but I cannot figure out what is wrong.
> s641-social is a graph object and the vertices do have a name attribute.
>
> What can I do to fix?
>
> Thanks
>
> --
> View this message in context: 
> http://r.789695.n4.nabble.com/Help-with-V-function-in-igraph-tp4629767.html
> Sent from the R help mailing list archive at Nabble.com.
>
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.



-- 
Gabor Csardi      MTA KFKI RMKI

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] New to R

2012-05-14 Thread Bert Gunter
I usually try google searches first. While not always successful, I am
frequently surprised by how well it does.

I strongly second the use of CRAN task views. IMHO, some fine folks
have volunteered their time and efforts to produce this very well
written series of guides to what's in R. It and they deserve greater
recognition.

-- Bert

On Mon, May 14, 2012 at 9:32 AM, David L Carlson  wrote:
> You will find functions such as these in the thousands of packages that are
> available once you have installed R. You can use rseek.org to search for
> specific topics. Good overviews are found in the CRAN Task Views (from the
> main R webpage, click on CRAN, select a mirror host, and then select Task
> Views from the list on the left.
>
> --
> David L Carlson
> Associate Professor of Anthropology
> Texas A&M University
> College Station, TX 77843-4352
>
>
>> -Original Message-
>> From: r-help-boun...@r-project.org [mailto:r-help-bounces@r-
>> project.org] On Behalf Of Ronald McDowell
>> Sent: Monday, May 14, 2012 6:51 AM
>> To: r-help@r-project.org
>> Subject: [R] New to R
>>
>> I am new to R and starting to explore its functionality. I wondered if
>> anyone could advise whether R supports non-linear canonical correlation
>> and/or the specification of models using alternating least squares?
>>
>> Thanks
>> Ron
>>
>>
>>       [[alternative HTML version deleted]]
>>
>> __
>> R-help@r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide http://www.R-project.org/posting-
>> guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.



-- 

Bert Gunter
Genentech Nonclinical Biostatistics

Internal Contact Info:
Phone: 467-7374
Website:
http://pharmadevelopment.roche.com/index/pdb/pdb-functional-groups/pdb-biostatistics/pdb-ncb-home.htm

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] using rcom to control Power Point - problem to set a property

2012-05-14 Thread Mark Heckmann
Hi all,

I try to convert the VBA code below to run it from R using rcom and Power Point.
The VBA code creates a shape and will move the handle of the the shape to 
another position.
This fails using rcom and I do not understand what I am doing wrong...

### VBA ###

ActiveWindow.Selection.SlideRange.Shapes.AddShape(msoShapeRectangularCallout, 
144, 144, 200, 200).Select
With ActiveWindow.Selection.ShapeRange
.Adjustments.Item(1) = 0.0942
.Adjustments.Item(2) = 1.7395
End With

### R ###

library(rcom)

## initializing power point program and slide
ppt <- comCreateObject("PowerPoint.Application")
ppt[["visible"]] <- TRUE
pres <- ppt[["Presentations"]]$add()
slide <- pres[["Slides"]]$add(1, 12)
slide$Select()

## adding the ppt autoform (corresponds to the above VBA code). The last lines 
fail for some reason.
doc <- ppt[["ActivePresentation"]][["Slides"]]$Item(1)
rect <- doc[["Shapes"]]$AddShape(Type=105, Top=144, Left=144, Width=200, 
Height=200)
rect$Select()
shp <- ppt[["ActiveWindow"]][["Selection"]][["ShapeRange"]]

shp[["Adjustments"]][["Count"]] 
# there are adjustment items!!
names(comGetObjectInfo(shp[["Adjustments"]]))   # there is a 
function called Item

# now the code below does not work
shp[["Adjustments"]][["Item"]]   # does not work
shp[["Adjustments"]]$Item(1) # does not work
shp[["Adjustments"]]$Item(1) <- .9   # does not work -> error

Does someone why this is the case or what I am doing wrong?
Thanks !
Mark


Mark Heckmann
Blog: www.markheckmann.de
R-Blog: http://ryouready.wordpress.com

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Discrete choice model maximum likelihood estimation

2012-05-14 Thread infinitehorizon
Of course, that was the trick! It works now.  Thank you very much Rui, I am
very grateful.
I hope this thread will help others as well.

Best,



Rui Barradas wrote
> 
> Once again, sorry.
> I had a different llfn in my R session and it messed with yours.
> 
> llfn <- function(param, a, tt) {
> 
> 
> llfn <- sum((a==1)*lL1+(a==2)*lL2+(a==3)*lL3)  # sum of logs, it's a
> log-likelihood.
> return(-llfn)
> }
> 
> Rui Barradas
> 
> infinitehorizon wrote
>> 
>> Hello again,
>> 
>> You are absolutely right about probabilities.. Thanks for reminding me
>> about that.
>> 
>> I did exactly how you said but in the end I receive the error :
>> "objective function in optim evaluates to length 12 not 1".
>> I checked how llfn give a vector instead of scalar,  but couldn't figure
>> it out.
>> 
>> Can you please tell me how did you obtain those estimates?
>> Thanks again,
>> 
>> Best,
>> 
>> Marc
>> 
>> 
>> Rui Barradas wrote
>>> 
>>> Hello, again.
>>> 
>>> Bug report:
>>> 1. Your densities can return negative values, 1 - exp(...) < 0.
>>> Shouldn't those be 1 PLUS exp()?
>>> 
>>> P3 <- function(bx,b3,b,tt) {
>>> P <- exp(bx*x+b3+b*(tt == 1))/(1+exp(bx*x+b3+b*(tt == 1)))
>>> return(P)
>>> }
>>> 
>>> And the same for P2 and P1?
>>> 
>>> 2. Include 'a' and 'tt' as llfn parameters and call like the following.
>>> 
>>> llfn <- function(param, a, tt) {
>>> 
>>>[... etc ...]
>>>return(-llfn)
>>> }
>>> 
>>> start.par <- rep(0, 5)
>>> est <- optim(start.par, llfn, gr=NULL, a=a, tt=tt)
>>> est
>>> $par
>>> [1]  4.1776294 -0.9952026 -0.7667640 -0.1933693  0.7325221
>>> 
>>> $value
>>> [1] 0
>>> 
>>> $counts
>>> function gradient 
>>>   44   NA 
>>> 
>>> $convergence
>>> [1] 0
>>> 
>>> $message
>>> NULL
>>> 
>>> 
>>> Note the optimum value of zero, est$value == 0
>>> 
>>> Rui Barradas
>>> 
>>> infinitehorizon wrote
 
 By the way, in my last post I forgot to return negative of llfn, hence
 the llfn will be as follows:
 
 llfn <- function(param) { 
 
 bx <- param[1] 
 b1 <- param[2] 
 b2 <- param[3] 
 b3 <- param[4] 
 b <- param[5] 
 
 lL1 <- log(L1(bx,b1,b2,b,tt)) 
 lL2 <- log(L2(bx,b1,b2,b3,b,tt)) 
 lL3 <- log(L3(bx,b1,b2,b3,b,tt)) 
 
 llfn <- (a==1)*lL1+(a==2)*lL2+(a==3)*lL3 
 return(-llfn)
 } 
 
 However, it does not fix the problem, I still receive the same error..
 
>>> 
>> 
> 


--
View this message in context: 
http://r.789695.n4.nabble.com/Discrete-choice-model-maximum-likelihood-estimation-tp4629877p4629962.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Data read as labels

2012-05-14 Thread David Winsemius


On May 14, 2012, at 11:23 AM, barb wrote:


Hey David,

thanks for your fast reply, i really appreciate that you answer so  
many

posts.

Unfortunately it´s not that easy. Try to operate with the output:

e.g
file<-read.csv2(tmp,sep=";",skip="5")
a<-(relevant<-file[,2][1])
a*5
# or
as.numeric(relevant<-file[,2][1])

a is saved in the workspace as a factor and the values i actually  
need are

saved as the labels.
(therefore my subject)


Your subject line asked for "labels". That is not a word that  
represents anything specific in R parlance except perhaps plotting  
function arguments. It you want to prevent the conversion of  
"character" values to factors then you should be using  
stringsAsFactors=FALSE in the read functions.


If you want to convert from factor to character correctly, you could  
also refer to the FAQ. On my machine the section "7.10 How do I  
convert factors to numeric?" is located at:


http://127.0.0.1:13702/doc/manual/R-FAQ.html#How-do-I-convert-factors-to-numeric_003f

You should have a similar copy of the FAQ someplace on your machine.  
It's good to review the "miscellaneous" section a couple of times.




Thank You!

--
View this message in context: 
http://r.789695.n4.nabble.com/Data-read-as-labels-tp4629901p4629951.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


David Winsemius, MD
West Hartford, CT

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] How to apply a function to a multidimensional array based on its indices

2012-05-14 Thread Rui Barradas
Hello,

Try

K <- array(0,dim=c(2,2,2,2)) #dimensions will be much larger
for(x1 in 1:2){
  for(y1 in 1:2)  {
for(x2 in 1:2){
  for(y2 in 1:2)  {
K[x1,y1,x2,y2] <-  x1*y2 - sin(x2*y1) #this is just a dummy
function.
  }  
}
  }
}


fun <- function(x){
x1 <- x[1]  # these are
x2 <- x[2]  # not
y1 <- x[3]  # really
y2 <- x[4]  # needed
x1*y2 - sin(x2*y1) # could have used x[1], etc
}


res <- apply(expand.grid(1:2, 1:2, 1:2, 1:2), 1, fun)
dim(res) <- c(2, 2, 2, 2)
res

all.equal(K, res)
[1] TRUE

See the help pages
?expand.grid
?apply


Hope this helps,

Rui Barradas

math_daddy wrote
> 
> Hello. I have a 4 dimensional array and I want to fill in the slots with
> values which are a function of the inputs. Through searching the forums
> here I found that the function "outer" is helpful for 2x2 matrices but
> cannot be applied to general multidimensional arrays. Is there anything
> which can achieve, more efficiently than the following code, the job I
> want?
> 
> K <- array(0,dim=c(2,2,2,2)) #dimensions will be much larger
> for(x1 in 1:2)
> {
>   for(y1 in 1:2)
>   {
> for(x2 in 1:2)
> {
>   for(y2 in 1:2)
>   {
> K[x1,y1,x2,y2] <- x1*y2 - sin(x2*y1) #this is just a dummy
> function.
>   }   
> }
>   }
> }
> 
> Thank you in advance for any help.
> 


--
View this message in context: 
http://r.789695.n4.nabble.com/How-to-apply-a-function-to-a-multidimensional-array-based-on-its-indices-tp4629940p4629955.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] select data

2012-05-14 Thread Andrea Sica
Thank you all. Really!
I have used the following function:

apply(dfrm, 1, function(x) mean(x[x>=0]) )

Someone of you even gave me a few interesting explanations
about why to use it.

Still thank you all.

Andrea

On Mon, May 14, 2012 at 6:52 PM, David Winsemius wrote:

>
> On May 14, 2012, at 12:32 PM, Andrea Sica wrote:
>
>  Dear all,
>>
>> I am sure it won't be difficult for you!!
>> I need to calculate the average among variables for the single units of my
>> dataset.
>> But, while doing it, I need to do not consider some values.
>> To better explain, think like there are two units and three variables:
>>
>> V1V2 V3
>> [1]   3 -2  4
>> [2]  -1  4  1
>>
>> and you want to calculate the average by row, without considering those
>> negative values:
>>
>> => mean(1row) = (3+4)/2
>> => mean(2row) = (4+1)/2
>>
>
> perhaps (untested in absence of reproducble example):
>
> apply(dfrm, 1, function(x) mean(x[x>=0]) ) # would also work for a matrix
> object.
>
>
>> Could anyone please give me the commands to do that in R?
>>
>> Thank you so much
>>
>>[[alternative HTML version deleted]]
>>
>> __**
>> R-help@r-project.org mailing list
>> https://stat.ethz.ch/mailman/**listinfo/r-help
>> PLEASE do read the posting guide http://www.R-project.org/**
>> posting-guide.html 
>> and provide commented, minimal, self-contained, reproducible code.
>>
>
> David Winsemius, MD
> West Hartford, CT
>
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Discrete choice model maximum likelihood estimation

2012-05-14 Thread Rui Barradas
Once again, sorry.
I had a different llfn in my R session and it messed with yours.

llfn <- function(param, a, tt) {


llfn <- sum((a==1)*lL1+(a==2)*lL2+(a==3)*lL3)  # sum of logs, it's a
log-likelihood.
return(-llfn)
}

Rui Barradas

infinitehorizon wrote
> 
> Hello again,
> 
> You are absolutely right about probabilities.. Thanks for reminding me
> about that.
> 
> I did exactly how you said but in the end I receive the error : "objective
> function in optim evaluates to length 12 not 1".
> I checked how llfn give a vector instead of scalar,  but couldn't figure
> it out.
> 
> Can you please tell me how did you obtain those estimates?
> Thanks again,
> 
> Best,
> 
> Marc
> 
> 
> Rui Barradas wrote
>> 
>> Hello, again.
>> 
>> Bug report:
>> 1. Your densities can return negative values, 1 - exp(...) < 0.
>> Shouldn't those be 1 PLUS exp()?
>> 
>> P3 <- function(bx,b3,b,tt) {
>>  P <- exp(bx*x+b3+b*(tt == 1))/(1+exp(bx*x+b3+b*(tt == 1)))
>>  return(P)
>> }
>> 
>> And the same for P2 and P1?
>> 
>> 2. Include 'a' and 'tt' as llfn parameters and call like the following.
>> 
>> llfn <- function(param, a, tt) {
>> 
>>[... etc ...]
>>return(-llfn)
>> }
>> 
>> start.par <- rep(0, 5)
>> est <- optim(start.par, llfn, gr=NULL, a=a, tt=tt)
>> est
>> $par
>> [1]  4.1776294 -0.9952026 -0.7667640 -0.1933693  0.7325221
>> 
>> $value
>> [1] 0
>> 
>> $counts
>> function gradient 
>>   44   NA 
>> 
>> $convergence
>> [1] 0
>> 
>> $message
>> NULL
>> 
>> 
>> Note the optimum value of zero, est$value == 0
>> 
>> Rui Barradas
>> 
>> infinitehorizon wrote
>>> 
>>> By the way, in my last post I forgot to return negative of llfn, hence
>>> the llfn will be as follows:
>>> 
>>> llfn <- function(param) { 
>>> 
>>> bx <- param[1] 
>>> b1 <- param[2] 
>>> b2 <- param[3] 
>>> b3 <- param[4] 
>>> b <- param[5] 
>>> 
>>> lL1 <- log(L1(bx,b1,b2,b,tt)) 
>>> lL2 <- log(L2(bx,b1,b2,b3,b,tt)) 
>>> lL3 <- log(L3(bx,b1,b2,b3,b,tt)) 
>>> 
>>> llfn <- (a==1)*lL1+(a==2)*lL2+(a==3)*lL3 
>>> return(-llfn)
>>> } 
>>> 
>>> However, it does not fix the problem, I still receive the same error..
>>> 
>> 
> 


--
View this message in context: 
http://r.789695.n4.nabble.com/Discrete-choice-model-maximum-likelihood-estimation-tp4629877p4629954.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] select data

2012-05-14 Thread David Winsemius


On May 14, 2012, at 12:32 PM, Andrea Sica wrote:


Dear all,

I am sure it won't be difficult for you!!
I need to calculate the average among variables for the single units  
of my

dataset.
But, while doing it, I need to do not consider some values.
To better explain, think like there are two units and three variables:

 V1V2 V3
[1]   3 -2  4
[2]  -1  4  1

and you want to calculate the average by row, without considering  
those

negative values:

=> mean(1row) = (3+4)/2
=> mean(2row) = (4+1)/2


perhaps (untested in absence of reproducble example):

apply(dfrm, 1, function(x) mean(x[x>=0]) ) # would also work for a  
matrix object.




Could anyone please give me the commands to do that in R?

Thank you so much

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


David Winsemius, MD
West Hartford, CT

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Data read as labels

2012-05-14 Thread Krzysztof Mitko


On Mon, May 14, 2012, at 02:33, barb wrote:
> Hey guys,
> 
> i have a strange problem reading a .csv file. 
> Seems not to be covered by the usual read.csv techniques. 
> 
> The relevant data i want to use, seems to be saved as the label of the
> data
> point. 
> Therefore i can not really use it
> 
> 
> spec<-"EU2001"
> part1<-"http://www.bundesbank.de/statistik/statistik_zeitreihen_download.php?func=directcsv&from=&until=&filename=bbk_";
> part2<-"&csvformat=de&euro=mixed&tr="
> tmp<-tempfile()
> load<-paste(part1,spec,part2,spec,sep="")
> download.file(load,tmp)
> file<-read.csv(tmp,sep=";",dec=",", skip="5")
> (relevant<-file[,2][1])

It seems to me that there is a problem with conversion from data to
known type - the last two lines contains comments instead of data and
first column type is not recognized. You can supress all conversions,
remove problematic lines and then make conversion manually or import
only relevant lines and specify types. For example:

file<-read.csv(tmp, sep=";",
dec=",",skip=5,header=FALSE,nrows=495,colClasses=c("character","numeric","NULL","NULL"))

-- 
Z pozdrowieniami,
Krzysztof Mitko

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] How to apply a function to a multidimensional array based on its indices

2012-05-14 Thread math_daddy
Hello. I have a 4 dimensional array and I want to fill in the slots with
values which are a function of the inputs. Through searching the forums here
I found that the function "outer" is helpful for 2x2 matrices but cannot be
applied to general multidimensional arrays. Is there anything which can
achieve, more efficiently than the following code, the job I want?

K <- array(0,dim=c(2,2,2,2)) #dimensions will be much larger
for(x1 in 1:2)
{
  for(y1 in 1:2)
  {
for(x2 in 1:2)
{
  for(y2 in 1:2)
  {
K[x1,y1,x2,y2] <- x1*y2 - sin(x2*y1) #this is just a dummy function.
  }   
}
  }
}

Thank you in advance for any help.


--
View this message in context: 
http://r.789695.n4.nabble.com/How-to-apply-a-function-to-a-multidimensional-array-based-on-its-indices-tp4629940.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] select data

2012-05-14 Thread R. Michael Weylandt
This was actually discussed about a week and a half ago with many good
solutions offered, but I think the most idiomatic would be something
like this:

apply(dataset, 1, function(x) mean(x[x>0]))

The reasons I like it:

i) It uses the apply function to do the same operation row-wise
(that's what the "1" does) to all elements of your data set -- since
this is side-effect free (as a good functional language should be) it
makes for easy parallelization if you move to "big data"
ii) It uses an anonymous function (the "function ... " bit) which are
first class objects in R and can be passed as arguments to other
functions (here apply())
iii) It uses logical subscripting to pick out the values greater than
zero -- I think the subscripting behavior is the very best bit of R

Best,

Michael

On Mon, May 14, 2012 at 12:32 PM, Andrea Sica  wrote:
> Dear all,
>
> I am sure it won't be difficult for you!!
> I need to calculate the average among variables for the single units of my
> dataset.
> But, while doing it, I need to do not consider some values.
> To better explain, think like there are two units and three variables:
>
>      V1    V2     V3
> [1]   3     -2      4
> [2]  -1      4      1
>
> and you want to calculate the average by row, without considering those
> negative values:
>
> => mean(1row) = (3+4)/2
> => mean(2row) = (4+1)/2
>
> Could anyone please give me the commands to do that in R?
>
> Thank you so much
>
>        [[alternative HTML version deleted]]
>
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Discrete choice model maximum likelihood estimation

2012-05-14 Thread Rui Barradas
Hello, again.

Bug report:
1. Your densities can return negative values, 1 - exp(...) < 0.
Shouldn't those be 1 PLUS exp()?

P3 <- function(bx,b3,b,tt) {
P <- exp(bx*x+b3+b*(tt == 1))/(1+exp(bx*x+b3+b*(tt == 1)))
return(P)
}

And the same for P2 and P1?

2. Include 'a' and 'tt' as llfn parameters and call like the following.

llfn <- function(param, a, tt) {

   [... etc ...]
   return(-llfn)
}

start.par <- rep(0, 5)
est <- optim(start.par, llfn, gr=NULL, a=a, tt=tt)
est
$par
[1]  4.1776294 -0.9952026 -0.7667640 -0.1933693  0.7325221

$value
[1] 0

$counts
function gradient 
  44   NA 

$convergence
[1] 0

$message
NULL


Note the optimum value of zero, est$value == 0

Rui Barradas

infinitehorizon wrote
> 
> By the way, in my last post I forgot to return negative of llfn, hence the
> llfn will be as follows:
> 
> llfn <- function(param) { 
> 
> bx <- param[1] 
> b1 <- param[2] 
> b2 <- param[3] 
> b3 <- param[4] 
> b <- param[5] 
> 
> lL1 <- log(L1(bx,b1,b2,b,tt)) 
> lL2 <- log(L2(bx,b1,b2,b3,b,tt)) 
> lL3 <- log(L3(bx,b1,b2,b3,b,tt)) 
> 
> llfn <- (a==1)*lL1+(a==2)*lL2+(a==3)*lL3 
> return(-llfn)
> } 
> 
> However, it does not fix the problem, I still receive the same error..
> 


--
View this message in context: 
http://r.789695.n4.nabble.com/Discrete-choice-model-maximum-likelihood-estimation-tp4629877p4629935.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Discrete choice model maximum likelihood estimation

2012-05-14 Thread infinitehorizon
Hello again,

You are absolutely right about probabilities.. Thanks for reminding me about
that.

I did exactly how you said but in the end I receive the error : "objective
function in optim evaluates to length 12 not 1".
I checked how llfn give a vector instead of scalar,  but couldn't figure it
out.

Can you please tell me how did you obtain those estimates?
Thanks again,

Best,

Marc


Rui Barradas wrote
> 
> Hello, again.
> 
> Bug report:
> 1. Your densities can return negative values, 1 - exp(...) < 0.
> Shouldn't those be 1 PLUS exp()?
> 
> P3 <- function(bx,b3,b,tt) {
>   P <- exp(bx*x+b3+b*(tt == 1))/(1+exp(bx*x+b3+b*(tt == 1)))
>   return(P)
> }
> 
> And the same for P2 and P1?
> 
> 2. Include 'a' and 'tt' as llfn parameters and call like the following.
> 
> llfn <- function(param, a, tt) {
> 
>[... etc ...]
>return(-llfn)
> }
> 
> start.par <- rep(0, 5)
> est <- optim(start.par, llfn, gr=NULL, a=a, tt=tt)
> est
> $par
> [1]  4.1776294 -0.9952026 -0.7667640 -0.1933693  0.7325221
> 
> $value
> [1] 0
> 
> $counts
> function gradient 
>   44   NA 
> 
> $convergence
> [1] 0
> 
> $message
> NULL
> 
> 
> Note the optimum value of zero, est$value == 0
> 
> Rui Barradas
> 
> infinitehorizon wrote
>> 
>> By the way, in my last post I forgot to return negative of llfn, hence
>> the llfn will be as follows:
>> 
>> llfn <- function(param) { 
>> 
>> bx <- param[1] 
>> b1 <- param[2] 
>> b2 <- param[3] 
>> b3 <- param[4] 
>> b <- param[5] 
>> 
>> lL1 <- log(L1(bx,b1,b2,b,tt)) 
>> lL2 <- log(L2(bx,b1,b2,b3,b,tt)) 
>> lL3 <- log(L3(bx,b1,b2,b3,b,tt)) 
>> 
>> llfn <- (a==1)*lL1+(a==2)*lL2+(a==3)*lL3 
>> return(-llfn)
>> } 
>> 
>> However, it does not fix the problem, I still receive the same error..
>> 
> 

--
View this message in context: 
http://r.789695.n4.nabble.com/Discrete-choice-model-maximum-likelihood-estimation-tp4629877p4629952.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Extract Variance Components

2012-05-14 Thread robgriffin247
Well, I'm going to reply to my own thread with a solution  here, turns out
one attempt we made last week nearly had it, slight adjustment made it work.
For anyone that is interested / in the future wants to achieve the same
thing >

*varcomp <- matrix(nrow=0, ncol=3)

for (i in 1:nlevels(narrow$gene)) {
x<-lme4::VarCorr(rg.lmer[[i]])
varcomp <- rbind(varcomp, c(as.numeric(x$"line:sex"),
as.numeric(x$"line"), attr(x, "sc")^2))}

varcomp<-data.frame(varcomp, row.names=levels(narrow$gene))

colnames(varcomp)<-c("sex.line", "line", "residual")*

--
View this message in context: 
http://r.789695.n4.nabble.com/Extract-Variance-Components-tp4629895p4629932.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Data read as labels

2012-05-14 Thread barb
Hey David,

thanks for your fast reply, i really appreciate that you answer so many
posts.

Unfortunately it´s not that easy. Try to operate with the output:

e.g
file<-read.csv2(tmp,sep=";",skip="5") 
a<-(relevant<-file[,2][1]) 
a*5
# or 
as.numeric(relevant<-file[,2][1])

a is saved in the workspace as a factor and the values i actually need are
saved as the labels. 
(therefore my subject)

Thank You!

--
View this message in context: 
http://r.789695.n4.nabble.com/Data-read-as-labels-tp4629901p4629951.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Random forests prediction

2012-05-14 Thread matt
But shouldn't it be resolved when I set mtry to the maximum number of
variables? 
Then the model explores all the variables for the next step, so it will
still be able to find the better ones? And then in the later steps it could
use the (less important) variables.

Matthijs

--
View this message in context: 
http://r.789695.n4.nabble.com/Random-forests-prediction-tp4627409p4629944.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Discrete choice model maximum likelihood estimation

2012-05-14 Thread infinitehorizon
By the way, in my last post I forgot to return negative of llfn, hence the
llfn will be as follows:

llfn <- function(param) { 

bx <- param[1] 
b1 <- param[2] 
b2 <- param[3] 
b3 <- param[4] 
b <- param[5] 

lL1 <- log(L1(bx,b1,b2,b,tt)) 
lL2 <- log(L2(bx,b1,b2,b3,b,tt)) 
lL3 <- log(L3(bx,b1,b2,b3,b,tt)) 

llfn <- (a==1)*lL1+(a==2)*lL2+(a==3)*lL3 
return(-llfn)
} 

However, it does not fix the problem, I still receive the same error..

--
View this message in context: 
http://r.789695.n4.nabble.com/Discrete-choice-model-maximum-likelihood-estimation-tp4629877p4629930.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Post stratification weights in survey package in R

2012-05-14 Thread Thomas Lumley
On Sun, May 13, 2012 at 7:10 PM, Ruijie  wrote:
> Hi all,
>
> I have data collected from a survey administered on a subset of the
> population. I also have the population proportions of variables such as
> gender, race and housing type. I would like to combine the weights from
> each separate cross tab (of gender, race and housing type) such that the
> weighted proportions of my survey data matches that of the population.
>
> I have tried the following:
>
> library(survey)
>
>
> gender.population <-
> read.table("http://dl.dropbox.com/u/822467/Gender.csv";, header = TRUE,
> sep = ",")
>
> housing.population <-
> read.table("http://dl.dropbox.com/u/822467/Housing.csv";, header =
> TRUE, sep = ",")
>
> race.population <-
> read.table("http://dl.dropbox.com/u/822467/Race.csv";, header = TRUE,
> sep = ",")
>
> survey.sample <-
> read.table("http://dl.dropbox.com/u/822467/survey.sample.csv";, header
> = TRUE, sep = ",")
>
> survey.object.sample <- svydesign(id = ~1, data = survey.sample)
>
> survey.object.sample.weighted <- rake(survey.object.sample,
> list(~gender, ~housing, ~race), list(gender.population,
> housing.population, race.population))
>
> str(survey.object.sample.weighted$postStrata)
>
> I see from survey.object.sample.weighted$postStrata that weights have been
> assigned separately for each of the variable. My question is: Is it
> possible to get 1 weight for each subject instead of 3 weights as shown in
> the package?

There *is* only one weight for each subject.

You are misinterpreting the internal structures of the package: if you
want to see the weights, use the weights() function.

The components of $postStrata are used in standard error computations.

   -thomas

-- 
Thomas Lumley
Professor of Biostatistics
University of Auckland

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] New to R

2012-05-14 Thread David L Carlson
You will find functions such as these in the thousands of packages that are
available once you have installed R. You can use rseek.org to search for
specific topics. Good overviews are found in the CRAN Task Views (from the
main R webpage, click on CRAN, select a mirror host, and then select Task
Views from the list on the left.

--
David L Carlson
Associate Professor of Anthropology
Texas A&M University
College Station, TX 77843-4352


> -Original Message-
> From: r-help-boun...@r-project.org [mailto:r-help-bounces@r-
> project.org] On Behalf Of Ronald McDowell
> Sent: Monday, May 14, 2012 6:51 AM
> To: r-help@r-project.org
> Subject: [R] New to R
> 
> I am new to R and starting to explore its functionality. I wondered if
> anyone could advise whether R supports non-linear canonical correlation
> and/or the specification of models using alternating least squares?
> 
> Thanks
> Ron
> 
> 
>   [[alternative HTML version deleted]]
> 
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-
> guide.html
> and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] select data

2012-05-14 Thread Andrea Sica
Dear all,

I am sure it won't be difficult for you!!
I need to calculate the average among variables for the single units of my
dataset.
But, while doing it, I need to do not consider some values.
To better explain, think like there are two units and three variables:

  V1V2 V3
[1]   3 -2  4
[2]  -1  4  1

and you want to calculate the average by row, without considering those
negative values:

=> mean(1row) = (3+4)/2
=> mean(2row) = (4+1)/2

Could anyone please give me the commands to do that in R?

Thank you so much

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] How do I do group wise clustering in R?

2012-05-14 Thread David L Carlson
Look at the aggregate function to create a new data.frame in which you have
M rows that have the means of the K variables for each group. Then use
cluster analysis to cluster the M groups.

--
David L Carlson
Associate Professor of Anthropology
Texas A&M University
College Station, TX 77843-4352


> -Original Message-
> From: r-help-boun...@r-project.org [mailto:r-help-bounces@r-
> project.org] On Behalf Of Luna
> Sent: Monday, May 14, 2012 8:51 AM
> To: r-help
> Subject: [R] How do I do group wise clustering in R?
> 
> How do I do group wise clustering in R?
> 
> Hi all,
> 
> I have N x K data matrix, where N is the number of observations, K is
> the
> number of variables.
> 
> The N observations fall into M categories or groups.
> 
> Now I want to cluster the groups, instead of the observations, how do I
> do
> that?
> 
> i.e. the clustering would be at the group level...
> 
> Thanks a lot for your help!
> 
>   [[alternative HTML version deleted]]
> 
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-
> guide.html
> and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] how to write data using xlsReadWrite

2012-05-14 Thread David L Carlson
You didn't tell us what your problem is, but it probably relates to the fact
that mydata is never defined.

--
David L Carlson
Associate Professor of Anthropology
Texas A&M University
College Station, TX 77843-4352

> -Original Message-
> From: r-help-boun...@r-project.org [mailto:r-help-bounces@r-
> project.org] On Behalf Of Nurdiyanah Jambari
> Sent: Sunday, May 13, 2012 12:04 AM
> To: r-help@r-project.org
> Subject: [R] how to write data using xlsReadWrite
> 
> Hai I'm Dee. I'm trying to write var data from these codes inside excel
> file. My directory to store the data is *D:\FYP\image* .
> these are my codes, can you help give an advice or idea with my
> problem:
> 
> l*ibrary("biOps")
> library("waveslim")
> library("xlsReadWrite")
> 
> x <- readTiff("D:\\FYP\\image\\SignatureImage\\user186g1.tif")
> y <- imgBlockMedianFilter(x, 5)
> #Plot image
> #plot(y)
> 
> y.modwt <- modwt.2d(y, "la8", 2)
> ## Level 2 decomposition
> par(mfrow=c(2,2), pty="s")
> 
> ##Plot wavelets
> image(y.modwt$LH2, col=rainbow(128), axes=FALSE, main="LH2")
> image(y.modwt$HH2, col=rainbow(128), axes=FALSE, main="HH2")
> image(y.modwt$LL2, col=rainbow(128), axes=FALSE, main="LL2")
> image(y.modwt$HL2, col=rainbow(128), axes=FALSE, main="HL2")
> 
> #---#
> ##Get the dimension
> ##LH2
> dimLH2 <- dim(y.modwt$LH2)
> dimLH2x <- dimLH2[1]
> dimLH2y <- dimLH2[2]
> varLH2xlist <- c(rep(0, dimLH2x))
> varLH2ylist <- c(rep(0, dimLH2y))
> 
> 
> ##Loop to get variance from x axis
> for(i in seq(dimLH2x)){
> varLH2xlist[i] <- var(y.modwt$LH2[i,])
> }
> 
> ##Get the variance from the overall x variance
> varLH2x <- var(varLH2xlist)
> 
> ##Loop to get variance from y axis
> for(i in seq(dimLH2y)){
> varLH2ylist[i] <- var(y.modwt$LH2[,i])
> }
> 
> ##Get the variance from the overall y variance
> varLH2y <- var(varLH2ylist)
> 
> #-#
> ##Get the dimension
> ##HH2
> dimHH2 <- dim(y.modwt$HH2)
> dimHH2x <- dimHH2[1]
> dimHH2y <- dimHH2[2]
> varHH2xlist <- c(rep(0, dimHH2x))
> varHH2ylist <- c(rep(0, dimHH2y))
> 
> 
> ##Loop to get variance from x axis
> for(i in seq(dimHH2x)){
> varHH2xlist[i] <- var(y.modwt$HH2[i,])
> }
> 
> ##Get the variance from the overall x variance
> varHH2x <- var(varHH2xlist)
> 
> ##Loop to get variance from y axis
> for(i in seq(dimHH2y)){
> varHH2ylist[i] <- var(y.modwt$HH2[,i])
> }
> 
> ##Get the variance from the overall y variance
> varHH2y <- var(varHH2ylist)
> 
> #-#
> ##Get the dimension
> ##LL2
> dimLL2 <- dim(y.modwt$LL2)
> dimLL2x <- dimLL2[1]
> dimLL2y <- dimLL2[2]
> varLL2xlist <- c(rep(0, dimLL2x))
> varLL2ylist <- c(rep(0, dimLL2y))
> 
> 
> ##Loop to get variance from x axis
> for(i in seq(dimLL2x)){
> varLL2xlist[i] <- var(y.modwt$LL2[i,])
> }
> 
> ##Get the variance from the overall x variance
> varLL2x <- var(varLL2xlist)
> 
> ##Loop to get variance from y axis
> for(i in seq(dimLL2y)){
> varLL2ylist[i] <- var(y.modwt$LL2[,i])
> }
> 
> ##Get the variance from the overall y variance
> varLL2y <- var(varLL2ylist)
> 
> #-#
> ##Get the dimension
> ##HL2
> dimHL2 <- dim(y.modwt$HL2)
> dimHL2x <- dimHL2[1]
> dimHL2y <- dimHL2[2]
> varHL2xlist <- c(rep(0, dimHL2x))
> varHL2ylist <- c(rep(0, dimHL2y))
> 
> 
> ##Loop to get variance from x axis
> for(i in seq(dimHL2x)){
> varHL2xlist[i] <- var(y.modwt$HL2[i,])
> }
> 
> ##Get the variance from the overall x variance
> varHL2x <- var(varHL2xlist)
> 
> ##Loop to get variance from y axis
> for(i in seq(dimHL2y)){
> varHL2ylist[i] <- var(y.modwt$HL2[,i])
> }
> 
> ##Get the variance from the overall y variance
> varHL2y <- var(varHL2ylist)
> 
> #-#
> ##write excel file
> 
> write.xls(mydata, "D:\\FYP\\image.mydata.xls")*
> 
> --
> Nurdiyanah Bt Hj Jambari
> Student
> Faculty of Engineering
> Bachelor Engineering of Computer & Communication System Engineering
> University Putra Malaysia
> 
>   [[alternative HTML version deleted]]
> 
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-
> guide.html
> and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Help with writing data to csv

2012-05-14 Thread David L Carlson
If you have a data.frame and you want a table in Microsoft Word, the
quickest path (without additional packages and assuming you are using
Windows) is

write.table(DataFrameName, file="clipboard", sep="\t", row.names=FALSE)
# If the file is large, you may need "clipboard-128" instead of "clipboard."

Now open Microsoft Excel and select Paste. You now have the data.frame in
Excel. Select the data and copy it. Now open Microsoft Word and select
Paste. You now have the data.frame in Word as a table.

If you need something more flexible, look at packages xlsx, xlsReadWrite,
and R2wd.

--
David L Carlson
Associate Professor of Anthropology
Texas A&M University
College Station, TX 77843-4352



> -Original Message-
> From: r-help-boun...@r-project.org [mailto:r-help-bounces@r-
> project.org] On Behalf Of DL
> Sent: Saturday, May 12, 2012 3:33 PM
> To: r-help@r-project.org
> Subject: [R] Help with writing data to csv
> 
> I am trying to write data to csv but I am having issues with the
> separations.
> 
> Basically I have some results that I get with R that I copied and
> pasted
> into word and then saved as .txt
> 
> I want to write the results to csv because it's easier to make tables
> in
> word (all I would have to do is copy and paste into a table, instead of
> typing everything out).
> 
> I am able to write the data to the .csv file but the data is not
> comma-delimited when I open the file. Everything is written into the
> first
> cell.
> 
> These are the various commands that I have been inputting:
> 
> write.csv(practice, file.choose(new=T), quote=F, row.names=F)
> 
> write.csv(practice, file.choose(new=T), sep=",", quote=F, row.names=F)
> Warning message:
> In write.csv(practice, file.choose(new = T), sep = ",", quote = F,  :
>   attempt to set 'sep' ignored
> 
> write.table(practice, file =
> "C:/Users/User/Documents/Documents/Proyectodeafroantillanos/Laterals/pr
> actice.csv",
> sep="\t", quote=F, row.names=F)
> 
>  write.csv(practice, file =
> "C:/Users/User/Documents/Documents/Proyectodeafroantillanos/Laterals/pr
> actice.csv")
> 
> > write.csv(practice, file =
> >
> "C:/Users/User/Documents/Documents/Proyectodeafroantillanos/Laterals/pr
> actice.csv",
> > row.names=FALSE)
> 
> write.csv(practice, file =
> "C:/Users/User/Documents/Documents/Proyectodeafroantillanos/Laterals/pr
> actice.csv",
> sep = "\t", row.names=FALSE)
> Warning message:
> In write.csv(practice, file =
> "C:/Users/User/Documents/Documents/Proyectodeafroantillanos/Laterals/pr
> actice.csv",
> :
>   attempt to set 'sep' ignored
> 
> 
> As you can see, the commands in which I write to csv and set "sep" I
> get a
> message saying that attempt to set "sep" ignored. All the other
> commands
> work but I don't get a comma-delimited nor a tab-delimited file.
> 
> Am I doing something wrong?
> 
> Thank you.
> 
> --
> View this message in context: http://r.789695.n4.nabble.com/Help-with-
> writing-data-to-csv-tp4629436.html
> Sent from the R help mailing list archive at Nabble.com.
> 
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-
> guide.html
> and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] New version of cloudRmpi

2012-05-14 Thread Barnet Wagman

cloudRmpi v 1.2 is now available on CRAN.

cloudRmpi is a means for doing parallel processing in R, using MPI on a  
cloud-based network.  It currently

supports the use of Amazon's EC2 cloud computer service.

Changes in v 1.2:

Support for RStudio. RStudio Server is available on new AMIs.  cloudRmpi 
now has a function for securely connecting to an RStudio session running 
on the master node of an EC2-MPI network, via ssh port forwarding.  
(RStudio Server is a browser based, IDE-like interface to R).


The network specification dialog shows more information about AMIs.

The network manager has a command to repeat Open MPI network network 
configuration.



Regards,

Barnet Wagman

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] glmnet speed

2012-05-14 Thread yan
I'm using glmnet for logistic regression, I got a fairly sparse dataset,
20,000 samples(very imbalanced too, 5% from one group), 1500 variables,.

the code have beed running for 2 hours, still waiting for result, I am doing
lasso here(alpha=1), my computer is core 2 due CPU @3Ghz, 4GB ram, why it's
much more slower than the speed report in tibshirani etc's paper?

has anyone got same problem?

Many thanks

yan

--
View this message in context: 
http://r.789695.n4.nabble.com/glmnet-speed-tp4629953.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] No Data in randomForest predict

2012-05-14 Thread Liaw, Andy
It doesn't:  You just get an error if there are NAs in the data; e.g.,

R> rf1 = randomForest(iris[1:4], iris[[5]])
R> predict(rf1, newdata=data.frame(Sepal.Length=1, Sepal.Width=2, 
Petal.Length=3, Petal.Width=NA))
Error in predict.randomForest(rf1, newdata = data.frame(Sepal.Length = 1,  : 
  missing values in newdata
 
Andy

-Original Message-
From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On 
Behalf Of Jennifer Corcoran
Sent: Saturday, May 05, 2012 5:17 PM
To: r-help@r-project.org
Subject: [R] No Data in randomForest predict

I would like to ask a general question about the randomForest predict
function and how it handles No Data values.  I understand that you can omit
No Data values while developing the randomForest object, but how does it
handle No Data in the prediction phase?  I would like the output to be NA
if any (not just all) of the input data have an NA value. It is not clear
to me if this is the default or if I need to add an argument in the predict
function.

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Notice:  This e-mail message, together with any attachme...{{dropped:11}}

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] range segment exclusion using range endpoints

2012-05-14 Thread Ben quant
Great solution! Thanks!

Ben

On Sat, May 12, 2012 at 12:50 PM, jim holtman  wrote:

> Here is an example of how you might do it.  It uses a technique of
> counting how many items are in a queue based on their arrival times;
> it can be used to also find areas of overlap.
>
> Note that it would be best to use a list for the 's' end points
>
> 
> > # note the next statement removes names of the format 's[0-9]+_rng'
> > # it would be best to create a list with the 's' endpoints, but this is
> > # what the OP specified
> >
> > rm(list = grep('s[0-9]+_rng', ls(), value = TRUE))  # Danger Will
> Robinson!!
> >
> > # ex 1
> > x_rng = c(-100,100)
> >
> > s1_rng = c(-25.5,30)
> > s2_rng = c(0.77,10)
> > s3_rng = c(25,35)
> > s4_rng = c(70,80.3)
> > s5_rng = c(90,95)
> >
> > # ex 2
> > # x_rng = c(-50.5,100)
> >
> > # s1_rng = c(-75.3,30)
> >
> > # ex 3
> > # x_rng = c(-75.3,30)
> >
> > # s1_rng = c(-50.5,100)
> >
> > # ex 4
> > # x_rng = c(-100,100)
> >
> > # s1_rng = c(-105,105)
> >
> > # find all the names -- USE A LIST NEXT TIME
> > sNames <- grep("s[0-9]+_rng", ls(), value = TRUE)
> >
> > # initial matrix with the 'x' endpoints
> > queue <- rbind(c(x_rng[1], 1), c(x_rng[2], 1))
> >
> > # add the 's' end points to the list
> > # this will be used to determine how many things are in a queue (or
> areas that
> > # overlap)
> > for (i in sNames){
> + queue <- rbind(queue
> + , c(get(i)[1], 1)  # enter queue
> + , c(get(i)[2], -1)  # exit queue
> + )
> + }
> > queue <- queue[order(queue[, 1]), ]  # sort
> > queue <- cbind(queue, cumsum(queue[, 2]))  # of people in the queue
> > print(queue)
> [,1] [,2] [,3]
>  [1,] -100.0011
>  [2,]  -25.5012
>  [3,]0.7713
>  [4,]   10.00   -12
>  [5,]   25.0013
>  [6,]   30.00   -12
>  [7,]   35.00   -11
>  [8,]   70.0012
>  [9,]   80.30   -11
> [10,]   90.0012
> [11,]   95.00   -11
> [12,]  100.0012
> >
> > # print out values where the last column is 1
> > for (i in which(queue[, 3] == 1)){
> + cat("start:", queue[i, 1L], '  end:', queue[i + 1L, 1L], "\n")
> + }
> start: -100   end: -25.5
> start: 35   end: 70
> start: 80.3   end: 90
> start: 95   end: 100
> >
> >
> =
>
> On Sat, May 12, 2012 at 1:54 PM, Ben quant  wrote:
> > Hello,
> >
> > I'm posting this again (with some small edits). I didn't get any replies
> > last time...hoping for some this time. :)
> >
> > Currently I'm only coming up with brute force solutions to this issue
> > (loops). I'm wondering if anyone has a better way to do this. Thank you
> for
> > your help in advance!
> >
> > The problem: I have endpoints of one x range (x_rng) and an unknown
> number
> > of s ranges (s[#]_rng) also defined by the range endpoints. I'd like to
> > remove the x ranges that overlap with the s ranges. The examples below
> > demonstrate what I mean.
> >
> > What is the best way to do this?
> >
> > Ex 1.
> > For:
> > x_rng = c(-100,100)
> >
> > s1_rng = c(-25.5,30)
> > s2_rng = c(0.77,10)
> > s3_rng = c(25,35)
> > s4_rng = c(70,80.3)
> > s5_rng = c(90,95)
> >
> > I would get:
> > -100,-25.5
> > 35,70
> > 80.3,90
> > 95,100
> >
> > Ex 2.
> > For:
> > x_rng = c(-50.5,100)
> >
> > s1_rng = c(-75.3,30)
> >
> > I would get:
> > 30,100
> >
> > Ex 3.
> > For:
> > x_rng = c(-75.3,30)
> >
> > s1_rng = c(-50.5,100)
> >
> > I would get:
> > -75.3,-50.5
> >
> > Ex 4.
> > For:
> > x_rng = c(-100,100)
> >
> > s1_rng = c(-105,105)
> >
> > I would get something like:
> > NA,NA
> > or...
> > NA
> >
> > Ex 5.
> > For:
> > x_rng = c(-100,100)
> >
> > s1_rng = c(-100,100)
> >
> > I would get something like:
> > -100,-100
> > 100,100
> > or just...
> > -100
> >  100
> >
> > PS - You may have noticed that in all of the examples I am including the
> s
> > range endpoints in the desired results, which I can deal with later in my
> > program so its not a problem...  I think leaving in the s range endpoints
> > simplifies the problem.
> >
> > Thanks!
> > Ben
> >
> >[[alternative HTML version deleted]]
> >
> > __
> > R-help@r-project.org mailing list
> > https://stat.ethz.ch/mailman/listinfo/r-help
> > PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> > and provide commented, minimal, self-contained, reproducible code.
>
>
>
> --
> Jim Holtman
> Data Munger Guru
>
> What is the problem that you are trying to solve?
> Tell me what you want to do, not how you want to do it.
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Data read as labels

2012-05-14 Thread David Winsemius


On May 14, 2012, at 5:33 AM, barb wrote:


Hey guys,

i have a strange problem reading a .csv file.
Seems not to be covered by the usual read.csv techniques.

The relevant data i want to use, seems to be saved as the label of  
the data

point.
Therefore i can not really use it


spec<-"EU2001"
part1<-"http://www.bundesbank.de/statistik/statistik_zeitreihen_download.php?func=directcsv&from=&until=&filename=bbk_ 
"

part2<-"&csvformat=de&euro=mixed&tr="
tmp<-tempfile()
load<-paste(part1,spec,part2,spec,sep="")
download.file(load,tmp)
file<-read.csv(tmp,sep=";",dec=",", skip="5")
(relevant<-file[,2][1])



If dec="," then you probably need read.csv2()

(Since dec="," is the default I would remove that argument from the  
call. It seemed to succeed )


file<-read.csv2(tmp,sep=";", skip="5")
(relevant<-file[,2][1])
[1] 10716,05
496 Levels: 10323,52 10391,38 10716,05 10929,62 11051,23 11329,50  
11380,11 ... Methodik: Ab Januar 1993 einschl. der Zuschätzungen für  
nichtmelde- pflichtigen Außenhandel, die bis Dezember 1992 in den  
Ergänzungen zum Außenhandel enthalten sind.



--
David Winsemius, MD
West Hartford, CT

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] range segment exclusion using range endpoints

2012-05-14 Thread Ben quant
Wow, I'll have to study this one for a bit. Thanks!

Ben

On Sat, May 12, 2012 at 3:09 PM, William Dunlap  wrote:

> Here is some code that I've been fiddling with for years
> (since I wanted to provide evidence that our main office
> needed more modems and wanted to show how often
> both of them were busy).  It does set operations and a
> bit more on collections of half-open intervals.  (Hence
> it drops zero-length intervals).
>
> Several of the functions could be defined as methods
> of standard set operators.
>
> To see what it does try
>
>   r1 <- as.Ranges(bottoms=c(1,3,5,7), tops=c(2, 4, 9, 8))
>   r2 <- as.Ranges(bottoms=c(1.5,4,6,7), tops=c(1.7,5,7,9))
>   setdiffRanges( as.Ranges(1, 5), as.Ranges(c(2, 3.5), c(3, 4.5)) )
>   plot(r1, r2, setdiffRanges(r1,r2), intersectRanges(r1,r2),
>unionRanges(r1,r2), c(r1,r2), inNIntervals(c(r1,r2), n=2))
>
> You can use Date and POSIXct objects for the endpoints of
> the intervals as well.
>
> Bill Dunlap
> Spotfire, TIBCO Software
> wdunlap tibco.com
>
> # An object of S3-class "Ranges" is a 2-column
> # data.frame(bottoms, tops), describing a
> # set of half-open intervals, (bottoms[i], tops[i]].
> # inRanges is the only function that cares about
> # the direction of the half-openness of those intervals,
> # but the other rely on half-openness (so 0-width intervals
> # are not allowed).
>
> # Use as.Ranges to create a Ranges object from
> #   * a matrix whose rows are intervals
> #   * a data.frame whose rows are intervals
> #   * a vector of interval starts and a vector of interval ends
> # The endpoints must be of a class which supports the comparison (<,<=)
> # operators and which can be concatenated with the c() function.
> # That class must also be able to be in a data.frame and be subscriptable.
> # That covers at least numeric, Data, and POSIXct.
> # (The plot method only works for numeric endpoints).
> # You may input a zero-width interval (with bottoms[i]==tops[i]),
> # but the constructors will silently remove it.
> as.Ranges <- function(x, ...) UseMethod("as.Ranges")
>
> as.Ranges.matrix <- function(x, ...) {
># each row of x is an interval
>stopifnot(ncol(x)==2, all(x[,1] <= x[,2]))
>x <- x[x[,1] < x[,2], , drop=FALSE]
>Ranges <- data.frame(bottoms = x[,1], tops = x[,2])
>class(Ranges) <- c("Ranges", class(Ranges))
>Ranges
> }
>
> as.Ranges.data.frame <- function(x, ...) {
># each row of x is an interval
>stopifnot(ncol(x)==2, all(x[,1] <= x[,2]))
>x <- x[x[,1] < x[,2], , drop=FALSE]
>Ranges <- data.frame(bottoms = x[,1], tops = x[,2])
>class(Ranges) <- c("Ranges", class(Ranges))
>Ranges
> }
>
> as.Ranges.default <- function(bottoms, tops, ...) {
># vectors of bottoms and tops of intervals
>stopifnot(all(bottoms <= tops))
>Ranges <- data.frame(bottoms=bottoms, tops=tops)[bottoms < tops, ,
> drop=FALSE]
>class(Ranges) <- c("Ranges", class(Ranges))
>Ranges
> }
>
> c.Ranges <- function(x, ...) {
># combine several Ranges objects into one which lists all the intervals.
>RangesList <- list(x=x, ...)
>Ranges <- x
>for (r in list(...)) {
>Ranges <- rbind(Ranges, r)
>}
>class(Ranges) <- unique(c("Ranges", class(Ranges)))
>Ranges
> }
>
> inNIntervals <- function(Ranges, n)
> {
># return Ranges object that describes points that are
># in at least n intervals in the input Ranges object
>stopifnot(n>0)
>u <- c(Ranges[,1], Ranges[,2])
>o <- order(u)
>u <- u[o]
>jumps <- rep(c(+1L,-1L), each=nrow(Ranges))[o]
>val <- cumsum(jumps)
>as.Ranges(u[val==n & jumps==1], u[val==n-1 & jumps==-1])
> }
>
> unionIntervals <- function(Ranges) {
># combine overlapping and adjacent intervals to create a
># possibly smaller and simpler, but equivalent, Ranges object
>inNIntervals(Ranges, 1)
> }
>
> intersectIntervals <- function(Ranges) {
># return 0- or 1-row Ranges object containing describing points
># that are in all the intervals in input Ranges object.
>u <- unname(c(Ranges[,1], Ranges[,2]))
>o <- order(u)
>u <- u[o]
>jumps <- rep(c(+1L,-1L), each=nrow(Ranges))[o]
>val <- cumsum(jumps)
>as.Ranges(u[val==nrow(Ranges) & jumps==1], u[val==nrow(Ranges)-1 &
> jumps==-1])
> }
>
> unionRanges <- function(x, ...) {
>unionIntervals(rbind(x, ...))
> }
>
> setdiffRanges <- function (x, y)
> {
># set difference: return Ranges object describing points that are in x
> but not y
>x <- unionIntervals(x)
>y <- unionIntervals(y)
>nx <- nrow(x)
>ny <- nrow(y)
>u <- c(x[, 1], y[, 1], x[, 2], y[, 2])
>o <- order(u)
>u <- u[o]
>vx <- cumsum(jx <- rep(c(1, 0, -1, 0), c(nx, ny, nx, ny))[o])
>vy <- cumsum(jy <- rep(c(0, -1, 0, 1), c(nx, ny, nx, ny))[o])
>as.Ranges(u[vx == 1 & vy == 0], u[(vx == 1 & jy == -1) | (jx == -1 & vy
> == 0)])
> }
>
> intersectRanges <- function(x, y)
> {
># return Ranges object describing points that are in both x and y
>x

[R] phyloclim could not be installed in linux - problems on tkrplot dependence

2012-05-14 Thread Mao Jianfeng
Dear R-helpers, Christoph (author of phyloclim) and Luke (author of
tkrplot),

I would like to get your helps on installing of phyloclim in Ubuntu linux.
It seems a package named 'tkrplot' could not be installed at firstly, then
packages depends on it could not be installed latter.

As I have tested, installation of phyloclim works smoothly in Mac. I
attempted to install these packages in a Ubuntu server without root. I
guess if installation of tkrplot need a root priority.

It seems tkrplot is not so necessary for main functionalities of phyloclim,
so how to overcome the dependence of tkrplot, when installing phyloclim?

Could you please give any directions on my problem? Thanks in advance.

Please see details in the followings.

Best wishes,
Jian-Feng,

#
# (1) my R version

R version 2.15.0 (2012-03-30)
Copyright (C) 2012 The R Foundation for Statistical Computing
ISBN 3-900051-07-0
Platform: x86_64-unknown-linux-gnu (64-bit)

R is free software and comes with ABSOLUTELY NO WARRANTY.
You are welcome to redistribute it under certain conditions.
Type 'license()' or 'licence()' for distribution details.

  Natural language support but running in an English locale

R is a collaborative project with many contributors.
Type 'contributors()' for more information and
'citation()' on how to cite R or R packages in publications.

Type 'demo()' for some demos, 'help()' for on-line help, or
'help.start()' for an HTML browser interface to help.
Type 'q()' to quit R.

#
# (2) my session info
> sessionInfo()
R version 2.15.0 (2012-03-30)
Platform: x86_64-unknown-linux-gnu (64-bit)

locale:
 [1] LC_CTYPE=en_US.UTF-8   LC_NUMERIC=C
 [3] LC_TIME=en_US.UTF-8LC_COLLATE=en_US.UTF-8
 [5] LC_MONETARY=en_US.UTF-8LC_MESSAGES=en_US.UTF-8
 [7] LC_PAPER=C LC_NAME=C
 [9] LC_ADDRESS=C   LC_TELEPHONE=C
[11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C

attached base packages:
[1] stats graphics  grDevices utils datasets  methods   base

loaded via a namespace (and not attached):
[1] tools_2.15.0

#
# (3) my systems info
> Sys.info()
   sysname
release
   "Linux"
"2.6.37.6"
   version
nodename
"#16 SMP Fri Jun 3 15:50:09 MEST 2011"
"upa"
   machine
login
  "x86_64"
"jmao"
  user
effective_user
"jmao"
"jmao"


#
# (4) prompt from phyloclim installation

> install.packages("phyloclim")
Installing package(s) into ‘/ebio/abt6/jmao/R/Rpacks’
(as ‘lib’ is unspecified)
also installing the dependencies ‘gee’, ‘tkrplot’, ‘ape’, ‘adehabitat’

trying URL 'http://mirrors.softliste.de/cran/src/contrib/gee_4.13-18.tar.gz'
Content type 'application/x-gzip' length 57162 bytes (55 Kb)
opened URL
==
downloaded 55 Kb

trying URL '
http://mirrors.softliste.de/cran/src/contrib/tkrplot_0.0-23.tar.gz'
Content type 'application/x-gzip' length 39037 bytes (38 Kb)
opened URL
==
downloaded 38 Kb

trying URL 'http://mirrors.softliste.de/cran/src/contrib/ape_3.0-3.tar.gz'
Content type 'application/x-gzip' length 710889 bytes (694 Kb)
opened URL
==
downloaded 694 Kb

trying URL '
http://mirrors.softliste.de/cran/src/contrib/adehabitat_1.8.10.tar.gz'
Content type 'application/x-gzip' length 2067806 bytes (2.0 Mb)
opened URL
==
downloaded 2.0 Mb

trying URL '
http://mirrors.softliste.de/cran/src/contrib/phyloclim_0.8.1.tar.gz'
Content type 'application/x-gzip' length 68133 bytes (66 Kb)
opened URL
==
downloaded 66 Kb

* installing *source* package ‘gee’ ...
** package ‘gee’ successfully unpacked and MD5 sums checked
** libs
gfortran   -fpic  -g -O2  -c dgedi.f -o dgedi.o
gfortran   -fpic  -g -O2  -c dgefa.f -o dgefa.o
gcc -std=gnu99 -I/ebio/abt6/jmao/R/R-2.15.0/include -DNDEBUG
-I/usr/local/include-fpic  -g -O2  -c ugee.c -o ugee.o
gcc -std=gnu99 -shared -L/usr/local/lib64 -o gee.so dgedi.o dgefa.o ugee.o
-L/ebio/abt6/jmao/R/R-2.15.0/lib -lRblas -lgfortran -lm -lgfortran -lm
installing to /ebio/abt6/jmao/R/Rpacks/gee/libs
** R
** preparing package for lazy loading
** help
*** installing help indices
** building package indices
** testing if installed package can be loaded

* DONE (gee)
* installing *source* package ‘tkrplot’ ...
** package ‘tkrplot’ successfully unpacked and MD5 sums checked
checking for gcc... gcc
checking for C compiler default output... a.out
checking whether the C compiler works... yes

Re: [R] file path

2012-05-14 Thread David Winsemius


On May 14, 2012, at 6:35 AM, Berend Hasselman wrote:



On 14-05-2012, at 12:07, Wincent wrote:


Emm, my bad.
I meant str <- "abc\d".
Any ideas?


gsub("", "", str)



#1:  One cannot execute:  str <- "abc\d" , at least on my machine,  
since that throws an error because "\d" is an "unrecognized escape".


#2: If the string has a backslash as its fourth character then it  
would need to be created with:


str <- "abc\\d"

(Then Berend's gsub would succeed.)

#3: If the string contains an ASCII cntrl-d. then the needed gsub  
command would be:


str <- "abc\004"
gsub("\\\004", "new", str)
[1] "abcnew"

--

David Winsemius, MD
West Hartford, CT

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] ggplot2: Dendrogram text position

2012-05-14 Thread Brian Smith
Hi,

I was trying to create a dendrogram using ggplot2. Everything seems to be
looking ok except that the text labels are too close to the dendrogram (in
the example below, 'a','b', ..). Is there a way that I can put a little gap
between where the dendrogram ends and the label begins?

thanks!!

= code ==

library(ggplot2)

mat <- matrix(sample(1:1000,180),12,15)
rownames(mat) <- letters[1:12]
colnames(mat) <- paste('c',1:15,sep='')

dates <- sample(1:3,12,replace=T)

hc <- hclust(dist(mat))
order <- hc$order
batch <- as.factor(as.numeric(dates[order]))  ## batchorder
roworder <- rownames(mat)[order]

dd.row <- as.dendrogram(hc)
ddata_x <- dendro_data(dd.row)
labs <- label(ddata_x)
labs2 <- cbind(labs,batch,roworder)

p2 <- ggplot(segment(ddata_x)) +
geom_segment(aes(x=x, y=y, xend=xend, yend=yend)) + coord_flip() +
scale_y_reverse(expand=c(0.2, 0)) +
theme_dendro()

p2 + geom_text(data=labs2,
aes(label=roworder, x=x, y=0, colour=batch),size=4)



[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] To summary data

2012-05-14 Thread John Kane
One way is to use the reshape2 package
===
library(reshape2)
dcast(xx,  A ~ . , sum)


John Kane
Kingston ON Canada


> -Original Message-
> From: mrzun...@gmail.com
> Sent: Sun, 13 May 2012 22:14:02 -0700 (PDT)
> To: r-help@r-project.org
> Subject: [R] To summary data
> 
> hi all,
> 
> my data is here,
> --
> data
> 
>   clss number
> 1A  1
> 2B  2
> 3C  3
> 4A  4
> 5B  5
> 6C  6
> 7A  7
> 8B  8
> 9C  9
> -
> I want to this format
> 
> clss number
> A  12
> B  15
> C  18
> 
> is there a way to solve this problem?
> 
> --
> View this message in context:
> http://r.789695.n4.nabble.com/To-summary-data-tp4629887.html
> Sent from the R help mailing list archive at Nabble.com.
> 
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.


FREE ONLINE PHOTOSHARING - Share your photos online with your friends and 
family!
Visit http://www.inbox.com/photosharing to find out more!

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Add column from other columns data.

2012-05-14 Thread Yellow
That worked. 
Thanks. :) 

--
View this message in context: 
http://r.789695.n4.nabble.com/Add-column-from-other-columns-data-tp4629921p4629937.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Random forests prediction

2012-05-14 Thread Liaw, Andy
I don't think this is so hard to explain.  If you evaluate AUC using either OOB 
prediction or on a test set (or something like CV or bootstrap), that would be 
what I expect for most data.  When you add more variables (that are, say, less 
informative) to a model, the model has to look harder to find the informative 
ones, and thus you pay a penalty.  One exception to that is if some of the 
"new" variables happen to have very strong interaction with some of the "old" 
variables, then you may see improved performance.

I've said it several times before, but it seems to be worth repeating:  Don't 
use the training set for evaluating models:  that almost never make sense.

Andy


-Original Message-
From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On 
Behalf Of matt
Sent: Friday, May 11, 2012 3:43 PM
To: r-help@r-project.org
Subject: [R] Random forests prediction

Hi all,

I have a strange problem when applying RF in R. 
I have a set of variables with which I obtain an AUC of 0.67.

I do have a second set of variables that have an AUC of 0.57. 

When I merge the first and second set of variables, the AUC becomes 0.64. 

I would expect the prediction to become better as I add variables that do
have some predictive power?
This is even more strange as the AUC on the training set increased when I
added more variables (while the AUC of the validation set thus decreased).

Is there anyone who has experienced the same and/or who know what could be
the reason?

Thanks,

Matthijs

--
View this message in context: 
http://r.789695.n4.nabble.com/Random-forests-prediction-tp4627409.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Notice:  This e-mail message, together with any attachme...{{dropped:11}}

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Add column from other columns data.

2012-05-14 Thread John Kane
Something along the lines of 

dat2  <-  ifelse( dat1==1 , "yes", "no")

should do it.

John Kane
Kingston ON Canada


> -Original Message-
> From: s1010...@student.hsleiden.nl
> Sent: Mon, 14 May 2012 05:45:38 -0700 (PDT)
> To: r-help@r-project.org
> Subject: [R] Add column from other columns data.
> 
> Hi everyone,
> 
> I am having some problems with making a new colomn wit data in it.
> I have this one column named: Fulfilled
> 
> Fulfilled
> 1
> 1
> 0
> 1
> 1
> 1
> 1
> 0
> 0
> 1
> 
> And now I would like to add another colum to my .csv file ("Finished")
> 
> In this "Finished" column I would like to have "Yes" or "No".
> Where in colomn "Fullfilled" is a 1, "Finished" should have a "Yes".
> Like this:
> 
> Fullfilled Finished
> 1 Yes
> 1 Yes
> 0 No
> etc
> 
> Now I know how to grab the data out of a column, and also know how to
> save
> data inside a .csv file.
> That is no problem.
> But how do I get the right Yes or No on the right place in the other
> column?
> 
> # Get al values: 1
> Fullfilled_1 = Fullfilled[Fullfilled = 1]
> 
> I was thinkng about subset.
> But I don' t realy know if that would be realy it
> 
> Maybe somebody here can push me a little in the right direction?
> 
> 
> --
> View this message in context:
> http://r.789695.n4.nabble.com/Add-column-from-other-columns-data-tp4629921.html
> Sent from the R help mailing list archive at Nabble.com.
> 
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.


Publish your photos in seconds for FREE
TRY IM TOOLPACK at http://www.imtoolpack.com/default.aspx?rc=if4

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] How do I do group wise clustering in R?

2012-05-14 Thread Luna
How do I do group wise clustering in R?

Hi all,

I have N x K data matrix, where N is the number of observations, K is the
number of variables.

The N observations fall into M categories or groups.

Now I want to cluster the groups, instead of the observations, how do I do
that?

i.e. the clustering would be at the group level...

Thanks a lot for your help!

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Why can we combine design matrix and data-frame in R?

2012-05-14 Thread Michael
Oh, so we can always combine model matrices and formulas in regression in R?

Thanks!

On Mon, May 14, 2012 at 2:41 AM, peter dalgaard  wrote:

>
> On May 14, 2012, at 02:24 , Luna wrote:
>
> > Thanks!
> >
> > Do you think if the correctness of the such results could be generalized
> to
> > other future cases?
>
>
> If correctly generalized, yes
>
> (Apologies for being slightly facetious; the point is that the properties
> you build on are part of the software design for model formulas and model
> matrices. They are not fortuitous buglets, so they are not going to go away
> unless the actual design is changed.)
>
> >
> >
> >
> >
> > On Sun, May 13, 2012 at 7:10 PM, S Ellison 
> wrote:
> >
> >>> But the line you cited was about "response" being a matrix, which is
> not
> >> our case.
> >> Yes, you're right; I picked the wrong thing to cite.
> >> The only documentation I found about lm accepting a matrix in the
> >> predictors is a one-line statement in "Introduction to R" which says
> "term_i
> >>   is either
> >>
> >>   a vector or matrix expression, or 1,
> >>   a factor, or
> >>   a formula expression consisting of factors, vectors or matrices
> >> connected by formula operators. "
> >>
> >> Not the most informative documentation. But Peter Dalgaard is a most
> >> authoritative source!
> >>
> >>> And also I have checked:
> >>>
> >>> Any more thoughts?
> >>
> >> Data frames are odd things; a column need not contain only a vector if
> the
> >> number of rows is OK. I am half surprised that including a matrix in one
> >> works. But the gods of R are powerful and their magic is strong. Here,
> >> names(tmp) is showing that the data frame has one element called X (in
> >> effect, the whole matrix is regarded as one element of the data frame),
> but
> >> on display the magic has expanded X to show all the columns of X.
> >>
> >> This is the main reason I generally keep to simple things in data
> frames;
> >> complicated things make it less easy to predict behaviour.
> >>
> >>
> >>
> >> ***
> >> This email and any attachments are confidential. Any u...{{dropped:13}}
> >
> > __
> > R-help@r-project.org mailing list
> > https://stat.ethz.ch/mailman/listinfo/r-help
> > PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> > and provide commented, minimal, self-contained, reproducible code.
>
> --
> Peter Dalgaard, Professor
>  Center for Statistics, Copenhagen Business School
> Solbjerg Plads 3, 2000 Frederiksberg, Denmark
> Phone: (+45)38153501
> Email: pd@cbs.dk  Priv: pda...@gmail.com
>
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Rjava on Ubuntu quantal

2012-05-14 Thread Hasan Diwan
I just upgraded to Ubuntu Quantal from Precise and RJava stopped working,
log follows:
0}% /usr/bin/find $HOME/workspace/FinanceOCR/visualizations/ -name '*R'
-print | /usr/bin/xargs -n 1 -i% /usr/bin/Rscript % $1 [~]
Loading required package: RJDBC
Loading required package: methods
Loading required package: DBI
Loading required package: rJava
Error : .onLoad failed in loadNamespace() for 'rJava', details:
call: dyn.load(file, DLLpath = DLLpath, ...)
error: unable to load shared object
'/usr/lib/R/site-library/rJava/libs/rJava.so':
libjvm.so: cannot open shared object file: No such file or directory
Failed with error: ‘package ‘rJava’ could not be loaded’
Error: could not find function "JDBC"
Execution halted
Loading required package: RJDBC
... and my sessionInfo:
% R -e 'sessionInfo()'

   [~]

R version 2.15.0 (2012-03-30)
Copyright (C) 2012 The R Foundation for Statistical Computing
ISBN 3-900051-07-0
Platform: i686-pc-linux-gnu (32-bit)

R is free software and comes with ABSOLUTELY NO WARRANTY.
You are welcome to redistribute it under certain conditions.
Type 'license()' or 'licence()' for distribution details.

  Natural language support but running in an English locale

R is a collaborative project with many contributors.
Type 'contributors()' for more information and
'citation()' on how to cite R or R packages in publications.

Type 'demo()' for some demos, 'help()' for on-line help, or
'help.start()' for an HTML browser interface to help.
Type 'q()' to quit R.

> sessionInfo()
R version 2.15.0 (2012-03-30)
Platform: i686-pc-linux-gnu (32-bit)

locale:
 [1] LC_CTYPE=en_US.UTF-8   LC_NUMERIC=C
 [3] LC_TIME=en_US.UTF-8LC_COLLATE=en_US.UTF-8
 [5] LC_MONETARY=en_US.UTF-8LC_MESSAGES=en_US.UTF-8
 [7] LC_PAPER=C LC_NAME=C
 [9] LC_ADDRESS=C   LC_TELEPHONE=C
[11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C

attached base packages:
[1] stats graphics  grDevices utils datasets  methods   base

Thanks for the kind assistance... Cheers -- H

-- 
Sent from my mobile device
Envoyait de mon portable

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Add column from other columns data.

2012-05-14 Thread Sarah Goslee
Assuming you actually have a data frame or matrix, and not a csv file, ifelse() 
is the general solution to your problem.

Sarah

On May 14, 2012, at 8:45 AM, Yellow  wrote:

> Hi everyone, 
> 
> I am having some problems with making a new colomn wit data in it. 
> I have this one column named: Fulfilled
> 
> Fulfilled 
> 1
> 1
> 0
> 1
> 1
> 1
> 1
> 0
> 0
> 1
> 
> And now I would like to add another colum to my .csv file ("Finished")
> 
> In this "Finished" column I would like to have "Yes" or "No". 
> Where in colomn "Fullfilled" is a 1, "Finished" should have a "Yes". 
> Like this: 
> 
> Fullfilled Finished 
> 1 Yes 
> 1 Yes 
> 0 No 
> etc 
> 
> Now I know how to grab the data out of a column, and also know how to save
> data inside a .csv file. 
> That is no problem. 
> But how do I get the right Yes or No on the right place in the other column? 
> 
> # Get al values: 1 
> Fullfilled_1 = Fullfilled[Fullfilled = 1]
> 
> I was thinkng about subset. 
> But I don' t realy know if that would be realy it 
> 
> Maybe somebody here can push me a little in the right direction? 
> 
> 
> --
> View this message in context: 
> http://r.789695.n4.nabble.com/Add-column-from-other-columns-data-tp4629921.html
> Sent from the R help mailing list archive at Nabble.com.
> 
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] rollmax.zoo : column names NULL

2012-05-14 Thread Achim Zeileis

Giles,

thanks for the bug report:


I am comparing the output of rollmax in two versions of R.  In the
current version, the column names are 'lost' ie NULL in the output; in
the earlier version they were retained.


Yes, this was an error. I just fixed it in the devel version on R-Forge.

Thanks,
Z

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Calculating all possible ratios

2012-05-14 Thread genome1976

Hi Rui,

Thanks once again for all the help. I need to ask for one more help from you. I 
have two matrices, with probesets as rows and samples as columns.
The samples in the two matrices are matched (from the same animal but two 
different tissues). I want to create a correlation matrix of sample by sample 
using the probeset expression values. Could you please suggest a way to do that?

Thanks,
Som.


Date: Sat, 12 May 2012 15:20:52 -0700
From: ml-node+s789695n4629656...@n4.nabble.com
To: genome1...@hotmail.com
Subject: RE: Calculating all possible ratios



Hello,


Nothing wrong with me, maybe your R session has some conflicting objects.

Running the function in the previous post on the first 4 rows and first 6 
columns of your dataset the result was (copy&paste to your session)


result <- structure(c(8.74714923153198, 1.83094400392095, 9.92065138471113, 

1.77145415014708, 1.01515180575001, 0.167175438316099, 0.222321656865252, 

0.155576771874649, 3.09417748158541, 0.469647988505747, 1.29398633565582, 

0.524043736521509, 3.75969597954255, 0.422694576901317, 9.75471698113208, 

0.290397651827521, 4.9035575319622, 1.00105273231888, 1.01093964697178, 

0.26895145631068, 0.114322960947685, 0.546166347992352, 0.100799832714726, 

0.564507977763338, 0.11605516024473, 0.0913055986191245, 0.0224099858208782, 

0.0878243288779063, 0.353735531392494, 0.256505926724138, 0.130433606169248, 

0.295826869963301, 0.42981957664441, 0.230861553382365, 0.983273839877614, 

0.163931791180376, 0.56058921623124, 0.546741314958369, 0.10190254729944, 

0.151825242718447, 0.9850743448771, 5.98173996175908, 4.49798734905118, 

6.4276947512815, 8.61659229879359, 10.9522309159971, 44.622964422, 

11.3863665430362, 3.04799485560622, 2.8093121408046, 5.82033416762497, 

3.36839317468124, 3.70358005398494, 2.52844904226946, 43.8765935747068, 

1.86658746243623, 4.83036872336483, 5.98803713273998, 4.5471937427, 

1.72873786407767, 0.323187666496628, 2.12925430210325, 0.772805687699305, 

1.90823767237023, 2.82697074863659, 3.89854539725884, 7.66673581578674, 

3.38035554418724, 0.328084543240185, 0.35595902124055, 0.1718114409242, 

0.296877457036954, 1.21508737036511, 0.900024246342843, 7.53850076491586, 

0.554147739185128, 1.58476931628683, 2.13149583692219, 0.781259909100518, 

0.513223300970874, 0.265978952936953, 2.36577437858509, 0.102514506769826, 

3.44355401535389, 2.32655759378615, 4.33160041310018, 1.01701068353905, 

6.10009805175427, 0.270009014365446, 0.395499368696959, 0.0227911949977918, 

0.535737017484743, 0.822986086753186, 1.11108117816092, 0.132652370966651, 

1.8045729131197, 1.30424309801742, 2.36826490573261, 0.103635979283374, 

0.926148867313916, 0.203933571388086, 0.998948374760994, 0.989178733859585, 

3.71814309436142, 1.78383738225087, 1.82901853699522, 9.81329737579089, 

6.58652001534723, 0.207023533247665, 0.166999632405824, 0.219915855047535, 

0.578456699988768, 0.631006664328306, 0.469154094827586, 1.27998376513563, 

1.9484696000908, 0.76672822844154, 0.422250060615857, 9.64915859255482, 

1.07974002376127), .Dim = c(4L, 30L), .Dimnames = list(c("S1", 

"S2", "S3", "S4"), c("P1:P2", "P1:P3", "P1:P4", "P1:P5", "P1:P6", 

"P2:P1", "P2:P3", "P2:P4", "P2:P5", "P2:P6", "P3:P1", "P3:P2", 

"P3:P4", "P3:P5", "P3:P6", "P4:P1", "P4:P2", "P4:P3", "P4:P5", 

"P4:P6", "P5:P1", "P5:P2", "P5:P3", "P5:P4", "P5:P6", "P6:P1", 

"P6:P2", "P6:P3", "P6:P4", "P6:P5")))


Rui Barradas



genome1976 wrote
Hi Rui,

Thanks once again. I really appreciate it.

I tried using the code with the following dataset:





 

 

  Sample

  P1

  P2

  P3

  P4

  P5

  P6

  P7

  P8

  P9

  P10

 

 

  S1

  5292.9

  605.1

  5213.9

  1710.6

  1407.8

  1079.4

  1379.6

  9321.4

  6951

  1205.8

 

 

  S2

  104.6

  57.129

  625.69

  222.72

  247.46

  104.49

  330.29

  1863.7

  389.67

  216.29

 

 

  S3

  191.29

  19.282

  860.42

  147.83

  19.61

  189.22

  203.27

  1799

  369.9

  175.73

 

 

  S4

  41.553

  23.457

  267.09

  79.293

  143.09

  154.5

  52.567

  613.54

  408.86

  61.715

 

 

  S5

  671.33

  19.076

  1040.9

  319.04

  50.766

  57.445

  50.005

  1615.5

  1149.1

  163.99

 

 

  S6

  125.9

  22.296

  563.83

  236.36

  112.38

  81.581

  48.406

  2073.6

  388.4

  62.575

 

 

  S7

  78.485

  18.152

  248.18

  156.19

  322.4

  162.01

  38.379

  2786.8

  630.63

  71.163

 

 

  S8

  1355.6

  51.534

  422.51

  134.89

  202.34

  48.368

  69.45

  231.11

  1875.9

  153.18

 

 

  S9

  2167.6

  45.244

  430.73

  262.19

  365.71

  116.49

  65.663

  151.04

  3071.5

  210.55

 

 

  S10

  575.7

  24.699

  170.09

  128.64

  42.58

  31.034

  55.256

  294.67

  448.05

  226.19

 

 

  S11

  234.22

  22.594

  944.54

  118.91

  16.994

  102.67

  199.32

  2300

  192.38

  108.3

 

 

  S12

  193.38

  25.374

  829.88

  74.872

  108.1

  116.49

  175.49

  1248

  340.33

  65.022

 

 

  S

  1   2   >