date:20100226

On Fri, Feb 26, 2010 at 1:28 PM, Saeed Abu Nimeh  wrote:
> Pat,
> Off the bat, beginners and advanced. In addition, splitting by domain
> would be very helpful -- something along the lines of:
> http://cran.r-project.org/web/views/. But we should be careful, we do
> not want to create 20 other mailing lists :) We have to group things.

Note that there are already 24 mailing lists here:
http://www.r-project.org/mail.html

> This will help splitting the volume of the list and will help in
> targeting lists by expertise.
> Thanks,
> Saeed
>
> On Fri, Feb 26, 2010 at 2:08 AM, Patrick Burns  
> wrote:
>> Saeed,
>>
>> If the R-help list were split, what do you
>> see as the pieces?
>>
>> Pat
>>
>> On 26/02/2010 01:53, Saeed Abu Nimeh wrote:
>>>
>>> On Thu, Feb 25, 2010 at 9:31 AM, Patrick Burns
>>>  wrote:

 * What were your biggest misconceptions or
 stumbling blocks to getting up and running
 with R?
>>>
>>> 1- Compared to other programming languages it is hard to learn R by
>>> example, because it is hard to find code on the web that will do the
>>> exact thing you are looking for, sometimes you might get lucky though.
>>> By contrast, take Perl for example, it is an easy language to learn by
>>> example.
>>>
>>> 2- The R mailing list. Beginners get frustrated after they struggle
>>> for a long time to solve a problem and the easiest thing then is to
>>> send an email to the R mailing list. I did this in the past. The best
>>> thing that happened was that my request was neglected and I had to
>>> spend more time on the problem and find a solution by myself
>>> eventually. Do not get me wrong, I am not saying that the mailing list
>>> is bad, but it should be more organized. Maybe broken down into couple
>>> of other mailing lists. This might bring up a good discussion thread.
>>>

 * What documents helped you the most in this
 initial phase?
>>>
>>> An Introduction to R by Venables
>>> simpleR – Using R for Introductory Statistics by Verzani
>>>
>>
>> --
>> Patrick Burns
>> pbu...@pburns.seanet.com
>> http://www.burns-stat.com
>> (home of 'The R Inferno' and 'A Guide for the Unwilling S User')
>>
>
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] two questions for R beginners

2010-02-26 Thread Saeed Abu Nimeh


sorry meant community not committee

On 2/26/10 8:36 PM, Saeed Abu Nimeh wrote:

Hi Ivan,

On 2/26/10 6:30 AM, Ivan Calandra wrote:

You are definitely right...
What to do with bad beginner's questions is not a simple issue.

If a "beginner's mailing list" is created, who will answer to such
questions?


If I subscribe to the beginners mailing list, then I have to expect
novice questions and I should be willing to help. Otherwise, I should
not be there.

And moreover, the beginners won't take advantage of the other

questions (I've personally learned a lot trying to understand the
questions and answers to other's problems).


They can still subscribe to the advanced, but they will know that they
are here to observe and learn, not to ask novice questions. You want to
ask basic stuff, go to the beginners list :)

Not sure if you guys have been on some of the linux mailing lists out
there, but man let me tell you, some of these lists have a RTFM attitude
and they will fry you if you ask novice questions. Frankly, that is
understandable, as most of the members are geeks and they have higher
expectations. This mailing list is different, I have seen posts from
different disciplines; biology, biostats, stats, computer science,
oceanography, etc. So, IMO, there should be a beginners list to cope
with such broad committee.

Thanks,
Saeed

And also, as you said, the

problems might persist.
The beginner's mailing list might be good in one aspect though: the
"experts" who subscribe to it would be willing to help the beginners to
get started with R, knowing that the questions might not be clearly
stated.

As you pointed out, the mailing list is not the best for basic stuff
(the question is of course "what is basic?"). Not everybody knows some
colleagues who work with R (I'm personally the 1st one to use R in my
lab).
I think, somehow and I have no idea how, documentation and guidance to
search for help should be more accessible as soon as you start with R.
Maybe a _*clear*_ section on the R homepage or in the "introduction to
R" manual like "where to find help", including all of the most common
and useful resources available (from "?" and RSiteSearch() to R Wiki and
Crantastic).

I hope that this whole discussion might help to make the R world better.
Thank you Patrick for initiating it!
Regards,
Ivan

Le 2/26/2010 15:09, Paul Hiemstra a écrit :

Ivan Calandra wrote:

Since you want input from beginners, here are some thoughts

I had and still have two big problems with R:
- this vectorization thing. I've read many manuals (including R
inferno), but I'm still not completely clear about it. In simple
examples, it's fine. But when it gets a bit more complex, then...
Related to it, the *apply functions are still a bit difficult to
understand. When I have to use them, I just try one and see what
happens. I don't understand them well enough to know which one I need.
- the second problem is where to find the functions/packages I need.
There are many options, and that's actually the problem. R Wiki,
Rseek, RSiteSearch, Crantastic, etc... When you start with R, you
discover that the capabilities of R are almost unlimited and you
don't really know where to start, where to find what you need.

As noted in earlier posts, the mailing list is really great, but some
people are really hard with beginners. It was noted in a discussion a
few days ago, but it looks like some don't realize how difficult it
is at the beginning to formulate a good question, clear, with
self-contained example and so on. Moreover, not everybody speaks
English natively. I don't mean that you must help, even when the
question is really vague and not clear and whatever. I'm just saying
that if you don't want to help (whatever the reason), you don't have
to say it badly. But in any cases, the mailing list is still really
helpful. As someone noted (sorry I erased the email so I don't
remember who), it might be a good idea to split it.

Hi everyone,

My 2ct about the mailing list :). I understand that beginners have a
hard time formulating a good question. But the problem is that we
can't answer the question when it is unclear. So either I:

- Don't bother answering
- Try do discuss with the author of the question, taking lots of time
to find out what exactly is the question.
- Send a "read the posting guide" answer

I mostly do the first, as I have to get things done during my PhD :).
So this leaves us with kind of a problem, the person mailing the list
doesn't have the knowledge to ask the right question, the list can't
answer properly and consequently, the person mailing the list still
doesn't get the information he/she needs. We could start an R-beginner
mailing list, but this would also suffer from this problem. What do
you guys think?

Maybe the mailing list is not the right medium for really basic stuff.
For that I would recommend a good R-book or (better) a course in R or
(even better) some colleagues who work with R that you can ask
questions to.

cheer

Re: [R] two questions for R beginners

2010-02-26 Thread Saeed Abu Nimeh


Hi Ivan,

On 2/26/10 6:30 AM, Ivan Calandra wrote:

You are definitely right...
What to do with bad beginner's questions is not a simple issue.

If a "beginner's mailing list" is created, who will answer to such
questions?


If I subscribe to the beginners mailing list, then I have to expect 
novice questions and I should be willing to help. Otherwise, I should 
not be there.


And moreover, the beginners won't take advantage of the other

questions (I've personally learned a lot trying to understand the
questions and answers to other's problems).


They can still subscribe to the advanced, but they will know that they 
are here to observe and learn, not to ask novice questions. You want to 
ask basic stuff, go to the beginners list :)


Not sure if you guys have been on some of the linux mailing lists out 
there, but man let me tell you, some of these lists have a RTFM attitude 
and they will fry you if you ask novice questions. Frankly, that is 
understandable, as most of the members are geeks and they have higher 
expectations. This mailing list is different, I have seen posts from 
different disciplines; biology, biostats, stats, computer science, 
oceanography, etc. So, IMO, there should be a beginners list to cope 
with such broad committee.


Thanks,
Saeed

And also, as you said, the

problems might persist.
The beginner's mailing list might be good in one aspect though: the
"experts" who subscribe to it would be willing to help the beginners to
get started with R, knowing that the questions might not be clearly stated.

As you pointed out, the mailing list is not the best for basic stuff
(the question is of course "what is basic?"). Not everybody knows some
colleagues who work with R (I'm personally the 1st one to use R in my lab).
I think, somehow and I have no idea how, documentation and guidance to
search for help should be more accessible as soon as you start with R.
Maybe a _*clear*_ section on the R homepage or in the "introduction to
R" manual like "where to find help", including all of the most common
and useful resources available (from "?" and RSiteSearch() to R Wiki and
Crantastic).

I hope that this whole discussion might help to make the R world better.
Thank you Patrick for initiating it!
Regards,
Ivan

Le 2/26/2010 15:09, Paul Hiemstra a écrit :

Ivan Calandra wrote:

Since you want input from beginners, here are some thoughts

I had and still have two big problems with R:
- this vectorization thing. I've read many manuals (including R
inferno), but I'm still not completely clear about it. In simple
examples, it's fine. But when it gets a bit more complex, then...
Related to it, the *apply functions are still a bit difficult to
understand. When I have to use them, I just try one and see what
happens. I don't understand them well enough to know which one I need.
- the second problem is where to find the functions/packages I need.
There are many options, and that's actually the problem. R Wiki,
Rseek, RSiteSearch, Crantastic, etc... When you start with R, you
discover that the capabilities of R are almost unlimited and you
don't really know where to start, where to find what you need.

As noted in earlier posts, the mailing list is really great, but some
people are really hard with beginners. It was noted in a discussion a
few days ago, but it looks like some don't realize how difficult it
is at the beginning to formulate a good question, clear, with
self-contained example and so on. Moreover, not everybody speaks
English natively. I don't mean that you must help, even when the
question is really vague and not clear and whatever. I'm just saying
that if you don't want to help (whatever the reason), you don't have
to say it badly. But in any cases, the mailing list is still really
helpful. As someone noted (sorry I erased the email so I don't
remember who), it might be a good idea to split it.

Hi everyone,

My 2ct about the mailing list :). I understand that beginners have a
hard time formulating a good question. But the problem is that we
can't answer the question when it is unclear. So either I:

- Don't bother answering
- Try do discuss with the author of the question, taking lots of time
to find out what exactly is the question.
- Send a "read the posting guide" answer

I mostly do the first, as I have to get things done during my PhD :).
So this leaves us with kind of a problem, the person mailing the list
doesn't have the knowledge to ask the right question, the list can't
answer properly and consequently, the person mailing the list still
doesn't get the information he/she needs. We could start an R-beginner
mailing list, but this would also suffer from this problem. What do
you guys think?

Maybe the mailing list is not the right medium for really basic stuff.
For that I would recommend a good R-book or (better) a course in R or
(even better) some colleagues who work with R that you can ask
questions to.

cheers,
Paul


Hope that's what you wanted
Ivan


Le 2/26/2010 08:39, Di

Re: [R] how to fast extract values from different list elements

2010-02-26 Thread kMan

Dear Peter,

What data types does your list contain? Have you tried treating the list as
a data frame or matrix? 

KeithC.

-Original Message-
From: Heym, Peter-Paul [mailto:ph...@ipb-halle.de] 
Sent: Thursday, February 25, 2010 2:11 AM
To: r-help@R-project.org
Subject: [R] how to fast extract values from different list elements

hi,

I have a list L having more than 14000 Elements, each of these contains an
array of about length 1200.

> L[[1]][26:30] # e.g. print 5 entries of first element of L
[1]   0.000   6.7982652 114.4737184  89.7328239   3.2001664

Furthermore I get two arrays A and B of same length as input.

A<-c(4,7,9,34,463,788)
B<-c(50,67,87,361,45,89)

I would like to extract (or print or save) certain values of L which I do in
the following (inefficient) way at the moment:

for (i in 1:length(A) {
print( L[[A[i]]][B[i]] ) }

this works fine but it is very slow (since A and B can be very large and I
have to repeat this about 5000 times). I would like to make this faster
using e.g. apply or lapply but I didn't get it work using these methods.
Does anybody know an EFFICIENT or FAST way extract the values from L using
the values from A and B?

thanks for your answers.
Peter

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] New Variable from Several Existing Variables

2010-02-26 Thread David Winsemius

And if your data is in a dataframe (... please include an example of  
the results of str() next time...) :

> dfrm <- rd.txt("Column1, Column2, Column3
+ Yes,Yes,Yes
+ Yes,No,Yes
+ No,No,No
+ No,Yes,No
+ Yes,Yes,No", sep=",")  #rd.txt is just a wrapper I use for  
read.table(textConnection( ), header=TRUE, ... )

> dfrm$newvar <- apply(subset(dfrm, select=c(Column1, Column2,  
Column3)), 1,
+ function(x) { if (all(x=="Yes")) {"Yes"}  
else {"No"} } )

> dfrm
  Column1 Column2 Column3 newvar
1 Yes Yes YesYes
2 Yes  No Yes No
3  No  No  No No
4  No Yes  No No
5 Yes Yes  No No

Notice that I created this variable in a manner that did not require  
the use of every column of the dataframe.

--
David

On Feb 26, 2010, at 7:57 PM, Don MacQueen wrote:

If your data is in a matrix named "orgdata" :

newvar <- apply(orgdata , 1, function(arow, if (all(arow=='Yes'))  
'Yes' else 'No'

Yes, at least 2 missing parens and an unneeded comma, perhaps:

newvar <- apply(orgdata , 1, function(arow) if (all(arow=='Yes'))  
'Yes' else 'No' )

newdata <- cbind(orgdata, newvar)

finaloutcome <- newdata[ newvar=='Yes',]

The key to this is the apply() function.

I might have missed some parentheses...

There are other ways; this is just one. I might think of a simpler  
one if I gave it more time...

-Don

At 4:40 PM -0800 2/26/10, wookie1976 wrote:
I am new to R, but have been using SAS for years.  In this  
transition period,
I am finding myself pulling my hair out to do some of the simplest  
things.
An example of this is that I need to generate a new variable based  
on the
outcome of several existing variables in a data row.  In other  
words, if the
variable in all three existing columns are "Yes", then then the new  
variable
should also be "Yes", however if any one of the three existing  
variables is
a "No", then then new variable should be a "No".  I would then use  
that new
variable as an exclusion for data in a new or existing dataset  
(i.e., if

NewVariable = "No" then delete):

Take this:
Column1, Column2, Column3
Yes, Yes, Yes
Yes, No, Yes
No, No, No
No, Yes, No
Yes, Yes, No

Generate this:
Column1, Column2, Column3, NewVariable1
Yes, Yes, Yes, Yes
Yes, No, Yes, No
No, No, No, No
No, Yes, No, No
Yes, Yes, No, No

And end up with this:
Column1, Column2, Column3, NewVariable1
Yes, Yes, Yes, Yes

Any suggestions on how to efficiently do this in either the  
existing or a

new dataset?

You might have simplified this a bit if you let the columns be logical  
rather than character.
> dfrm$newvar <- apply(subset(dfrm, select=c(Column1, Column2,  
Column3)), 1,

+ function(x) {  (all(x=="Yes"))  } )
> dfrm
  Column1 Column2 Column3 newvar
1 Yes Yes Yes   TRUE
2 Yes  No Yes  FALSE
3  No  No  No  FALSE
4  No Yes  No  FALSE
5 Yes Yes  No  FALSE

You would then be able to apply more simple tests with operators and  
functions that accept the logical data type.

--

David Winsemius, MD
Heritage Laboratories
West Hartford, CT

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] split function

2010-02-26 Thread rusers.sh

Thanks.
I think i mistake the sampling() with split().

2010/2/26 David Winsemius 

>
> On Feb 26, 2010, at 2:40 PM, rusers.sh wrote:
>
>  Your method seems to only re-express the data "data.frame(x, g)" using
>> another format.
>>
>
> In all fairness to the first respondent to your question, that _was_ what
> it appeared you were requesting. My other thoughts would be:
>
> > cbind(x[order(g)], sort(as.numeric(as.character(g # g is a factor
> # so sort(g) or g[order(g)] returns the internal index.
>[,1] [,2]
>  [1,] -0.06782370
>  [2,]  2.25381490
>  [3,]  1.89512570
>  [4,]  2.20796200
>  [5,]  3.20112671
>  [6,] -0.55240361
>  [7,]  0.78917431
>  [8,]  2.25200061
>  [9,]  1.11914211
> [10,]  2.29234701
> [11,]  3.58316951
> [12,]  2.22990132
> [13,]  1.51407592
>
> or:
>
> >split(data.frame(x=x,g=g), g)
> > split(data.frame(x=x,g=g), g)
> $`0`
>x g
> 6  -0.0678237 0
> 15  2.2538149 0
> 18  1.8951257 0
> 30  2.2079620 0
>
> $`1`
>x g
> 1   3.2011267 1
> 3  -0.5524036 1
> 10  0.7891743 1
> 12  2.2520006 1
> 17  1.1191421 1
> 19  2.2923470 1
> 29  3.5831695 1
>
> $`2`
>   x g
> 2  2.2299013 2
> 
>
> Which also re-express it. But if that is not what you want then offer a
> better explanation  and a different example of desired output.
>
>
>
>  The results are really from the generated data frame. Maybe
>> be not good.
>>
>>> table(g)
>>>
>> g
>> 0 1 2 3
>> 7 9 8 6
>>  I hope to randomly split the value 'x' according to the different sample
>> sizes of different levels, displayed above. That is, 7 for level 0, 9 for
>> level 1, et al.
>>  Thanks.
>>
>
> Or maybe you don't want the value of x but the number of elements?
>
> > tapply(x, g, length)  # another way to get a table
>  0  1  2  3 # and with different numbers since you did not use
> set.seed(123)
>  4  7 11  8
>
> Please do clarify.
>
>
>
>> 2010/2/26 Henrique Dallazuanna 
>>
>>  Try this:
>>>
>>> split(data.frame(x, g), g)
>>>
>>> On Fri, Feb 26, 2010 at 3:55 PM, rusers.sh  wrote:
>>>
 Hi,
 I am using split function and wonder how to add the factor to the

>>> splitted
>>>
 results.
 #Example
 n <- 3; nn <- 10
 g <- factor(round(n * stats::runif(n * nn)))   #factor
 x <- rnorm(n * nn) + sqrt(as.numeric(g))#value
 xg <- split(x, g)
 xg
 $`0`
 [1]  0.82513702 -0.03911584  2.32955347  0.36745335  1.75572642

>>> 2.65461438
>>>
 0.41675829
 $`1`
 [1]  0.8583493  2.4264804 -0.3622378  3.1770015  0.5162129
 $`2`
 [1] 1.7914651 1.1440121 0.8097543 1.2064742 1.6411988 1.3743778

>>> 1.7094387
>>>
 2.1204501 1.9330132 2.0731997
 [11] 2.8931865 2.5825309 0.6978723
 $`3`
 [1] 3.0246214 1.6870782 0.9685926 1.6449350 0.9378751

> g
>
 [1] 2 2 3 2 1 3 2 3 3 1 2 2 2 2 0 0 3 0 2 2 1 1 2 2 0 1 2 0 0 0
 Levels: 0 1 2 3

 Anybody can tell me how to add the corresponding values of factor "g" to
 the splitted results 'xg' to get a data frame?
 Something like,

 Splitted/xg factor/g
 0.825137020
 -0.03911584   0
 2.329553470
  ...
 I know i can use "xg$'0',xg$'1',xg$'2',xg$'3'" to get the values of each
 class and then add a new variable to indicate the factor.
 But i hope to get a method to automatic do those things. Any ideas?
 Thanks.


 --

>>>
>
> David Winsemius, MD
> Heritage Laboratories
> West Hartford, CT
>
>


-- 
-
Jane Chang
Queen's

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] New Variable from Several Existing Variables

2010-02-26 Thread Don MacQueen


If your data is in a matrix named "orgdata" :

newvar <- apply(orgdata , 1, function(arow, if (all(arow=='Yes')) 
'Yes' else 'No'


newdata <- cbind(orgdata, newvar)

finaloutcome <- newdata[ newvar=='Yes',]


The key to this is the apply() function.

I might have missed some parentheses...

There are other ways; this is just one. I might think of a simpler 
one if I gave it more time...


-Don

At 4:40 PM -0800 2/26/10, wookie1976 wrote:

I am new to R, but have been using SAS for years.  In this transition period,
I am finding myself pulling my hair out to do some of the simplest things.
An example of this is that I need to generate a new variable based on the
outcome of several existing variables in a data row.  In other words, if the
variable in all three existing columns are "Yes", then then the new variable
should also be "Yes", however if any one of the three existing variables is
a "No", then then new variable should be a "No".  I would then use that new
variable as an exclusion for data in a new or existing dataset (i.e., if
NewVariable = "No" then delete):

Take this:
Column1, Column2, Column3
Yes, Yes, Yes
Yes, No, Yes
No, No, No
No, Yes, No
Yes, Yes, No

Generate this:
Column1, Column2, Column3, NewVariable1
Yes, Yes, Yes, Yes
Yes, No, Yes, No
No, No, No, No
No, Yes, No, No
Yes, Yes, No, No

And end up with this:
Column1, Column2, Column3, NewVariable1
Yes, Yes, Yes, Yes

Any suggestions on how to efficiently do this in either the existing or a
new dataset?

Thanks,
--
View this message in context: 
http://*n4.nabble.com/New-Variable-from-Several-Existing-Variables-tp1571574p1571574.html

Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://*stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://*www.*R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.



--
--
Don MacQueen
Environmental Protection Department
Lawrence Livermore National Laboratory
Livermore, CA, USA
925-423-1062

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Preserving lists in a function

2010-02-26 Thread Don MacQueen

Barry explained your first puzzle, but let  me add some explanation 
and examples.




 tmpfun <- function( a =3 ) {a}
 tmpfun()

[1] 3

 tmpfun(a='x')

[1] "x"

Inside the function, the value of the argument is whatever the user 
supplied. The default is replaced by what the user supplies. There is 
no mechanism for retaining the default structure and filling in any 
missing parts. R never preserves the defaults when the user supplies 
something other than the default.


For example, and using your function,


 myfunction(list1='x')

$list1
[1] "x"

$list2
$list2$variable1
[1] "variable1"

$list2$variable2
[1] "variable2"

$list2$variable3
[1] "variable3"


$list3
$list3$variable1
[1] "character"

$list3$variable2
[1] 24

$list3$variable3
[1] 0.1 0.1 0.1 0.1

$list3$variable4
[1] TRUE



 myfunction(list1=data.frame(a=1:2, b=c('x','y')))

$list1
  a b
1 1 x
2 2 y

$list2
$list2$variable1
[1] "variable1"

$list2$variable2
[1] "variable2"

$list2$variable3
[1] "variable3"


$list3
$list3$variable1
[1] "character"

$list3$variable2
[1] 24

$list3$variable3
[1] 0.1 0.1 0.1 0.1

$list3$variable4
[1] TRUE

What you put in is what you get out.

I don't know that I would deal with this the way Barry did. I would 
probably write code to examine the structure of what the user 
supplies, compare it to the required structure, and then fill in.


myf <- function(l1, l2, l3) {
  if (missing(l1)) {
   ## user did not supply l1, so set it = to the default
l1 <- list(v1=1, v2=2, v3=3)
  }  else if (!is.list(l1)) {
   ## user must supply a list, if not, it's an error
   stop('l1 must be a list')
} else {
   ## user has at least supplied a list
   ## now write code to check the names of the list that the user supplied
   ## make sure the names that the user supplied are valid, if not, stop()
   ## if the user supplied too few elements, fill in the missing ones
   ## if the user supplied too many elements stop()
   ## if the user supplied all the correct elements, with all the 
correct names, use what the user supplied

}

Looks complicated; maybe Barry's way is better...

-Don

At 5:56 PM -0500 2/26/10, Shang Gao wrote:

Dear R users,

A co-worker and I are writing a function to facilitate graph 
plotting in R. The function makes use of a lot of lists in its 
defaults.


However, we discovered that R does not necessarily preserve the 
defaults if we were to input them in the form of list() when 
initializing the function. For example, if you feed the function 
codes below into R:


myfunction=function(
list1=list  (variable1=1,
variable2=2,
variable3=3),

list2=list  (variable1="variable1",
variable2="variable2",
variable3="variable3"),

list3=list  (variable1="character",
variable2=24,
variable3=c(0.1,0.1,0.1,0.1),
variable4=TRUE))

{return(list(list1=list1,list2=list2,list3=list3))}

By definition, the values associated with each variable in the lists 
would be the default unless the user impute a different value while 
executing the function. But a problem arises when a variable in the 
list is left out completely (not imputed at all). An example is 
shown below:


myfunction( list1=list  (variable1=1,
variable2=2), #variable 3 deliberately left out

list2=list  (variable1="variable1",
variable3="position changed",
variable2="variable2"),

list3=list  (variable1="character",
variable2=24,
variable4=FALSE)) #variable 3 deliberately left out

#The outcome of the above execution is shown below:

$list1
$list1$variable1
[1] 1

$list1$variable2
[1] 2
#list1$variable3 is missing. Defaults in function not assigned in 
this execution


$list2
$list2$variable1
[1] "variable1"

$list2$variable3
[1] "position changed"

$list2$variable2
[1] "variable2"


$list3
$list3$variable1
[1] "character"

$list3$variable2
[1] 24

$list3$variable4
[1] FALSE
#list3$variable3 is missing. Defaults in function not assigned in 
this execution


We later realized that the problem lies in list() commands. Hence, 
we tried to enforce the defaults on the list using these codes in 
the function definition:


myfunction.alternative=function(
list1=list  (variable1=1,
variable2=2,
variable3=3),

list2=list  (variable1="variable1",
variable2="variable2",
variable3="variable3"),

list3=list  (variable1="character",
variable2=24,
variable3=c(0.1,0.1,0.1,0.1),
variable4=TRUE))
{
defaults=vector("list", 3)
names(defaults)=c("list1","list2","list3")
defaults$list1=list(variable1=1,
variable2=2,
variable3=3)
defaults$list2=list(variable1="variable1",
variable2="variable2",
variable3="variable3")
defaults$list3=l

[R] New Variable from Several Existing Variables

2010-02-26 Thread wookie1976


I am new to R, but have been using SAS for years.  In this transition period,
I am finding myself pulling my hair out to do some of the simplest things. 
An example of this is that I need to generate a new variable based on the
outcome of several existing variables in a data row.  In other words, if the
variable in all three existing columns are "Yes", then then the new variable
should also be "Yes", however if any one of the three existing variables is
a "No", then then new variable should be a "No".  I would then use that new
variable as an exclusion for data in a new or existing dataset (i.e., if
NewVariable = "No" then delete): 

Take this:
Column1, Column2, Column3
Yes, Yes, Yes
Yes, No, Yes
No, No, No
No, Yes, No
Yes, Yes, No

Generate this:
Column1, Column2, Column3, NewVariable1
Yes, Yes, Yes, Yes
Yes, No, Yes, No
No, No, No, No
No, Yes, No, No
Yes, Yes, No, No

And end up with this:
Column1, Column2, Column3, NewVariable1
Yes, Yes, Yes, Yes

Any suggestions on how to efficiently do this in either the existing or a
new dataset?

Thanks,
-- 
View this message in context: 
http://n4.nabble.com/New-Variable-from-Several-Existing-Variables-tp1571574p1571574.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Preserving lists in a function

2010-02-26 Thread Barry Rowlingson

On Fri, Feb 26, 2010 at 10:56 PM, Shang Gao  wrote:
> Dear R users,
>
> A co-worker and I are writing a function to facilitate graph plotting in R. 
> The function makes use of a lot of lists in its defaults.
>
> However, we discovered that R does not necessarily preserve the defaults if 
> we were to input them in the form of list() when initializing the function.

 It does preserve the defaults, it's just that the default is a single
object. If you assign anything to that argument it becomes the full
value of the argument, as I think you discovered!

>For example, if you feed the function codes below into R:
>
> myfunction=function(
>    list1=list  (variable1=1,
>                variable2=2,
>                variable3=3),
>
>    list2=list  (variable1="variable1",
>                variable2="variable2",
>                variable3="variable3"),
>
>    list3=list  (variable1="character",
>                variable2=24,
>                variable3=c(0.1,0.1,0.1,0.1),
>                variable4=TRUE))
>
> {return(list(list1=list1,list2=list2,list3=list3))}

 What I think you need to do is to replace your lists with functions.
So you'd do:

 > arg1f
function(v1=1,v2=2,v3=3){list(v1,v2,v3)}

 for your first argument. That gives you a way of overriding some of
those arguments:

> arg1f(v2=99)
[[1]]
[1] 1

[[2]]
[1] 99

[[3]]
[1] 3

 do the same for the second argument with a new function with
different defaults:

> arg2f
function(v1=99,v2=99,v3=99){list(v1,v2,v3)}

 then define your main function to get its args from the defaults of
these functions:

> myfunction=function(a1=arg1f(),a2=arg2f())
+ {list(a1=a1,a2=a2)}

so that

 myfunction() gives:

$a1
$a1[[1]]
[1] 1

$a1[[2]]
[1] 2

$a1[[3]]
[1] 3

$a2
$a2[[1]]
[1] 99

$a2[[2]]
[1] 99

$a2[[3]]
[1] 99

 - and then you can override bits thus:

> myfunction(a1=arg1f(v2=pi))
$a1
$a1[[1]]
[1] 1

$a1[[2]]
[1] 3.141593

$a1[[3]]
[1] 3

$a2
$a2[[1]]
[1] 99

$a2[[2]]
[1] 99

$a2[[3]]
[1] 99

 - which only overrides the second part of the first argument, keeping
the defaults for everything else.

What you are saying by doing:

myfunction=function(a1=arg1f(),a2=arg2f())

 is that a1 is by default the default value of arg1f(), and similarly
for a2. It's almost like constructing an object of some type. When you
call the function you have to do a1=arg1f(v1=whatever) but that's an
advantage to just a list() call in more ways - for example your arg1f
function can check that the inputs are valid, or it can return an
object of some class you can work with.

 Hope this helps.

 Barry

-- 
blog: http://geospaced.blogspot.com/
web: http://www.maths.lancs.ac.uk/~rowlings
web: http://www.rowlingson.com/
twitter: http://twitter.com/geospacedman
pics: http://www.flickr.com/photos/spacedman

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] wrap long lines in table using "latex" in Hmisc

2010-02-26 Thread Marc Schwartz

On Feb 26, 2010, at 5:18 PM, Sharpie wrote:

> 
> 
> Ista Zahn wrote:
>> 
>> Hi Tao,
>> Just set the appropriate *.just argument, e.g.:
>> 
>> Dat <- data.frame(x1 = rep("this value consists of a long string of
>> text", 5), x2 = rep("this value consists of an even longer string of
>> text", 5))
>> 
>> library(Hmisc)
>> latex(Dat, col.just = rep("p{1in}",  2))
>> 
>> You can also set justification for column headings, column group
>> headings etc. See ?latex for details.
>> 
>> Best,
>> Ista
>> 
>> 
> 
> As Ista said, you can use the p{}, m{} and b{} LaTeX column specifications
> to create a table column that enforces a line wrap on it's contents.  See: 
> 
>  http://en.wikibooks.org/wiki/LaTeX/Tables#The_tabular_environment
> 
> for full details.
> 
> However, one problem with using say, p{2in}, is that the text is set *fully
> justified*.  This means that the inter-word spacing in each line is expanded
> so that the line fully occupies the allotted 2 inches of space.  For some
> tables the results are a typographical travesty.
> 
> The solution is to prepend a ">{justificationCommand}" to your column
> specification, such as:
> 
>> {\centering}p{2in}
> 
> The justification commands you can use are :
> 
>  \centering-> Centers wrapped text
>  \raggedright -> *left* aligns wrapped text
>  \raggedleft   -> *right* aligns wrapped text
> 
> Remember to double the backslash if you are passing this command as an
> argument in R.
> 
> This trick will cause a LaTeX compilation error if used to specify the
> right-most column in a table, unless the hmisc latex() command produces
> tables that use "\tabularnewline" to invoke table row breaks instead of
> "\\".
> 
> Hope this helps.
> 
> -Charlie

One other option that you can use is to create a \newcommand that wraps text in 
a tabular, which you can then actually use within an existing table cell. This 
enables multiple lines of text within the cell, with line breaks that you 
specify. So you in effect end up with nested tables. Of course, the entire row 
height is adjusted accordingly, but this way, you don't need to specify a fixed 
column width.

For example, put the following in your .tex file (or .Rnw file) after the 
\begin{document} directive:

  \newcommand{\multiliner}[1]{\begin{tabular}[...@{}r@{}}#1\end{tabular}} 
  \newcommand{\multilinel}[1]{\begin{tabular}[...@{}l@{}}#1\end{tabular}} 
  \newcommand{\multilinec}[1]{\begin{tabular}[...@{}c@{}}#1\end{tabular}} 

Each of the above provides for Right, Left and Centered justification, 
respectively, within the table cell.

Then, you can create a cell entry that results in the following TeX markup:

   \multilineC{Line 1 \\ Line 2 \\ ...}

If you are cat()ing the output from R, you need to double the backslashes, so 
that you begin with something like:

   \\multilineC{Line 1  Line 2  ...}

I typically do this with headers for tables that would otherwise be too wide 
for the column.

So you would start with a long line of text:

LongLine <- "This is a really long line that needs to wrap in a table row"

Break it into chunks around 15 characters in length using strwrap():

> strwrap(LongLine, 15)
[1] "This is a"  "really long""line that"  "needs to wrap" 
[5] "in a table row"

Use paste() to begin to create the proper LaTeX markup for \multilineC:

TMP1 <- paste(strwrap(LongLine, 15), collapse = "")

> TMP1
[1] "This is areally longline thatneeds to wrapin a table row"

Now create the full line:

TMP2 <- paste("\\multiLine{", TMP1, "}") 

> TMP2
[1] "\\multiLine{ This is areally longline thatneeds to wrapin 
a table row }"

When you cat() the output, you get:

> cat(TMP2)
\multiLine{ This is a\\really long\\line that\\needs to wrap\\in a table row } 

"TMP2" can now be used in place of the original long line of text and when 
processed by 'latex', will be rendered properly.

Of course, rather than using strwrap(), you can hard code the line breaks into 
your character vector as you may otherwise require.

HTH,

Marc Schwartz

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] wrap long lines in table using "latex" in Hmisc

2010-02-26 Thread Tao Shi


Hi Ista,

Thanks!  I missed that.

...Tao


> From: istaz...@gmail.com
> Date: Fri, 26 Feb 2010 17:48:32 -0500
> Subject: Re: [R] wrap long lines in table using "latex" in Hmisc
> To: shi...@hotmail.com
> CC: r-help@r-project.org
>
> Hi Tao,
> Just set the appropriate *.just argument, e.g.:
>
> Dat <- data.frame(x1 = rep("this value consists of a long string of
> text", 5), x2 = rep("this value consists of an even longer string of
> text", 5))
>
> library(Hmisc)
> latex(Dat, col.just = rep("p{1in}", 2))
>
> You can also set justification for column headings, column group
> headings etc. See ?latex for details.
>
> Best,
> Ista
>
>
> On Fri, Feb 26, 2010 at 3:37 PM, Tao Shi  wrote:
>>
>> Hi list,
>>
>> Is there a way to control long-line wrapping in a table using "latex" 
>> function in Hmisc or any other functions?  It seems I can't find any 
>> examples.
>>
[[elided Hotmail spam]]
>>
>>
>> ...Tao
>>
>> _
>> Hotmail: Trusted email with Microsoft’s powerful SPAM protection.
>>
>> __
>> R-help@r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>>
>
>
>
> --
> Ista Zahn
> Graduate student
> University of Rochester
> Department of Clinical and Social Psychology
> http://yourpsyche.org
  
_
Hotmail: Powerful Free email with security by Microsoft.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] wrap long lines in table using "latex" in Hmisc

2010-02-26 Thread Sharpie

Ista Zahn wrote:
> 
> Hi Tao,
> Just set the appropriate *.just argument, e.g.:
> 
> Dat <- data.frame(x1 = rep("this value consists of a long string of
> text", 5), x2 = rep("this value consists of an even longer string of
> text", 5))
> 
> library(Hmisc)
> latex(Dat, col.just = rep("p{1in}",  2))
> 
> You can also set justification for column headings, column group
> headings etc. See ?latex for details.
> 
> Best,
> Ista
> 
> 

As Ista said, you can use the p{}, m{} and b{} LaTeX column specifications
to create a table column that enforces a line wrap on it's contents.  See: 

  http://en.wikibooks.org/wiki/LaTeX/Tables#The_tabular_environment

for full details.

However, one problem with using say, p{2in}, is that the text is set *fully
justified*.  This means that the inter-word spacing in each line is expanded
so that the line fully occupies the allotted 2 inches of space.  For some
tables the results are a typographical travesty.

The solution is to prepend a ">{justificationCommand}" to your column
specification, such as:

  >{\centering}p{2in}

The justification commands you can use are :

  \centering-> Centers wrapped text
  \raggedright -> *left* aligns wrapped text
  \raggedleft   -> *right* aligns wrapped text

Remember to double the backslash if you are passing this command as an
argument in R.

This trick will cause a LaTeX compilation error if used to specify the
right-most column in a table, unless the hmisc latex() command produces
tables that use "\tabularnewline" to invoke table row breaks instead of
"\\".

Hope this helps.

-Charlie
-- 
View this message in context: 
http://n4.nabble.com/wrap-long-lines-in-table-using-latex-in-Hmisc-tp1571298p1571496.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] using grep

2010-02-26 Thread William Dunlap

> -Original Message-
> From: r-help-boun...@r-project.org 
> [mailto:r-help-boun...@r-project.org] On Behalf Of kayj
> Sent: Friday, February 26, 2010 12:02 PM
> To: r-help@r-project.org
> Subject: Re: [R] using grep
> 
> 
> Hi ,
> 
> I have tried
> 
> gsub(".*York(\\d+).*", "\\1", grep("New York", x, value = TRUE)) 
> 
> and outputs 
> 
>  "P New York722AZ" "K New York20" 
> but that is not what i want, I want the output to be 
> 
> 722,20

Does it work if you replace the "\\d" in
the pattern with "[0-9]"?  If so, your
regular expression code doesn't recognize
the "\\d" to mean "a decimal digit".

Bill Dunlap
Spotfire, TIBCO Software
wdunlap tibco.com  

> 
> 
> -- 
> View this message in context: 
> http://n4.nabble.com/using-grep-tp1571102p1571251.html
> Sent from the R help mailing list archive at Nabble.com.
> 
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide 
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
> 

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Preserving lists in a function

2010-02-26 Thread Shang Gao

Dear R users,

A co-worker and I are writing a function to facilitate graph plotting in R. The 
function makes use of a lot of lists in its defaults.

However, we discovered that R does not necessarily preserve the defaults if we 
were to input them in the form of list() when initializing the function. For 
example, if you feed the function codes below into R:

myfunction=function(
list1=list  (variable1=1,
variable2=2,
variable3=3),

list2=list  (variable1="variable1",
variable2="variable2",
variable3="variable3"),

list3=list  (variable1="character",
variable2=24,
variable3=c(0.1,0.1,0.1,0.1),
variable4=TRUE))

{return(list(list1=list1,list2=list2,list3=list3))}

By definition, the values associated with each variable in the lists would be 
the default unless the user impute a different value while executing the 
function. But a problem arises when a variable in the list is left out 
completely (not imputed at all). An example is shown below:

myfunction( list1=list  (variable1=1,
variable2=2), #variable 3 deliberately left out

list2=list  (variable1="variable1",
variable3="position changed",
variable2="variable2"),

list3=list  (variable1="character",
variable2=24,
variable4=FALSE)) #variable 3 deliberately left out

#The outcome of the above execution is shown below:

$list1
$list1$variable1
[1] 1

$list1$variable2
[1] 2
#list1$variable3 is missing. Defaults in function not assigned in this execution

$list2
$list2$variable1
[1] "variable1"

$list2$variable3
[1] "position changed"

$list2$variable2
[1] "variable2"


$list3
$list3$variable1
[1] "character"

$list3$variable2
[1] 24

$list3$variable4
[1] FALSE
#list3$variable3 is missing. Defaults in function not assigned in this execution

We later realized that the problem lies in list() commands. Hence, we tried to 
enforce the defaults on the list using these codes in the function definition:

myfunction.alternative=function(
list1=list  (variable1=1,
variable2=2,
variable3=3),

list2=list  (variable1="variable1",
variable2="variable2",
variable3="variable3"),

list3=list  (variable1="character",
variable2=24,
variable3=c(0.1,0.1,0.1,0.1),
variable4=TRUE))
{
defaults=vector("list", 3)
names(defaults)=c("list1","list2","list3")
defaults$list1=list(variable1=1,
variable2=2,
variable3=3)
defaults$list2=list(variable1="variable1",
variable2="variable2",
variable3="variable3")
defaults$list3=list  (variable1="character",
variable2=24,
variable3=c(0.1,0.1,0.1,0.1),
variable4=TRUE)
if(length(list1$variable1)==0){list1$variable1=defaults$list1$variable1}
if(length(list1$variable2)==0){list1$variable2=defaults$list1$variable2}
if(length(list1$variable3)==0){list1$variable3=defaults$list1$variable3}

if(length(list2$variable1)==0){list2$variable1=defaults$list2$variable1}
if(length(list2$variable2)==0){list2$variable2=defaults$list2$variable2}
if(length(list2$variable3)==0){list2$variable3=defaults$list2$variable3}

if(length(list3$variable1)==0){list3$variable1=defaults$list3$variable1}
if(length(list3$variable2)==0){list3$variable2=defaults$list3$variable2}
if(length(list3$variable3)==0){list3$variable3=defaults$list3$variable3}
if(length(list3$variable4)==0){list3$variable4=defaults$list3$variable4}

return(list(list1=list1,list2=list2,list3=list3))}

The outcome of execution the above function with the same commands produces the 
results that we wanted:
> myfunction.alternative( list1=list  (variable1=1,
+ variable2=2), #variable 3 deliberately left out
+
+ list2=list  (variable1="variable1",
+ variable3="position changed",
+ variable2="variable2"),
+
+ list3=list  (variable1="character",
+ variable2=24,
+ variable4=FALSE)) #variable 3 deliberately left out
$list1
$list1$variable1
[1] 1

$list1$variable2
[1] 2

$list1$variable3
[1] 3
 #list1$variable3 is assigned default despite being left out in the execution 
command


$list2
$list2$variable1
[1] "variable1"

$list2$variable3
[1] "position changed"

$list2$variable2
[1] "variable2"


$list3
$list3$variable1
[1] "character"

$list3$variable2
[1] 24

$list3$variable4
[1] FALSE

$list3$variable3
[1] 0.1 0.1 0.1 0.1
 #list3$variable3 is assigned default despite being left out in the execution 
command

Even though the function works, as you can see, the codes that enforce the 
defaults are very long and bulky. Such lengthy codes won't be efficient if we 
have a write a fun

Re: [R] ODBC with Filemaker

2010-02-26 Thread Marc Schwartz

On Feb 26, 2010, at 12:20 PM, Daniel wrote:

> Hi all,
> anybody get connection with Filemarker 10 for mac?
> How do that?
> I suppose did right, but it is not working.

You need to be sure that you have the current version of R (2.10.1), the 
current version of the RODBC package (1.3-0), an ODBC driver for Filemaker 
installed and of course, appropriate environment variables set and 
configuration files created.

I am not familiar with Filemaker, so you may have to contact their support 
folks to be sure that you at least have their end of things configured 
correctly. You want to be sure that you make it clear that you want to connect 
TO Filemaker using an ODBC connection from an external application. Not that 
you want to connect FROM Filemaker to an external database application/server 
(eg. Oracle).

A quick search of the R list archives comes up with at least one indication of 
success:

  http://tolstoy.newcastle.edu.au/R/help/06/01/19111.html

though it is not clear what OS Sean was using.

Another specifically on OSX is here:

  https://stat.ethz.ch/pipermail/r-sig-mac/2008-July/005169.html

If you need to post follow ups, you are best to post to the R-SIG-DB list. More 
info here:

  https://stat.ethz.ch/mailman/listinfo/r-sig-db

HTH,

Marc Schwartz

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] wrap long lines in table using "latex" in Hmisc

2010-02-26 Thread Ista Zahn

Hi Tao,
Just set the appropriate *.just argument, e.g.:

Dat <- data.frame(x1 = rep("this value consists of a long string of
text", 5), x2 = rep("this value consists of an even longer string of
text", 5))

library(Hmisc)
latex(Dat, col.just = rep("p{1in}",  2))

You can also set justification for column headings, column group
headings etc. See ?latex for details.

Best,
Ista


On Fri, Feb 26, 2010 at 3:37 PM, Tao Shi  wrote:
>
> Hi list,
>
> Is there a way to control long-line wrapping in a table using "latex" 
> function in Hmisc or any other functions?  It seems I can't find any examples.
>
> Thank you very much!
>
>
> ...Tao
>
> _
> Hotmail: Trusted email with Microsoft’s powerful SPAM protection.
>
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>



-- 
Ista Zahn
Graduate student
University of Rochester
Department of Clinical and Social Psychology
http://yourpsyche.org

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] need help to resolve RODBC error

2010-02-26 Thread Marc Schwartz

On Feb 26, 2010, at 1:04 PM, Yan Zhang wrote:

> I've installed R-2.9.2 (64 bit), unixODBC-2.2.14-p2 (64 bit) and RODBC_1.2-5 
> (64 bit) on a 64 bit Redhat Linux server (Red Hat Enterprise Linux Server 
> release 5.4 (Tikanga), x86_64) release 2.6.18-164.2.1.el5.  I've tested the 
> ODBC drive via isql and the test was success:
> 
> [yzh...@roracletest ~]$ isql -v DRTST yzhang  test
> +---+
> | Connected!|
> |   |
> | sql-statement |
> | help [tablename]  |
> | quit  |
> |   |
> +---+
> SQL> select sysdate from dual;
> ++
> | SYSDATE|
> ++
> | 2010-02-26 13:57:00|
> ++
> SQLRowCount returns -1
> 1 rows fetched
> SQL> quit;
> 
> However, when I started up R console on the machine and test the RODBC 
> connectivity to oracle, I got following error:
> 
> [ora...@roracletest R]$ R
> 
> R version 2.9.2 (2009-08-24)
> Copyright (C) 2009 The R Foundation for Statistical Computing
> ISBN 3-900051-07-0
> 
> R is free software and comes with ABSOLUTELY NO WARRANTY.
> You are welcome to redistribute it under certain conditions.
> Type 'license()' or 'licence()' for distribution details.
> 
> Natural language support but running in an English locale
> 
> R is a collaborative project with many contributors.
> Type 'contributors()' for more information and
> 'citation()' on how to cite R or R packages in publications.
> 
> Type 'demo()' for some demos, 'help()' for on-line help, or
> 'help.start()' for an HTML browser interface to help.
> Type 'q()' to quit R.
> 
>> library(RODBC)
>> channel <- odbcConnect("DRTST", uid="yzhang", pwd="test")
> 
> *** caught segfault ***
> address (nil), cause 'unknown'
> 
> Traceback:
> 1: .Call(C_RODBCDriverConnect, as.character(connection), id, 
> as.integer(believeNRows))
> 2: odbcDriverConnect(st, ...)
> 3: odbcConnect("DRTST", uid = "yzhang", pwd = "test")
> 
> Possible actions:
> 1: abort (with core dump, if enabled)
> 2: normal R exit
> 3: exit R without saving workspace
> 4: exit R saving workspace
> 
> I searched around and haven't found any resolution.  Please help.
> 
> Thanks.
> 
> Yan Zhang

Three quick comments:

1. R 2.9.2 is dated and 2.10.1 is available for RHEL 5 via the EPEL:

  http://fedoraproject.org/wiki/EPEL

as you can see here:

  http://download.fedora.redhat.com/pub/epel/5/x86_64/repoview/r.html

I presume that you installed R via RPMs.

2. RODBC version 1.2-5 is a year out of date and has been updated twice since 
then to version 1.3-0.

3. I don't see any indication above that you installed the Oracle Linux 64 bit 
ODBC drivers, which are available from:

http://www.oracle.com/technology/software/tech/oci/instantclient/htdocs/linuxx86_64soft.html

The Oracle instant client does not require or use ODBC, but native drivers. So 
the success of the iSQL connection only serves to confirm that important 
environment variables and config files are probably ok.

At least to start, you need to install the Oracle 64 bit ODBC driver (if you 
have not already) and update both R and RODBC and see if you still get the 
errors.

Once you get those installed/updated, be sure to read the vignette for the 
RODBC package, which will be available within R using:

  vignette("RODBC")

Also, just to be sure that you have 64 bit R installed, check:

  .Machine$sizeof.pointer

and be sure that it returns 8, not 4.

Lastly, there is a R-SIG-DB e-mail list, where this topic is better discussed. 
More information here:

  https://stat.ethz.ch/mailman/listinfo/r-sig-db

HTH,

Marc Schwartz

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] using grep

2010-02-26 Thread David Winsemius

On Feb 26, 2010, at 3:02 PM, kayj wrote:

Hi ,

I have tried

gsub(".*York(\\d+).*", "\\1", grep("New York", x, value = TRUE))

and outputs

"P New York722AZ" "K New York20"

Strange:

> x<-c("P Los Angeles44AZ", "P New York722AZ", "K New York20")
>
> gsub(".*York(\\d+).*", "\\1", grep("New York", x, value = TRUE))
[1] "722" "20"

but that is not what i want, I want the output to be

722,20

Aside from being a character vector without commas, it seemed pretty  
close.

David Winsemius, MD
Heritage Laboratories
West Hartford, CT

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] decomposing an irregularly spaced time series

2010-02-26 Thread ravi

Hi, 
I am interested in decomposing an irregularly spaced time series and getting 
results similiar to that obtained with the stl command for a regularly spaced 
time series. I would like to know if any of the time series packages like zoo 
can be used for this. From my search, I was only able to find some help with 
the tseries and pastecs passages, But I had difficulty in going the full length 
even with these packages . Let me explain with my sample code :

# Attemts with the functions irts, regul and tsd
n<-20
t1 <- runif(n)
n1<-(1:n)*1e6
t2<-t1*1e3
t3<-n1+t2
u <- rnorm(n)
n2<-rep(1:5,times=4); u<-u+n2
library(tseries)
x <- irts(t3, u) # could not find a decomposition method for irts
y<-data.frame(day=x$time,val=x$value)
y$day<-as.Date(y$day)
y$nday<-as.numeric(y$day)
y
with(y,plot(nday,val,type="b"))
library(pastecs)
y1<-y$day[1]
yf<-"y-m-d"
reg.y<-regul(x=as.numeric(y$day),y=y$val,units="days",methods=c("l"),
  datemin=y1,dateformat=yf,deltat=5)

I get the following error message :

Error in approx(x, y, xout, method = "linear", rule = rule) :
  need at least two non-NA values to interpolate

I would like to get help on the following points :
1. The actual decomposition is suposed to work with the tsd command. Is it 
possible to use it without first using regul?
2. Can I succeed with the regul command by a better choice of argument values? 
My attempts to set rule=2 did not help.
3. Would it be better to first get a regularly spaced time series by 
interpolation, and then try decompostion with stl?
 I would appreciate some practical help here.
4.Is it hopeless to attemt decomposition when the irregularity level is high? 
The series that I am working on is 
fairly regular in periods interpersed with either breaks or irregular data. I 
would like to see the trends and seasonal effects there.
Is there an alternative method of approaching this task?

Thanking you,
Ravi

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] using grep

What is your R version?

Install the most recent R version (2.10.1).

On Fri, Feb 26, 2010 at 5:02 PM, kayj  wrote:
>
> Hi ,
>
> I have tried
>
> gsub(".*York(\\d+).*", "\\1", grep("New York", x, value = TRUE))
>
> and outputs
>
>  "P New York722AZ" "K New York20"
> but that is not what i want, I want the output to be
>
> 722,20
>
>
> --
> View this message in context: 
> http://n4.nabble.com/using-grep-tp1571102p1571251.html
> Sent from the R help mailing list archive at Nabble.com.
>
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>



-- 
Henrique Dallazuanna
Curitiba-Paraná-Brasil
25° 25' 40" S 49° 16' 22" O

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] using grep

2010-02-26 Thread kayj


Hi ,

I have tried

gsub(".*York(\\d+).*", "\\1", grep("New York", x, value = TRUE)) 

and outputs 

 "P New York722AZ" "K New York20" 
but that is not what i want, I want the output to be 

722,20


-- 
View this message in context: 
http://n4.nabble.com/using-grep-tp1571102p1571251.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] factorial block design with missing data

2010-02-26 Thread Guillaume Théroux Rancourt

Hello!

I have read somewhere (somehow, I can't seem to find it again, it's been a 
couple of months) that when analyzing factorial block design, the position 
where you put the block factor is important, even more when there are missing 
values.

I understand that when using anova.lm, the order is sequential, so that if I 
want to check for a treatment effect, I should put my blocking factor before in 
order to . It's just that I got confused with all the answers from previous 
posts and books, and I don't know if the missing values are being handled 
properly. 

My code is:
P.biom = lm(biomass ~ Bloc + Trt*Clone, data=mydata)
P.aov = anova.lm(P.biom, test="F")

> anova.lm(P.ar.2, test="F")
Analysis of Variance Table

Response: M_aerien
  Df  Sum Sq Mean Sq  F valuePr(>F)
Bloc   2   139.769.9   0.40540.6710
Trt1 31069.5 31069.5 180.2905 6.227e-13 ***
Clone  7  1206.2   172.3   0.0.4544
Trt:Clone  7   570.381.5   0.47280.8450
Residuals 25  4308.2   172.3   
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1 



Thank you very much. 

Guillaume Théroux Rancourt
Ph.D. candidate --- Plant Biology
Université Laval, Québec, QC, Canada
guillaume.theroux-rancourt.1 at ulaval.ca

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] need help to resolve RODBC error

2010-02-26 Thread Yan Zhang

I've installed R-2.9.2 (64 bit), unixODBC-2.2.14-p2 (64 bit) and RODBC_1.2-5 
(64 bit) on a 64 bit Redhat Linux server (Red Hat Enterprise Linux Server 
release 5.4 (Tikanga), x86_64) release 2.6.18-164.2.1.el5.  I've tested the 
ODBC drive via isql and the test was success:

[yzh...@roracletest ~]$ isql -v DRTST yzhang  test
+---+
| Connected!|
|   |
| sql-statement |
| help [tablename]  |
| quit  |
|   |
+---+
SQL> select sysdate from dual;
++
| SYSDATE|
++
| 2010-02-26 13:57:00|
++
SQLRowCount returns -1
1 rows fetched
SQL> quit;

However, when I started up R console on the machine and test the RODBC 
connectivity to oracle, I got following error:

[ora...@roracletest R]$ R

R version 2.9.2 (2009-08-24)
Copyright (C) 2009 The R Foundation for Statistical Computing
ISBN 3-900051-07-0

R is free software and comes with ABSOLUTELY NO WARRANTY.
You are welcome to redistribute it under certain conditions.
Type 'license()' or 'licence()' for distribution details.

 Natural language support but running in an English locale

R is a collaborative project with many contributors.
Type 'contributors()' for more information and
'citation()' on how to cite R or R packages in publications.

Type 'demo()' for some demos, 'help()' for on-line help, or
'help.start()' for an HTML browser interface to help.
Type 'q()' to quit R.

> library(RODBC)
> channel <- odbcConnect("DRTST", uid="yzhang", pwd="test")

*** caught segfault ***
address (nil), cause 'unknown'

Traceback:
1: .Call(C_RODBCDriverConnect, as.character(connection), id, 
as.integer(believeNRows))
2: odbcDriverConnect(st, ...)
3: odbcConnect("DRTST", uid = "yzhang", pwd = "test")

Possible actions:
1: abort (with core dump, if enabled)
2: normal R exit
3: exit R without saving workspace
4: exit R saving workspace

I searched around and haven't found any resolution.  Please help.

Thanks.

Yan Zhang

Senior DBA
SDI Health, L.L.C.| www.sdihealth.com
484-362-2022



[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] help system works under IE but not Firefox

2010-02-26 Thread Raynor, Bill

Thanks Duncan, that did the trick.
I added a new proxy to the FoxyProxy list, white listing 127.0.0.1 and moved 
that to the top of the list.
Help.start() works fine now.
Bill

> -Original Message-
> From: Duncan Murdoch [mailto:murd...@stats.uwo.ca]
> Sent: Friday, February 26, 2010 1:00 PM
> To: Raynor, Bill
> Cc: r-help@R-project.org
> Subject: Re: [R] help system works under IE but not Firefox
> 
> On 26/02/2010 11:31 AM, Raynor, Bill wrote:
> > I just upgraded to 2.10.1 on a WinXPSP2 machine. When I type
> help.start() R attempts to open a browser session at
> http://127.0.0.1:27594/doc/html/index.html using my default browser
> (Firefox 3.6) and is unable to connect. If I then open the same page
> using IE 8, it works just fine. How do I fix/change R so that it works
> with Firefox?
> >
> 
> I think you want to fix Firefox so it works with R.  The most likely
> problem is that it's using a proxy/firewall somewhere; you want to
> enter
> 127.0.0.1 as an exception to which it should connect directly.
> 
> Duncan Murdoch
> 
> > Thanks
> > Bill
> > William J. Raynor, Jr. Ph.D.
> > Technical Leader III
> > Innovation Design & Testing
> > Kimberly-Clark Corp.
> > 2100 Winchester Road
> > Neenah, Wi. 54956
> > (920) 721-5973
> > Email: bill.ray...@kcc.com
> >
> >
> >
> >
> > This e-mail is intended for the use of the addressee(s) only and may
> contain privileged, confidential, or proprietary information that is
> exempt from disclosure under law.  If you have received this message in
> error, please inform us promptly by reply e-mail, then delete the e-
> mail and destroy any printed copy.   Thank you.
> >
> >
> > [[alternative HTML version deleted]]
> >
> > __
> > R-help@r-project.org mailing list
> > https://stat.ethz.ch/mailman/listinfo/r-help
> > PLEASE do read the posting guide http://www.R-project.org/posting-
> guide.html
> > and provide commented, minimal, self-contained, reproducible code.
> >
> 
> 



 
This e-mail is intended for the use of the addressee(s) only and may contain 
privileged, confidential, or proprietary information that is exempt from 
disclosure under law.  If you have received this message in error, please 
inform us promptly by reply e-mail, then delete the e-mail and destroy any 
printed copy.   Thank you. 

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] R Experts




Ryan Kinzer wrote:

Erik

Thanks for helping.  Both of them are factors.



That's the problem, they need to be of class Date.  See the R NEWS 
article about Date classes in Volume 4/1.


http://cran.r-project.org/doc/Rnews/

I don't see how they could be factors though, since you shouldn't be 
able to subtract two factors from each other without a warning at least?


e.g., when I make up factors f1  and f2

>f1 - f2



Warning message:
In Ops.factor(f1, f2) : - not meaningful for factors

We would have to have a small, reproducible example to know for sure 
what's going on...


Best Regards,
Erik

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Automate generation of multiple reports using odfWeave

2010-02-26 Thread Aleksey Naumov

Sarah,

Thank you very much, this an easy solution that works very well!

On a more complicated note, is there a way to embed the station name in a
header or footer of the document? It seems there is no way to evaluate a
chunk or an inline \Sexpr{...} in a header or footer?
This would put station ID on every report page, making reading & comparing
multiple reports much easier. Right now, if reports are converted to PDF,
they all have title "\Sexpr{listString(letters[1:5])}" making navigation
between them very cumbersome. I could adjust the title in ODT, but again,
cannot embed any variable into it. Is there a way to set the title from
odfWeave?

Thank you,
Aleksey

On Fri, Feb 26, 2010 at 3:48 PM, Sarah Goslee wrote:

> I tend to do it the other way around.
>
> Hard-code the station into the ODT file as "thisstation".
>
> Then, in R, do something like this:
>
> allstations <- c("station1", "station2", "station3")
>
> for (i in allstations) {
>   thisstation <- i
>   odfWeave("inputfile.odt", paste("output-", i, ".odt", sep=""))
> }
> rm(thisstation)
>
> That way you don't have to have a bunch of files.
>
> Sarah
>
> On Fri, Feb 26, 2010 at 3:43 PM, Aleksey Naumov  wrote:
> > Dear R and odfWeave users,
> >
> > I am looking for a way to automate generation of many reports using
> > odfWeave. All reports would use the same input ODT file, the only
> difference
> > would be in the name of the dataset which will be analyzed in any
> particular
> > report. Right now, the name of the dataset is hardcoded in the first code
> > chuck in the input file:
> >
> > <<01 get data, echo=TRUE>>
> > station = '123'  # name of the station dataset to be
> > analyzed in this report
> > data = get_data(station)# get data, e.g. from a database
> > @
> >
> > This is far from ideal, as it requires a separate input file (ODT) for
> every
> > input station, a huge duplication.
> >
> > Are there ways to streamline this? I am looking for a way to have only
> one
> > input ODT file (much easier to maintain one file than many). I am
> thinking
> > that the input file could be pre-processed to include the station
> parameter
> > and saved as an intermediate ODT, which would then be put through
> > odfWeave()? Does anyone know of a good way to edit the OO.org ODT file to
> > put "station = '...'" into the first code chuck?
> > Is there any other way to do that?
> >
> > Thank you,
> > Aleksey
>
> --
> Sarah Goslee
> http://www.functionaldiversity.org
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] dramatic speed difference in lapply

2010-02-26 Thread Rob Forler

I'm trying to do data grouping like you said. I will look into data.table
package and I will also consider using a matrix instead of a data frame.

Thank you for your responses.

Thanks,
Rob

On Fri, Feb 26, 2010 at 3:21 PM, Tom Short  wrote:

> I'm sorry, Rob, but that code is dense enough and formatted badly
> enough that it's hard to dig through.
>
> You may want to try the data.table package. The development version on
> R-forge is pretty fast for grouping operations like this. I'm not sure
> if this is what you're really after. It's hard to tell from your
> example.
>
> Compare some speeds:
>
> > dat <- data.frame(D=sample(32000:33000, 666000,T),
> +   Fid=sample(1:10,666000,T),
> +   A=sample(1:5,666000,T))
> >
> > ### one of your examples
> > system.time(ret <- fedb.ddplyWrapper2(dat, c("D", "Fid"),
> + function(x) c(sum(x[,"A"], na.rm=T),
> sum(x[,"A"], na.rm=T
>   user  system elapsed
>  21.78   14.42   36.35
> >
> >
> > ### data.table
> > install.packages("data.table",repos="http://R-Forge.R-project.org";)
> > library(data.table)
> > dt <- as.data.table(dat)
> > system.time(ret2 <- dt[, sum(A, na.rm=T), by = "D,Fid"])
>   user  system elapsed
>   0.270.000.28
> >
> >
> > ### plyr for comparison, too
> > library(plyr)
> > system.time(ret3 <- ddply(dat, .(D,Fid), function(x) sum(x$A, na.rm=T)))
>   user  system elapsed
>  28.94   12.16   41.23
>
> > head(ret)
>  [,1] [,2]
> 1  175  175
> 2  222  222
> 3  221  221
> 4  134  134
> 5  253  253
> 6  194  194
>
> > head(ret2)
> D Fid  V1
> [1,] 32000   1 228
> [2,] 32000   2 209
> [3,] 32000   3 182
> [4,] 32000   4 180
> [5,] 32000   5 181
> [6,] 32000   6 222
>
> > head(ret3)
>  D Fid  V1
> 1 32000   1 175
> 2 32000   2 222
> 3 32000   3 221
> 4 32000   4 134
> 5 32000   5 253
> 6 32000   6 194
>
>
> - Tom
>
>
> On Fri, Feb 26, 2010 at 2:58 PM, Rob Forler  wrote:
> > So I have a function that does lapply's for me based on dimension.
> Currently
> > only works for length(pivotColumns)=2 because I haven't fixed the rbinds.
> I
> > have two versions. One runs WAYYY faster than the other. And I'm not sure
> > why.
> >
> > Fast Version:
> >
> > fedb.ddplyWrapper2Fast <- function(data, pivotColumns, listNameFunctions,
> > ...){
> >lapplyFunctionRecurse <- function(cdata, level=1, ...){
> >if(level==1){
> >
> > return(lapply(split(seq(nrow(cdata)),cdata[,pivotColumns[level]],
> drop=T),
> > function(x) lapplyFunctionRecurse(x, level+1, ...)))
> >} else if (level==length(pivotColumns)) {
> >#
> > return(lapply(split(cdata,data[cdata,pivotColumns[level]], drop=T),
> > function(x, ...) listNameFunctions(data[x,], ...)))
> >return(lapply(split(cdata,data[cdata,pivotColumns[level]],
> > drop=T), function(x, ...) c(data[cdata[1],pivotColumns[2]],
> > data[cdata[1],pivotColumns[1]], sum(data[cdata,"A"], na.rm=T),
> > sum(data[cdata,"A"], na.rm=T
> >} else {
> >return(lapply(split(cdata,data[cdata,pivotColumns[level]],
> > drop=T), function(x) lapplyFunctionRecurse(x, level+1, ...)))
> >}
> >}
> >result = lapplyFunctionRecurse(data, ...)
> >matrix2 <- do.call('rbind', lapply(result, function(x)
> > do.call('rbind',x)))
> >return(matrix2)
> > }
> >
> >
> > dat <- data.frame(D=sample(32000:33000, 666000,
> > T),Fid=sample(1:10,666000,T), A=sample(1:5,666000,T))
> >> temp = proc.time(); ret = fedb.ddplyWrapper2(dat, c("D", "Fid"),
> > function(x) c(sum(x[,"A"], na.rm=T), sum(x[,"A"], na.rm=T)));
> > proc.time()-temp
> >   user  system elapsed
> >  4.616   0.006   4.630
> > #note in thie case the anonymous function I pass in isn't used because I
> > hardcode the function into the lapply.
> >
> > approx 4 seconds
> >
> > This runs very fast. This runs very slow:
> >
> > fedb.ddplyWrapper2 <- function(data, pivotColumns, listNameFunctions,
> ...){
> >lapplyFunctionRecurse <- function(cdata, level=1, ...){
> >if(level==1){
> >
> > return(lapply(split(seq(nrow(cdata)),cdata[,pivotColumns[level]],
> drop=T),
> > function(x) lapplyFunctionRecurse(x, level+1, ...)))
> >} else if (level==length(pivotColumns)) {
> >#this line is different. it essentially calls the function you
> > pass in
> >return(lapply(split(cdata,data[cdata,pivotColumns[level]],
> > drop=T), function(x, ...) listNameFunctions(data[x,], ...)))
> >} else {
> >return(lapply(split(cdata,data[cdata,pivotColumns[level]],
> > drop=T), function(x) lapplyFunctionRecurse(x, level+1, ...)))
> >}
> >}
> >result = lapplyFunctionRecurse(data, ...)
> >matrix2 <- do.call('rbind', lapply(result, function(x)
> > do.call('rbind',x)))
> >return(matrix2)
> > }
> >
> > dat <- data.frame(D=sample(32000:33000, 666000,
> > T),Fid=sample(1:10,666000,T), A=sample(1:5,666000,T))
> >> temp = proc.time(); ret = fedb.ddplyWrapper2(dat, c("D", "Fid"),
> > function(x) c(sum(x[,

Re: [R] dramatic speed difference in lapply

2010-02-26 Thread Tom Short

I'm sorry, Rob, but that code is dense enough and formatted badly
enough that it's hard to dig through.

You may want to try the data.table package. The development version on
R-forge is pretty fast for grouping operations like this. I'm not sure
if this is what you're really after. It's hard to tell from your
example.

Compare some speeds:

> dat <- data.frame(D=sample(32000:33000, 666000,T),
+   Fid=sample(1:10,666000,T),
+   A=sample(1:5,666000,T))
>
> ### one of your examples
> system.time(ret <- fedb.ddplyWrapper2(dat, c("D", "Fid"),
+ function(x) c(sum(x[,"A"], na.rm=T),
sum(x[,"A"], na.rm=T
   user  system elapsed
  21.78   14.42   36.35
>
>
> ### data.table
> install.packages("data.table",repos="http://R-Forge.R-project.org";)
> library(data.table)
> dt <- as.data.table(dat)
> system.time(ret2 <- dt[, sum(A, na.rm=T), by = "D,Fid"])
   user  system elapsed
   0.270.000.28
>
>
> ### plyr for comparison, too
> library(plyr)
> system.time(ret3 <- ddply(dat, .(D,Fid), function(x) sum(x$A, na.rm=T)))
   user  system elapsed
  28.94   12.16   41.23

> head(ret)
  [,1] [,2]
1  175  175
2  222  222
3  221  221
4  134  134
5  253  253
6  194  194

> head(ret2)
 D Fid  V1
[1,] 32000   1 228
[2,] 32000   2 209
[3,] 32000   3 182
[4,] 32000   4 180
[5,] 32000   5 181
[6,] 32000   6 222

> head(ret3)
  D Fid  V1
1 32000   1 175
2 32000   2 222
3 32000   3 221
4 32000   4 134
5 32000   5 253
6 32000   6 194


- Tom


On Fri, Feb 26, 2010 at 2:58 PM, Rob Forler  wrote:
> So I have a function that does lapply's for me based on dimension. Currently
> only works for length(pivotColumns)=2 because I haven't fixed the rbinds. I
> have two versions. One runs WAYYY faster than the other. And I'm not sure
> why.
>
> Fast Version:
>
> fedb.ddplyWrapper2Fast <- function(data, pivotColumns, listNameFunctions,
> ...){
>    lapplyFunctionRecurse <- function(cdata, level=1, ...){
>        if(level==1){
>
> return(lapply(split(seq(nrow(cdata)),cdata[,pivotColumns[level]], drop=T),
> function(x) lapplyFunctionRecurse(x, level+1, ...)))
>        } else if (level==length(pivotColumns)) {
>            #
> return(lapply(split(cdata,data[cdata,pivotColumns[level]], drop=T),
> function(x, ...) listNameFunctions(data[x,], ...)))
>            return(lapply(split(cdata,data[cdata,pivotColumns[level]],
> drop=T), function(x, ...) c(data[cdata[1],pivotColumns[2]],
> data[cdata[1],pivotColumns[1]], sum(data[cdata,"A"], na.rm=T),
> sum(data[cdata,"A"], na.rm=T
>        } else {
>            return(lapply(split(cdata,data[cdata,pivotColumns[level]],
> drop=T), function(x) lapplyFunctionRecurse(x, level+1, ...)))
>        }
>    }
>    result = lapplyFunctionRecurse(data, ...)
>    matrix2 <- do.call('rbind', lapply(result, function(x)
> do.call('rbind',x)))
>    return(matrix2)
> }
>
>
> dat <- data.frame(D=sample(32000:33000, 666000,
> T),Fid=sample(1:10,666000,T), A=sample(1:5,666000,T))
>> temp = proc.time(); ret = fedb.ddplyWrapper2(dat, c("D", "Fid"),
> function(x) c(sum(x[,"A"], na.rm=T), sum(x[,"A"], na.rm=T)));
> proc.time()-temp
>   user  system elapsed
>  4.616   0.006   4.630
> #note in thie case the anonymous function I pass in isn't used because I
> hardcode the function into the lapply.
>
> approx 4 seconds
>
> This runs very fast. This runs very slow:
>
> fedb.ddplyWrapper2 <- function(data, pivotColumns, listNameFunctions, ...){
>    lapplyFunctionRecurse <- function(cdata, level=1, ...){
>        if(level==1){
>
> return(lapply(split(seq(nrow(cdata)),cdata[,pivotColumns[level]], drop=T),
> function(x) lapplyFunctionRecurse(x, level+1, ...)))
>        } else if (level==length(pivotColumns)) {
>            #this line is different. it essentially calls the function you
> pass in
>            return(lapply(split(cdata,data[cdata,pivotColumns[level]],
> drop=T), function(x, ...) listNameFunctions(data[x,], ...)))
>        } else {
>            return(lapply(split(cdata,data[cdata,pivotColumns[level]],
> drop=T), function(x) lapplyFunctionRecurse(x, level+1, ...)))
>        }
>    }
>    result = lapplyFunctionRecurse(data, ...)
>    matrix2 <- do.call('rbind', lapply(result, function(x)
> do.call('rbind',x)))
>    return(matrix2)
> }
>
> dat <- data.frame(D=sample(32000:33000, 666000,
> T),Fid=sample(1:10,666000,T), A=sample(1:5,666000,T))
>> temp = proc.time(); ret = fedb.ddplyWrapper2(dat, c("D", "Fid"),
> function(x) c(sum(x[,"A"], na.rm=T), sum(x[,"A"], na.rm=T)));
> proc.time()-temp
>   user  system elapsed
>  16.346  65.059  81.680
>
> head(ret3)
  D Fid  V1
1 32000   1 175
2 32000   2 222
3 32000   3 221
4 32000   4 134
5 32000   5 253
6 32000   6 194
>
>
> Can anyone explain to me why there is a 4x time difference? I don't want to
> have to hardcore into the recursion function, but if I have to I will.
>
> Thanks,
> Rob
>
>        [[alternative HTML version deleted]]
>
> __
> R-help@r-

Re: [R] dramatic speed difference in lapply

2010-02-26 Thread jim holtman

On my computer your two examples seem to execute about the same:
> fedb.ddplyWrapper2Fast <- function(data, pivotColumns, listNameFunctions,
+ ...){
+ lapplyFunctionRecurse <- function(cdata, level=1, ...){
+if(level==1){
+
+
return(lapply(split(seq(nrow(cdata)),cdata[,pivotColumns[level]],
drop=T),
+ function(x) lapplyFunctionRecurse(x, level+1, ...)))
+} else if (level==length(pivotColumns)) {
+#
+
return(lapply(split(cdata,data[cdata,pivotColumns[level]], drop=T),
+ function(x, ...) listNameFunctions(data[x,], ...)))
+
return(lapply(split(cdata,data[cdata,pivotColumns[level]],
+ drop=T), function(x, ...) c(data[cdata[1],pivotColumns[2]],
+ data[cdata[1],pivotColumns[1]], sum(data[cdata,"A"], na.rm=T),
+ sum(data[cdata,"A"], na.rm=T
+} else {
+return(lapply(split(cdata,data[cdata,pivotColumns[level]],
+ drop=T), function(x) lapplyFunctionRecurse(x, level+1, ...)))
+}
+ }
+ result = lapplyFunctionRecurse(data, ...)
+ matrix2 <- do.call('rbind', lapply(result, function(x)
+ do.call('rbind',x)))
+ return(matrix2)
+ }
>
> Rprof()
> dat <- data.frame(D=sample(32000:33000, 666000,
+ T),Fid=sample(1:10,666000,T), A=sample(1:5,666000,T))
> temp = proc.time(); ret = fedb.ddplyWrapper2Fast(dat, c("D", "Fid"),
+ function(x) c(sum(x[,"A"], na.rm=T), sum(x[,"A"], na.rm=T)));
> proc.time()-temp
   user  system elapsed
  23.447.37   30.86


> fedb.ddplyWrapper2 <- function(data, pivotColumns, listNameFunctions, ...){
+lapplyFunctionRecurse <- function(cdata, level=1, ...){
+if(level==1){
+
+ return(lapply(split(seq(nrow(cdata)),cdata[,pivotColumns[level]], drop=T),
+ function(x) lapplyFunctionRecurse(x, level+1, ...)))
+} else if (level==length(pivotColumns)) {
+#this line is different. it essentially calls the
function you pass in
+return(lapply(split(cdata,data[cdata,pivotColumns[level]],
+ drop=T), function(x, ...) listNameFunctions(data[x,], ...)))
+} else {
+return(lapply(split(cdata,data[cdata,pivotColumns[level]],
+ drop=T), function(x) lapplyFunctionRecurse(x, level+1, ...)))
+}
+}
+result = lapplyFunctionRecurse(data, ...)
+matrix2 <- do.call('rbind', lapply(result, function(x)
+ do.call('rbind',x)))
+return(matrix2)
+ }
>
> dat <- data.frame(D=sample(32000:33000, 666000,
+ T),Fid=sample(1:10,666000,T), A=sample(1:5,666000,T))
> temp = proc.time(); ret = fedb.ddplyWrapper2(dat, c("D", "Fid"),
+ function(x) c(sum(x[,"A"], na.rm=T), sum(x[,"A"], na.rm=T)));
> proc.time()-temp
   user  system elapsed
  24.067.38   31.50


If you run Rprof, most of the time is being spent accessing the
dataframe.  I would suggest that you convert the dataframe to a matrix
to get better performance.  Here is what I saw in the Rprof of the
first example:

  0  19.9 root
  1.   19.7 fedb.ddplyWrapper2Fast
  2. .   19.7 lapplyFunctionRecurse
  3. . .   19.7 lapply
  4. . . .   19.4 FUN
  5. . . . .   19.4 lapplyFunctionRecurse
  6. . . . . .   19.3 lapply
  7. . . . . . .   18.6 FUN
  8. . . . . . . .   18.6 listNameFunctions
  9. . . . . . . . .   18.5 [
 10. . . . . . . . . .   18.3 [.data.frame  <<- most of the time in
accessing the data within a data frame.
 11. . . . . . . . . . .   14.6 attr
 11. . . . . . . . . . .0.5 %in%
 12. . . . . . . . . . . .0.4 match
 13. . . . . . . . . . . . .0.4 is.factor
 14. . . . . . . . . . . . . .0.3 inherits
 11. . . . . . . . . . .0.5 [[
 12. . . . . . . . . . . .0.5 [[.data.frame
 13. . . . . . . . . . . . .0.2 %in%
 14. . . . . . . . . . . . . .0.2 match
 15. . . . . . . . . . . . . . .0.1 is.factor
 16. . . . . . . . . . . . . . . .0.1 inherits
 11. . . . . . . . . . .0.4 anyDuplicated
 12. . . . . . . . . . . .0.2 anyDuplicated.default
 11. . . . . . . . . . .0.2 names
 12. . . . . . . . . . . .0.2 names
 11. . . . . . . . . . .0.1 vector
 12. . . . . . . . . . . .0.1 length
 13. . . . . . . . . . . . .0.1 length
  7. . . . . . .0.7 is.vector
  8. . . . . . . .0.7 split
  9. . . . . . . . .0.6 split.default
 10. . . . . . . . . .0.5 factor
 11. . . . . . . . . . .0.2 as.character
 11. . . . . . . . . . .0.1 unique
 12. . . . . . . . . . . .0.1 unique.default
 10. . . . . . . . . .0.2 [
 11. . . . . . . . . . .0.1 [.data.frame
  4. . . .0.4 is.vector
  5. . . . .0.4 split
  6. . . . . .0.4 split.default
  7. . . . . . .0.4 factor
  8. . . . . . . .0.3 as.character
  1.0.1 data.frame



On Fri, Feb 26, 2010 at 2:58 PM, Rob Forler  wrote:
> So I have a function that does lapply's for me based on dimension. Currently
> only works for length(pivotColumns)=2 because I haven't fixed the rbinds. I
> have two versions. One runs WAYYY f

Re: [R] text editors

2010-02-26 Thread Tao Shi


If you do everything in Windows, Tinn-R is one of the best and also the one I 
use.  I also tried WinEdt. It's very good, but it is not free.  If you want a 
cross-platform editor, Emacs+ESS is the one.  Like others said, the learning 
curve is steep, but worth it.

...Tao

===
>Dear all,
>
>Do you use a text editor ? What would you recommend for Windows users ? What
>about Tinn-R ?
>
>Thank you very much,
>Dwayne

  
_
Hotmail: Free, trusted and rich email service.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Automate generation of multiple reports using odfWeave

2010-02-26 Thread Sarah Goslee

I tend to do it the other way around.

Hard-code the station into the ODT file as "thisstation".

Then, in R, do something like this:

allstations <- c("station1", "station2", "station3")

for (i in allstations) {
   thisstation <- i
   odfWeave("inputfile.odt", paste("output-", i, ".odt", sep=""))
}
rm(thisstation)

That way you don't have to have a bunch of files.

Sarah

On Fri, Feb 26, 2010 at 3:43 PM, Aleksey Naumov  wrote:
> Dear R and odfWeave users,
>
> I am looking for a way to automate generation of many reports using
> odfWeave. All reports would use the same input ODT file, the only difference
> would be in the name of the dataset which will be analyzed in any particular
> report. Right now, the name of the dataset is hardcoded in the first code
> chuck in the input file:
>
> <<01 get data, echo=TRUE>>
> station = '123'                  # name of the station dataset to be
> analyzed in this report
> data = get_data(station)    # get data, e.g. from a database
> @
>
> This is far from ideal, as it requires a separate input file (ODT) for every
> input station, a huge duplication.
>
> Are there ways to streamline this? I am looking for a way to have only one
> input ODT file (much easier to maintain one file than many). I am thinking
> that the input file could be pre-processed to include the station parameter
> and saved as an intermediate ODT, which would then be put through
> odfWeave()? Does anyone know of a good way to edit the OO.org ODT file to
> put "station = '...'" into the first code chuck?
> Is there any other way to do that?
>
> Thank you,
> Aleksey

-- 
Sarah Goslee
http://www.functionaldiversity.org

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Automate generation of multiple reports using odfWeave

2010-02-26 Thread Aleksey Naumov

Dear R and odfWeave users,

I am looking for a way to automate generation of many reports using
odfWeave. All reports would use the same input ODT file, the only difference
would be in the name of the dataset which will be analyzed in any particular
report. Right now, the name of the dataset is hardcoded in the first code
chuck in the input file:

<<01 get data, echo=TRUE>>
station = '123'  # name of the station dataset to be
analyzed in this report
data = get_data(station)# get data, e.g. from a database
@

This is far from ideal, as it requires a separate input file (ODT) for every
input station, a huge duplication.

Are there ways to streamline this? I am looking for a way to have only one
input ODT file (much easier to maintain one file than many). I am thinking
that the input file could be pre-processed to include the station parameter
and saved as an intermediate ODT, which would then be put through
odfWeave()? Does anyone know of a good way to edit the OO.org ODT file to
put "station = '...'" into the first code chuck?
Is there any other way to do that?

Thank you,
Aleksey

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] match.call to obtain the name of a function

Try this:

foo <- function()
sprintf("The name of this function is %s", gettext(match.call()))

On Fri, Feb 26, 2010 at 5:22 PM, Jacob Wegelin  wrote:
>
> Within a function I'd often like to obtain a text string equal to the name
> of the function.
>
> One use for this: To generate a filename for use in pdf(). This enables me
> to keep track of which function generated a particular graphic came.
>
> match.call() puts parentheses at the end of the name. I don't want
> parentheses in a filename.
>
> The following kludgey function gives the desired result.
>
>> JANK
>
> function(x, y) {
>   one<-deparse(match.call())
>   functionName<-gsub("\\(.*", "", one, perl=T)
>   cat("The name of this function is ", functionName, "\n")
> }
>>
>> JANK(55, pi^2)
>
> The name of this function is  JANK
>>
>> JANK()
>
> The name of this function is  JANK
>
> Is there not a more direct way? To paraphrase Douglas Bates, the above
> approach is like the diner scene in the movie "Five Easy Pieces". You get an
> order of toast by first ordering a chicken sandwich and then telling the
> waitress to hold (that is, to subtract) the meat, lettuce, and mayonnaise.
>
> Thanks for any insights
>
> Jacob Wegelin
>
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>



-- 
Henrique Dallazuanna
Curitiba-Paraná-Brasil
25° 25' 40" S 49° 16' 22" O

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] wrap long lines in table using "latex" in Hmisc

2010-02-26 Thread Tao Shi


Hi list,

Is there a way to control long-line wrapping in a table using "latex" function 
in Hmisc or any other functions?  It seems I can't find any examples.

Thank you very much!


...Tao
  
_
Hotmail: Trusted email with Microsoft’s powerful SPAM protection.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Defective help pages

2010-02-26 Thread Peter Danenberg

> This seems to be plain text help, right?

It is.

> Does the html version give the same result?

Interestingly, the html seems to be whole; but it's less convenient to
access from ESS, though.

Do you know what program generates the plain text; and are there any
options that govern where R looks for the plain text help files?

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] match.call to obtain the name of a function

2010-02-26 Thread Jacob Wegelin



Within a function I'd often like to obtain a text string equal to the name of 
the function.

One use for this: To generate a filename for use in pdf(). This enables me to 
keep track of which function generated a particular graphic came.

match.call() puts parentheses at the end of the name. I don't want parentheses 
in a filename.

The following kludgey function gives the desired result.


JANK

function(x, y) {
   one<-deparse(match.call())
   functionName<-gsub("\\(.*", "", one, perl=T)
   cat("The name of this function is ", functionName, "\n")
}

JANK(55, pi^2)
The name of this function is  JANK 

JANK()

The name of this function is  JANK

Is there not a more direct way? To paraphrase Douglas Bates, the above approach is like 
the diner scene in the movie "Five Easy Pieces". You get an order of toast by 
first ordering a chicken sandwich and then telling the waitress to hold (that is, to 
subtract) the meat, lettuce, and mayonnaise.

Thanks for any insights

Jacob Wegelin

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] split function

2010-02-26 Thread David Winsemius

On Feb 26, 2010, at 2:40 PM, rusers.sh wrote:

Your method seems to only re-express the data "data.frame(x, g)" using
another format.

In all fairness to the first respondent to your question, that _was_  
what it appeared you were requesting. My other thoughts would be:

> cbind(x[order(g)], sort(as.numeric(as.character(g # g is a factor
# so sort(g) or g[order(g)] returns the internal index.
[,1] [,2]
 [1,] -0.06782370
 [2,]  2.25381490
 [3,]  1.89512570
 [4,]  2.20796200
 [5,]  3.20112671
 [6,] -0.55240361
 [7,]  0.78917431
 [8,]  2.25200061
 [9,]  1.11914211
[10,]  2.29234701
[11,]  3.58316951
[12,]  2.22990132
[13,]  1.51407592

or:

>split(data.frame(x=x,g=g), g)
> split(data.frame(x=x,g=g), g)
$`0`
x g
6  -0.0678237 0
15  2.2538149 0
18  1.8951257 0
30  2.2079620 0

$`1`
x g
1   3.2011267 1
3  -0.5524036 1
10  0.7891743 1
12  2.2520006 1
17  1.1191421 1
19  2.2923470 1
29  3.5831695 1

$`2`
   x g
2  2.2299013 2

Which also re-express it. But if that is not what you want then offer  
a better explanation  and a different example of desired output.

The results are really from the generated data frame. Maybe
be not good.

table(g)

g
0 1 2 3
7 9 8 6
 I hope to randomly split the value 'x' according to the different  
sample
sizes of different levels, displayed above. That is, 7 for level 0,  
9 for

level 1, et al.
 Thanks.

Or maybe you don't want the value of x but the number of elements?

> tapply(x, g, length)  # another way to get a table
 0  1  2  3 # and with different numbers since you did  
not use set.seed(123)

 4  7 11  8

Please do clarify.

2010/2/26 Henrique Dallazuanna 

Try this:

split(data.frame(x, g), g)

On Fri, Feb 26, 2010 at 3:55 PM, rusers.sh   
wrote:

Hi,
I am using split function and wonder how to add the factor to the

splitted

results.
#Example
n <- 3; nn <- 10
g <- factor(round(n * stats::runif(n * nn)))   #factor
x <- rnorm(n * nn) + sqrt(as.numeric(g))#value
xg <- split(x, g)
xg
$`0`
[1]  0.82513702 -0.03911584  2.32955347  0.36745335  1.75572642

2.65461438

0.41675829
$`1`
[1]  0.8583493  2.4264804 -0.3622378  3.1770015  0.5162129
$`2`
[1] 1.7914651 1.1440121 0.8097543 1.2064742 1.6411988 1.3743778

1.7094387

2.1204501 1.9330132 2.0731997
[11] 2.8931865 2.5825309 0.6978723
$`3`
[1] 3.0246214 1.6870782 0.9685926 1.6449350 0.9378751

g

[1] 2 2 3 2 1 3 2 3 3 1 2 2 2 2 0 0 3 0 2 2 1 1 2 2 0 1 2 0 0 0
Levels: 0 1 2 3

Anybody can tell me how to add the corresponding values of factor  
"g" to

the splitted results 'xg' to get a data frame?
Something like,

Splitted/xg factor/g
0.825137020
-0.03911584   0
2.329553470
  ...
I know i can use "xg$'0',xg$'1',xg$'2',xg$'3'" to get the values  
of each

class and then add a new variable to indicate the factor.
But i hope to get a method to automatic do those things. Any ideas?
Thanks.

--

David Winsemius, MD
Heritage Laboratories
West Hartford, CT

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] dramatic speed difference in lapply

2010-02-26 Thread Rob Forler

So I have a function that does lapply's for me based on dimension. Currently
only works for length(pivotColumns)=2 because I haven't fixed the rbinds. I
have two versions. One runs WAYYY faster than the other. And I'm not sure
why.

Fast Version:

fedb.ddplyWrapper2Fast <- function(data, pivotColumns, listNameFunctions,
...){
lapplyFunctionRecurse <- function(cdata, level=1, ...){
if(level==1){

return(lapply(split(seq(nrow(cdata)),cdata[,pivotColumns[level]], drop=T),
function(x) lapplyFunctionRecurse(x, level+1, ...)))
} else if (level==length(pivotColumns)) {
#
return(lapply(split(cdata,data[cdata,pivotColumns[level]], drop=T),
function(x, ...) listNameFunctions(data[x,], ...)))
return(lapply(split(cdata,data[cdata,pivotColumns[level]],
drop=T), function(x, ...) c(data[cdata[1],pivotColumns[2]],
data[cdata[1],pivotColumns[1]], sum(data[cdata,"A"], na.rm=T),
sum(data[cdata,"A"], na.rm=T
} else {
return(lapply(split(cdata,data[cdata,pivotColumns[level]],
drop=T), function(x) lapplyFunctionRecurse(x, level+1, ...)))
}
}
result = lapplyFunctionRecurse(data, ...)
matrix2 <- do.call('rbind', lapply(result, function(x)
do.call('rbind',x)))
return(matrix2)
}


dat <- data.frame(D=sample(32000:33000, 666000,
T),Fid=sample(1:10,666000,T), A=sample(1:5,666000,T))
> temp = proc.time(); ret = fedb.ddplyWrapper2(dat, c("D", "Fid"),
function(x) c(sum(x[,"A"], na.rm=T), sum(x[,"A"], na.rm=T)));
proc.time()-temp
   user  system elapsed
 4.616   0.006   4.630
#note in thie case the anonymous function I pass in isn't used because I
hardcode the function into the lapply.

approx 4 seconds

This runs very fast. This runs very slow:

fedb.ddplyWrapper2 <- function(data, pivotColumns, listNameFunctions, ...){
lapplyFunctionRecurse <- function(cdata, level=1, ...){
if(level==1){

return(lapply(split(seq(nrow(cdata)),cdata[,pivotColumns[level]], drop=T),
function(x) lapplyFunctionRecurse(x, level+1, ...)))
} else if (level==length(pivotColumns)) {
#this line is different. it essentially calls the function you
pass in
return(lapply(split(cdata,data[cdata,pivotColumns[level]],
drop=T), function(x, ...) listNameFunctions(data[x,], ...)))
} else {
return(lapply(split(cdata,data[cdata,pivotColumns[level]],
drop=T), function(x) lapplyFunctionRecurse(x, level+1, ...)))
}
}
result = lapplyFunctionRecurse(data, ...)
matrix2 <- do.call('rbind', lapply(result, function(x)
do.call('rbind',x)))
return(matrix2)
}

dat <- data.frame(D=sample(32000:33000, 666000,
T),Fid=sample(1:10,666000,T), A=sample(1:5,666000,T))
> temp = proc.time(); ret = fedb.ddplyWrapper2(dat, c("D", "Fid"),
function(x) c(sum(x[,"A"], na.rm=T), sum(x[,"A"], na.rm=T)));
proc.time()-temp
   user  system elapsed
 16.346  65.059  81.680



Can anyone explain to me why there is a 4x time difference? I don't want to
have to hardcore into the recursion function, but if I have to I will.

Thanks,
Rob

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] two questions for R beginners

2010-02-26 Thread Jack Siegrist


My biggest impediment, as a scientist without previous programming
experience, is that the R help is not beginner-friendly. I think it is
probably great for experienced programmers and for the people who helped to
create the software, to help them  remember what they did, but I think it is
very difficult for a newcomer without a strong programming background to
learn about a new function or to discover the name of a function that you
are pretty sure should already exist. Maybe this wouldn’t matter for most
programming languages, but as free statistics software R is obviously going
to attract many scientists who want to get an analysis done and have varying
levels of experience with programming. 

I found it much easier to learn how to use Mathematica, using only the
online help. With R I had to buy several books to get a handle on it, which
is fine, but even the books that I have found to be most useful tend to be
didactically lacking—either too cursory or mired in unexplained programming
jargon. They are OK just not great.

What I think would be very helpful is an introduction to programming using
R, preferably a big thick college textbook that takes at least a semester to
go through, which should be a prerequisite for going through the
Introduction to R available on CRAN.

Also to do any analysis on real data you have to use the apply family of
functions to perform different functions by groups. A long introduction to
these functions, with lots of comparisons and contrasts between them would
be very helpful.

A few random examples concerning the R help: 

In my version of R (2.7.0 on Windows XP) typing
> ?+
doesn’t do anything, but then if you type in the next line
+ ?sum
you get the “Arithmetic Operators” help page.
If you had just typed
> ?sum
in the first place you get the “Sum of Vector Elements” help page. 

Most examples in the R help pages use way to many other functions to be
useful to a beginner. If an example uses 10 other functions besides the one
being described, chances are a beginner won’t know what one of them does,
which can set off a chain of having to look up other irrelevant functions.

Some function names in the base package are goofy, such as “rowsum” which is
used to “compute column sums across rows”, not to be confused with “rowSums”
which computes row sums.

-- 
View this message in context: 
http://n4.nabble.com/two-questions-for-R-beginners-tp1569384p1571243.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] How to add a variable to a dataframe whose values are conditional upon the values of an existing variable

You can work with factor also:

week <- c('SAT', 'SUN', 'MON', 'FRI')
factor(week, levels = week, labels = 1:4)

On Fri, Feb 26, 2010 at 4:31 PM, Steve Matco  wrote:
> Hi everyone,
>
> I am at my wits end with what I believe would be considered simple by a more 
> experienced R user. I want to know how to add a variable to a dataframe whose 
> values are conditional on the values of an existing variable. I can't seem to 
> make an ifelse statement work for my situation. The existing variable in my 
> dataframe is a character variable named DOW which contains abbreviated day 
> names (SAT, SUN, MON.FRI). I want to add a numerical variable named DOW1 
> to my dataframe that will take on the value 1 if DOW equals "SAT", 2 if DOW 
> equals "SUN", 3 if DOW equals "MON",.,7 if DOW equals "FRI".
> I  know this must be a simple problem but I have searched everywhere and 
> tried everything I could think of. Any help would be greatly appreciated.
>
> Thank you,
>
> Mike
>
>
>
>
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>



-- 
Henrique Dallazuanna
Curitiba-Paraná-Brasil
25° 25' 40" S 49° 16' 22" O

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] How to add a variable to a dataframe whose values are conditional upon the values of an existing variable

2010-02-26 Thread Daniel Malter

Hi, two approaches at least.

a. a nested ifelse statement
b. merging the original data frame with DOW in it with a data frame that
holds the day "MON" to "SUN" and their indicators.

Both approaches illustrated below:

DOW=rep(c("MON","TUE"),100)

#nested ifelse approach
DOW.ind=ifelse(DOW=="MON",1,ifelse(DOW=="TUE",2,0)) 
#continue to nest ifelse statements for more days

DOW.ind

#merging approach
day=c("MON","TUE") 
ind=c(1,2)
ind.frame=data.frame(day,ind)
merge(data.frame(DOW),ind.frame,by.x="DOW",by.y="day",all.x=T,all.y=F)

HTH,
Daniel

-
cuncta stricte discussurus
-
-Original Message-
From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On
Behalf Of Steve Matco
Sent: Friday, February 26, 2010 2:32 PM
To: r-help@r-project.org
Subject: [R] How to add a variable to a dataframe whose values are
conditional upon the values of an existing variable

Hi everyone,

I am at my wits end with what I believe would be considered simple by a more
experienced R user. I want to know how to add a variable to a dataframe
whose values are conditional on the values of an existing variable. I can't
seem to make an ifelse statement work for my situation. The existing
variable in my dataframe is a character variable named DOW which contains
abbreviated day names (SAT, SUN, MON.FRI). I want to add a numerical
variable named DOW1 to my dataframe that will take on the value 1 if DOW
equals "SAT", 2 if DOW equals "SUN", 3 if DOW equals "MON",.,7 if DOW
equals "FRI". 
I  know this must be a simple problem but I have searched everywhere and
tried everything I could think of. Any help would be greatly appreciated.

Thank you,

Mike




__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] split function

So, I don't understand what you want, you want add the factor to split
result, isn't?

#Example
set.seed(1234)
n <- 3; nn <- 10
g <- factor(round(n * stats::runif(n * nn)))   #factor
x <- rnorm(n * nn) + sqrt(as.numeric(g))#value
xg <- split(x, g)
xg

xgD <- split(data.frame(x, g), g)

sapply(xgD, nrow) == table(g)
#


On Fri, Feb 26, 2010 at 4:40 PM, rusers.sh  wrote:
> Your method seems to only re-express the data "data.frame(x, g)" using
> another format. The results are really from the generated data frame. Maybe
> be not good.
>> table(g)
> g
> 0 1 2 3
> 7 9 8 6
>   I hope to randomly split the value 'x' according to the different sample
> sizes of different levels, displayed above. That is, 7 for level 0, 9 for
> level 1, et al.
>   Thanks.
> 2010/2/26 Henrique Dallazuanna 
>>
>> Try this:
>>
>> split(data.frame(x, g), g)
>>
>> On Fri, Feb 26, 2010 at 3:55 PM, rusers.sh  wrote:
>> > Hi,
>> >  I am using split function and wonder how to add the factor to the
>> > splitted
>> > results.
>> > #Example
>> > n <- 3; nn <- 10
>> > g <- factor(round(n * stats::runif(n * nn)))   #factor
>> > x <- rnorm(n * nn) + sqrt(as.numeric(g))    #value
>> > xg <- split(x, g)
>> > xg
>> > $`0`
>> > [1]  0.82513702 -0.03911584  2.32955347  0.36745335  1.75572642
>> >  2.65461438
>> >  0.41675829
>> > $`1`
>> > [1]  0.8583493  2.4264804 -0.3622378  3.1770015  0.5162129
>> > $`2`
>> >  [1] 1.7914651 1.1440121 0.8097543 1.2064742 1.6411988 1.3743778
>> > 1.7094387
>> > 2.1204501 1.9330132 2.0731997
>> > [11] 2.8931865 2.5825309 0.6978723
>> > $`3`
>> > [1] 3.0246214 1.6870782 0.9685926 1.6449350 0.9378751
>> >> g
>> >  [1] 2 2 3 2 1 3 2 3 3 1 2 2 2 2 0 0 3 0 2 2 1 1 2 2 0 1 2 0 0 0
>> > Levels: 0 1 2 3
>> >
>> >  Anybody can tell me how to add the corresponding values of factor "g"
>> > to
>> > the splitted results 'xg' to get a data frame?
>> > Something like,
>> >
>> > Splitted/xg     factor/g
>> > 0.82513702        0
>> > -0.03911584       0
>> > 2.32955347        0
>> >    ...
>> >  I know i can use "xg$'0',xg$'1',xg$'2',xg$'3'" to get the values of
>> > each
>> > class and then add a new variable to indicate the factor.
>> > But i hope to get a method to automatic do those things. Any ideas?
>> >  Thanks.
>> >
>> >
>> > --
>> > -
>> > Jane Chang
>> > Queen's
>> >
>> >        [[alternative HTML version deleted]]
>> >
>> > __
>> > R-help@r-project.org mailing list
>> > https://stat.ethz.ch/mailman/listinfo/r-help
>> > PLEASE do read the posting guide
>> > http://www.R-project.org/posting-guide.html
>> > and provide commented, minimal, self-contained, reproducible code.
>> >
>>
>>
>>
>> --
>> Henrique Dallazuanna
>> Curitiba-Paraná-Brasil
>> 25° 25' 40" S 49° 16' 22" O
>
>
>
> --
> -
> Jane Chang
> Queen's
>



-- 
Henrique Dallazuanna
Curitiba-Paraná-Brasil
25° 25' 40" S 49° 16' 22" O

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] How to add a variable to a dataframe whose values are conditional upon the values of an existing variable

2010-02-26 Thread Andrew Miles

You could also try a series of simple ifelse statements.  I just tried  
the following and got it to work, though I am sure there is a faster  
way.

 t=c("cow", "dog", "chick")
 y=c(1,3,4)
mat=cbind(t,y)
mat=as.data.frame(mat)

> mat
  t y
1   cow 1
2   dog 3
3 chick 4

mat$g=ifelse(mat$t=="cow", 1, 6)
mat$g=ifelse(mat$t=="dog", 2, mat$g)
mat$g=ifelse(mat$t=="chick", 3, mat$g)

> mat
  t y g
1   cow 1 1
2   dog 3 2
3 chick 4 3

To days of the week would only be 7 statements.

Andrew Miles
Department of Sociology
Duke University

On Feb 26, 2010, at 2:31 PM, Steve Matco wrote:

Hi everyone,

I am at my wits end with what I believe would be considered simple  
by a more experienced R user. I want to know how to add a variable  
to a dataframe whose values are conditional on the values of an  
existing variable. I can't seem to make an ifelse statement work for  
my situation. The existing variable in my dataframe is a character  
variable named DOW which contains abbreviated day names (SAT, SUN,  
MON.FRI). I want to add a numerical variable named DOW1 to my  
dataframe that will take on the value 1 if DOW equals "SAT", 2 if  
DOW equals "SUN", 3 if DOW equals "MON",.,7 if DOW equals "FRI".
I  know this must be a simple problem but I have searched everywhere  
and tried everything I could think of. Any help would be greatly  
appreciated.

Thank you,

Mike

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] split function

2010-02-26 Thread rusers.sh

Your method seems to only re-express the data "data.frame(x, g)" using
another format. The results are really from the generated data frame. Maybe
be not good.
> table(g)
g
0 1 2 3
7 9 8 6
  I hope to randomly split the value 'x' according to the different sample
sizes of different levels, displayed above. That is, 7 for level 0, 9 for
level 1, et al.
  Thanks.

2010/2/26 Henrique Dallazuanna 

> Try this:
>
> split(data.frame(x, g), g)
>
> On Fri, Feb 26, 2010 at 3:55 PM, rusers.sh  wrote:
> > Hi,
> >  I am using split function and wonder how to add the factor to the
> splitted
> > results.
> > #Example
> > n <- 3; nn <- 10
> > g <- factor(round(n * stats::runif(n * nn)))   #factor
> > x <- rnorm(n * nn) + sqrt(as.numeric(g))#value
> > xg <- split(x, g)
> > xg
> > $`0`
> > [1]  0.82513702 -0.03911584  2.32955347  0.36745335  1.75572642
>  2.65461438
> >  0.41675829
> > $`1`
> > [1]  0.8583493  2.4264804 -0.3622378  3.1770015  0.5162129
> > $`2`
> >  [1] 1.7914651 1.1440121 0.8097543 1.2064742 1.6411988 1.3743778
> 1.7094387
> > 2.1204501 1.9330132 2.0731997
> > [11] 2.8931865 2.5825309 0.6978723
> > $`3`
> > [1] 3.0246214 1.6870782 0.9685926 1.6449350 0.9378751
> >> g
> >  [1] 2 2 3 2 1 3 2 3 3 1 2 2 2 2 0 0 3 0 2 2 1 1 2 2 0 1 2 0 0 0
> > Levels: 0 1 2 3
> >
> >  Anybody can tell me how to add the corresponding values of factor "g" to
> > the splitted results 'xg' to get a data frame?
> > Something like,
> >
> > Splitted/xg factor/g
> > 0.825137020
> > -0.03911584   0
> > 2.329553470
> >...
> >  I know i can use "xg$'0',xg$'1',xg$'2',xg$'3'" to get the values of each
> > class and then add a new variable to indicate the factor.
> > But i hope to get a method to automatic do those things. Any ideas?
> >  Thanks.
> >
> >
> > --
> > -
> > Jane Chang
> > Queen's
> >
> >[[alternative HTML version deleted]]
> >
> > __
> > R-help@r-project.org mailing list
> > https://stat.ethz.ch/mailman/listinfo/r-help
> > PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> > and provide commented, minimal, self-contained, reproducible code.
> >
>
>
>
> --
> Henrique Dallazuanna
> Curitiba-Paraná-Brasil
> 25° 25' 40" S 49° 16' 22" O
>



-- 
-
Jane Chang
Queen's

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] How to add a variable to a dataframe whose values are conditional upon the values of an existing variable

Try this:

sapply(c('SAT', 'SUN', 'MON', 'FRI'), switch, SAT = 1, SUN = 2, MON =
3, FRI = 4)


On Fri, Feb 26, 2010 at 4:31 PM, Steve Matco  wrote:
> Hi everyone,
>
> I am at my wits end with what I believe would be considered simple by a more 
> experienced R user. I want to know how to add a variable to a dataframe whose 
> values are conditional on the values of an existing variable. I can't seem to 
> make an ifelse statement work for my situation. The existing variable in my 
> dataframe is a character variable named DOW which contains abbreviated day 
> names (SAT, SUN, MON.FRI). I want to add a numerical variable named DOW1 
> to my dataframe that will take on the value 1 if DOW equals "SAT", 2 if DOW 
> equals "SUN", 3 if DOW equals "MON",.,7 if DOW equals "FRI".
> I  know this must be a simple problem but I have searched everywhere and 
> tried everything I could think of. Any help would be greatly appreciated.
>
> Thank you,
>
> Mike
>
>
>
>
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>



-- 
Henrique Dallazuanna
Curitiba-Paraná-Brasil
25° 25' 40" S 49° 16' 22" O

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] How to add a variable to a dataframe whose values are conditional upon the values of an existing variable

You mention ifelse, so for completeness, I will show you a solution that
should work with that. There are other plenty of other possibilities
though, I am sure. The follow is not tested..

Assume 'my.df' is your data.frame, containing a variable "DOW".

my.df$DOW1 <- ifelse(my.df$DOW == "SAT", 1,
ifelse(my.df$DOW == "SUN", 2,
ifelse(my.df$DOW == "MON", 3,
ifelse(my.df$DOW == "TUE", 4,
ifelse(my.df$DOW == "WED", 5,
ifelse(my.df$DOW == "THU", 6,
7))

(don't know if the number of closing ")" is right, but you get the idea...

Erik

Steve Matco wrote:

Hi everyone,

I am at my wits end with what I believe would be considered simple by a more experienced R user. I want to know how to add a variable to a dataframe whose values are conditional on the values of an existing variable. I can't seem to make an ifelse statement work for my situation. The existing variable in my dataframe is a character variable named DOW which contains abbreviated day names (SAT, SUN, MON.FRI). I want to add a numerical variable named DOW1 to my dataframe that will take on the value 1 if DOW equals "SAT", 2 if DOW equals "SUN", 3 if DOW equals "MON",.,7 if DOW equals "FRI".
I know this must be a simple problem but I have searched everywhere and tried everything I could think of. Any help would be greatly appreciated.

Thank you,

Mike

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] How to add a variable to a dataframe whose values are conditional upon the values of an existing variable

2010-02-26 Thread Steve Matco

Hi everyone,

I am at my wits end with what I believe would be considered simple by a more 
experienced R user. I want to know how to add a variable to a dataframe whose 
values are conditional on the values of an existing variable. I can't seem to 
make an ifelse statement work for my situation. The existing variable in my 
dataframe is a character variable named DOW which contains abbreviated day 
names (SAT, SUN, MON.FRI). I want to add a numerical variable named DOW1 to 
my dataframe that will take on the value 1 if DOW equals "SAT", 2 if DOW equals 
"SUN", 3 if DOW equals "MON",.,7 if DOW equals "FRI". 
I  know this must be a simple problem but I have searched everywhere and tried 
everything I could think of. Any help would be greatly appreciated.

Thank you,

Mike




__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] two questions for R beginners

2010-02-26 Thread Seeliger . Curt

Patrick Burns 
> * What were your biggest misconceptions or
> stumbling blocks to getting up and running
> with R?

I came into R from SAS, with its powerful data step language and very 
simplified data types.  Most of my work is data manipulation prior to a 
variety of univariate statistical calculations.  The vector-based nature 
of R, and thus the variety of indexing schemes used, was a big conceptual 
hurdle. 

The often unhelpful attitude of several list respondents, while not unique 
to this list, was and continues to be another block to advancement.  This 
does not occur on the list for SAS, in which asking 'dumb' questions is 
generally supported as an inevitable part of learning.  Having aggregate() 
pointed out to me by one kind soul, hidden amidst the assortment of 
by()/apply() functions, became the basis for much success.

I am currently trying to wrap my mind around how missing values are 
handled; the defaults are quite different than SAS, and mostly in a good 
way.  However the handling of NA values in a slicing statements does not 
seem quite proper, even if it is addressed in the R documents.
aa <- data.frame('id'=letters[1:5], 'x'=1:5, 
stringsAsFactors=FALSE)
aa[aa$x == 3,]$x <- NA
aa[aa$x == '4',]# 2 rows instead of 1.
aa[aa$x %in% '4',]  # 1 row as expected.

I am also looking for concise methods for building up dataframes for our 
unit tests.  While there are several ways to accomplish this, depending on 
what is needed, none are elegant though expand.grid() comes close.

next: The R inferno.  I *will* understand more than the first few pages. 
And all those apply()-ish functions, as I'm already good friends with 
aggregate().

> * What documents helped you the most in this
> initial phase?

RSeek.org was and continues to be a big source of help. I've looked at 
several texts aimed at beginners, and all provided simple examples that 
were useful.  The most consistent source of instruction has been to make 
up my own small projects that were either fun or slightly relevant to my 
job.  The ability to make up toy problems, or simplify a complex process 
have been unexpectedly important skills.  Developing unit tests for 
functions, initially seen as an irritant by some, has become an important 
tool for honing our advances.

> I especially want to hear from people who are
> lazy and impatient.

And, I hope, incompetent.  I've found incompetence to be as professionally 
important as hubris.  I wouldn't want one without the other.

cur

-- 
Curt Seeliger, Data Ranger
Raytheon Information Services - Contractor to ORD
seeliger.c...@epa.gov
541/754-4638


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] help system works under IE but not Firefox

2010-02-26 Thread Duncan Murdoch

On 26/02/2010 11:31 AM, Raynor, Bill wrote:

I just upgraded to 2.10.1 on a WinXPSP2 machine. When I type help.start() R
attempts to open a browser session at
http://127.0.0.1:27594/doc/html/index.html using my default browser (Firefox
3.6) and is unable to connect. If I then open the same page using IE 8, it
works just fine. How do I fix/change R so that it works with Firefox?

I think you want to fix Firefox so it works with R. The most likely
problem is that it's using a proxy/firewall somewhere; you want to enter
127.0.0.1 as an exception to which it should connect directly.

Duncan Murdoch

Thanks
Bill
William J. Raynor, Jr. Ph.D.
Technical Leader III
Innovation Design & Testing
Kimberly-Clark Corp.
2100 Winchester Road
Neenah, Wi. 54956
(920) 721-5973
Email: bill.ray...@kcc.com

This e-mail is intended for the use of the addressee(s) only and may contain privileged, confidential, or proprietary information that is exempt from disclosure under law. If you have received this message in error, please inform us promptly by reply e-mail, then delete the e-mail and destroy any printed copy. Thank you.

[[alternative HTML version deleted]]

Re: [R] split function

Try this:

split(data.frame(x, g), g)

On Fri, Feb 26, 2010 at 3:55 PM, rusers.sh  wrote:
> Hi,
>  I am using split function and wonder how to add the factor to the splitted
> results.
> #Example
> n <- 3; nn <- 10
> g <- factor(round(n * stats::runif(n * nn)))   #factor
> x <- rnorm(n * nn) + sqrt(as.numeric(g))    #value
> xg <- split(x, g)
> xg
> $`0`
> [1]  0.82513702 -0.03911584  2.32955347  0.36745335  1.75572642  2.65461438
>  0.41675829
> $`1`
> [1]  0.8583493  2.4264804 -0.3622378  3.1770015  0.5162129
> $`2`
>  [1] 1.7914651 1.1440121 0.8097543 1.2064742 1.6411988 1.3743778 1.7094387
> 2.1204501 1.9330132 2.0731997
> [11] 2.8931865 2.5825309 0.6978723
> $`3`
> [1] 3.0246214 1.6870782 0.9685926 1.6449350 0.9378751
>> g
>  [1] 2 2 3 2 1 3 2 3 3 1 2 2 2 2 0 0 3 0 2 2 1 1 2 2 0 1 2 0 0 0
> Levels: 0 1 2 3
>
>  Anybody can tell me how to add the corresponding values of factor "g" to
> the splitted results 'xg' to get a data frame?
> Something like,
>
> Splitted/xg     factor/g
> 0.82513702        0
> -0.03911584       0
> 2.32955347        0
>    ...
>  I know i can use "xg$'0',xg$'1',xg$'2',xg$'3'" to get the values of each
> class and then add a new variable to indicate the factor.
> But i hope to get a method to automatic do those things. Any ideas?
>  Thanks.
>
>
> --
> -
> Jane Chang
> Queen's
>
>        [[alternative HTML version deleted]]
>
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>



-- 
Henrique Dallazuanna
Curitiba-Paraná-Brasil
25° 25' 40" S 49° 16' 22" O

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] text editors

2010-02-26 Thread David Reinke

It depends on your needs, your budget, and your personal tastes. I program in a 
number of languages, so I want a text editor that I can configure easily. I 
used CodeWright for years, but it's in limbo now since Borland acquired it and 
no longer supports it.

I've switched over to SlickEdit. One can configure it different languages in 
addition to the templates that already come with it (I've built one for R). I 
bought it also for its other capabilities such as editing large files and 
binary files. Emacs and Vim are also fine. You might also look up "text 
editors" in Wikipedia, which gives some general advice; it also has links to 
comparisons of text editors and a link called 'editor war' (Emacs vs other 
editors).

A cardinal rule in among developers is never to try to convince someone else 
that your text editor is better, so take your pick.

David Reinke

Senior Transportation Engineer/Economist
Dowling Associates, Inc.
180 Grand Avenue, Suite 250
Oakland, California 94612-3774
510.839.1742 x104 (voice)
510.839.0871 (fax)
www.dowlinginc.com

 Please consider the environment before printing this e-mail.

Confidentiality Notice:  This e-mail message, including any attachments, is for 
the sole use of the intended recipient(s), and may contain confidential  and 
privileged information. Any unauthorized review, use, disclosure or 
distribution is prohibited. If you are not the intended recipient, please 
contact the sender by reply e-mail and destroy all copies of the original 
message.

-Original Message-
From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On 
Behalf Of Tobias Verbeke
Sent: Friday, February 26, 2010 10:22 AM
To: Sharpie
Cc: r-help@r-project.org
Subject: Re: [R] text editors

Sharpie wrote:
> 
> Dwayne Blind wrote:
>> Dear all,
>>
>> Do you use a text editor ? What would you recommend for Windows users ?
>> What
>> about Tinn-R ?
>>
>> Thank you very much,
>> Dwayne
>>
>>
> 
> Learning a text editor is a significant and very valuable investment of your
> time.  In order to maximize the return from this investment you will want to
> choose an editor that works well with all the languages and operating
> systems you currently use, as well as the ones you may use in the future.
> 
> For example, I spend an equal amount of time working on Windows, OS X and
> Linux.  There are a ton of great Windows-only editors out there, but they
> aren't a good option for me because I only use windows 1/3 of the time I'm
> at a computer.
> 
> Some good editors I know of that fall into this category are Emacs, Vim and
> Eclipse.  For integrating with R, Emacs has the ESS plug-in and Eclipse has
> an extension called StatET.
> 
> Eclipse is a rather large in terms of file size compared to Emacs or Vim--
> also I know Emacs and Vim can be used through a ssh connection.  I'm not
> sure about Eclipse as I haven't used it much.  

For the record: it is perfectly possible to configure an 'R Remote
Console' in StatET which will launch R over an SSH connection. You can
then disconnect from /reconnect to an R session on server etc.

Best,
Tobias

> This is important because if
> you happen to be stuck on a computer that is locked down and doesn't have
> your editor of choice installed there is still a chance that you will be
> able to use ssh to reach a computer that does.
> 
> Personally, I use Vim and have found it just fine for my needs.
> 
> 
> ESS: http://ess.r-project.org/
> StatET:
> http://www.walware.de/?page=/;jsessionid=b4d82261e53bd419d41609155e9e
> 
> -Charlie

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] split function

2010-02-26 Thread rusers.sh

Hi,
  I am using split function and wonder how to add the factor to the splitted
results.
#Example
n <- 3; nn <- 10
g <- factor(round(n * stats::runif(n * nn)))   #factor
x <- rnorm(n * nn) + sqrt(as.numeric(g))#value
xg <- split(x, g)
xg
$`0`
[1]  0.82513702 -0.03911584  2.32955347  0.36745335  1.75572642  2.65461438
 0.41675829
$`1`
[1]  0.8583493  2.4264804 -0.3622378  3.1770015  0.5162129
$`2`
 [1] 1.7914651 1.1440121 0.8097543 1.2064742 1.6411988 1.3743778 1.7094387
2.1204501 1.9330132 2.0731997
[11] 2.8931865 2.5825309 0.6978723
$`3`
[1] 3.0246214 1.6870782 0.9685926 1.6449350 0.9378751
> g
 [1] 2 2 3 2 1 3 2 3 3 1 2 2 2 2 0 0 3 0 2 2 1 1 2 2 0 1 2 0 0 0
Levels: 0 1 2 3

  Anybody can tell me how to add the corresponding values of factor "g" to
the splitted results 'xg' to get a data frame?
Something like,

Splitted/xg factor/g
0.825137020
-0.03911584   0
2.329553470
...
  I know i can use "xg$'0',xg$'1',xg$'2',xg$'3'" to get the values of each
class and then add a new variable to indicate the factor.
But i hope to get a method to automatic do those things. Any ideas?
  Thanks.


-- 
-
Jane Chang
Queen's

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] New methods for generic functions show and print : some visible with ls(), some not

2010-02-26 Thread Duncan Murdoch


On 26/02/2010 1:04 PM, Joris Meys wrote:

Dear all,

I'm trying to understand the S4 way of object-oriented programming, but I
still can't grasp completely what R is doing. I have a class definition for
a class called PM10Meteo, and I set a initializer function. next, I include
a show method and a print method as shown below.

setClass( Class="PM10Meteo",...) # end setClass

setMethod ("initialize",signature="PM10Meteo",...) # end setMethod


# Show Method
#
setMethod("show","PM10Meteo",
  function(object){
...
  } # end function
) # end show method

###
# Print Method
#

setMethod("print","PM10Meteo",
  function(x,n=400,station=F,values=T,meteo=T,...){
...
  }
) # end print method

Now when I run the complete class definition file, neither the method
"initialize" nor the method "show" occur in the list given by ls(). On the
other hand, the method "print" does! So when I use rm(list=ls()) I still
have the "initialize" and "show" method, but the "print" method is gone. Am
I forgetting something somewhere? It's rather inconvenient to have to run
the definition files every time I want to clear the memory.


You aren't seeing the print method, you are seeing a newly created print 
generic function.  As Uwe mentioned, print() is not an S4 generic, so 
when you create your print method, a new S4 generic also gets created.  
You should be using show(), which will be called by print() when necessary.


When you say "clear the memory", I'm not sure what you have in mind, but 
S4 methods are not stored in your workspace, so rm(list=ls()) won't 
delete them.  You need removeMethod() to get rid of a method.


Duncan Murdoch

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] R Experts


Hello,

Ryan Kinzer wrote:

I am trying to understand why R is working in a particular way.  I have a
data set with two variables; mark date (markd) and recap date (recapd).  I
would like to know the number of days between capture dates.  But if I
subtract recap date from mark date I often get the wrong results. 



Well, what are the classes of recapd and markd in your case?
Erik

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] sort columns

2010-02-26 Thread Ista Zahn

Hi Frederik,
Not exactly clear how you want them sorted, but one of these two is
probably what you want:

mat <- matrix(100:1, ncol=10)
mat.1 <- apply(mat, 2, sort)
mat.2 <- mat[order(10:1),]

Best,
Ista


On Fri, Feb 26, 2010 at 12:41 PM, frederik vanhaelst
 wrote:
> Hello,
>
> i have a 50*100 matrix, with real numbers. How do i sort each column?
> Now i sort it with a for-loop but this take a lot of time...
>
> Thank you,
>
> Frederik
>
>        [[alternative HTML version deleted]]
>
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>



-- 
Ista Zahn
Graduate student
University of Rochester
Department of Clinical and Social Psychology
http://yourpsyche.org

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] sort columns

Try this:

apply(m, 2, sort)

On Fri, Feb 26, 2010 at 2:41 PM, frederik vanhaelst
 wrote:
> Hello,
>
> i have a 50*100 matrix, with real numbers. How do i sort each column?
> Now i sort it with a for-loop but this take a lot of time...
>
> Thank you,
>
> Frederik
>
>        [[alternative HTML version deleted]]
>
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>



-- 
Henrique Dallazuanna
Curitiba-Paraná-Brasil
25° 25' 40" S 49° 16' 22" O

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] using grep

Try this:

gsub(".*York(\\d+).*", "\\1", grep("New York", x, value = TRUE))

On Fri, Feb 26, 2010 at 3:27 PM, kayj  wrote:
>
> Hi All,
>
> I have a character vector with naems of cities in the us. I need to extract
> the number that appear after the word "New York", for example,
>
> x<-c("P Los Angeles44AZ", "P New York722AZ", "K New York20")
>
> I want the results to be
>
> 722, 20
>
>
> cab I use the grep function, if so how?
> I appreciate your help, thanks,
>
> --
> View this message in context: 
> http://n4.nabble.com/using-grep-tp1571102p1571102.html
> Sent from the R help mailing list archive at Nabble.com.
>
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>



-- 
Henrique Dallazuanna
Curitiba-Paraná-Brasil
25° 25' 40" S 49° 16' 22" O

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] New methods for generic functions show and print : some visible with ls(), some not




On 26.02.2010 19:04, Joris Meys wrote:

Dear all,

I'm trying to understand the S4 way of object-oriented programming, but I
still can't grasp completely what R is doing. I have a class definition for
a class called PM10Meteo, and I set a initializer function. next, I include
a show method and a print method as shown below.

setClass( Class="PM10Meteo",...) # end setClass

setMethod ("initialize",signature="PM10Meteo",...) # end setMethod


# Show Method
#
setMethod("show","PM10Meteo",
   function(object){
...
   } # end function
) # end show method

###
# Print Method
#

setMethod("print","PM10Meteo",
   function(x,n=400,station=F,values=T,meteo=T,...){
...
   }
) # end print method



print is not a S4 generic. show methods are mapped to print for 
convenience, though.


Uwe Ligges




Now when I run the complete class definition file, neither the method
"initialize" nor the method "show" occur in the list given by ls(). On the
other hand, the method "print" does! So when I use rm(list=ls()) I still
have the "initialize" and "show" method, but the "print" method is gone. Am
I forgetting something somewhere? It's rather inconvenient to have to run
the definition files every time I want to clear the memory.

Cheers
Joris


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] R Experts

2010-02-26 Thread Ryan Kinzer

I am trying to understand why R is working in a particular way.  I have a
data set with two variables; mark date (markd) and recap date (recapd).  I
would like to know the number of days between capture dates.  But if I
subtract recap date from mark date I often get the wrong results. 

 

Example:  Dataset - markd   recapd

8/28/1991
12/24/1994

 

Timeoutd<-recapd-markd

Timeoutd = 1945

 

But the number of days between should be around 1214.

 

Why is this happening and how do you fix it.

 

Thank you for all the help.

 

RKinzer

 


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] counting the number of ones in a vector

2010-02-26 Thread David Reinke

The length will remain the same no matter what expression appears in the 
subscript. I suggest this:

sum(x == 1)

David Reinke

Senior Transportation Engineer/Economist
Dowling Associates, Inc.
180 Grand Avenue, Suite 250
Oakland, California 94612-3774
510.839.1742 x104 (voice)
510.839.0871 (fax)
www.dowlinginc.com

 Please consider the environment before printing this e-mail.

Confidentiality Notice:  This e-mail message, including any attachments, is for 
the sole use of the intended recipient(s), and may contain confidential  and 
privileged information. Any unauthorized review, use, disclosure or 
distribution is prohibited. If you are not the intended recipient, please 
contact the sender by reply e-mail and destroy all copies of the original 
message.



-Original Message-
From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On 
Behalf Of Randall Wrong
Sent: Friday, February 26, 2010 6:44 AM
To: r-help@r-project.org
Subject: [R] counting the number of ones in a vector

 Dear R users,

I want to count the number of ones in a vector x.

That's what I did : length( x[x==1] )

Is that a good solution ?

Thank you very much,
Randall

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] ODBC with Filemaker

2010-02-26 Thread Daniel

Hi all,
anybody get connection with Filemarker 10 for mac?
How do that?
I suppose did right, but it is not working.


-- 
Daniel Marcelino
Phone: (647) 8910939

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] using grep

2010-02-26 Thread kayj


Hi All,

I have a character vector with naems of cities in the us. I need to extract
the number that appear after the word "New York", for example,

x<-c("P Los Angeles44AZ", "P New York722AZ", "K New York20")

I want the results to be 

722, 20


cab I use the grep function, if so how?
I appreciate your help, thanks,

-- 
View this message in context: 
http://n4.nabble.com/using-grep-tp1571102p1571102.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] sort columns

2010-02-26 Thread frederik vanhaelst

Hello,

i have a 50*100 matrix, with real numbers. How do i sort each column?
Now i sort it with a for-loop but this take a lot of time...

Thank you,

Frederik

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] help system works under IE but not Firefox

2010-02-26 Thread Raynor, Bill

I just upgraded to 2.10.1 on a WinXPSP2 machine. When I type help.start() R 
attempts to open a browser session at 
http://127.0.0.1:27594/doc/html/index.html using my default browser (Firefox 
3.6) and is unable to connect. If I then open the same page using IE 8, it 
works just fine. How do I fix/change R so that it works with Firefox?

Thanks
Bill
William J. Raynor, Jr. Ph.D.
Technical Leader III
Innovation Design & Testing
Kimberly-Clark Corp.
2100 Winchester Road
Neenah, Wi. 54956
(920) 721-5973
Email: bill.ray...@kcc.com



 
This e-mail is intended for the use of the addressee(s) only and may contain 
privileged, confidential, or proprietary information that is exempt from 
disclosure under law.  If you have received this message in error, please 
inform us promptly by reply e-mail, then delete the e-mail and destroy any 
printed copy.   Thank you. 


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] two questions for R beginners

2010-02-26 Thread Saeed Abu Nimeh

Pat,
Off the bat, beginners and advanced. In addition, splitting by domain
would be very helpful -- something along the lines of:
http://cran.r-project.org/web/views/. But we should be careful, we do
not want to create 20 other mailing lists :) We have to group things.
This will help splitting the volume of the list and will help in
targeting lists by expertise.
Thanks,
Saeed

On Fri, Feb 26, 2010 at 2:08 AM, Patrick Burns  wrote:
> Saeed,
>
> If the R-help list were split, what do you
> see as the pieces?
>
> Pat
>
> On 26/02/2010 01:53, Saeed Abu Nimeh wrote:
>>
>> On Thu, Feb 25, 2010 at 9:31 AM, Patrick Burns
>>  wrote:
>>>
>>> * What were your biggest misconceptions or
>>> stumbling blocks to getting up and running
>>> with R?
>>
>> 1- Compared to other programming languages it is hard to learn R by
>> example, because it is hard to find code on the web that will do the
>> exact thing you are looking for, sometimes you might get lucky though.
>> By contrast, take Perl for example, it is an easy language to learn by
>> example.
>>
>> 2- The R mailing list. Beginners get frustrated after they struggle
>> for a long time to solve a problem and the easiest thing then is to
>> send an email to the R mailing list. I did this in the past. The best
>> thing that happened was that my request was neglected and I had to
>> spend more time on the problem and find a solution by myself
>> eventually. Do not get me wrong, I am not saying that the mailing list
>> is bad, but it should be more organized. Maybe broken down into couple
>> of other mailing lists. This might bring up a good discussion thread.
>>
>>>
>>> * What documents helped you the most in this
>>> initial phase?
>>
>> An Introduction to R by Venables
>> simpleR – Using R for Introductory Statistics by Verzani
>>
>
> --
> Patrick Burns
> pbu...@pburns.seanet.com
> http://www.burns-stat.com
> (home of 'The R Inferno' and 'A Guide for the Unwilling S User')
>

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] text editors

2010-02-26 Thread Tobias Verbeke


Sharpie wrote:


Dwayne Blind wrote:

Dear all,

Do you use a text editor ? What would you recommend for Windows users ?
What
about Tinn-R ?

Thank you very much,
Dwayne




Learning a text editor is a significant and very valuable investment of your
time.  In order to maximize the return from this investment you will want to
choose an editor that works well with all the languages and operating
systems you currently use, as well as the ones you may use in the future.

For example, I spend an equal amount of time working on Windows, OS X and
Linux.  There are a ton of great Windows-only editors out there, but they
aren't a good option for me because I only use windows 1/3 of the time I'm
at a computer.

Some good editors I know of that fall into this category are Emacs, Vim and
Eclipse.  For integrating with R, Emacs has the ESS plug-in and Eclipse has
an extension called StatET.

Eclipse is a rather large in terms of file size compared to Emacs or Vim--
also I know Emacs and Vim can be used through a ssh connection.  I'm not
sure about Eclipse as I haven't used it much.  


For the record: it is perfectly possible to configure an 'R Remote
Console' in StatET which will launch R over an SSH connection. You can
then disconnect from /reconnect to an R session on server etc.

Best,
Tobias


This is important because if
you happen to be stuck on a computer that is locked down and doesn't have
your editor of choice installed there is still a chance that you will be
able to use ssh to reach a computer that does.

Personally, I use Vim and have found it just fine for my needs.


ESS: http://ess.r-project.org/
StatET:
http://www.walware.de/?page=/;jsessionid=b4d82261e53bd419d41609155e9e

-Charlie


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Working directories... Again

2010-02-26 Thread Duncan Murdoch


On 26/02/2010 12:55 PM, Allen L wrote:

Dear R forum,
I've looked many places for this and figure there must be an easy way to
implement.

I want to set the working directory in my script to the place where the R
code is located.
Something like: 
>setwd(directory where this script is found).


If you wrote the script, then you know where you put it, so the simplest 
way is just to hard code the directory at the top of the script.


If you want to be able to move the script around and have it still work, 
that doesn't work.  See the thread in this list entitled "Re: [R] 
relative file path" from about 4 days ago for some dirty methods to 
handle that case.


Duncan Murdoch

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] question to make a vector without loop

2010-02-26 Thread Phil Spector


Khazaei -
   I think

mapply(function(x,ind)x * a / (b + ind),c(NA,w),c(NA,1:length(w)))

does what you want, but since you didn't include a reproducible
example, I can't tell for sure.

- Phil Spector
 Statistical Computing Facility
 Department of Statistics
 UC Berkeley
 spec...@stat.berkeley.edu


On Fri, 26 Feb 2010, khaz...@ceremade.dauphine.fr wrote:


Hello all,

I want to define a vector like w[k+1]=w[k]*a/(b+k) for k=1,...,N-1 without
use loop. Is it posible to do in R?

Regards

khazaei

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.



__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] New methods for generic functions show and print : some visible with ls(), some not

2010-02-26 Thread Joris Meys

Dear all,

I'm trying to understand the S4 way of object-oriented programming, but I
still can't grasp completely what R is doing. I have a class definition for
a class called PM10Meteo, and I set a initializer function. next, I include
a show method and a print method as shown below.

setClass( Class="PM10Meteo",...) # end setClass

setMethod ("initialize",signature="PM10Meteo",...) # end setMethod


# Show Method
#
setMethod("show","PM10Meteo",
  function(object){
...
  } # end function
) # end show method

###
# Print Method
#

setMethod("print","PM10Meteo",
  function(x,n=400,station=F,values=T,meteo=T,...){
...
  }
) # end print method

Now when I run the complete class definition file, neither the method
"initialize" nor the method "show" occur in the list given by ls(). On the
other hand, the method "print" does! So when I use rm(list=ls()) I still
have the "initialize" and "show" method, but the "print" method is gone. Am
I forgetting something somewhere? It's rather inconvenient to have to run
the definition files every time I want to clear the memory.

Cheers
Joris
-- 
Joris Meys
Statistical Consultant

Ghent University
Faculty of Bioscience Engineering
Department of Applied mathematics, biometrics and process control

Coupure Links 653
B-9000 Gent

tel : +32 9 264 59 87
joris.m...@ugent.be
---
Disclaimer : http://helpdesk.ugent.be/e-maildisclaimer.php

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Working directories... Again

2010-02-26 Thread Allen L


Dear R forum,
I've looked many places for this and figure there must be an easy way to
implement.

I want to set the working directory in my script to the place where the R
code is located.
Something like: 
>setwd(directory where this script is found).

Thanks!
-Allen

-- 
View this message in context: 
http://n4.nabble.com/Working-directories-Again-tp1571058p1571058.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] biclust package

2010-02-26 Thread Martin Maechler

> "UweL" == Uwe Ligges 
> on Fri, 26 Feb 2010 17:24:43 +0100 writes:

UweL> On 26.02.2010 17:04, linda garcia wrote:
>> Dear all,
>> I am using biclust package for biclustering. I wanted to
>> know how can I extract my clusters from the object?
>> 
>> 
>> library(biclust)
>> test<- matrix(rnorm(5000), 100, 50)
>> 
>> test[11:20,11:20]<- rnorm(100, 3, 0.1)
>> 
>> loma<- binarize(test,2)
>> 
>> res<- biclust(x=loma, method=BCBimax(), minr=4, minc=4, number=10)
>> 
>> res
>> 
>> 
>> Thanks for your help
>> 
>> 

UweL> According to ?biclust which links to the Biclust class, there are 
slots 
UweL> that indicate cluster assigmnets in:

UweL> r...@rowxnumber
UweL> r...@numberxcol

Yes, indeed.  Reading the help page carefully *is*  hmm, at
least recommended.

Note that

str(res)

would also reveal to you about the components, you could use.
Here's a small script snippet,
which *would* reveal more if there wasn't a bug in biclust's
summary method :

str(res)## -> 'low-level' content.

## Better recommended:
class(res)# "Biclust"
showMethods(class = "Biclust")
## which reveals that there's a  summary() method.
##
## However,
sres <- summary(res)
## --> starts *printing* things [not according to "good R examples"]
## --> ends in an **ERROR**
##   .
##   'x' must be an array of at least two dimensions
## (because it assumes things that are wrong)

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] text editors

2010-02-26 Thread Sharpie



Sharpie wrote:
> 
> For example, I spend an equal amount of time working on Windows, OS X and
> Linux.  There are a ton of great Windows-only editors out there, but they
> aren't a good option for me because I only use windows 1/3 of the time I'm
> at a computer.
> 
> Some good editors I know of that fall into this category are Emacs, Vim
> and Eclipse.  For integrating with R, Emacs has the ESS plug-in and
> Eclipse has an extension called StatET.
> 

Sigh,  I really need to remember-- coffee *before* mailing list.

What I meant to say is "Some good editors I know of that fall into the
category of being platform-neutral are Emacs, Vim and Eclipse"

Sorry for any confusion.

-Charlie
-- 
View this message in context: 
http://n4.nabble.com/text-editors-tp1570848p1571038.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] text editors

2010-02-26 Thread Sharpie

Dwayne Blind wrote:
> 
> Dear all,
> 
> Do you use a text editor ? What would you recommend for Windows users ?
> What
> about Tinn-R ?
> 
> Thank you very much,
> Dwayne
> 
> 

Learning a text editor is a significant and very valuable investment of your
time.  In order to maximize the return from this investment you will want to
choose an editor that works well with all the languages and operating
systems you currently use, as well as the ones you may use in the future.

For example, I spend an equal amount of time working on Windows, OS X and
Linux.  There are a ton of great Windows-only editors out there, but they
aren't a good option for me because I only use windows 1/3 of the time I'm
at a computer.

Some good editors I know of that fall into this category are Emacs, Vim and
Eclipse.  For integrating with R, Emacs has the ESS plug-in and Eclipse has
an extension called StatET.

Eclipse is a rather large in terms of file size compared to Emacs or Vim--
also I know Emacs and Vim can be used through a ssh connection.  I'm not
sure about Eclipse as I haven't used it much.  This is important because if
you happen to be stuck on a computer that is locked down and doesn't have
your editor of choice installed there is still a chance that you will be
able to use ssh to reach a computer that does.

Personally, I use Vim and have found it just fine for my needs.

ESS: http://ess.r-project.org/
StatET:
http://www.walware.de/?page=/;jsessionid=b4d82261e53bd419d41609155e9e

-Charlie
-- 
View this message in context: 
http://n4.nabble.com/text-editors-tp1570848p1571021.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] two questions for R beginners

2010-02-26 Thread Claudia Beleites


Dear Patrick (and all)

I'm now working with R a couple of years, before working mostly in Matlab
Lazy & impatient is both true for me :-)


* What were your biggest misconceptions or
stumbling blocks to getting up and running
with R?


> * What documents helped you the most in this
> initial phase?
>
> I especially want to hear from people who are
> lazy and impatient.
>
> Feel free to write to me off-list.  Definitely
> write off-list if you are just confirming what
> has been said on-list.
>

Stumbling:

* It took me long to remember
getwd () and setwd () (instead of pwd and cd / chdir or the like)

* I still discover very useful functions that I would have needed for a long 
time. Latest discoveries: mapply and ave
I knew aggregate. And was always a little angry that it needs a grouping list. I 
even decided that the aggregate method for my hyperSpec class should work with 
factors as well as with lists. Some day I read in this mailing list that ave 
does what I need...
I like the crosslinks in the help (see also) very much. Maybe I rely too much on 
them. So: not lazy today, I attach a patch for aggregate.Rd that adds the 
seealso to ave.


Reading this mailing list once in a while gives me nice new ideas. However, > 50 
emails / d is somewhat scary for me, so I read only occasionally.


* Vecorization: I like the *apply functions.
but I'd really appreciate a comprehensive page/vignette here.
I remember that it took me a while to realize that the rule for MARGIN in sweep 
is "use the same number as in the apply that created the STATS"


* I never found the pdf manuals helpful (help pages are easier to access, and 
there is nothing in the pdf that the help doesn't have.

At the beginning I expected the pdf manual to be something that the vignettes 
are.

* I did not arrive at a comfortable debugging cycle for a long time. But now 
there's the debug package and setBreakpoint and I'm happy


* As I now start teaching I notice that many students react to error messages 
"uhh! an error!" (panic). Few realizing that the error message actually gives 
information on what went wrong.
A list with common causes of different error messages would be helpful here, I 
think.
In case someone agrees: I started one at the Wiki: 
http://rwiki.sciviews.org/doku.php?id=tips:errormessages



Cheers,

Claudia



__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] text editors

2010-02-26 Thread gerald . jean

I also agree, Emacs without question.  The learning curve is a bit steep
but once you know it you can use it
for just about anything, but cleaning the kitchen sink!

Gérald Jean
Conseiller senior en statistiques,
VP Planification et Développement des Marchés,
Desjardins Groupe d'Assurances Générales
télephone: (418) 835-4900 poste (7639)
télecopieur  : (418) 835-6657
courrier électronique: gerald.j...@dgag.ca

``You don't realize how much you want
   something until you can't have it...
   then, if you can't have it for long enough,
   you realize you never really wanted it.''   --- Unknown!

r-help-boun...@r-project.org a écrit sur 2010/02/26 11:20:23 :

> As has been said before: Emacs! It's not as scary as it used to be.
> For Windows I recommend
> http://vgoulet.act.ulaval.ca/en/ressources/emacs/
>
> -Ista
>
> On Fri, Feb 26, 2010 at 11:10 AM, Dwayne Blind 
wrote:
> > Dear all,
> >
> > Do you use a text editor ? What would you recommend for Windows users ?
What
> > about Tinn-R ?
> >
> > Thank you very much,
> > Dwayne
> >
> >        [[alternative HTML version deleted]]
> >
> > __
> > R-help@r-project.org mailing list
> > https://stat.ethz.ch/mailman/listinfo/r-help
> > PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
> > and provide commented, minimal, self-contained, reproducible code.
> >
>
>
>
> --
> Ista Zahn
> Graduate student
> University of Rochester
> Department of Clinical and Social Psychology
> http://yourpsyche.org
>
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

Le message ci-dessus, ainsi que les documents l'accompagnant, sont destinés
uniquement aux personnes identifiées et peuvent contenir des informations
privilégiées, confidentielles ou ne pouvant être divulguées. Si vous avez
reçu ce message par erreur, veuillez le détruire.

This communication ( and/or the attachments ) is intended for named
recipients only and may contain privileged or confidential information
which is not to be disclosed. If you received this communication by mistake
please destroy all copies.

Faites bonne impression et imprimez seulement au besoin !
Think green before you print !

Le message ci-dessus, ainsi que les documents l'accompagnant, sont destinés 
uniquement aux personnes identifiées et peuvent contenir des informations 
privilégiées, confidentielles ou ne pouvant être divulguées. Si vous avez reçu 
ce message par erreur, veuillez le détruire.

This communication (and/or the attachments) is intended for named recipients 
only and may contain privileged or confidential information which is not to be 
disclosed. If you received this communication by mistake please destroy all 
copies.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Problem accessing sub-methods of functions stored in a vector

On Fri, Feb 26, 2010 at 11:57 AM, Matt Asher  wrote:
> Hi,
>
> This:
>
> agents <- list(agent(1), agent(2))
>
> worked perfectly. Thanks Uwe!
>
> I'm not fully sure what you mean by this:
>
> "
> Anyway, I hope you know that lexical scoping will yield in the
> environments attached to all those functions they have been generated in
> and you know about possible consequences. If not, you really should not
> be doing this ... (nor using <<- ) ...
> "
>
> I used the <<- assignment because I am treating these variables in the
> sub-function as "belonging" to the parent function (like using "self" or
> "this" in other languages). I am basically treating my agent function like a
> class that can have instances and class vars, without having to use R's
> backwards (IMO) way of doing OOP.
>

In that case you might want to look at the proto package

http://r-proto.googlecode.com

which can do this a bit more cleanly and also allows for delgation
(i.e. the counterpart to inheritance in the prototype model)

> library(proto)
> agent <- proto(id = NA, get_id = function(self) self$id, set_id = 
> function(self, id) self$id <- id)
>
> agent$set_id(2)
> agent$get_id()
[1] 2

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] text editors

2010-02-26 Thread Ista Zahn

As has been said before: Emacs! It's not as scary as it used to be.
For Windows I recommend
http://vgoulet.act.ulaval.ca/en/ressources/emacs/

-Ista

On Fri, Feb 26, 2010 at 11:10 AM, Dwayne Blind  wrote:
> Dear all,
>
> Do you use a text editor ? What would you recommend for Windows users ? What
> about Tinn-R ?
>
> Thank you very much,
> Dwayne
>
>        [[alternative HTML version deleted]]
>
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

-- 
Ista Zahn
Graduate student
University of Rochester
Department of Clinical and Social Psychology
http://yourpsyche.org

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Problem accessing sub-methods of functions stored in a vector

2010-02-26 Thread Matt Asher


Hi,

This:

agents <- list(agent(1), agent(2))

worked perfectly. Thanks Uwe!

I'm not fully sure what you mean by this:

"
Anyway, I hope you know that lexical scoping will yield in the
environments attached to all those functions they have been generated in
and you know about possible consequences. If not, you really should not
be doing this ... (nor using <<- ) ...
"

I used the <<- assignment because I am treating these variables in the 
sub-function as "belonging" to the parent function (like using "self" or 
"this" in other languages). I am basically treating my agent function 
like a class that can have instances and class vars, without having to 
use R's backwards (IMO) way of doing OOP.


Cheers.



On 26.02.2010 16:33, Matt Asher wrote:

Hi folks,

I am having trouble accessing sub-functions when the main function is
stored in an array. For example, the following test code works fine:

fcns = c(abs, sqrt)
fcns[[1]](-2)
fcns[[2]](2)

However, when I try to access sub-functions declared within list() in a
function, this only works directly. When I try to access these within an
array only the first declared sub-function is run. For example I have
the function:

agent <- function(id) {

# MANY VARIABLES DECLARED

list( set_id = function(newid) {
id <<- newid
},
get_id = function(newid) {
return(id)
},

# LOTS MORE SUB FUNCTIONS
)
}

If I create a variable to hold this function, I can then access all the
subfunctions without problem Example:

myAgent = agent(1)
myAgent$get_id() # Works fine

However, once this function is stored in a vector, I can no longer
access the subfunctions.

agents = c(agent(1), agent(2))



agents is still a list (or in other words a vector of mode "list"), but
since you c()'ed, it has one hierarchy level less than you expect.

In order to make your code below work, you rather need:

agents <- list(agent(1), agent(2))

Anyway, I hope you know that lexical scoping will yield in the
environments attached to all those functions they have been generated in
and you know about possible consequences. If not, you really should not
be doing this ... (nor using <<- ) ...



agents[[1]] # This shows the set_id function only, unnamed

agents[[1]]$get_id() # Leads to error below:

Error in agents[[1]]$get_id : object of type 'closure' is not subsettable

How can I access these sub methods within the vector?

I am using R version 2.8.1



... and upgrade to some recent version of R.


Uwe Ligges



TIA for the help!

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] t-distribution values

2010-02-26 Thread Peter Ehlers


How about taking the unusual step of reading 'An Introduction to R',
where, if you peruse the table of contents, you will quickly be led
to Chapter 8: Probability Distributions.

 -Peter Ehlers

On 2010-02-26 7:23, Антон Морковин wrote:


Dear all,


how to calculate  values of t-distribution for given values of d.f. using R
functions?




Anton
__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.




--
Peter Ehlers
University of Calgary

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] question to make a vector without loop

2010-02-26 Thread Joshua Wiley

My apologies, I misread your formula.  Here is a clearer example anyways:

w <- 1:10
N <- length(w)
a <- 1
b <- 1
k <- 1:(N-1)
w[k+1] <- w[k]*(a/(b+k))
w
 [1] 1.000 0.500 0.667 0.750 0.800 0.833 0.8571429
 [8] 0.875 0.889 0.900

Best,

Josh
On Fri, Feb 26, 2010 at 7:23 AM,  wrote:

> Hello all,
>
> I want to define a vector like w[k+1]=w[k]*a/(b+k) for k=1,...,N-1 without
> use loop. Is it posible to do in R?
>
> Regards
>
> khazaei
>
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

-- 
Joshua Wiley
Senior in Psychology
University of California, Riverside
http://www.joshuawiley.com/

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Boot R

2010-02-26 Thread Paul Hiemstra


Cassiano wrote:

I think I have 'libgfortran'.
After that I digit 'dpkg -l | grep libgfortran' in terminal, I got this
message:

ii  libgfortran2   4.2.4-5ubuntu1
Runtime library for GNU Fortran applications
ii  libgfortran2-dbg   4.2.4-5ubuntu1
Runtime library for GNU Fortran applications
ii  libgfortran3   4.4.1-4ubuntu9
Runtime library for GNU Fortran applications
ii  libgfortran3-dbg   4.4.1-4ubuntu9
Runtime library for GNU Fortran applications


And the error continue:

/usr/lib/R/bin/exec/R: error while loading shared libraries:
libgfortran.so.3: cannot open shared object file: No such file or directory

  

Cassiano wrote:
After sudo updatedb - nothing

after locate libgfortran | grep so

//usr/lib/libgfortran.so.2
/usr/lib/libgfortran.so.2.0.0
/usr/lib/libgfortran.so.3.0.0
/usr/lib/debug/usr/lib/libgfortran.so.2.0.0
/usr/lib/debug/usr/lib/libgfortran.so.3.0.0
/usr/lib/gcc/i486-linux-gnu/4.4/libgfortran.so
/
My reply:
The point is that R is expecting /usr/lib/libgfortran.so.3 but your 
computer has /usr/lib/libgfortran.so.3.0.0. A trick is to make a 
symbolic link from /usr/lib/libgfortran.so.3 to 
/usr/lib/libgfortran.so.3.0.0. /usr/lib/libgfortran.so only points to 
/usr/lib/libgfortran.so.3.0.0 in that case:


sudo ln -s /usr/lib/libgfortran.so.3.0.0 /usr/lib/libgfortran.so.3

This should fix the problem.

cheers,
Paul

--
Drs. Paul Hiemstra
Department of Physical Geography
Faculty of Geosciences
University of Utrecht
Heidelberglaan 2
P.O. Box 80.115
3508 TC Utrecht
Phone:  +3130 274 3113 Mon-Tue
Phone:  +3130 253 5773 Wed-Fri
http://intamap.geo.uu.nl/~paul

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Boot R

2010-02-26 Thread Paul Hiemstra


Cassiano wrote:

I think I have 'libgfortran'.
After that I digit 'dpkg -l | grep libgfortran' in terminal, I got this
message:

ii  libgfortran2   4.2.4-5ubuntu1
Runtime library for GNU Fortran applications
ii  libgfortran2-dbg   4.2.4-5ubuntu1
Runtime library for GNU Fortran applications
ii  libgfortran3   4.4.1-4ubuntu9
Runtime library for GNU Fortran applications
ii  libgfortran3-dbg   4.4.1-4ubuntu9
Runtime library for GNU Fortran applications


And the error continue:

/usr/lib/R/bin/exec/R: error while loading shared libraries:
libgfortran.so.3: cannot open shared object file: No such file or directory

  

if you do:

sudo updatedb
locate libgfortran | grep so

does it find the file? And in which path?

cheers,
Paul

--
Drs. Paul Hiemstra
Department of Physical Geography
Faculty of Geosciences
University of Utrecht
Heidelberglaan 2
P.O. Box 80.115
3508 TC Utrecht
Phone:  +3130 274 3113 Mon-Tue
Phone:  +3130 253 5773 Wed-Fri
http://intamap.geo.uu.nl/~paul

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] biclust package




On 26.02.2010 17:04, linda garcia wrote:

Dear all,
  I am using biclust package for biclustering. I wanted to
know how can I extract my clusters from the object?


library(biclust)
test<- matrix(rnorm(5000), 100, 50)

test[11:20,11:20]<- rnorm(100, 3, 0.1)

loma<- binarize(test,2)

res<- biclust(x=loma, method=BCBimax(), minr=4, minc=4, number=10)

res


Thanks for your help





According to ?biclust which links to the Biclust class, there are slots 
that indicate cluster assigmnets in:


r...@rowxnumber
r...@numberxcol

Uwe Ligges

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] two questions for R beginners

2010-02-26 Thread Paul Hiemstra


Thomas Adams wrote:

Paul,

I think your point "you need [to] spend at least a few hours a week on 
it" is key. Since I am not doing statistics daily, more in fits & 
starts as my latest project -may- require, my approach has been more 
task oriented. A less-than-ideal approach. So, I think your suggestion 
is on-the-mark.


Tom
I also see co-workers who would like to work with R, see the benefit of 
R etc, but don't have the time to learn and maintain R. But I'm not 
really sure how to fix this, it seems impossible to have both easy, 
intuitive  to use and power and flexibility.


cheers,
Paul



Paul Hiemstra wrote:

Ivan Calandra wrote:

You are definitely right...
What to do with bad beginner's questions is not a simple issue.

If a "beginner's mailing list" is created, who will answer to such 
questions? And moreover, the beginners won't take advantage of the 
other questions (I've personally learned a lot trying to understand 
the questions and answers to other's problems). And also, as you 
said, the problems might persist.
The beginner's mailing list might be good in one aspect though: the 
"experts" who subscribe to it would be willing to help the beginners 
to get started with R, knowing that the questions might not be 
clearly stated.


As you pointed out, the mailing list is not the best for basic stuff 
(the question is of course "what is basic?"). Not everybody knows 
some colleagues who work with R (I'm personally the 1st one to use R 
in my lab).
I think, somehow and I have no idea how, documentation and guidance 
to search for help should be more accessible as soon as you start 
with R. Maybe a _*clear*_ section on the R homepage or in the 
"introduction to R" manual like "where to find help", including all 
of the most common and useful resources available (from "?" and 
RSiteSearch() to R Wiki and Crantastic).
  

Hi Ivan (and list),

I think the main problem is not as much that there isn't structure in 
the way R provides documentation / tutorials, but that people have a 
hard time finding the structure. There are task views for certain 
specific fields, but I think a lot of beginners do not know that they 
exist. There are separate mailing lists for specific fields, but I 
often see geographical (my field of expertise) oriented questions on 
R-help that would fit much better on R-sig-geo.


So I think a "O my God, I've downloaded R and what now" tutorial 
might be a good idea to put very close to the download button of R on 
CRAN. This tutorial would focus not on how to do things in R, but 
would provide guidance to the most obvious sources of information 
such as Task views, specific mailing lists, ways to search list 
archives, information for beginners how to write a good e-mail etc. I 
think for a lot of beginners it is not as much the answer to a 
specific question that they need, but more guidance how to look for 
answers themselves.


But at the end of the day, R is still not very easy to learn when 
coming from GUI oriented stats programs. In addition, to become 
reasonably fluent in R, you need spend at least a few hours a week on 
it. SO I think we can ease the pain for beginners, but not take away 
that it takes quite some time to become fluent in R.


cheers,
Paul
I hope that this whole discussion might help to make the R world 
better.

Thank you Patrick for initiating it!
Regards,
Ivan

Le 2/26/2010 15:09, Paul Hiemstra a écrit :
 

Ivan Calandra wrote:
  

Since you want input from beginners, here are some thoughts

I had and still have two big problems with R:
- this vectorization thing. I've read many manuals (including R 
inferno), but I'm still not completely clear about it. In simple 
examples, it's fine. But when it gets a bit more complex, then...
Related to it, the *apply functions are still a bit difficult to 
understand. When I have to use them, I just try one and see what 
happens. I don't understand them well enough to know which one I 
need.
- the second problem is where to find the functions/packages I 
need. There are many options, and that's actually the problem. R 
Wiki, Rseek, RSiteSearch, Crantastic, etc... When you start with 
R, you discover that the capabilities of R are almost unlimited 
and you don't really know where to start, where to find what you 
need.


As noted in earlier posts, the mailing list is really great, but 
some people are really hard with beginners. It was noted in a 
discussion a few days ago, but it looks like some don't realize 
how difficult it is at the beginning to formulate a good question, 
clear, with self-contained example and so on. Moreover, not 
everybody speaks English natively. I don't mean that you must 
help, even when the question is really vague and not clear and 
whatever. I'm just saying that if you don't want to help (whatever 
the reason), you don't have to say it badly. But in any cases, the 
mailing list is still really helpful. As someone noted (sorry I 
erased the email so I don't remember who), it mig

Re: [R] text editors

There is a list here:
http://www.sciviews.org/_rgui/projects/Editors.html

On Fri, Feb 26, 2010 at 11:10 AM, Dwayne Blind  wrote:
> Dear all,
>
> Do you use a text editor ? What would you recommend for Windows users ? What
> about Tinn-R ?
>
> Thank you very much,
> Dwayne
>
>        [[alternative HTML version deleted]]
>
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] question to make a vector without loop

A general facility for this is Reduce:

f <- function(w, k, a = 2, b = 1) w*a / (b+k)
c(7, Reduce(f, 2:9, 7, accumulate = TRUE))

the result of which is:
 c(7, Reduce(f, 2:9, 7, accumulate = TRUE))
 [1] 7.00 7.00 4.67 2.33 0.93
0.31 0.09 0.02 0.0049382716 0.0009876543


On Fri, Feb 26, 2010 at 10:23 AM,   wrote:
> Hello all,
>
> I want to define a vector like w[k+1]=w[k]*a/(b+k) for k=1,...,N-1 without
> use loop. Is it posible to do in R?
>
> Regards
>
> khazaei
>
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] possible arrangements of across sample ties for runs test

2010-02-26 Thread Dale Steele

I'm trying to implement the two-sample Wald-Wolfowitz runs test.  Daniel
(1990) suggests a method to deal with ties across samples.  His suggestion
is to prepare ordered arrangements, one resulting in the fewest number of
runs, and one resulting in the largest number of runs.  Then take the mean
of these.  The code below counts 9 runs for my example data where '60' is
tied across samples.

X <-  c(58, 62, 55, 60, 60, 67)
n1 <- length(X)
Y <- c(60, 59, 72, 73, 56, 53, 50, 50)
n2 <- length(Y)
data <- c(X, Y)
names(data) <- c(rep("X", n1), rep("Y", n2))
data <- sort(data)
runs <- rle(names(data))
r <- length(runs$lengths)
r

Y  Y  Y  X  Y  X  Y  X  X  Y  X  X  Y  Y
50 50 53 55 56 58 59 60 60 60 62 67 72 73 --> r = 9 runs

The other possible orderings are:

Y  Y  Y  X  Y  X  Y  X  Y  X  X  X  Y  Y  --> 9 runs
50 50 53 55 56 58 59 60 60 60 62 67 72 73

Y  Y  Y  X  Y  X  Y  Y  X  X  X  X  Y  Y  --> 7 runs
50 50 53 55 56 58 59 60 60 60 62 67 72 73

How to I generate the other possible orderings?  Thus, far, I've found a day
to identify cross sample duplicates...

# find the ties across samples
dd <- data[duplicated(data)]  #find all duplicates
idd <- dd  %in% X & dd  %in% Y #determine found in both X and Y
duplicates <- dd[idd]

Thanks!  --Dale

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Adjust lattice graph axis label on final page

2010-02-26 Thread Deepayan Sarkar

On Fri, Feb 26, 2010 at 6:14 AM, Sebastien Bihorel
 wrote:
> Thanks Deepayan,
>
> This confirms what I thought I should do... One follow-up question about
> your suggested code: is it possible to create a lattice graph object myplot
> and modify the layout just for panel 7 and 8, rather than creating two
> graphs with different layouts?

Sure:

p <- xyplot(y~x|id,as.table=T,data=mydata)
update(p[1:6], layout = c(2, 3))
update(p[7:8], layout = c(2, 1))

-Deepayan

>
> Sebastien
>
> Deepayan Sarkar wrote:
>>
>> On Thu, Feb 25, 2010 at 3:45 AM, Sebastien Bihorel
>>  wrote:
>>
>>>
>>> Dear R-users,
>>>
>>> I was wondering if there was a way to adjust the placement of the axis
>>> titles for the last page of a multi-page lattice plot (see example
>>> below).
>>> Depending on the total number of panels, the placement of these titles
>>> might
>>> look strange on the last page, if the layout is not adjusted (e.g. in
>>> some
>>> template code).
>>>
>>
>> It's not possible to adjust the labels on a per-page basis.
>>
>> It _is_ possible to have the two plots fill up the last page, but that
>> may not be what you want.
>>
>> xyplot(y~x|id,as.table=T,data=mydata,layout=c(2,3))[1:6]
>> xyplot(y~x|id,as.table=T,data=mydata,layout=c(2,1))[7:8]
>>
>> -Deepayan
>>
>

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Restructure some data

2010-02-26 Thread Doran, Harold

Thank you both for your replies; both are very useful. The larger issue at hand 
is that the data will actually be huge, thus the end result will be a very 
large, sparse data frame.

So, I decided to put all three possible solutions to a timing test and see what 
they yield. I simulated 15000 possible students and created an item pool of 300 
total items that could be selected. I fixed the number of total items each 
students sees to 3, although this will truly be on the order of 50 in the real 
world problem.

So, first the new data for testing all three solutions.

item.pool <- paste("item", 1:300, sep = "")
N <- 15000
set.seed(54321)
dat <- data.frame(id = c(1:N), first.item = sample(item.pool, N, replace=TRUE), 
second.item = sample(item.pool, N,replace=TRUE), third.item = 
sample(item.pool, N,replace=TRUE),
score1 = sample(c(0,1), N,replace=TRUE), score2 = sample(c(0,1), 
N,replace=TRUE), score3 = sample(c(0,1), N,replace=TRUE))

Now, my original loop is in the function 'harold', I created a new function 
"bill" and "phil". I modified Bill's code only to reflect my original naming 
conventions. Timing results for each solution are below.

> system.time(result <- harold(dat))
   user  system elapsed 
1347.85  441.92 1799.75

> system.time(result <- bill(dat))
   user  system elapsed 
   0.040.040.09

> system.time(result <- phil(dat))
   user  system elapsed 
   4.420.004.42

The loop timing is laughable; so it is out. Clearly, Phil wins from the "golf" 
viewpoint, but Bill's solution is quite fast. Phil, it is actually quite 
irrelevant that the original ordering of the columns is not preserved since 
that can be easily remedied in a post-hoc reordering of columns.

Again, thank you both.
Harold

harold <- function(dat){
Nstu <- nrow(dat)
df <- matrix(NA, ncol = length(item.pool), nrow = Nstu)
colnames(df) <- item.pool
for(i in 1:Nstu){
for(j in 2:4){
rr <- which(dat[i,j] == colnames(df))
df[i,rr] <- dat[i, (j+3)]
}
}
df
}
system.time(result <- harold(dat))

bill <- function(dat) {
L <- length(item.pool)
items <- as.matrix(dat[2:4])
scores <- as.matrix(dat[, 5:7])
retval <- matrix(NA_real_, nrow = nrow(dat), ncol = L,
dimnames = list(character(), item.pool))
retval[cbind(dat$id, match(items, item.pool))] <- scores
retval
  }
system.time(result <- bill(dat))

phil <- function(dat){
df <- tapply(as.vector(as.matrix(dat[5:7])),
list(rep(dat$id,3),as.vector(as.matrix(dat[2:4]))),I)
df
}
system.time(result <- phil(dat))

-Original Message-
From: Phil Spector [mailto:spec...@stat.berkeley.edu] 
Sent: Thursday, February 25, 2010 5:38 PM
To: Doran, Harold
Cc: r-help@r-project.org
Subject: Re: [R] Restructure some data

Harold -
Here's what I came up with:

>  tapply(as.vector(as.matrix(dat[5:7])),
+ list(rep(dat$id,3),as.vector(as.matrix(dat[2:4]))),I)
   item1 item10 item2 item3 item4 item5 item7 item9
1NA NA 1NANA 1NA 0
2 0 NANANANA 1 1NA
3 1 NA 0 1NANANANA
4NA NANA 1 0NA 0NA
5NA  1NA 0 1NANANA

I thought there would be a way to use xtabs, but I had
trouble preserving the NAs.

The columns aren't in the right order, and the item6 column is
missing, but it's pretty close.
Thanks for the easily reproducible example, and the interesting
puzzle.

- Phil Spector
 Statistical Computing Facility
 Department of Statistics
 UC Berkeley
 spec...@stat.berkeley.edu

On Thu, 25 Feb 2010, Doran, Harold wrote:

> Suppose I have a data frame like "dat" below. For some context, this is the 
> format that represents student's taking a computer adaptive test. first.item 
> is the first item that student was administered and then score.1 is the 
> student's response to that item and so forth.
>
> item.pool <- paste("item", 1:10, sep = "")
> set.seed(54321)
> dat <- data.frame(id = c(1,2,3,4,5), first.item = sample(item.pool, 5, 
> replace=TRUE),
>second.item = sample(item.pool, 5,replace=TRUE), third.item = 
> sample(item.pool, 5,replace=TRUE),
>score1 = sample(c(0,1), 5,replace=TRUE), score2 = 
> sample(c(0,1), 5,replace=TRUE), score3 = sample(c(0,1), 5,replace=TRUE))
>
> I need to restructure this into a new format. The new matrix df (after the 
> loop) is exactly what I want in the end. But, I'm annoyed at myself for not 
> thinking of a more efficient way to restructure this without using a loop.
>
> df <- matrix(NA, ncol = length(item.pool), nrow = nrow(dat))
> colnames(df) <- uni

Re: [R] Problem accessing sub-methods of functions stored in a vector




On 26.02.2010 16:33, Matt Asher wrote:

Hi folks,

I am having trouble accessing sub-functions when the main function is
stored in an array. For example, the following test code works fine:

fcns = c(abs, sqrt)
fcns[[1]](-2)
fcns[[2]](2)

However, when I try to access sub-functions declared within list() in a
function, this only works directly. When I try to access these within an
array only the first declared sub-function is run. For example I have
the function:

agent <- function(id) {

# MANY VARIABLES DECLARED

list( set_id = function(newid) {
id <<- newid
},
get_id = function(newid) {
return(id)
},

# LOTS MORE SUB FUNCTIONS
)
}

If I create a variable to hold this function, I can then access all the
subfunctions without problem Example:

myAgent = agent(1)
myAgent$get_id() # Works fine

However, once this function is stored in a vector, I can no longer
access the subfunctions.

agents = c(agent(1), agent(2))



agents is still a list (or in other words a vector of mode "list"), but 
since you c()'ed, it has one hierarchy level less than you expect.


In order to make your code below work, you rather need:

agents <- list(agent(1), agent(2))

Anyway, I hope you know that lexical scoping will yield in the 
environments attached to all those functions they have been generated in 
and you know about possible consequences. If not, you really should not 
be doing this ... (nor using <<- ) ...




agents[[1]] # This shows the set_id function only, unnamed

agents[[1]]$get_id() # Leads to error below:

Error in agents[[1]]$get_id : object of type 'closure' is not subsettable

How can I access these sub methods within the vector?

I am using R version 2.8.1



... and upgrade to some recent version of R.


Uwe Ligges



TIA for the help!

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] text editors


Dwayne Blind wrote:

Dear all,

Do you use a text editor ? What would you recommend for Windows users ? What
about Tinn-R ?



Dwayne,

Perhaps you have seen http://www.sciviews.org/_rgui/ , it has 
information on several possibilities.  It would be hard to pull me away 
from using Emacs with ESS (http://ess.r-project.org/), both on Windows 
and Linux.  I use Emacs for a lot of things now, but ESS was the gateway 
that helped me learn it.  The fact that there is always a version of 
Emacs on all the platforms I might be faced with helps a lot too.  I 
know nothing about Tinn-R, but my recollection is that people who use it 
seem to like it just fine.


Erik

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Legend's attribute