Re: [R] Taking the sum of only some columns of a data frame

2017-03-31 Thread William Michels via R-help
Thank you Jeff for pointing out bad spreadsheet practices in R,
seconded by Mathew and Bert.

I should have considered creating a second dataframe ("test1_summary")
to distinguish raw from processed data. Those who want to address
memory issues caused by unnecessary duplication, feel free to chime
in.

Finally, thank you Bert for your most informative post on adding
attributes to dataframes. I really learned a lot!

Best Regards,

Bill.

William Michels, Ph.D.



On Fri, Mar 31, 2017 at 4:59 PM, Bert Gunter  wrote:
> All:
>
> 1. I agree wholeheartedly with prior responses.
>
> 2. But let's suppose that for some reason, you *did* want to carry
> around some "calculated values" with the data frame. Then one way to
> do it is to add them as attributes to the data frame. This way they
> cannot "pollute" the data in the way Jeff warned against; e.g.
>
> attr(your_frame,"colsums") <- colSums(your_frame)
>
> This of course calculates them all, but you can of course just attach
> some (e.g. colSums(your_frame[,c(1,3)] )
>
> 3. This, of course, has the disadvantage of requiring recalculation of
> the attribute if the data changes, which is an invitation to problems.
> A better approach might be to attach the *function* that does the
> calculation as an attribute, which when invoked always uses the
> current data:
>
> attr(your_frame,"colsums") <- function(x)colSums(x)
>
> For example:
>
> df <- data.frame(x=1:5,y=21:25)
> attr(df,"colsums")<- function(x)colSums(x)
>
> ## then:
>> attr(df,"colsums")(df)
>   x   y
>  15 115
>
> ## add a row
>> df[6,] <- rep(100,2)
>> attr(df,"colsums")(df)
>   x   y
> 115 215
>
>
> This survives changing the name of df:
>
>> dat <- df
>> attr(dat,"colsums")(dat)
>   x   y
> 115 215
>
> As it stands, the call: attr(df,"colsums")(df)  is a bit clumsy; one
> could easily write a function that does this sort of thing more
> cleanly, as, for example, is done via the "selfStart" functionality
> for nonlinear models.
>
> But all this presupposes that the OP is familiar with R programming
> paradigms, especially the use of functions as first class objects, and
> the language in general. While I may have missed this, his posts do
> not seem to me to indicate such familiarity, so as others have
> suggested, perhaps the best answer is to first spend some time with an
> R tutorial or two and *not* try to mimic bad spreadsheet practices in
> R.
>
> Cheers,
> Bert
>
>
> Bert Gunter
>
> "The trouble with having an open mind is that people keep coming along
> and sticking things into it."
> -- Opus (aka Berkeley Breathed in his "Bloom County" comic strip )
>
>
> On Fri, Mar 31, 2017 at 2:49 PM, Jeff Newmiller
>  wrote:
>> You can also look at the knitr-RMarkdown work flow, or the knitr-latex work 
>> flow. In both of these it is reasonable to convert your data frame to a 
>> temporary character-only form purely for output purposes. However, one can 
>> usually use an existing function to push your results out without damaging 
>> your working data.
>>
>> It is important to separate your data from your output because mixing 
>> results (totals) with data makes using the data further extremely difficult. 
>> Mixing them is one of the major flaws of the spreadsheet model of 
>> computation, and it causes problems there as well as in R.
>> --
>> Sent from my phone. Please excuse my brevity.
>>
>> On March 31, 2017 1:05:09 PM PDT, William Michels via R-help 
>>  wrote:
>>>Again, you should always copy the R-help list on replies to your OP.
>>>
>>>The short answer is you **shouldn't** replace NAs with blanks in your
>>>matrix or dataframe.  NA is the proper designation for those cell
>>>positions. Replacing NA with a "blank" in a dataframe will convert
>>>that column to a "character" mode, precluding further numeric
>>>manipulation of those columns.
>>>
>>>Consider your workflow:  are you tying to export a table? If so, take
>>>a look at installing pander (see 'missing' argument on webpage below):
>>>
>>>https://cran.r-project.org/web/packages/pander/README.html
>>>
>>>Finally, please review the Introductory PDF, available here:
>>>
>>>https://cran.r-project.org/doc/manuals/R-intro.pdf
>>>
>>>HTH, Bill.
>>>
>>>William Michels, Ph.D.
>>>
>>>
>>>
>>>On Fri, Mar 31, 2017 at 11:21 AM, BR_email  wrote:
 William:
 How can I replace the "NAs" with blanks?
 Bruce

 Bruce Ratner, Ph.D.
 The Significant Statistician™


 William Michels wrote:
>
> I'm sure there are more efficient ways, but this works:
>
>> test1 <- matrix(runif(50), nrow=10, ncol=5)
>> ## test1 <- as.data.frame(test1)
>> test1 <- rbind(test1, NA)
>> test1[11, c(1,3)] <- colSums(test1[1:10,c(1,3)])
>> test1
>
>
> HTH,
>
> Bill.
>
> William Michels, Ph.D.
>
>
>
> On Fri, Mar 31, 2017 at 9:20 AM, Bruce Ratner PhD 
>>>wrote:
>>

Re: [R] Variation of bubble sort (based on divisors)

2017-03-31 Thread Boris Steipe
This looks opaque and hard to maintain.
It seems to me that a better strategy is to subset your vector with modulo 
expressions, use a normal sort on each of the subsets, and add the result to 
each other. 0 and 1 need to be special-cased.


myPrimes <- c(2, 3, 5)
mySource <- sample(0:10)

# special case 0,1
sel <- mySource < 2  
myTarget <- sort(mySource[sel])
mySource <- mySource[!sel]

# Iterate over requested primes
for (num in myPrimes) {
sel <- !as.logical(mySource %% num)
myTarget <- c(myTarget, sort(mySource[sel]))
mySource <- mySource[!sel]
}

# Add remaining elements
myTarget <- c(myTarget, sort(mySource))  


B.






> On Mar 31, 2017, at 2:16 PM, Piotr Koller  wrote:
> 
> Hi, I'd like to create a function that will sort values of a vector on a
> given basis:
> 
> -zeros
> 
> -ones
> 
> -numbers divisible by 2
> 
> -numbers divisible by 3 (but not by 2)
> 
> -numbers divisible by 5 (but not by 2 and 3)
> 
> etc.
> 
> I also want to omit zeros in those turns. So when I have a given vector of
> c(0:10), I want to receive 0 1 2 4 6 8 10 3 9 5 7 I think it'd be the best
> to use some variation of bubble sort, so it'd look like that
> 
> sort <- function(x) {
> for (j in (length(x)-1):1) {
>   for (i in j:(length(x)-1)) {
> if (x[i+1]%%divisor==0 && x[i]%%divisor!=0) {
>  temp <- x[i]
>  x[i] <- x[i+1]
>  x[i+1] <- temp
>  }
>}
>  }
> return(x)}
> 
> This function works out well on a given divisor and incresing sequences.
> 
> sort <- function(x) {
>  for (j in (length(x)-1):1) {
> for (i in j:(length(x)-1)) {
>   if (x[i+1]%%5==0 && x[i]%%5!=0) {
>temp <- x[i]
>x[i] <- x[i+1]
>x[i+1] <- temp
>   }
>  }
> }
>  return(x)
> }
> 
> x <- c(1:10)
> print(x)
> print(bubblesort(x))
> 
> This function does its job. It moves values divisible by 5 on the
> beginning. The question is how to increase divisor every "round" ?
> 
> Thanks for any kind of help
> 
>   [[alternative HTML version deleted]]
> 
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] pull stat out of summary

2017-03-31 Thread Bert Gunter
?str

tells you the structure of any object. *Learn to use it!*

It may well be the that you *cannot* do what you describe. As you
should know by now in your "learning curve", invoking

> obj

at the console silently invokes the print method for obj, and what is
printed may in fact be calculated on the fly in the print method and
not stored in an object anywhere.

?print.summary.lm

is such an example:  p-values are calculated and printed, but not stored.


Cheers,
Bert


Bert Gunter

"The trouble with having an open mind is that people keep coming along
and sticking things into it."
-- Opus (aka Berkeley Breathed in his "Bloom County" comic strip )


On Fri, Mar 31, 2017 at 4:54 PM, Sarah Goslee  wrote:
> The short answer is that hold isn't a list-like object, and $ only
> works with list-like objects (lists and data frames, mainly).
>
> You can get the full explanation (VERY full), at
> ?Extract
> or any of its aliases, like
> ?'$'
> or
> ?'['
>
> Sarah
>
> On Fri, Mar 31, 2017 at 7:11 PM, Evan Cooch  wrote:
>> Continuing my learning curve after 25_ years with using SAS. Want to pull
>> the "Mean" forom the summary of something...
>>
>> test <- rnorm(1000,1.5,1.25)
>>
>> hold <- summary(test)
>>
>> names(hold)
>> [1] "Min.""1st Qu." "Median"  "Mean""3rd Qu." "Max."
>>
>> OK, so "Mean" is in there.
>> So, is there a short form answer for why hold$Mean throws an error, and
>> hold["Mean"} returns the mean (as desired)?
>>
>> Silly question I know,  but gotta start somewhere...
>>
>> Thanks...
>>
>
> --
> Sarah Goslee
> http://www.functionaldiversity.org
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] pull stat out of summary

2017-03-31 Thread Sarah Goslee
The short answer is that hold isn't a list-like object, and $ only
works with list-like objects (lists and data frames, mainly).

You can get the full explanation (VERY full), at
?Extract
or any of its aliases, like
?'$'
or
?'['

Sarah

On Fri, Mar 31, 2017 at 7:11 PM, Evan Cooch  wrote:
> Continuing my learning curve after 25_ years with using SAS. Want to pull
> the "Mean" forom the summary of something...
>
> test <- rnorm(1000,1.5,1.25)
>
> hold <- summary(test)
>
> names(hold)
> [1] "Min.""1st Qu." "Median"  "Mean""3rd Qu." "Max."
>
> OK, so "Mean" is in there.
> So, is there a short form answer for why hold$Mean throws an error, and
> hold["Mean"} returns the mean (as desired)?
>
> Silly question I know,  but gotta start somewhere...
>
> Thanks...
>

-- 
Sarah Goslee
http://www.functionaldiversity.org

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Taking the sum of only some columns of a data frame

2017-03-31 Thread Bert Gunter
All:

1. I agree wholeheartedly with prior responses.

2. But let's suppose that for some reason, you *did* want to carry
around some "calculated values" with the data frame. Then one way to
do it is to add them as attributes to the data frame. This way they
cannot "pollute" the data in the way Jeff warned against; e.g.

attr(your_frame,"colsums") <- colSums(your_frame)

This of course calculates them all, but you can of course just attach
some (e.g. colSums(your_frame[,c(1,3)] )

3. This, of course, has the disadvantage of requiring recalculation of
the attribute if the data changes, which is an invitation to problems.
A better approach might be to attach the *function* that does the
calculation as an attribute, which when invoked always uses the
current data:

attr(your_frame,"colsums") <- function(x)colSums(x)

For example:

df <- data.frame(x=1:5,y=21:25)
attr(df,"colsums")<- function(x)colSums(x)

## then:
> attr(df,"colsums")(df)
  x   y
 15 115

## add a row
> df[6,] <- rep(100,2)
> attr(df,"colsums")(df)
  x   y
115 215


This survives changing the name of df:

> dat <- df
> attr(dat,"colsums")(dat)
  x   y
115 215

As it stands, the call: attr(df,"colsums")(df)  is a bit clumsy; one
could easily write a function that does this sort of thing more
cleanly, as, for example, is done via the "selfStart" functionality
for nonlinear models.

But all this presupposes that the OP is familiar with R programming
paradigms, especially the use of functions as first class objects, and
the language in general. While I may have missed this, his posts do
not seem to me to indicate such familiarity, so as others have
suggested, perhaps the best answer is to first spend some time with an
R tutorial or two and *not* try to mimic bad spreadsheet practices in
R.

Cheers,
Bert


Bert Gunter

"The trouble with having an open mind is that people keep coming along
and sticking things into it."
-- Opus (aka Berkeley Breathed in his "Bloom County" comic strip )


On Fri, Mar 31, 2017 at 2:49 PM, Jeff Newmiller
 wrote:
> You can also look at the knitr-RMarkdown work flow, or the knitr-latex work 
> flow. In both of these it is reasonable to convert your data frame to a 
> temporary character-only form purely for output purposes. However, one can 
> usually use an existing function to push your results out without damaging 
> your working data.
>
> It is important to separate your data from your output because mixing results 
> (totals) with data makes using the data further extremely difficult. Mixing 
> them is one of the major flaws of the spreadsheet model of computation, and 
> it causes problems there as well as in R.
> --
> Sent from my phone. Please excuse my brevity.
>
> On March 31, 2017 1:05:09 PM PDT, William Michels via R-help 
>  wrote:
>>Again, you should always copy the R-help list on replies to your OP.
>>
>>The short answer is you **shouldn't** replace NAs with blanks in your
>>matrix or dataframe.  NA is the proper designation for those cell
>>positions. Replacing NA with a "blank" in a dataframe will convert
>>that column to a "character" mode, precluding further numeric
>>manipulation of those columns.
>>
>>Consider your workflow:  are you tying to export a table? If so, take
>>a look at installing pander (see 'missing' argument on webpage below):
>>
>>https://cran.r-project.org/web/packages/pander/README.html
>>
>>Finally, please review the Introductory PDF, available here:
>>
>>https://cran.r-project.org/doc/manuals/R-intro.pdf
>>
>>HTH, Bill.
>>
>>William Michels, Ph.D.
>>
>>
>>
>>On Fri, Mar 31, 2017 at 11:21 AM, BR_email  wrote:
>>> William:
>>> How can I replace the "NAs" with blanks?
>>> Bruce
>>>
>>> Bruce Ratner, Ph.D.
>>> The Significant Statistician™
>>>
>>>
>>> William Michels wrote:

 I'm sure there are more efficient ways, but this works:

> test1 <- matrix(runif(50), nrow=10, ncol=5)
> ## test1 <- as.data.frame(test1)
> test1 <- rbind(test1, NA)
> test1[11, c(1,3)] <- colSums(test1[1:10,c(1,3)])
> test1


 HTH,

 Bill.

 William Michels, Ph.D.



 On Fri, Mar 31, 2017 at 9:20 AM, Bruce Ratner PhD 
>>wrote:
>
> Hi R'ers:
> Given a data.frame of five columns and ten rows.
> I would like to take the sum of, say, the first and third columns
>>only.
> For the remaining columns, I do not want any calculations, thus
>>rending
> their "values" on the "total" row blank. The sum/total row is to be
>>combined
> to the original data.frame, yielding a data.frame with five columns
>>and
> eleven rows.
>
> Thanks, in advance.
> Bruce
>
>
> __
> Bruce Ratner PhD
> The Significant Statistician™
>
>
>
>
>  [[alternative HTML version deleted]]
>
> __
> R-help@r-project.org mailing list -- To 

Re: [R] pull stat out of summary

2017-03-31 Thread David L Carlson
This is your answer:

> str(hold)
Classes 'summaryDefault', 'table'  Named num [1:6] -2.602 0.636 1.514 1.54 
2.369 ...
  ..- attr(*, "names")= chr [1:6] "Min." "1st Qu." "Median" "Mean" ...

hold is a table of named numbers, i.e. a vector with a names attribute. It is 
not a data.frame so it does not have column names. The error message sort of 
tells you this when it says hold is an atomic vector (i.e. not a list or a data 
frame which are built from other objects such as vectors).

David Carlson
Anthropology Department
Texas A University

From: R-help  on behalf of Evan Cooch 

Sent: Friday, March 31, 2017 6:11 PM
To: r-help@r-project.org
Subject: [R] pull stat out of summary

Continuing my learning curve after 25_ years with using SAS. Want to
pull the "Mean" forom the summary of something...

test <- rnorm(1000,1.5,1.25)

hold <- summary(test)

names(hold)
[1] "Min.""1st Qu." "Median"  "Mean""3rd Qu." "Max."

OK, so "Mean" is in there.
So, is there a short form answer for why hold$Mean throws an error, and
hold["Mean"} returns the mean (as desired)?

Silly question I know,  but gotta start somewhere...

Thanks...

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Variation of bubble sort (based on divisors)

2017-03-31 Thread Piotr Koller
Hi, I'd like to create a function that will sort values of a vector on a
given basis:

-zeros

-ones

-numbers divisible by 2

-numbers divisible by 3 (but not by 2)

-numbers divisible by 5 (but not by 2 and 3)

etc.

I also want to omit zeros in those turns. So when I have a given vector of
c(0:10), I want to receive 0 1 2 4 6 8 10 3 9 5 7 I think it'd be the best
to use some variation of bubble sort, so it'd look like that

sort <- function(x) {
 for (j in (length(x)-1):1) {
   for (i in j:(length(x)-1)) {
 if (x[i+1]%%divisor==0 && x[i]%%divisor!=0) {
  temp <- x[i]
  x[i] <- x[i+1]
  x[i+1] <- temp
  }
}
  }
 return(x)}

This function works out well on a given divisor and incresing sequences.

sort <- function(x) {
  for (j in (length(x)-1):1) {
 for (i in j:(length(x)-1)) {
   if (x[i+1]%%5==0 && x[i]%%5!=0) {
temp <- x[i]
x[i] <- x[i+1]
x[i+1] <- temp
   }
  }
 }
  return(x)
 }

x <- c(1:10)
print(x)
print(bubblesort(x))

This function does its job. It moves values divisible by 5 on the
beginning. The question is how to increase divisor every "round" ?

Thanks for any kind of help

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] pull stat out of summary

2017-03-31 Thread Evan Cooch
Continuing my learning curve after 25_ years with using SAS. Want to 
pull the "Mean" forom the summary of something...


test <- rnorm(1000,1.5,1.25)

hold <- summary(test)

names(hold)
[1] "Min.""1st Qu." "Median"  "Mean""3rd Qu." "Max."

OK, so "Mean" is in there.
So, is there a short form answer for why hold$Mean throws an error, and 
hold["Mean"} returns the mean (as desired)?


Silly question I know,  but gotta start somewhere...

Thanks...

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Taking the sum of only some columns of a data frame

2017-03-31 Thread Mathew Guilfoyle
This does the summation you want in one line:

#create example data and column selection
d = as.data.frame(matrix(rnorm(50),ncol=5))
cols = c(1,3)

#sum selected columns and put results in new row
d[nrow(d)+1,cols] = colSums(d[,cols])

However, I would agree with the sentiments that this is a bad idea; far better 
to have the mean values stored in a new object leaving the original data table 
untainted.  


> On 31 Mar 2017, at 17:20, Bruce Ratner PhD  wrote:
> 
> Hi R'ers:
> Given a data.frame of five columns and ten rows. 
> I would like to take the sum of, say, the first and third columns only.
> For the remaining columns, I do not want any calculations, thus rending their 
> "values" on the "total" row blank. The sum/total row is to be combined to the 
> original data.frame, yielding a data.frame with five columns and eleven rows. 
> 
> Thanks, in advance. 
> Bruce 
> 
> 
> __
> Bruce Ratner PhD
> The Significant Statistician™
> 
> 
> 
> 
>   [[alternative HTML version deleted]]
> 
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Taking the sum of only some columns of a data frame

2017-03-31 Thread Jeff Newmiller
You can also look at the knitr-RMarkdown work flow, or the knitr-latex work 
flow. In both of these it is reasonable to convert your data frame to a 
temporary character-only form purely for output purposes. However, one can 
usually use an existing function to push your results out without damaging your 
working data. 

It is important to separate your data from your output because mixing results 
(totals) with data makes using the data further extremely difficult. Mixing 
them is one of the major flaws of the spreadsheet model of computation, and it 
causes problems there as well as in R.
-- 
Sent from my phone. Please excuse my brevity.

On March 31, 2017 1:05:09 PM PDT, William Michels via R-help 
 wrote:
>Again, you should always copy the R-help list on replies to your OP.
>
>The short answer is you **shouldn't** replace NAs with blanks in your
>matrix or dataframe.  NA is the proper designation for those cell
>positions. Replacing NA with a "blank" in a dataframe will convert
>that column to a "character" mode, precluding further numeric
>manipulation of those columns.
>
>Consider your workflow:  are you tying to export a table? If so, take
>a look at installing pander (see 'missing' argument on webpage below):
>
>https://cran.r-project.org/web/packages/pander/README.html
>
>Finally, please review the Introductory PDF, available here:
>
>https://cran.r-project.org/doc/manuals/R-intro.pdf
>
>HTH, Bill.
>
>William Michels, Ph.D.
>
>
>
>On Fri, Mar 31, 2017 at 11:21 AM, BR_email  wrote:
>> William:
>> How can I replace the "NAs" with blanks?
>> Bruce
>>
>> Bruce Ratner, Ph.D.
>> The Significant Statistician™
>>
>>
>> William Michels wrote:
>>>
>>> I'm sure there are more efficient ways, but this works:
>>>
 test1 <- matrix(runif(50), nrow=10, ncol=5)
 ## test1 <- as.data.frame(test1)
 test1 <- rbind(test1, NA)
 test1[11, c(1,3)] <- colSums(test1[1:10,c(1,3)])
 test1
>>>
>>>
>>> HTH,
>>>
>>> Bill.
>>>
>>> William Michels, Ph.D.
>>>
>>>
>>>
>>> On Fri, Mar 31, 2017 at 9:20 AM, Bruce Ratner PhD 
>wrote:

 Hi R'ers:
 Given a data.frame of five columns and ten rows.
 I would like to take the sum of, say, the first and third columns
>only.
 For the remaining columns, I do not want any calculations, thus
>rending
 their "values" on the "total" row blank. The sum/total row is to be
>combined
 to the original data.frame, yielding a data.frame with five columns
>and
 eleven rows.

 Thanks, in advance.
 Bruce


 __
 Bruce Ratner PhD
 The Significant Statistician™




  [[alternative HTML version deleted]]

 __
 R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.
>>>
>>>
>>>
>>
>
>__
>R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
>https://stat.ethz.ch/mailman/listinfo/r-help
>PLEASE do read the posting guide
>http://www.R-project.org/posting-guide.html
>and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Deploying R on the cloud - Help Please

2017-03-31 Thread Axel Urbiz
Hello,

I work for a large organization who is looking to productionize (deploy)
models built in R on the cloud. Currently, we were looking into IBM
Bluemix, but I’ve been told only Python is supported for model deployment.

I’d appreciate if anyone can point me to the right direction here in terms
of best practices / companies that support deploying R models on the cloud.



Thank you for your help.

Regards,

Axel.

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Taking the sum of only some columns of a data frame

2017-03-31 Thread William Michels via R-help
Again, you should always copy the R-help list on replies to your OP.

The short answer is you **shouldn't** replace NAs with blanks in your
matrix or dataframe.  NA is the proper designation for those cell
positions. Replacing NA with a "blank" in a dataframe will convert
that column to a "character" mode, precluding further numeric
manipulation of those columns.

Consider your workflow:  are you tying to export a table? If so, take
a look at installing pander (see 'missing' argument on webpage below):

https://cran.r-project.org/web/packages/pander/README.html

Finally, please review the Introductory PDF, available here:

https://cran.r-project.org/doc/manuals/R-intro.pdf

HTH, Bill.

William Michels, Ph.D.



On Fri, Mar 31, 2017 at 11:21 AM, BR_email  wrote:
> William:
> How can I replace the "NAs" with blanks?
> Bruce
>
> Bruce Ratner, Ph.D.
> The Significant Statistician™
>
>
> William Michels wrote:
>>
>> I'm sure there are more efficient ways, but this works:
>>
>>> test1 <- matrix(runif(50), nrow=10, ncol=5)
>>> ## test1 <- as.data.frame(test1)
>>> test1 <- rbind(test1, NA)
>>> test1[11, c(1,3)] <- colSums(test1[1:10,c(1,3)])
>>> test1
>>
>>
>> HTH,
>>
>> Bill.
>>
>> William Michels, Ph.D.
>>
>>
>>
>> On Fri, Mar 31, 2017 at 9:20 AM, Bruce Ratner PhD  wrote:
>>>
>>> Hi R'ers:
>>> Given a data.frame of five columns and ten rows.
>>> I would like to take the sum of, say, the first and third columns only.
>>> For the remaining columns, I do not want any calculations, thus rending
>>> their "values" on the "total" row blank. The sum/total row is to be combined
>>> to the original data.frame, yielding a data.frame with five columns and
>>> eleven rows.
>>>
>>> Thanks, in advance.
>>> Bruce
>>>
>>>
>>> __
>>> Bruce Ratner PhD
>>> The Significant Statistician™
>>>
>>>
>>>
>>>
>>>  [[alternative HTML version deleted]]
>>>
>>> __
>>> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
>>> https://stat.ethz.ch/mailman/listinfo/r-help
>>> PLEASE do read the posting guide
>>> http://www.R-project.org/posting-guide.html
>>> and provide commented, minimal, self-contained, reproducible code.
>>
>>
>>
>

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Using R and Python together

2017-03-31 Thread Wensui Liu
In https://statcompute.wordpress.com/?s=rpy2, you can find examples of rpy2.

In https://statcompute.wordpress.com/?s=pyper, you can find examples of pyper.

On Fri, Mar 31, 2017 at 11:38 AM, Kankana Shukla  wrote:
> I'm not great at rpy2.  Are there any good examples I could see to learn
> how to do that?  My R code is very long and complicated.
>
> On Fri, Mar 31, 2017 at 7:08 AM, Stefan Evert 
> wrote:
>
>>
>> > On 30 Mar 2017, at 23:37, Kankana Shukla  wrote:
>> >
>> > I have searched for examples using R and Python together, and rpy2 seems
>> > like the way to go, but is there another (easier) way to do it?
>>
>> Rpy2 would seem to be a very easy and convenient solution.  What do you
>> need that can't easily be down with rpy2?
>>
>> Best regards,
>> Stefan
>
> [[alternative HTML version deleted]]
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Taking the sum of only some columns of a data frame

2017-03-31 Thread William Michels via R-help
I'm sure there are more efficient ways, but this works:

> test1 <- matrix(runif(50), nrow=10, ncol=5)
> ## test1 <- as.data.frame(test1)
> test1 <- rbind(test1, NA)
> test1[11, c(1,3)] <- colSums(test1[1:10,c(1,3)])
> test1


HTH,

Bill.

William Michels, Ph.D.



On Fri, Mar 31, 2017 at 9:20 AM, Bruce Ratner PhD  wrote:
>
> Hi R'ers:
> Given a data.frame of five columns and ten rows.
> I would like to take the sum of, say, the first and third columns only.
> For the remaining columns, I do not want any calculations, thus rending their 
> "values" on the "total" row blank. The sum/total row is to be combined to the 
> original data.frame, yielding a data.frame with five columns and eleven rows.
>
> Thanks, in advance.
> Bruce
>
>
> __
> Bruce Ratner PhD
> The Significant Statistician™
>
>
>
>
> [[alternative HTML version deleted]]
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Difference between R for the Mac and for Windows

2017-03-31 Thread Berend Hasselman

> On 31 Mar 2017, at 19:28, John McKown  wrote:
> 
> On Fri, Mar 31, 2017 at 12:15 PM, Berend Hasselman  wrote:
> 
> I have noted a difference between R on macOS en on Kubuntu Trusty (64bits) 
> with complex division.
> I don't know what would happen R on Windows.
> 
> R.3.3.3:
> 
> macOS (10.11.6)
> -
> > (1+2i)/0
> [1] NaN+NaNi
> > (-1+2i)/0
> [1] NaN+NaNi
> >
> > 1i/0
> [1] NaN+NaNi
> > 1i/(0+0i)
> [1] NaN+NaNi
> 
> 
> KubuntuTrusty
> -
> > (1+2i)/0
> [1] Inf+Infi
> > (-1+2i)/0
> [1] -Inf+Infi
> >
> > 1i/0
> [1] NaN+Infi
> > 1i/(0+0i)
> [1] NaN+Infi
> 
> Interesting to see what R on Windows delivers.
> 
> ​> (1+2i)/0
> [1] Inf+Infi
> > (-1+2i)/0
> [1] -Inf+Infi
> > 1i/0
> [1] NaN+Infi
> > 1i/(0+0i)
> [1] NaN+Infi
> > Sys.info()
>  sysname  release 
>"Windows"  "7 x64" 
>  version nodename 
> "build 7601, Service Pack 1" "IT-JMCKOWN" 
>  machinelogin 
> "x86-64""John.Mckown" 
> user   effective_user 
>"John.Mckown""John.Mckown" 
> > 
> 
> Same as Kubuntu. I am _guessing_ that the MacOS somehow sets up the floating 
> point processing to work differently, since they are all on Intel machines 
> nowadays. Or the R was customized to detect division by zero in software and 
> not really do any floating point processing at all.
> ​
> 

I think it's the system math library that does this.

I have assumed that the Kubuntu Trusty (and Windows) give the correct result.
In my package geigen I have taken that into account and made a specialized 
complexdivision function that tries to detect a possibly wrong outcome (which 
appears to happen only on macOS).

Berend Hasselman

> Berend Hasselman
> 
> 
> 
> -- 
> "Irrigation of the land with seawater desalinated by fusion power is ancient. 
> It's called 'rain'." -- Michael McClary, in alt.fusion
> 
> Maranatha! <><
> John McKown

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Difference between R for the Mac and for Windows

2017-03-31 Thread John McKown
On Fri, Mar 31, 2017 at 12:15 PM, Berend Hasselman  wrote:

>
> I have noted a difference between R on macOS en on Kubuntu Trusty (64bits)
> with complex division.
> I don't know what would happen R on Windows.
>
> R.3.3.3:
>
> macOS (10.11.6)
> -
> > (1+2i)/0
> [1] NaN+NaNi
> > (-1+2i)/0
> [1] NaN+NaNi
> >
> > 1i/0
> [1] NaN+NaNi
> > 1i/(0+0i)
> [1] NaN+NaNi
>
>
> KubuntuTrusty
> -
> > (1+2i)/0
> [1] Inf+Infi
> > (-1+2i)/0
> [1] -Inf+Infi
> >
> > 1i/0
> [1] NaN+Infi
> > 1i/(0+0i)
> [1] NaN+Infi
>
> Interesting to see what R on Windows delivers.
>

​> (1+2i)/0
[1] Inf+Infi
> (-1+2i)/0
[1] -Inf+Infi
> 1i/0
[1] NaN+Infi
> 1i/(0+0i)
[1] NaN+Infi
> Sys.info()
 sysname  release
   "Windows"  "7 x64"
 version nodename
"build 7601, Service Pack 1" "IT-JMCKOWN"
 machinelogin
"x86-64""John.Mckown"
user   effective_user
   "John.Mckown""John.Mckown"
>

Same as Kubuntu. I am _guessing_ that the MacOS somehow sets up the
floating point processing to work differently, since they are all on Intel
machines nowadays. Or the R was customized to detect division by zero in
software and not really do any floating point processing at all.
​

>
> Berend Hasselman
>
>

-- 
"Irrigation of the land with seawater desalinated by fusion power is
ancient. It's called 'rain'." -- Michael McClary, in alt.fusion

Maranatha! <><
John McKown

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Difference between R for the Mac and for Windows

2017-03-31 Thread Uwe Ligges



On 31.03.2017 19:15, Berend Hasselman wrote:


I have noted a difference between R on macOS en on Kubuntu Trusty (64bits) with 
complex division.
I don't know what would happen R on Windows.

R.3.3.3:

macOS (10.11.6)
-

(1+2i)/0

[1] NaN+NaNi

(-1+2i)/0

[1] NaN+NaNi


1i/0

[1] NaN+NaNi

1i/(0+0i)

[1] NaN+NaNi


KubuntuTrusty
-

(1+2i)/0

[1] Inf+Infi

(-1+2i)/0

[1] -Inf+Infi


1i/0

[1] NaN+Infi

1i/(0+0i)

[1] NaN+Infi

Interesting to see what R on Windows delivers.


Same as KubuntuTrusty and what I would expect.

Best,
Uwe Ligges





Berend Hasselman

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.



__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Taking the sum of only some columns of a data frame

2017-03-31 Thread William Dunlap via R-help
> dat <- data.frame(Group=LETTERS[1:5], X=1:5, Y=11:15)
> pos <- c(2,3)
> rbind(dat, Sum=lapply(seq_len(ncol(dat)), function(i) if (i %in% pos) 
> sum(dat[,i]) else NA_real_))
Group  X  Y
1   A  1 11
2   B  2 12
3   C  3 13
4   D  4 14
5   E  5 15
Sum   15 65
> str(.Last.value)
'data.frame':   6 obs. of  3 variables:
 $ Group: Factor w/ 5 levels "A","B","C","D",..: 1 2 3 4 5 NA
 $ X: int  1 2 3 4 5 15
 $ Y: int  11 12 13 14 15 65
Bill Dunlap
TIBCO Software
wdunlap tibco.com


On Fri, Mar 31, 2017 at 9:20 AM, Bruce Ratner PhD  wrote:
> Hi R'ers:
> Given a data.frame of five columns and ten rows.
> I would like to take the sum of, say, the first and third columns only.
> For the remaining columns, I do not want any calculations, thus rending their 
> "values" on the "total" row blank. The sum/total row is to be combined to the 
> original data.frame, yielding a data.frame with five columns and eleven rows.
>
> Thanks, in advance.
> Bruce
>
>
> __
> Bruce Ratner PhD
> The Significant Statistician™
>
>
>
>
> [[alternative HTML version deleted]]
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Difference between R for the Mac and for Windows

2017-03-31 Thread Berend Hasselman

I have noted a difference between R on macOS en on Kubuntu Trusty (64bits) with 
complex division.
I don't know what would happen R on Windows.

R.3.3.3:

macOS (10.11.6)
-
> (1+2i)/0
[1] NaN+NaNi
> (-1+2i)/0
[1] NaN+NaNi
> 
> 1i/0
[1] NaN+NaNi
> 1i/(0+0i)
[1] NaN+NaNi


KubuntuTrusty
-
> (1+2i)/0
[1] Inf+Infi
> (-1+2i)/0
[1] -Inf+Infi
> 
> 1i/0
[1] NaN+Infi
> 1i/(0+0i)
[1] NaN+Infi

Interesting to see what R on Windows delivers.

Berend Hasselman

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Taking the sum of only some columns of a data frame

2017-03-31 Thread Doran, Harold
Let's keep r-list on the email per typical protocol. Apply is a function in 
base R, so you don't need to install it

-Original Message-
From: Bruce Ratner PhD [mailto:b...@dmstat1.com] 
Sent: Friday, March 31, 2017 1:06 PM
To: Doran, Harold 
Subject: Re: [R] Taking the sum of only some columns of a data frame

Hey Harold:
Thanks for quick reply. 
But, I can't install "apply."

Is there anything you can suggest to get my install of apply on R 3.3.3, or a 
work around of your original answer?

Thanks, so much. 
Bruce 

__
Bruce Ratner PhD
The Significant Statistician™




> On Mar 31, 2017, at 12:33 PM, Doran, Harold  wrote:
> 
> apply

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Using R and Python together

2017-03-31 Thread Kankana Shukla
I'm not great at rpy2.  Are there any good examples I could see to learn
how to do that?  My R code is very long and complicated.

On Fri, Mar 31, 2017 at 7:08 AM, Stefan Evert 
wrote:

>
> > On 30 Mar 2017, at 23:37, Kankana Shukla  wrote:
> >
> > I have searched for examples using R and Python together, and rpy2 seems
> > like the way to go, but is there another (easier) way to do it?
>
> Rpy2 would seem to be a very easy and convenient solution.  What do you
> need that can't easily be down with rpy2?
>
> Best regards,
> Stefan

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Date operation Question in R

2017-03-31 Thread David Winsemius

> On Mar 30, 2017, at 3:16 PM, Thomas Petzoldt  wrote:
> 
> On 30.03.2017 23:34, Paul Bernal wrote:
>> Hello everyone,
>> 
>> Is there a way to use the function seq to generate a date sequence in
>> this kind of format: jan-2007?
> 
> format(seq(ISOdate(2017,1,1), ISOdate(2017,12,31), "months"), "%b-%Y")

But since the original one asked for a starting point of Sys.Date, on this 31st 
day of March, it might be useful to demonstrate that there are pifalls for the 
uninitiated useR. Note the many duplicate "months":

> format(seq(ISOdate(2017,1,31), ISOdate(2018,12,31), "months"), "%b-%Y")
 [1] "Jan-2017" "Mar-2017" "Mar-2017" "May-2017" "May-2017" "Jul-2017" 
"Jul-2017"
 [8] "Aug-2017" "Oct-2017" "Oct-2017" "Dec-2017" "Dec-2017" "Jan-2018" 
"Mar-2018"
[15] "Mar-2018" "May-2018" "May-2018" "Jul-2018" "Jul-2018" "Aug-2018" 
"Oct-2018"
[22] "Oct-2018" "Dec-2018" "Dec-2018"

-- 
David.
> 
>> 
>> Also, is there a way to change the Sys.Date() format to the one
>> mentioned above (jan-2007)?
> 
> format(Sys.Date(), "%b-%Y")
> 
> see ?strptime for details.
> 
> Thomas
> 
>> 
>> Thanks in advance for your valuable help,
>> 
>> Best regards,
>> 
>> Paul
> 
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

David Winsemius
Alameda, CA, USA

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Taking the sum of only some columns of a data frame

2017-03-31 Thread Doran, Harold
Apologies, my code below has an error that recycles the vector x. Hopefully, 
the concept is clear.

-Original Message-
From: R-help [mailto:r-help-boun...@r-project.org] On Behalf Of Doran, Harold
Sent: Friday, March 31, 2017 12:34 PM
To: 'Bruce Ratner PhD' ; r-help@r-project.org
Subject: Re: [R] Taking the sum of only some columns of a data frame

I do not believe this can be done in one step

dat <- data.frame(matrix(rnorm(50), 5))

 pos <- c(1,3)
res <-  apply(dat[, pos], 2, sum)

 x <- numeric(5)
 x[pos] <- res

rbind(dat,x)

-Original Message-
From: R-help [mailto:r-help-boun...@r-project.org] On Behalf Of Bruce Ratner PhD
Sent: Friday, March 31, 2017 12:20 PM
To: r-help@r-project.org
Subject: [R] Taking the sum of only some columns of a data frame

Hi R'ers:
Given a data.frame of five columns and ten rows. 
I would like to take the sum of, say, the first and third columns only.
For the remaining columns, I do not want any calculations, thus rending their 
"values" on the "total" row blank. The sum/total row is to be combined to the 
original data.frame, yielding a data.frame with five columns and eleven rows. 

Thanks, in advance. 
Bruce 


__
Bruce Ratner PhD
The Significant Statistician™




[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see 
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see 
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Taking the sum of only some columns of a data frame

2017-03-31 Thread Doran, Harold
I do not believe this can be done in one step

dat <- data.frame(matrix(rnorm(50), 5))

 pos <- c(1,3)
res <-  apply(dat[, pos], 2, sum)

 x <- numeric(5)
 x[pos] <- res

rbind(dat,x)

-Original Message-
From: R-help [mailto:r-help-boun...@r-project.org] On Behalf Of Bruce Ratner PhD
Sent: Friday, March 31, 2017 12:20 PM
To: r-help@r-project.org
Subject: [R] Taking the sum of only some columns of a data frame

Hi R'ers:
Given a data.frame of five columns and ten rows. 
I would like to take the sum of, say, the first and third columns only.
For the remaining columns, I do not want any calculations, thus rending their 
"values" on the "total" row blank. The sum/total row is to be combined to the 
original data.frame, yielding a data.frame with five columns and eleven rows. 

Thanks, in advance. 
Bruce 


__
Bruce Ratner PhD
The Significant Statistician™




[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see 
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Taking the sum of only some columns of a data frame

2017-03-31 Thread Bruce Ratner PhD
Hi R'ers:
Given a data.frame of five columns and ten rows. 
I would like to take the sum of, say, the first and third columns only.
For the remaining columns, I do not want any calculations, thus rending their 
"values" on the "total" row blank. The sum/total row is to be combined to the 
original data.frame, yielding a data.frame with five columns and eleven rows. 

Thanks, in advance. 
Bruce 


__
Bruce Ratner PhD
The Significant Statistician™




[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] conditional regression with mgcv

2017-03-31 Thread Dean Force
Hello,


As a part of a larger project, I am trying to run a conditional logistic
regression to look at whether maternal age is implicated in the risk of
developing gestational diabetes. I am using a matched case-control design,
where mothers with GDM were individually matched with up to 6 controls
based on several parameters.


I run the following model:


model <- gam(gdm ~ s(maternal_age, bs="cr") + strata(risk_set) +
as.factor(district) + as.factor(riskfactor1)+as.factor(riskfactor2), data =
dt, family=cox.ph(), weights = wt)



weights are defined as 0 for censoring, 1 for event, and each subject has
one event/censoring time and one row of covariate values. In total there
are 1000 cases, matched to 5500 controls, so there are 1000 risk_set that I
define as strata.

When running the model, I keep getting the following error: “Error in
xat[[i]] : subscript out of bounds”. Am I doing something wrong?

Using mgcv_1.8.



Thank you!

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Difference between R for the Mac and for Windows

2017-03-31 Thread Ista Zahn
The only place I've noticed differences is in encoding and string sorting,
both of which are locale and library dependent.

Best,
Ista

On Mar 31, 2017 8:14 AM, "Neil Salkind"  wrote:

> Can someone please direct me to an answer to the question as to how R
> differs for these two operating systems, if at all? Thanks - Neil
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/
> posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] fisher.test function error

2017-03-31 Thread Marc Schwartz

> On Mar 31, 2017, at 7:04 AM, Stefan Evert  wrote:
> 
> 
>> On 30 Mar 2017, at 11:51, Eshi Vaz  wrote:
>> 
>> When trying to computer a fisher’s exact test using the fisher.test function 
>> from the gmodels() package,  <
> 
> The problem seems to be with a different fisher.test() function from the 
> gmodels package, not with stats::fisher.test.
> 
> The usual recommendation is to contact the package authors for help.
> 
> Best regards,
> Stefan


There is no fisher.test() function in the gmodels package. 

The error message is being generated from compiled code in stats::fisher.test().

A Google search for the error message indicates that there are reports going 
back at least as far as 2002, suggesting that the underlying issue is an 
integer overflow:

  https://bugs.r-project.org/bugzilla3/show_bug.cgi?id=1662

with at least one example resolved back in 2005:

  https://bugs.r-project.org/bugzilla3/show_bug.cgi?id=6986

The former report has recent reports from 2014/2015 suggesting that the 
original 2002 issue is still present, at least in specific situations:

d4 <- matrix(c(0, 0, 0, 0, 0,  0, 3, 0, 1, 0,  0, 0, 0, 0, 0,
   1, 0, 0, 0, 0,  1, 0, 0, 2, 0,  0, 0, 1, 0, 0,
   0, 1, 0, 1, 0,  4, 0, 2, 0, 0,  0, 1, 0, 0, 0,  0, 0, 0, 0, 0,
   0, 1, 0, 0, 2,  0, 0, 0, 2, 2,  0, 1, 0, 0, 0,
   0, 0, 1, 1, 0,  0, 0, 0, 0, 0,  0, 1, 0, 0, 0,
   1, 0, 0, 0, 2,  0, 0, 0, 3, 0,  0, 0, 0, 0, 1,  0, 0, 0, 0, 0,
   2, 0, 0, 0, 0,  0, 0, 0, 0, 0,  0, 0, 1, 0, 1,
   0, 0, 0, 0, 2,  0, 0, 0, 0, 8,  0, 0, 0, 3, 0,
   0, 0, 0, 0, 0,  0, 0, 0, 0, 0,  0, 0, 0, 0, 0,  0, 0, 0, 0, 1,
   0, 0, 1, 0, 0,  0, 0, 0, 0, 0,  2, 0, 0, 1, 0,
   0, 2, 0, 0, 0,  0, 2, 0, 0, 1,  3, 0, 0, 0, 0,
   0, 0, 0, 0, 0,  0, 0, 0, 0, 1,  0, 0, 1, 0, 0,  4, 0, 0, 0, 0),
 nr=50)

> fisher.test(d4)
Error in fisher.test(d4) : FEXACT error 30.
Stack length exceeded in f3xact.
This problem should not occur.


tab <- structure(list(V1 = c(1, 0, 0, 0, 0, 0), 
  V2 = c(323, 4, 1, 0, 0, 22), 
  V3 = c(3, 0, 0, 0, 0, 1), 
  V4 = c(2, 0, 1, 0, 1, 3), 
  V5 = c(1, 0, 0, 0, 0, 4), 
  V6 = c(1, 0, 0, 0, 0, 0), 
  V7 = c(0, 0, 0, 1, 0, 1), 
  V8 = c(96, 0, 0, 0, 0, 2)), 
  .Names = c("V1", "V2", "V3", "V4", "V5", "V6", "V7", 
"V8"), 
  class = "data.frame", row.names = c(NA, -6L))

> fisher.test(tab)
Error in fisher.test(tab) : Bug in FEXACT: gave negative key


Note that in the second example, the data frame is coerced to a matrix inside 
fisher.test().

The above two examples were run using R version 3.3.3 on macOS 10.12.4 in a CLI 
console.

Regards,

Marc Schwartz

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] fisher.test function error

2017-03-31 Thread peter dalgaard

> On 31 Mar 2017, at 14:04 , Stefan Evert  wrote:
> 
> 
>> On 30 Mar 2017, at 11:51, Eshi Vaz  wrote:
>> 
>> When trying to computer a fisher’s exact test using the fisher.test function 
>> from the gmodels() package,  <
> 
> The problem seems to be with a different fisher.test() function from the 
> gmodels package, not with stats::fisher.test.

That's what I thought, but there is no fisher.test variant in gmodels. There is 
CrossTable, which calls fisher.test in its print method, but as far as I can 
tell, that is the usual one from stats.

At any rate, it would be useful to know what the table looks like. If has a 
huge number of rows or columns then

(a) it could be the result of a coding blunder
(b) be quite meaninglesss to attack with a fisher exact test


-pd

> 
> The usual recommendation is to contact the package authors for help.
> 
> Best regards,
> Stefan
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

-- 
Peter Dalgaard, Professor,
Center for Statistics, Copenhagen Business School
Solbjerg Plads 3, 2000 Frederiksberg, Denmark
Phone: (+45)38153501
Office: A 4.23
Email: pd@cbs.dk  Priv: pda...@gmail.com

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Using R and Python together

2017-03-31 Thread Stefan Evert

> On 30 Mar 2017, at 23:37, Kankana Shukla  wrote:
> 
> I have searched for examples using R and Python together, and rpy2 seems
> like the way to go, but is there another (easier) way to do it? 

Rpy2 would seem to be a very easy and convenient solution.  What do you need 
that can't easily be down with rpy2?

Best regards,
Stefan
__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] fisher.test function error

2017-03-31 Thread Stefan Evert

> On 30 Mar 2017, at 11:51, Eshi Vaz  wrote:
> 
> When trying to computer a fisher’s exact test using the fisher.test function 
> from the gmodels() package,  <

The problem seems to be with a different fisher.test() function from the 
gmodels package, not with stats::fisher.test.

The usual recommendation is to contact the package authors for help.

Best regards,
Stefan
__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Difference between R for the Mac and for Windows

2017-03-31 Thread peter dalgaard
File encodings differ when you move outside of standard ASCII code. Not really 
R's problem, but it is a fly in the ointment when teaching classes with mixed 
laptop armoury and there are also differences between classroom and desktop 
computers. RStudio does have features to switch encodings, but I usually 
sidestep the issue by commenting scripts in English.

-pd 

> On 31 Mar 2017, at 05:40 , Boris Steipe  wrote:
> 
> I can't remember having seen my students write code that runs correctly on 
> one platform but not the other. Obviously under the hood there are 
> significant differences, but as far as code goes, R seems quite foolproof. 
> There are GUI differences in base R - but AFAIK no such differences in the 
> RStudio IDE.
> 
> B. 
> 
> 
> 
> 
>> On Mar 30, 2017, at 9:21 PM, Neil Salkind  wrote:
>> 
>> Can someone please direct me to an answer to the question as to how R 
>> differs for these two operating systems, if at all? Thanks - Neil 
>> __
>> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
> 
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

-- 
Peter Dalgaard, Professor,
Center for Statistics, Copenhagen Business School
Solbjerg Plads 3, 2000 Frederiksberg, Denmark
Phone: (+45)38153501
Office: A 4.23
Email: pd@cbs.dk  Priv: pda...@gmail.com

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Using R and Python together

2017-03-31 Thread Ulrik Stervbo
'Snakemake' (https://snakemake.readthedocs.io/en/stable/) was created to
ease pipelines through different tools so it might be useful.

In all honesty I only know of Snakemake, so it might be the completely
wrong horse.

HTH
Ulrik

On Fri, 31 Mar 2017 at 06:01 Wensui Liu  wrote:

> How about pyper?
>
> On Thu, Mar 30, 2017 at 10:42 PM Kankana Shukla 
> wrote:
>
> > Hello,
> >
> > I am running a deep neural network in Python.  The input to the NN is the
> > output from my R code. I am currently running the python script and
> calling
> > the R code using a subprocess call, but this does not allow me to
> > recursively change (increment) parameters used in the R code that would
> be
> > the inputs to the python code.  So in short, I would like to follow this
> > automated process:
> >
> >1. Parameters used in R code generate output
> >2. This output is input to Python code
> >3. If output of Python code > x,  stop
> >4. Else, increment parameters used as input in R code (step 1) and
> >repeat all steps
> >
> > I have searched for examples using R and Python together, and rpy2 seems
> > like the way to go, but is there another (easier) way to do it?  I would
> > highly appreciate the help.
> >
> > Thanks in advance,
> >
> > Kankana
> >
> > [[alternative HTML version deleted]]
> >
> > __
> > R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> > https://stat.ethz.ch/mailman/listinfo/r-help
> > PLEASE do read the posting guide
> > http://www.R-project.org/posting-guide.html
> > and provide commented, minimal, self-contained, reproducible code.
> >
>
> [[alternative HTML version deleted]]
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.