[R] Help If

2013-08-29 Thread Mª Teresa Martinez Soriano
Hi to everyone and sorry for my question,  I would like to use IF in an example 
like this:


If((condition1 and condition2) Or (condition 3 and condition4)) {print uhvef}


BUt I don´t know how to write it correctly,

Thanks in advance 
[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Help If

2013-08-29 Thread Rui Barradas

Hello,

and is  ; or is || ; and print() needs the parenthesis around its argument

if((condition1  condition2) || (condition3  condition4)) {print(uhvef)}


Hope this helps,

Rui Barradas

Em 29-08-2013 09:16, Mª Teresa Martinez Soriano escreveu:

Hi to everyone and sorry for my question,  I would like to use IF in an example 
like this:


If((condition1 and condition2) Or (condition 3 and condition4)) {print uhvef}


BUt I don´t know how to write it correctly,

Thanks in advance   
[[alternative HTML version deleted]]



__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.



__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Help If

2013-08-29 Thread Zsurzsa Laszlo
Hey

if  (( (1==1)  (2==2) ) || (3==3)) {  print( hello world) }

-
- László-András Zsurzsa,-
- Msc. Infromatics, Technical University Munich, Germany -
- Scientific Employee, TUM -
-


On Thu, Aug 29, 2013 at 11:11 AM, Zsurzsa Laszlo zsurzsalas...@gmail.comwrote:

 Hey

 if  (( (1==1)  (2==2) ) || (3==3)) {  print( hello world) }


 -
 - László-András Zsurzsa,-
 - Msc. Infromatics, Technical University Munich, Germany -
 - Scientific Employee, TUM -

 -


 On Thu, Aug 29, 2013 at 10:16 AM, Mª Teresa Martinez Soriano 
 teresama...@hotmail.com wrote:

 Hi to everyone and sorry for my question,  I would like to use IF in an
 example like this:


 If((condition1 and condition2) Or (condition 3 and condition4)) {print
 uhvef}


 BUt I don´t know how to write it correctly,

 Thanks in advance
 [[alternative HTML version deleted]]


 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.




[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] XLSX package + Excel creation question

2013-08-29 Thread Zsurzsa Laszlo
Dear R users,


I have a question about the xlsx package. It's possible to create excel
files and color cells and etc.

My question would be that is it possible to color only some part of the
data hold in a cell. Let's assume I've got the following data :
167,153,120,100 and I want to color to red everything that is bigger then
120. How can I achive this using R.

Example file setup with a few lines in attachment. (SEL_MASS column can be
used for example)




Thank you in advance,
-
- László-András Zsurzsa,-
- Msc. Infromatics, Technical University Munich, Germany -
- Scientific Employee, TUM -
-
__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Plotting time vs number

2013-08-29 Thread Jim Lemon

On 08/29/2013 02:19 PM, mohan.radhakrish...@polarisft.com wrote:

Hi,
...
The plots are all there but the x=axis labels are not there. The graph
labels are only '12:30', '13:30' and '14:30'

I think I need to use your code to get all the values.



Hi Mohan,
Try this:

plot(strptime(data$Time,%H:%M:%S),data$Kbytes,pch=0,
 type=b,col=red,col.axis=red, ylab=,
 xlab=,las=2,lwd=2.5,xaxt=n)
library(plotrix)
staxlab(at=as.numeric(strptime(data$Time,%H:%M:%S)),
 labels=as.character(data$Time),nlines=3)

Jim

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Unsuccessful beginner's struggle with lm

2013-08-29 Thread David Epstein
I have two data frames, train and response. Here is my attempt to do a
linear regression. All entries of both data frames are numeric. I am
expecting the intercept value to lie between 2 and 3 (in particular,
non-zero).

Here is a record of my interaction with R:

 class(response)
[1] data.frame
 c(nrow(response),ncol(response))
[1] 13891
 class(train)
[1] data.frame
 c(nrow(train),ncol(train))
[1] 1389  256
 beta.lm - lm(response ~ train)
Error in model.frame.default(formula = response ~ train, drop.unused.levels
= TRUE) :
  invalid type (list) for variable 'response'

What elementary syntax error am I making in my call to lm? And why does R
think at first that the class of response is data.frame, but that its
class is list when I call lm?

Thanks
David

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Sensitivy / Specificity and nulls

2013-08-29 Thread Michael Dewey

At 15:18 28/08/2013, Donald Catanzaro wrote:

Good Day All,

I am working with a diagnostic test and comparing the new test to an old
test.  Normally I would be able to calculate sensitivity and specificity
quite easily.

However, the 'gold standard' that I am comparing my new diagnostic with is
really 'gold-plated' in that sometimes the 'gold standard' fails completely
and I have no data from the 'gold standard' but I might have data from the
diagnostic test.  Of course sometimes my new diagnostic fails but I have
data from my 'gold standard'


I am not sure I completely understand the situation, my crystal ball 
is becoming rather opaque, but it sounds as though you are looking 
for some form of meta-analysis of diagnostic tests when there is no 
reference standard. HSROC, available from CRAN, claims to provide 
this although I have never used it myself.




To me this really starts moving towards classification but I cannot seem to
find the appropriate calculations.

Can someone point me to some web resources to determine the appropriate
method to be able to deal with the NULLs ?  Resources within the medical
realm would be better (because the rest of the folks would understand them
better) but not required.

--
- Don

Donald Catanzaro PhD
dgcatanz...@gmail.com
16144 Sigmond Lane
Lowell, AR 72745
479-751-3616

[[alternative HTML version deleted]]


Michael Dewey
i...@aghmed.fsnet.co.uk
http://www.aghmed.fsnet.co.uk/home.html

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Unsuccessful beginner's struggle with lm

2013-08-29 Thread Duncan Murdoch

On 13-08-29 8:23 AM, David Epstein wrote:

I have two data frames, train and response. Here is my attempt to do a
linear regression. All entries of both data frames are numeric. I am
expecting the intercept value to lie between 2 and 3 (in particular,
non-zero).


lm expects the variables in the formula to be numeric vectors (or 
factors).  They are often columns of a dataframe, but they won't be 
dataframes themselves.




Here is a record of my interaction with R:


class(response)

[1] data.frame

c(nrow(response),ncol(response))

[1] 13891

class(train)

[1] data.frame

c(nrow(train),ncol(train))

[1] 1389  256

beta.lm - lm(response ~ train)

Error in model.frame.default(formula = response ~ train, drop.unused.levels
= TRUE) :
   invalid type (list) for variable 'response'

What elementary syntax error am I making in my call to lm? And why does R
think at first that the class of response is data.frame, but that its
class is list when I call lm?


dataframes are lists with some extra rules added.  lm() is just 
reporting the low level type, rather than the high level one.


The way to do what you want is to include the response as a column in 
the same dataframe that includes the predictor variables.  If you call 
the dataframe df and the response column name response, then the lm 
call would look like


lm(response ~ ., data=df)

The . here means all the other columns.  You could also list them 
explicitly, but 256 of them sounds like a lot...


Duncan Murdoch

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Narrowing values collected from .txt file

2013-08-29 Thread jim holtman
Here is how I would do it since are reading in the entire file.  This
breaks on each Flow Budget section, extracts the RECHARGE values and
puts them in a list with the name of the Flow Budget:


 # read entire file
 input - readLines(C:\\Users\\jh52822\\Downloads\\MCR_Budgets.txt)
 # determine the lines of interest
 indx - grep(Flow Budget for Zone|RECHARGE =, input)
 # remove everything else
 input - input[indx]
 # split by Flow Budget
 sep - split(input, cumsum(grepl(Flow Budget, input)))
 # process the list extracting data
 result - lapply(sep, function(.lines){
+ as.numeric(sub(.*=(.*), \\1, .lines[-1]))
+ })

 # extract the names for each Flow
 fNames - sapply(sep, '[', 1)

 # add to the list
 names(result) - fNames
  result
$` Flow Budget for Zone  1 at Time Step   1 of Stress Period   2`
[1] 128980  0  0  0

$` Flow Budget for Zone  2 at Time Step   1 of Stress Period   2`
[1] 274160  0  0  0

$` Flow Budget for Zone  3 at Time Step   1 of Stress Period   2`
[1] 81084 0 0 0

$` Flow Budget for Zone  1 at Time Step   1 of Stress Period   3`
[1] 128980  0  0  0

$` Flow Budget for Zone  2 at Time Step   1 of Stress Period   3`
[1] 274160  0  0  0

$` Flow Budget for Zone  3 at Time Step   1 of Stress Period   3`
[1] 81084 0 0 0

$` Flow Budget for Zone  1 at Time Step   1 of Stress Period   4`
[1] 128980  0  0  0

$` Flow Budget for Zone  2 at Time Step   1 of Stress Period   4`
[1] 274160  0  0  0

$` Flow Budget for Zone  3 at Time Step   1 of Stress Period   4`
[1] 81084 0 0 0

$` Flow Budget for Zone  1 at Time Step   1 of Stress Period   5`
[1] 128980  0  0  0

$` Flow Budget for Zone  2 at Time Step   1 of Stress Period   5`
[1] 274160  0  0  0

$` Flow Budget for Zone  3 at Time Step   1 of Stress Period   5`
[1] 81084 0 0 0

$` Flow Budget for Zone  1 at Time Step   1 of Stress Period   6`
[1] 128980  0  0  0

$` Flow Budget for Zone  2 at Time Step   1 of Stress Period   6`
[1] 274160  0  0  0

$` Flow Budget for Zone  3 at Time Step   1 of Stress Period   6`
[1] 81084 0 0 0

$` Flow Budget for Zone  1 at Time Step   1 of Stress Period   7`
[1] 128980  0  0  0
Jim Holtman
Data Munger Guru

What is the problem that you are trying to solve?
Tell me what you want to do, not how you want to do it.


On Wed, Aug 28, 2013 at 7:45 PM, Morway, Eric emor...@usgs.gov wrote:
 It looks as though the attachment to my last post didn't make the cut (or
 at least it's not appearing on the Nabble forum), for one reason or
 another.  I'm reattaching a smaller version so folks can run the code
 (won't work without a text file to operate on).  So, while the attached
 file is only a small sample of the larger file and will therefore run
 quickly, I would still be helpful if someone knows a more efficient
 approach to the code in the previous post.


 On Wed, Aug 28, 2013 at 11:28 AM,


 A relatively concise, commented, working solution to the problem
 originally motivating this thread was found (below).  I suspect the
 approach I've taken has a major inefficiency through the use of the
 scan statement appearing inside the function g.  The way the code
 works right now, it has to re-open and read the file 'length(matched)
 times' rather than sequentially reading through to the next pertinent
 section of the txt file.  Does anyone have a more efficient approach in
 mind so I don't have to wait 1/2 hour to get the results? (The only
 adjustment to the code that follows is to point txt to wherever the
 attached file is placed)


 # where is the file?
 txt-c:/temp/MCR_Budgets.txt

 # Demarcation header
 hdr_str-Flow Budget for Zone  2

 # string to identify lines with desired values
 srch_str-  RECHARGE =

 # retrieves desired values
 g-function(txt_con, hdr_str, srch_str, from, to, ...) {

 L - readLines(txt_con)

 #matched contains the line #s w/ hdr_str
 matched - grep(hdr_str, L, value = FALSE, ...)

 #initialize output list
 fetched_list-numeric()

 #for each instance of hdr_str, loop
 for(i in 1:(length(matched))){

   #retrieve a section of text following each hdr_str
   snippet-scan(txt_con, what=character(), skip=matched[i]-1, n=42,
 sep='\n')

   #get data within the short section of retrieved text
   fetched - grep(srch_str, snippet, value=TRUE)

   #append output vector for plotting time series
   fetched_list - c(fetched_list, as.numeric(substring(fetched, from,
 to)))

   #monitor
   print(i)
 }

 #return desired values
 as.numeric(fetched_list)
 }

 #The results of system.time reflect full 147 MB file,
 # only half of which is attached.
 system.time(
   rech_z2-g(txt,hdr_str,srch_str,37,51)
 )
 #   user  system elapsed
 #1740.48   36.08 1825.77



 

[R] Fwd: Narrowing values collected from .txt file

2013-08-29 Thread jim holtman
Jim Holtman
Data Munger Guru

What is the problem that you are trying to solve?
Tell me what you want to do, not how you want to do it.



-- Forwarded message --
From: jim holtman jholt...@gmail.com
Date: Thu, Aug 29, 2013 at 8:43 AM
Subject: Re: [R] Narrowing values collected from .txt file
To: Morway, Eric emor...@usgs.gov


FYI, I duped your data to 100MB file and it took less that 10 seconds
to process.
Jim Holtman
Data Munger Guru

What is the problem that you are trying to solve?
Tell me what you want to do, not how you want to do it.


On Wed, Aug 28, 2013 at 7:45 PM, Morway, Eric emor...@usgs.gov wrote:
 It looks as though the attachment to my last post didn't make the cut (or
 at least it's not appearing on the Nabble forum), for one reason or
 another.  I'm reattaching a smaller version so folks can run the code
 (won't work without a text file to operate on).  So, while the attached
 file is only a small sample of the larger file and will therefore run
 quickly, I would still be helpful if someone knows a more efficient
 approach to the code in the previous post.


 On Wed, Aug 28, 2013 at 11:28 AM,


 A relatively concise, commented, working solution to the problem
 originally motivating this thread was found (below).  I suspect the
 approach I've taken has a major inefficiency through the use of the
 scan statement appearing inside the function g.  The way the code
 works right now, it has to re-open and read the file 'length(matched)
 times' rather than sequentially reading through to the next pertinent
 section of the txt file.  Does anyone have a more efficient approach in
 mind so I don't have to wait 1/2 hour to get the results? (The only
 adjustment to the code that follows is to point txt to wherever the
 attached file is placed)


 # where is the file?
 txt-c:/temp/MCR_Budgets.txt

 # Demarcation header
 hdr_str-Flow Budget for Zone  2

 # string to identify lines with desired values
 srch_str-  RECHARGE =

 # retrieves desired values
 g-function(txt_con, hdr_str, srch_str, from, to, ...) {

 L - readLines(txt_con)

 #matched contains the line #s w/ hdr_str
 matched - grep(hdr_str, L, value = FALSE, ...)

 #initialize output list
 fetched_list-numeric()

 #for each instance of hdr_str, loop
 for(i in 1:(length(matched))){

   #retrieve a section of text following each hdr_str
   snippet-scan(txt_con, what=character(), skip=matched[i]-1, n=42,
 sep='\n')

   #get data within the short section of retrieved text
   fetched - grep(srch_str, snippet, value=TRUE)

   #append output vector for plotting time series
   fetched_list - c(fetched_list, as.numeric(substring(fetched, from,
 to)))

   #monitor
   print(i)
 }

 #return desired values
 as.numeric(fetched_list)
 }

 #The results of system.time reflect full 147 MB file,
 # only half of which is attached.
 system.time(
   rech_z2-g(txt,hdr_str,srch_str,37,51)
 )
 #   user  system elapsed
 #1740.48   36.08 1825.77



 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] XLSX package + Excel creation question

2013-08-29 Thread Rainer Hurling
Am 29.08.2013 12:08 (UTC+1) schrieb Zsurzsa Laszlo:
 Dear R users,
 
 I have a question about the xlsx package. It's possible to create excel
 files and color cells and etc.

yes, with package xlsx you can colourize you data sheets, even the
fonts. See for example ?CellStyle .

A good demonstration of the capabilities is on
http://tradeblotter.wordpress.com/2013/05/02/writing-from-r-to-excel-with-xlsx/

 
 My question would be that is it possible to color only some part of the
 data hold in a cell. Let's assume I've got the following data :
 167,153,120,100 and I want to color to red everything that is bigger then
 120. How can I achive this using R.
 
 Example file setup with a few lines in attachment. (SEL_MASS column can be
 used for example)

Attachment missing ...

HTH,
Rainer

 
 Thank you in advance,
 -
 - László-András Zsurzsa,-
 - Msc. Infromatics, Technical University Munich, Germany -
 - Scientific Employee, TUM -
 -

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] XLSX package + Excel creation question

2013-08-29 Thread Zsurzsa Laszlo
First of all thank you for the quick resposen.

I know I can color and set up every cell. I will take a look again *
CellStyle* but is it possbile for example to write an array to a single
cell that has different colors for some data. Basically the color depends
on the data.

-
- László-András Zsurzsa,-
- Msc. Infromatics, Technical University Munich, Germany -
- Scientific Employee, TUM -
-


On Thu, Aug 29, 2013 at 2:55 PM, Rainer Hurling rhur...@gwdg.de wrote:

 Am 29.08.2013 12:08 (UTC+1) schrieb Zsurzsa Laszlo:
  Dear R users,
 
  I have a question about the xlsx package. It's possible to create excel
  files and color cells and etc.

 yes, with package xlsx you can colourize you data sheets, even the
 fonts. See for example ?CellStyle .

 A good demonstration of the capabilities is on

 http://tradeblotter.wordpress.com/2013/05/02/writing-from-r-to-excel-with-xlsx/

 
  My question would be that is it possible to color only some part of the
  data hold in a cell. Let's assume I've got the following data :
  167,153,120,100 and I want to color to red everything that is bigger then
  120. How can I achive this using R.
 
  Example file setup with a few lines in attachment. (SEL_MASS column can
 be
  used for example)

 Attachment missing ...

 HTH,
 Rainer

 
  Thank you in advance,
 
 -
  - László-András Zsurzsa,-
  - Msc. Infromatics, Technical University Munich, Germany -
  - Scientific Employee, TUM -
 
 -


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Few doubts about ANOVA

2013-08-29 Thread bala chand
Hi
   can you please give  the brief explanation about anova?

  what is the purpose of null hypothesis in anova?

 how can we find future predictive value from existing data?

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Few doubts about R

2013-08-29 Thread bala chand
Hi
   can you please give  the brief explanation about anova?

  what is the purpose of null hypothesis in anova?

 how can we find future predictive value from existing data?

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Calculation with Times Series

2013-08-29 Thread arun
HI,
May be this helps:

 ts1- ts(1:20)
 ts2- ts(1:25)
ts1[-(1:3)]- ts1[-(1:3)]+ts2[1:17]

 as.numeric(ts1)
# [1]  1  2  3  5  7  9 11 13 15 17 19 21 23 25 27 29 31 33 35 37


A.K.


Hey everyone, 

I`m an absolut beginner in R and need some help for an exercise: 

I want to do ordinary calculations with 2 time series. The issue
 with this, that I want to use different elements of time series. 
Let me give you an example: 

I want to sum let`s say the 10th element of time series 1 with 
the 7th element of time series 2.  And 9th element of TS 1 with 6th 
element of TS 2 and 8th element of TS 1 with 5th element of TS 2 ... 

This pattern of the summation should go all over the time series. 

Is there a function, which allows me to do this, if possible a 
function in which I can change the difference of the position with a 
variable. 

Thanks a lot for your support. I´m for any advice thankful!

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] XLSX package + Excel creation question

2013-08-29 Thread Rainer Hurling
Am 29.08.2013 15:03 (UTC+1) schrieb Zsurzsa Laszlo:
 First of all thank you for the quick resposen.
 
 I know I can color and set up every cell. I will take a look again *
 CellStyle* but is it possbile for example to write an array to a single
 cell that has different colors for some data. Basically the color depends
 on the data.

As far as I know there is no ready to use functionality to mask groups
of selected cells. You have to write your own function, which selects
the right cells and changes their style with setCellStyle(cell, cellStyle).

Some hints are given in the examples section of ?CellStyle.

 
 -
 - László-András Zsurzsa,-
 - Msc. Infromatics, Technical University Munich, Germany -
 - Scientific Employee, TUM -
 -
 
 
 On Thu, Aug 29, 2013 at 2:55 PM, Rainer Hurling rhur...@gwdg.de wrote:
 
 Am 29.08.2013 12:08 (UTC+1) schrieb Zsurzsa Laszlo:
 Dear R users,

 I have a question about the xlsx package. It's possible to create excel
 files and color cells and etc.

 yes, with package xlsx you can colourize you data sheets, even the
 fonts. See for example ?CellStyle .

 A good demonstration of the capabilities is on

 http://tradeblotter.wordpress.com/2013/05/02/writing-from-r-to-excel-with-xlsx/


 My question would be that is it possible to color only some part of the
 data hold in a cell. Let's assume I've got the following data :
 167,153,120,100 and I want to color to red everything that is bigger then
 120. How can I achive this using R.

 Example file setup with a few lines in attachment. (SEL_MASS column can
 be
 used for example)

 Attachment missing ...

 HTH,
 Rainer


 Thank you in advance,

 -
 - László-András Zsurzsa,-
 - Msc. Infromatics, Technical University Munich, Germany -
 - Scientific Employee, TUM -

 -

 
   [[alternative HTML version deleted]]
 
 
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Sensitivy / Specificity and nulls

2013-08-29 Thread Donald Catanzaro
Hi All,

I apologize for the opaqueness and I will try to make it clearer.

I am comparing two diagnostic tests G (gold standard) and N (new).  Both
are real tests, real experiments.  G is currently the gold standard because
it is the best test available, not because it is a perfect test.  G is a
growth based test and sometimes the test fails (the sample is contaminated
with multiple species of bacteria and no results).  The new test is
molecular based and DNA is present you get a result however, sometimes this
test fails as well (the quality control parameters have been exceeded).
 Both tests are one-shots so there is no opportunity for retesting.

Thus what happens is that sometimes G has fails while N detects DNA in the
sample and sometimes the reverse is true, G has growth and N fails.

So I guess the simplest way to think of this is that both G and N have some
level of measurement error that is unknown but I would like to account for
in my calculations.  So rather than having data for a 'traditional' 2x2
matrix for sensitivity/specificity, my data (where 1=positive and 0 =
negative test results) looks like this:

GN
1 0
1 1
F 1
0 F
1 1
1 1
1 0
F 0






On Thu, Aug 29, 2013 at 7:32 AM, Michael Dewey i...@aghmed.fsnet.co.ukwrote:

 At 15:18 28/08/2013, Donald Catanzaro wrote:

 Good Day All,

 I am working with a diagnostic test and comparing the new test to an old
 test.  Normally I would be able to calculate sensitivity and specificity
 quite easily.

 However, the 'gold standard' that I am comparing my new diagnostic with is
 really 'gold-plated' in that sometimes the 'gold standard' fails
 completely
 and I have no data from the 'gold standard' but I might have data from the
 diagnostic test.  Of course sometimes my new diagnostic fails but I have
 data from my 'gold standard'


 I am not sure I completely understand the situation, my crystal ball is
 becoming rather opaque, but it sounds as though you are looking for some
 form of meta-analysis of diagnostic tests when there is no reference
 standard. HSROC, available from CRAN, claims to provide this although I
 have never used it myself.


  To me this really starts moving towards classification but I cannot seem
 to
 find the appropriate calculations.

 Can someone point me to some web resources to determine the appropriate
 method to be able to deal with the NULLs ?  Resources within the medical
 realm would be better (because the rest of the folks would understand them
 better) but not required.

 --
 - Don

 Donald Catanzaro PhD
 dgcatanz...@gmail.com
 16144 Sigmond Lane
 Lowell, AR 72745
 479-751-3616

 [[alternative HTML version deleted]]


 Michael Dewey
 i...@aghmed.fsnet.co.uk
 http://www.aghmed.fsnet.co.uk/**home.htmlhttp://www.aghmed.fsnet.co.uk/home.html




-- 
- Don

Donald Catanzaro PhD
dgcatanz...@gmail.com
16144 Sigmond Lane
Lowell, AR 72745
479-751-3616

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] XLSX package + Excel creation question

2013-08-29 Thread Zsurzsa Laszlo
I understand you response but it does not solve the problem. I'am aware
that one can simply color every cell in an excel file by using his own
algorithm.

The question was if I can write my data to a *single* cells and use
different formatting for every piece of data.

-
- László-András Zsurzsa,-
- Msc. Infromatics, Technical University Munich, Germany -
- Scientific Employee, TUM -
-


On Thu, Aug 29, 2013 at 3:36 PM, Rainer Hurling rhur...@gwdg.de wrote:

 Am 29.08.2013 15:03 (UTC+1) schrieb Zsurzsa Laszlo:
  First of all thank you for the quick resposen.
 
  I know I can color and set up every cell. I will take a look again *
  CellStyle* but is it possbile for example to write an array to a single
  cell that has different colors for some data. Basically the color depends
  on the data.

 As far as I know there is no ready to use functionality to mask groups
 of selected cells. You have to write your own function, which selects
 the right cells and changes their style with setCellStyle(cell, cellStyle).

 Some hints are given in the examples section of ?CellStyle.

 
 
 -
  - László-András Zsurzsa,-
  - Msc. Infromatics, Technical University Munich, Germany -
  - Scientific Employee, TUM -
 
 -
 
 
  On Thu, Aug 29, 2013 at 2:55 PM, Rainer Hurling rhur...@gwdg.de wrote:
 
  Am 29.08.2013 12:08 (UTC+1) schrieb Zsurzsa Laszlo:
  Dear R users,
 
  I have a question about the xlsx package. It's possible to create excel
  files and color cells and etc.
 
  yes, with package xlsx you can colourize you data sheets, even the
  fonts. See for example ?CellStyle .
 
  A good demonstration of the capabilities is on
 
 
 http://tradeblotter.wordpress.com/2013/05/02/writing-from-r-to-excel-with-xlsx/
 
 
  My question would be that is it possible to color only some part of the
  data hold in a cell. Let's assume I've got the following data :
  167,153,120,100 and I want to color to red everything that is bigger
 then
  120. How can I achive this using R.
 
  Example file setup with a few lines in attachment. (SEL_MASS column can
  be
  used for example)
 
  Attachment missing ...
 
  HTH,
  Rainer
 
 
  Thank you in advance,
 
 
 -
  - László-András Zsurzsa,
  -
  - Msc. Infromatics, Technical University Munich, Germany -
  - Scientific Employee, TUM
 -
 
 
 -
 
 
[[alternative HTML version deleted]]
 
 
 
  __
  R-help@r-project.org mailing list
  https://stat.ethz.ch/mailman/listinfo/r-help
  PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
  and provide commented, minimal, self-contained, reproducible code.
 


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Help R

2013-08-29 Thread Mª Teresa Martinez Soriano
Hi to everyone,
 
I would like to replace some values in a data.frame (D)

 str(D)
'data.frame':   116 obs. of  10 variables:
 $ X. : int  1108 1591 3408 3872 5823 8099 10640 12600 14680 14698 ...
 $ media  : num  22 86.6 807 103.2 73 ...
 $ IE.2003: num  32 92 166 237 161 ...
 $ IE.2004: num  63 122.8 290 117.8 73.6 ...
 $ IE.2005: num  60 277 302 154 134 ...
 $ IE.2006: num  39 87 322 113 70 ...
 $ IE.2007: num  4 95 621 116 80 ...
 $ IE.2008: num  8 94 1071 90 74 ...
 $ IE.2009: num  16 81 1301 94 69 ...
 $ IE.2010: num  5 76 1225 1911 72 ...

D

X.  media IE.2003 IE.2004 IE.2005 IE.2006 IE.2007 IE.2008 IE.2009 
IE.2010
1   1108   22.032.063.060.0  39 4.0   8  16 
5.0
2   1591   86.692.0   122.8   276.6  8795.0  94  81
76.0
3   3408  807.0   166.0   290.0   302.0 322   621.010711301  
1225.0
4   3872  103.25000   237.2   117.8   154.4 113   116.0  90  94  
1911.2
5   5823   73.0   160.673.6   133.6  7080.0  74  69
72.0
6   8099  125.16667   169.0   206.0   196.0 161   150.0  94  72
78.0
7  10640   67.3   494.8   168.2   424.8 476   670.6  74  77
51.0
8  12600 2417.0  1958.0  1871.0  1960.02383  2453.025062758  
2442.0
9  14680   38.0   142.246.030.0  61   404.0  42  19   
243.8
10 14698  698.16667   505.0   482.0   553.0 664   847.0 800 679   
646.0



WHat I really want to do is:

 for( i in 1: nrow(D))
 {
   for( j in 5:ncol(D))
   {
D[((D[i,j]/D[i,2])1.5)]15999)]-100
   }
 
 }






Error en `[-.data.frame`(`*tmp*`, (D[i, j] 15999), value = 1e+06) : 
  missing values are not allowed in subscripted assignments of data frames  
  
[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Help R

2013-08-29 Thread Zsurzsa Laszlo
Do you have NA/NAN in your data set? If yes our check with an IF or
substitute them with a value that fits your need.

I hope I understood correctly your problem.

-
- László-András Zsurzsa,-
- Msc. Infromatics, Technical University Munich, Germany -
- Scientific Employee, TUM -
-


On Thu, Aug 29, 2013 at 3:49 PM, Mª Teresa Martinez Soriano 
teresama...@hotmail.com wrote:

 Hi to everyone,

 I would like to replace some values in a data.frame (D)

  str(D)
 'data.frame':   116 obs. of  10 variables:
  $ X. : int  1108 1591 3408 3872 5823 8099 10640 12600 14680 14698 ...
  $ media  : num  22 86.6 807 103.2 73 ...
  $ IE.2003: num  32 92 166 237 161 ...
  $ IE.2004: num  63 122.8 290 117.8 73.6 ...
  $ IE.2005: num  60 277 302 154 134 ...
  $ IE.2006: num  39 87 322 113 70 ...
  $ IE.2007: num  4 95 621 116 80 ...
  $ IE.2008: num  8 94 1071 90 74 ...
  $ IE.2009: num  16 81 1301 94 69 ...
  $ IE.2010: num  5 76 1225 1911 72 ...

 D

 X.  media IE.2003 IE.2004 IE.2005 IE.2006 IE.2007 IE.2008 IE.2009
 IE.2010
 1   1108   22.032.063.060.0  39 4.0   8
  16 5.0
 2   1591   86.692.0   122.8   276.6  8795.0  94
  8176.0
 3   3408  807.0   166.0   290.0   302.0 322   621.01071
  1301  1225.0
 4   3872  103.25000   237.2   117.8   154.4 113   116.0  90
  94  1911.2
 5   5823   73.0   160.673.6   133.6  7080.0  74
  6972.0
 6   8099  125.16667   169.0   206.0   196.0 161   150.0  94
  7278.0
 7  10640   67.3   494.8   168.2   424.8 476   670.6  74
  7751.0
 8  12600 2417.0  1958.0  1871.0  1960.02383  2453.02506
  2758  2442.0
 9  14680   38.0   142.246.030.0  61   404.0  42
  19   243.8
 10 14698  698.16667   505.0   482.0   553.0 664   847.0 800
 679   646.0



 WHat I really want to do is:

  for( i in 1: nrow(D))
  {
for( j in 5:ncol(D))
{
 D[((D[i,j]/D[i,2])1.5)]15999)]-100
}

  }






 Error en `[-.data.frame`(`*tmp*`, (D[i, j] 15999), value = 1e+06) :
   missing values are not allowed in subscripted assignments of data frames
 [[alternative HTML version deleted]]

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Help R

2013-08-29 Thread arun


HI,
Your code is not clear:
 for( i in 1: nrow(D)) 
 { 
   for( j in 5:ncol(D)) 
   { 
    D[((D[i,j]/D[i,2])1.5)]15999)]-100  
##  1.5)]15999)]
   }
  
^^^
  
 } 

D- structure(list(X. = c(1108L, 1591L, 3408L, 3872L, 5823L, 8099L, 
10640L, 12600L, 14680L, 14698L), media = c(22, 86.6, 807, 103.25, 
73, 125.16667, 67.3, 2417, 38, 698.16667), IE.2003 = c(32, 
92, 166, 237.2, 160.6, 169, 494.8, 1958, 142.2, 505), IE.2004 = c(63, 
122.8, 290, 117.8, 73.6, 206, 168.2, 1871, 46, 482), IE.2005 = c(60, 
276.6, 302, 154.4, 133.6, 196, 424.8, 1960, 30, 553), IE.2006 = c(39L, 
87L, 322L, 113L, 70L, 161L, 476L, 2383L, 61L, 664L), IE.2007 = c(4, 
95, 621, 116, 80, 150, 670.6, 2453, 404, 847), IE.2008 = c(8L, 
94L, 1071L, 90L, 74L, 94L, 74L, 2506L, 42L, 800L), IE.2009 = c(16L, 
81L, 1301L, 94L, 69L, 72L, 77L, 2758L, 19L, 679L), IE.2010 = c(5, 
76, 1225, 1911.2, 72, 78, 51, 2442, 243.8, 646)), .Names = c(X., 
media, IE.2003, IE.2004, IE.2005, IE.2006, IE.2007, 
IE.2008, IE.2009, IE.2010), class = data.frame, row.names = c(1, 
2, 3, 4, 5, 6, 7, 8, 9, 10))
D[,-c(1:4)][D[,-c(1:4)]/D[,2]1.5]
# [1]   60.0  276.6  133.6  196.0  424.8   39.0  476.0   61.0  670.6  404.0
#[11] 1301.0 1225.0 1911.2  243.8


A.K.
 






Hi to everyone, 
  
I would like to replace some values in a data.frame (D) 

 str(D) 
'data.frame':   116 obs. of  10 variables: 
 $ X.     : int  1108 1591 3408 3872 5823 8099 10640 12600 14680 14698 ... 
 $ media  : num  22 86.6 807 103.2 73 ... 
 $ IE.2003: num  32 92 166 237 161 ... 
 $ IE.2004: num  63 122.8 290 117.8 73.6 ... 
 $ IE.2005: num  60 277 302 154 134 ... 
 $ IE.2006: num  39 87 322 113 70 ... 
 $ IE.2007: num  4 95 621 116 80 ... 
 $ IE.2008: num  8 94 1071 90 74 ... 
 $ IE.2009: num  16 81 1301 94 69 ... 
 $ IE.2010: num  5 76 1225 1911 72 ... 

D 

    X.      media IE.2003 IE.2004 IE.2005 IE.2006 IE.2007 IE.2008 IE.2009 
IE.2010 
1   1108   22.0    32.0    63.0    60.0      39     4.0       8      16     
5.0 
2   1591   86.6    92.0   122.8   276.6      87    95.0      94      81    
76.0 
3   3408  807.0   166.0   290.0   302.0     322   621.0    1071    1301  
1225.0 
4   3872  103.25000   237.2   117.8   154.4     113   116.0      90      94  
1911.2 
5   5823   73.0   160.6    73.6   133.6      70    80.0      74      69    
72.0 
6   8099  125.16667   169.0   206.0   196.0     161   150.0      94      72    
78.0 
7  10640   67.3   494.8   168.2   424.8     476   670.6      74      77    
51.0 
8  12600 2417.0  1958.0  1871.0  1960.0    2383  2453.0    2506    2758  
2442.0 
9  14680   38.0   142.2    46.0    30.0      61   404.0      42      19   
243.8 
10 14698  698.16667   505.0   482.0   553.0     664   847.0     800     679   
646.0 



WHat I really want to do is: 

 for( i in 1: nrow(D)) 
 { 
   for( j in 5:ncol(D)) 
   { 
    D[((D[i,j]/D[i,2])1.5)]15999)]-100 
   } 
  
 }

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Few doubts about ANOVA

2013-08-29 Thread John Kane
Looks like school is starting up again.

We don't usually help with homework especially at this level. Read a text book

John Kane
Kingston ON Canada


 -Original Message-
 From: bal.chan...@gmail.com
 Sent: Thu, 29 Aug 2013 15:57:29 +0530
 To: r-help@r-project.org
 Subject: [R] Few doubts about ANOVA
 
 Hi
can you please give  the brief explanation about anova?
 
   what is the purpose of null hypothesis in anova?
 
  how can we find future predictive value from existing data?
 
   [[alternative HTML version deleted]]
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


FREE 3D EARTH SCREENSAVER - Watch the Earth right on your desktop!

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Help R

2013-08-29 Thread Jose Iparraguirre
As said by arun, the code is not clear.
Ma Teresa, what is it that you actually want to do?
Regards,
José


-Original Message-
From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On 
Behalf Of arun
Sent: 29 August 2013 15:12
To: R help
Subject: Re: [R] Help R



HI,
Your code is not clear:
 for( i in 1: nrow(D))
 {
   for( j in 5:ncol(D))
   {
    D[((D[i,j]/D[i,2])1.5)]15999)]-100
##  1.5)]15999)]
   } ^^^
  
 } 

D- structure(list(X. = c(1108L, 1591L, 3408L, 3872L, 5823L, 8099L, 10640L, 
12600L, 14680L, 14698L), media = c(22, 86.6, 807, 103.25, 73, 125.16667, 
67.3, 2417, 38, 698.16667), IE.2003 = c(32, 92, 166, 237.2, 160.6, 169, 
494.8, 1958, 142.2, 505), IE.2004 = c(63, 122.8, 290, 117.8, 73.6, 206, 168.2, 
1871, 46, 482), IE.2005 = c(60, 276.6, 302, 154.4, 133.6, 196, 424.8, 1960, 30, 
553), IE.2006 = c(39L, 87L, 322L, 113L, 70L, 161L, 476L, 2383L, 61L, 664L), 
IE.2007 = c(4, 95, 621, 116, 80, 150, 670.6, 2453, 404, 847), IE.2008 = c(8L, 
94L, 1071L, 90L, 74L, 94L, 74L, 2506L, 42L, 800L), IE.2009 = c(16L, 81L, 1301L, 
94L, 69L, 72L, 77L, 2758L, 19L, 679L), IE.2010 = c(5, 76, 1225, 1911.2, 72, 78, 
51, 2442, 243.8, 646)), .Names = c(X., media, IE.2003, IE.2004, 
IE.2005, IE.2006, IE.2007, IE.2008, IE.2009, IE.2010), class = 
data.frame, row.names = c(1, 2, 3, 4, 5, 6, 7, 8, 9, 10)) 
D[,-c(1:4)][D[,-c(1:4)]/D[,2]1.5]
# [1]   60.0  276.6  133.6  196.0  424.8   39.0  476.0   61.0  670.6  404.0 
#[11] 1301.0 1225.0 1911.2  243.8


A.K.
 






Hi to everyone, 
  
I would like to replace some values in a data.frame (D) 

 str(D)
'data.frame':   116 obs. of  10 variables: 
 $ X.     : int  1108 1591 3408 3872 5823 8099 10640 12600 14680 14698 ... 
 $ media  : num  22 86.6 807 103.2 73 ... 
 $ IE.2003: num  32 92 166 237 161 ... 
 $ IE.2004: num  63 122.8 290 117.8 73.6 ... 
 $ IE.2005: num  60 277 302 154 134 ... 
 $ IE.2006: num  39 87 322 113 70 ... 
 $ IE.2007: num  4 95 621 116 80 ... 
 $ IE.2008: num  8 94 1071 90 74 ... 
 $ IE.2009: num  16 81 1301 94 69 ... 
 $ IE.2010: num  5 76 1225 1911 72 ... 

D 

    X.      media IE.2003 IE.2004 IE.2005 IE.2006 IE.2007 IE.2008 IE.2009 
IE.2010
1   1108   22.0    32.0    63.0    60.0      39     4.0       8      16     
5.0
2   1591   86.6    92.0   122.8   276.6      87    95.0      94      81    
76.0
3   3408  807.0   166.0   290.0   302.0     322   621.0    1071    1301  
1225.0
4   3872  103.25000   237.2   117.8   154.4     113   116.0      90      94  
1911.2
5   5823   73.0   160.6    73.6   133.6      70    80.0      74      69    
72.0
6   8099  125.16667   169.0   206.0   196.0     161   150.0      94      72    
78.0
7  10640   67.3   494.8   168.2   424.8     476   670.6      74      77    
51.0
8  12600 2417.0  1958.0  1871.0  1960.0    2383  2453.0    2506    2758  
2442.0
9  14680   38.0   142.2    46.0    30.0      61   404.0      42      19   
243.8
10 14698  698.16667   505.0   482.0   553.0     664   847.0     800     679   
646.0 



WHat I really want to do is: 

 for( i in 1: nrow(D))
 {
   for( j in 5:ncol(D))
   {
    D[((D[i,j]/D[i,2])1.5)]15999)]-100
   } 
  
 }

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

The Wireless from Age UK | Radio for grown-ups.

www.ageuk.org.uk/thewireless


If you’re looking for a radio station that offers real variety, tune in to The 
Wireless from Age UK. 
Whether you choose to listen through the website at 
www.ageuk.org.uk/thewireless, on digital radio (currently available in London 
and Yorkshire) or through our TuneIn Radio app, you can look forward to an 
inspiring mix of music, conversation and useful information 24 hours a day.



 
---
Age UK is a registered charity and company limited by guarantee, (registered 
charity number 1128267, registered company number 6825798). 
Registered office: Tavis House, 1-6 Tavistock Square, London WC1H 9NA.

For the purposes of promoting Age UK Insurance, Age UK is an Appointed 
Representative of Age UK Enterprises Limited, Age UK is an Introducer 
Appointed Representative of JLT Benefit Solutions Limited and Simplyhealth 
Access for the purposes of introducing potential annuity and health 
cash plans customers respectively.  Age UK Enterprises Limited, JLT Benefit 
Solutions Limited and Simplyhealth Access are all authorised and 
regulated by the Financial Services Authority. 
--

This email and any files transmitted with it are confidential and intended 
solely for the use of the individual or entity to whom they are 
addressed. If you receive a message in error, please advise the sender and 
delete immediately.

Except where this email is sent in the usual course of our business, any 
opinions expressed in 

Re: [R] Plotting time vs number

2013-08-29 Thread John Kane
Please use dput() to supply data. It's a lot easier for readers to just copy 
and paste into R.

I have no idea of what variables are associated with the columns below.

John Kane
Kingston ON Canada


 -Original Message-
 From: mohan.radhakrish...@polarisft.com
 Sent: Thu, 29 Aug 2013 09:49:36 +0530
 To: jholt...@gmail.com
 Subject: Re: [R] Plotting time vs number
 
 Hi,
 
 plot(strptime(data$Time,%H:%M:%S),data$Kbytes,pch=0,type=b,col =
 red, col.axis=red, ylab=, xlab=,las=2,lwd=2.5,cex.axis=1.5)
 title(,cex.main=3,xlab=Seconds, line=5.2,ylab=Kbytes, cex.lab=2,1)
 
 Hope I am not simplifying this in a bad way. These  lines plot everything
 properly except the number of labels on the x-axis.
 
 The plots are all there but the x=axis labels are not there. The graph
 labels are only '12:30', '13:30' and '14:30'
 
 I think I need to use your code to get all the values.
 
 13:18:452691296 1601996 1584936
 13:20:252691296 1603548 1586488
 13:22:052691296 1603556 1586496
 13:23:452691296 1606760 1589700
 13:25:252691296 1611020 1593960
 13:27:052691296 1614348 1597288
 13:28:452691296 1614356 1597296
 13:30:252691296 1614380 1597320
 13:32:052691296 1614388 1597328
 13:33:452691296 1614392 1597332
 13:35:252691296 1614408 1597352
 13:37:052691296 1614416 1597356
 13:38:452691296 161 1597384
 13:40:262691296 1614624 1597564
 13:42:062691296 1614716 1597660
 13:43:462691296 1614740 1597680
 13:45:262691296 1614744 1597684
 13:47:062756832 1631728 1614668
 13:48:462756832 1631768 1614708
 13:50:262756832 1631892 1614832
 
 Tell me what you want to do, not how you want to do it.
 
 If I don't show working code I don't get any response from the forum. So
 I
 need basic code to show how it works :-)
 
 Thanks,
 Mohan
 
 
 
 From:   jim holtman jholt...@gmail.com
 To: mohan.radhakrish...@polarisft.com
 Cc: Jannis bt_jan...@yahoo.de, R mailing list
 r-help@r-project.org, r-help-boun...@r-project.org
 Date:   08/29/2013 01:32 AM
 Subject:Re: [R] Plotting time vs number
 
 
 
 What you need to do is to create the plot without an x-axis (xaxt =
 'n') and then add your own values on the axis with 'axis'
 
 
 x - read.table(text =   Time  Kbytes RSS Dirty_Mode
 1 11:42:02 2691296 15997961582736
 2 11:43:42 2691396 15998041582744
 3 11:45:22 2691496 15998041582744
 4 11:47:02 2691596 15998121582752
 5 11:48:42 2691696 15998161582756
 6 11:50:22 2691796 15998201582760, as.is = TRUE, header = TRUE)
 x$tod - as.POSIXct(paste('2013-08-28', x$Time))
 plot(x$tod,x$Kbytes,type=b,col = blue,  ylab=, xaxt = 'n',
 xlab=,las=2,lwd=2.5, lty=1,cex.axis=1.5)
 # now plot you times
 axis(1, at = x$tod, labels = x$Time, las = 2)
 Jim Holtman
 Data Munger Guru
 
 What is the problem that you are trying to solve?
 Tell me what you want to do, not how you want to do it.
 
 
 On Wed, Aug 28, 2013 at 8:35 AM,  mohan.radhakrish...@polarisft.com
 wrote:
 Hi,
 
 plot(strptime(data$Time,%H:%M:%S),data$Kbytes,type=l,col = blue,
 ylab=, xlab=,las=2,lwd=2.5, lty=1,cex.axis=1.5)
 
 strptime functions draws a proper graph but now all the time values are
 not in the x-axis.
 
 1 11:42:02 2691296
 2 11:43:42 2691396
 3 11:45:22 2691496
 4 11:47:02 2691596
 5 11:48:42 2691696
 
 I mean that each time value is not shown. It shows only a few values.
 Each
 individual pair is not plotted.
 
 Thanks.
 
 
 
 From:   mohan.radhakrish...@polarisft.com
 To: Jannis bt_jan...@yahoo.de
 Cc: r-help@r-project.org, r-help-boun...@r-project.org
 Date:   08/28/2013 05:39 PM
 Subject:Re: [R] Plotting time vs number
 Sent by:r-help-boun...@r-project.org
 
 
 
 Hi Jannis,
 
 I have tried that. It doesn't work. Jumps are not there in my other
 graphs
 
 using numbers. Does this anything to do with time series ?
 
 Can I just convert this time representation into milliseconds and plot
 the
 
 graph ? The x-axis should still show this time format though(names.arg ?
 ).
 
 
 this.dir - dirname(parent.frame(2)$ofile)
 setwd(this.dir)
 
 
 data = read.table(D:\\Log analysis\\pmapdata-node1.1,header=F)
 colnames(data) - c(Time,Kbytes,RSS,Dirty Mode)
 
 
 png(
   pmapanalysis4705.png,
width = 2224, height = 768)
 par(mar=c(5, 6, 5, 8) + 0.1)
 
 plot(data$Time,names.org=Test,data$Kbytes,type=b,col = blue,
 ylab=, xlab=,las=2,lwd=2.5, lty=1,cex.axis=1.5)
 
 box()
 
 dev.off()
 
 
 
 
 From:   Jannis bt_jan...@yahoo.de
 To: r-help@r-project.org
 Date:   08/28/2013 05:32 PM
 Subject:Re: [R] Plotting time vs number
 Sent by:r-help-boun...@r-project.org
 
 
 
 Hi Mohan,
 
 i am not sure whether I understand your question correctly. Without
 beeing able to easily reproduce your plot, I would guess that the
 breaks come from the type='b' option you choose. When you use type
 ='l', the line would be continuous (though the jumps 

Re: [R] Narrowing values collected from .txt file

2013-08-29 Thread Morway, Eric
On Thu, Aug 29, 2013 at 5:40 AM, jim holtman jholt...@gmail.com wrote:

 Here is how I would do it since are reading in the entire file.  This
 breaks on each Flow Budget section, extracts the RECHARGE values and
 puts them in a list with the name of the Flow Budget:


I learned more R in studying your solution than I could've in a week
devoted to googling R.  Thank you for making short work of the problem.


 What is the problem that you are trying to solve?
 Tell me what you want to do, not how you want to do it.


To answer your question, I'm simply trying to plot various components of
the flow budget (e.g., recharge, lake seepage) for any zone that I name
through time.  For example, I tried altering your solution to restrict the
retrieved output to zone 2 only:

indx - grep(Flow Budget for Zone  2|  RECHARGE =, input)

But this was wholly unsatisfactory because I got recharge for all the other
zones as well:

 [1]   RECHARGE =   0.12898E+06
 [2]   RECHARGE =0.
 [3]  Flow Budget for Zone  2 at Time Step   1 of Stress Period   2
 [4]   RECHARGE =   0.27416E+06
 [5]   RECHARGE =0.
 [6]   RECHARGE =81084.
 [7]   RECHARGE =0.
 [8]   RECHARGE =45295.
 [9]   RECHARGE =0.
[10]   RECHARGE =71834.
[11]   RECHARGE =0.
[12]   RECHARGE =97739.
[13]   RECHARGE =0.
[14]   RECHARGE =   0.12100E-01
[15]   RECHARGE =0.
[16]   RECHARGE =25350.
[17]   RECHARGE =0.
[18]   RECHARGE =6167.2
[19]   RECHARGE =0.
[20]   RECHARGE =28608.


My thinking at this point is to amend your original solution to account for
composite zones:

indx - grep(Flow Budget for Zone|Flow Budget for Composite Zone|RECHARGE
=, input)

and from this extract zone 2's RECHARGE, or composite zone 10's LAKE
SEEPAGE, etc.  So from the output as I now have it (example shown below),
how does one search this form of output for zone 2, or composite zone 10?



and leaving the rest of the R as is, I get processed results not unlike
what you showed, only with composite zones taken into acct (shown below).
 So, the final step of what I like to do is to then plot a time series of
RECHARGE (not including UZF RECHAGE) for Zone 2, or plot LAKE SEEPAGE
for Composite Zone 10 over all 574 stress periods.



result[1:100]
$` Flow Budget for Zone  1 at Time Step   1 of Stress Period   2`
[1] 128980  0

$` Flow Budget for Zone  2 at Time Step   1 of Stress Period   2`
[1] 274160  0

$` Flow Budget for Zone  3 at Time Step   1 of Stress Period   2`
[1] 81084 0

$` Flow Budget for Zone  4 at Time Step   1 of Stress Period   2`
[1] 45295 0

$` Flow Budget for Zone  5 at Time Step   1 of Stress Period   2`
[1] 71834 0

$` Flow Budget for Zone  6 at Time Step   1 of Stress Period   2`
[1] 97739 0

$` Flow Budget for Zone  7 at Time Step   1 of Stress Period   2`
[1] 0.0121 0.

...
$` Flow Budget for Zone 94 at Time Step   1 of Stress Period   2`
[1] 0 0

$` Flow Budget for Zone 95 at Time Step   1 of Stress Period   2`
[1] 0 0

$` Flow Budget for Zone 96 at Time Step   1 of Stress Period   2`
[1] 0 0

$` Flow Budget for *Composite Zone* CZ001  at Time Step   1 of
Stress Period   2`
[1] 587810  0

$` Flow Budget for *Composite Zone* CZ002  at Time Step   1 of
Stress Period   2`
[1] 725030  0

$` Flow Budget for *Composite Zone* CZ003  at Time Step   1 of
Stress Period   2`
[1] 1312800   0
...

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] scale breaks

2013-08-29 Thread Shane Carey
Hello all,

I have decided to go ahead with gap.boxplot. I am trying to suppress the
axis labels, both x and y labels. I tried using axis.labels=NULL but it
would not work.

gap.boxplot(DATA$Conductivity~factor(DATA$UnitName_1),ylim=c(LOWER_Y_Conductivity,UPPER_Y_Conductivity_int),gap=gap_Conductivity,
axes=FALSE,col=colours,outwex=one,whisklty =
solid,whisklwd=lwth,outcol= black, outpch=dtsym,  outcex=dtsize,
axis.labels=NULL,range=1.5)

I would also like to display a y-axis value in the upper box, but I am
unable to that and wondering is that possible to do so with this package.
Is it possible to remove the upper and lower boxes horizontal lines and
replace the gap symbol with axis.break on the y-axis instead. Any advice
would be greatly appreciated!!!

Thanks




On Thu, Aug 29, 2013 at 9:38 AM, Shane Carey careys...@gmail.com wrote:

 Ok, thanks all :-)


 On Thu, Aug 29, 2013 at 2:39 AM, Jim Lemon j...@bitwrit.com.au wrote:

 On 08/29/2013 02:52 AM, Shane Carey wrote:

 Hi,

 Has anyone ever created scale breaks in R something like what is shown
 here
 in the section,
 Use a Scale Break

 http://www.r-bloggers.com/**graphing-highly-skewed-data/http://www.r-bloggers.com/graphing-highly-skewed-data/

 Thanks

  Hi Shane,
 As Sarah answered, axis.break in the plotrix package is a start.
 gap.barplot (also in plotrix) does the whole thing. If they won't give you
 lunch until you do it that way, like Sarah I say, Go for it

 Jim




 --
 Shane




-- 
Shane

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] scale breaks

2013-08-29 Thread Shane Carey
I would also like to display a y-axis value in the upper box

I got this part working now.


On Thu, Aug 29, 2013 at 4:28 PM, Shane Carey careys...@gmail.com wrote:

 Hello all,

 I have decided to go ahead with gap.boxplot. I am trying to suppress the
 axis labels, both x and y labels. I tried using axis.labels=NULL but it
 would not work.


 gap.boxplot(DATA$Conductivity~factor(DATA$UnitName_1),ylim=c(LOWER_Y_Conductivity,UPPER_Y_Conductivity_int),gap=gap_Conductivity,
 axes=FALSE,col=colours,outwex=one,whisklty =
 solid,whisklwd=lwth,outcol= black, outpch=dtsym,  outcex=dtsize,
 axis.labels=NULL,range=1.5)

 I would also like to display a y-axis value in the upper box, but I am
 unable to that and wondering is that possible to do so with this package.
 Is it possible to remove the upper and lower boxes horizontal lines and
 replace the gap symbol with axis.break on the y-axis instead. Any advice
 would be greatly appreciated!!!

 Thanks




 On Thu, Aug 29, 2013 at 9:38 AM, Shane Carey careys...@gmail.com wrote:

 Ok, thanks all :-)


 On Thu, Aug 29, 2013 at 2:39 AM, Jim Lemon j...@bitwrit.com.au wrote:

 On 08/29/2013 02:52 AM, Shane Carey wrote:

 Hi,

 Has anyone ever created scale breaks in R something like what is shown
 here
 in the section,
 Use a Scale Break

 http://www.r-bloggers.com/**graphing-highly-skewed-data/http://www.r-bloggers.com/graphing-highly-skewed-data/

 Thanks

  Hi Shane,
 As Sarah answered, axis.break in the plotrix package is a start.
 gap.barplot (also in plotrix) does the whole thing. If they won't give you
 lunch until you do it that way, like Sarah I say, Go for it

 Jim




 --
 Shane




 --
 Shane




-- 
Shane

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Help for a function

2013-08-29 Thread Rui Barradas

Hello,

You should post your questions to r-help@r-project.org, the odds of 
getting more and better answers are greater.


As for the question, try the following. Note that the functions now have 
an extra argument.




incub - function(x, n = 2){
  x$Incubation - 0
  x$Incubation[1] - x$Symptomes[1]
  if(nrow(x) = n)
x$Incubation[2] - sum(x$Symptomes[seq_len(n)])
  for(i in seq_len(nrow(x))[-seq_len(n)])
x$Incubation[i] - sum(x$Symptomes[i - (seq_len(n) - 1)])
  x
}


contag - function(x, n = 7){
  x$CONTAGIEUX - 0
  for(i in 1:min(nrow(x), n))
x$CONTAGIEUX[i] - sum(x$Symptomes[1:i], na.rm = TRUE)
  for (i in seq_len(nrow(x))[-seq_len(n)]) {
x$CONTAGIEUX[i] - x$Symptomes[i] + x$CONTAGIEUX[i-1] -
  x$Symptomes[i-n]
  }
  x
}

incub_ARGENTINA -incub(ARGENTINA, 2)
incub_ARGENTINA
contag_ARGENTINA -contag(ARGENTINA, 7)
contag_ARGENTINA
derdata_ARGENTINA -merge(contag_ARGENTINA, incub_ARGENTINA)
derdata_ARGENTINA


Hope this helps,

Rui Barradas


Em 29-08-2013 08:31, teko maurice escreveu:



Dear Rui,
Long time!
I came to ask for advice and help if you have time.
I am on my PHD developping all to model pandemic.
I have post on R help but nobody answer me,maybe it's so specific.
So i back to you if you can help me.
Hello all,
I have such a datasets for a pandemic virus.
  DATE Algeria Antigua.and.Barbuda ARGENTINA AUSTRALIA AUSTRIA Bahamas
1  2009-04-24   0   0 0 0   0   0
2  2009-04-26   0   0 0 0   0   0
3  2009-04-27   0   0 0 0   0   0
4  2009-04-28   0   0 0 0   0   0
5  2009-04-29   0   0 0 1   0   0
6  2009-04-30   0   0 0 1   0   0
7  2009-05-01   0   0 0 1   0   0
8  2009-05-02   0   0 0 1   0   0
9  2009-05-03   0   0 0 1   0   0
10 2009-05-04   0   0 0 1   0   0
11 2009-05-05   0   0 0 1   0   0
12 2009-05-06   0   0 0 1   0   0
13 2009-05-07   0   0 0 1   0   0
14 2009-05-08   0   0 0 1   0   0
15 2009-05-09   0   0 1 2   0   0
16 2009-05-10   0   0 1 2   0   0
17 2009-05-11   0   0 1 1   1   0
18 2009-05-12   0   0 1 1   1   0
19 2009-05-13   0   0 1 1   1   0
20 2009-05-14   0   0 1 1   1   0
21 2009-05-15   0   0 1 1   1   0
22 2009-05-16   0   0 1 1   1   0
23 2009-05-17   0   0 1 1   1   0
24 2009-05-18   0   0 1 1   1   0
25 2009-05-19   0   0 1 1   1   0
26 2009-05-20   0   0 1 1   1   0
27 2009-05-21   0   0 1 3   1   0
28 2009-05-22   0   0 1 7   1   0
29 2009-05-23   0   0 112   1   0
30 2009-05-25   0   0 216   1   0
31 2009-05-26   0   0 519   1   0
32 2009-05-27   0   01939   1   0
33 2009-05-29   0   037   147   1   0
34 2009-06-01   0   0   100   297   1   1
35 2009-06-03   0   0   131   501   1   1
36 2009-06-05   0   0   147   876   2   1
37 2009-06-08   0   0   202  1051   5   1
38 2009-06-10   0   0   235  1224   5   2
39 2009-06-11   0   0   256  1307   7   1
40 2009-06-12   0   0   343  1307   7   1
41 2009-06-15   0   0   343  1823   7   1
42 2009-06-17   0   0   733  2112   7   2
43 2009-06-19   0   0   918  2199   8   2
44 2009-06-22   1   0  1010  2436   9   2
45 2009-06-24   3   2  1213  2857  12   6
46 2009-06-26   2   2  1391  3280  12   4
47 2009-06-29   2 

[R] A question about multivariate normal distribution with a diagonal covariance matrix

2013-08-29 Thread Marino David
Hi all R users:



I am a little bit confused about the following results. See as follows:



library(mvtnorm)



xMean-c(24.12,66.92,77.65,131.97,158.8)

xVar-c(0.01,0.06,0.32,0.18,0.95)

xFloor-floor(xMean)



# use “mvtnorm” package

p1-dmvnorm(xFloor,mean=xMean,sigma=diag(xVar))

p2-dmvnorm(xFloor[1],mean=xMean[1],sigma=matrix(xVar[1]))*dmvnorm(xFloor[2],mean=xMean[2],sigma=matrix(xVar[2]))*dmvnorm(xFloor[3],mean=xMean[3],sigma=matrix(xVar[3]))



# use the basic package “stats”

p3-dnorm(xFloor[1],mean=xMean[1],sd=sqrt(xVar[1]))*dnorm(xFloor[2],mean=xMean[2],sd=sqrt(xVar[2]))*dnorm(xFloor[3],mean=xMean[3],sd=sqrt(xVar[3]))



The result is: p1= 2.006403e-05, p2=p3= 0.00099646. My question is why p1
does not equal to p2 when the covariance matrix is diagonal, meaning no
correlation among variates. From p2=p3, it seems that the “mvtnorm” package
exhibits well agreement with the R basic package. Any explain will be
greatly appreciated.



Thanks in advance!



David

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Add new calculated column to data frame

2013-08-29 Thread srecko joksimovic
Hi,

I have a following data set:
ideventtime (in sec)
1 add  1373502892
2 add  1373502972
3 delete   1373502995
4 view  1373503896
5 add   1373503996
...

I'd like to add new column time on task which is time elapsed between two
events (id2 - id1...). What would be the best approach to do that?

Thanks,
Srecko

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] A question about multivariate normal distribution with a diagonal covariance matrix

2013-08-29 Thread Duncan Murdoch

On 29/08/2013 1:37 PM, Marino David wrote:

Hi all R users:



I am a little bit confused about the following results. See as follows:



library(mvtnorm)



xMean-c(24.12,66.92,77.65,131.97,158.8)

xVar-c(0.01,0.06,0.32,0.18,0.95)

xFloor-floor(xMean)



# use “mvtnorm” package

p1-dmvnorm(xFloor,mean=xMean,sigma=diag(xVar))

p2-dmvnorm(xFloor[1],mean=xMean[1],sigma=matrix(xVar[1]))*dmvnorm(xFloor[2],mean=xMean[2],sigma=matrix(xVar[2]))*dmvnorm(xFloor[3],mean=xMean[3],sigma=matrix(xVar[3]))



# use the basic package “stats”

p3-dnorm(xFloor[1],mean=xMean[1],sd=sqrt(xVar[1]))*dnorm(xFloor[2],mean=xMean[2],sd=sqrt(xVar[2]))*dnorm(xFloor[3],mean=xMean[3],sd=sqrt(xVar[3]))



The result is: p1= 2.006403e-05, p2=p3= 0.00099646. My question is why p1
does not equal to p2 when the covariance matrix is diagonal, meaning no
correlation among variates. From p2=p3, it seems that the “mvtnorm” package
exhibits well agreement with the R basic package. Any explain will be
greatly appreciated.



Why would you expect p1=p2? p1 is the density in 5 dimensions, p2 is 
only the first 3 components.


Duncan Murdoch

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] A question about multivariate normal distribution with a diagonal covariance matrix

2013-08-29 Thread Marino David
You got the point. Thank you for pointing out the problem.

Thanks again.
David


2013/8/30 Duncan Murdoch murdoch.dun...@gmail.com

 On 29/08/2013 1:37 PM, Marino David wrote:

 Hi all R users:



 I am a little bit confused about the following results. See as follows:



 library(mvtnorm)



 xMean-c(24.12,66.92,77.65,**131.97,158.8)

 xVar-c(0.01,0.06,0.32,0.18,0.**95)

 xFloor-floor(xMean)



 # use “mvtnorm” package

 p1-dmvnorm(xFloor,mean=xMean,**sigma=diag(xVar))

 p2-dmvnorm(xFloor[1],mean=**xMean[1],sigma=matrix(xVar[1])**
 )*dmvnorm(xFloor[2],mean=**xMean[2],sigma=matrix(xVar[2])**
 )*dmvnorm(xFloor[3],mean=**xMean[3],sigma=matrix(xVar[3])**)



 # use the basic package “stats”

 p3-dnorm(xFloor[1],mean=**xMean[1],sd=sqrt(xVar[1]))***
 dnorm(xFloor[2],mean=xMean[2],**sd=sqrt(xVar[2]))*dnorm(**
 xFloor[3],mean=xMean[3],sd=**sqrt(xVar[3]))



 The result is: p1= 2.006403e-05, p2=p3= 0.00099646. My question is why p1
 does not equal to p2 when the covariance matrix is diagonal, meaning no
 correlation among variates. From p2=p3, it seems that the “mvtnorm”
 package
 exhibits well agreement with the R basic package. Any explain will be
 greatly appreciated.


 Why would you expect p1=p2? p1 is the density in 5 dimensions, p2 is only
 the first 3 components.

 Duncan Murdoch


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Add new calculated column to data frame

2013-08-29 Thread arun


Hi,
Try:
dat1- read.table(text=
id    event    time
1    add  1373502892
2    add  1373502972
3    delete  1373502995
4    view  1373503896
5    add  1373503996
,sep=,header=TRUE,stringsAsFactors=FALSE)
 dat1$time_on_task- c(NA,diff(dat1$time))
 dat1
#  id  event   time time_on_task
#1  1    add 1373502892   NA
#2  2    add 1373502972   80
#3  3 delete 1373502995   23
#4  4   view 1373503896  901
#5  5    add 1373503996  100

#Not sure whether this depends on the values of event or not..
A.K.





- Original Message -
From: srecko joksimovic sreckojoksimo...@gmail.com
To: R help R-help@r-project.org
Cc: 
Sent: Thursday, August 29, 2013 1:52 PM
Subject: [R] Add new calculated column to data frame

Hi,

I have a following data set:
id    event    time (in sec)
1     add      1373502892
2     add      1373502972
3     delete   1373502995
4     view      1373503896
5     add       1373503996
...

I'd like to add new column time on task which is time elapsed between two
events (id2 - id1...). What would be the best approach to do that?

Thanks,
Srecko

    [[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Add new calculated column to data frame

2013-08-29 Thread srecko joksimovic
Thanks Arun,

this is great. However, it should be just a little bit different:

#  id  event   time time_on_task
#1  1add 1373502892   80
#2  2add 1373502972   23
#3  3 delete 1373502995   901
#4  4   view 1373503896  100
#5  5add 1373503996  NA

When I calculate difference, I need to know how long each activity was. It
is id2-id1 for the first activity...


On Thu, Aug 29, 2013 at 11:03 AM, arun smartpink...@yahoo.com wrote:



 Hi,
 Try:
 dat1- read.table(text=
 ideventtime
 1add  1373502892
 2add  1373502972
 3delete  1373502995
 4view  1373503896
 5add  1373503996
 ,sep=,header=TRUE,stringsAsFactors=FALSE)
  dat1$time_on_task- c(NA,diff(dat1$time))
  dat1
 #  id  event   time time_on_task
 #1  1add 1373502892   NA
 #2  2add 1373502972   80
 #3  3 delete 1373502995   23
 #4  4   view 1373503896  901
 #5  5add 1373503996  100

 #Not sure whether this depends on the values of event or not..
 A.K.





 - Original Message -
 From: srecko joksimovic sreckojoksimo...@gmail.com
 To: R help R-help@r-project.org
 Cc:
 Sent: Thursday, August 29, 2013 1:52 PM
 Subject: [R] Add new calculated column to data frame

 Hi,

 I have a following data set:
 ideventtime (in sec)
 1 add  1373502892
 2 add  1373502972
 3 delete   1373502995
 4 view  1373503896
 5 add   1373503996
 ...

 I'd like to add new column time on task which is time elapsed between two
 events (id2 - id1...). What would be the best approach to do that?

 Thanks,
 Srecko

 [[alternative HTML version deleted]]

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.



[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Add new calculated column to data frame

2013-08-29 Thread srecko joksimovic
Hi Arun,

There is one more question... you explained me how to
use split(dat1,cumsum(dat1$action==login)) in one of previous questions,
and that is great.
Now, if I have something like this:

id  moduleevent   time   time_on_task
1   sys login 1373502892   80
2   taskadd  1373502892   80
3   taskadd  1373502972   23
4   sys login 1373502892   80
5   list delete   1373502995  901
6   list  view 1373503896  100
7   taskadd  1373503996   NA

I know how to split at each login occurrence, and I know how to add new
column with time differences. But, how to add new column category which
will be calculated based on columns module and even? For example if
module=task and event=add = category= A...

Srecko



On Thu, Aug 29, 2013 at 11:22 AM, arun smartpink...@yahoo.com wrote:

 Hi Srecko,
 No problem.
 Regards,
 Arun






 
 From: srecko joksimovic sreckojoksimo...@gmail.com
 To: arun smartpink...@yahoo.com
 Sent: Thursday, August 29, 2013 2:22 PM
 Subject: Re: [R] Add new calculated column to data frame



 Sorry... I should figure it out...

 thanks so much!
 Srecko



 On Thu, Aug 29, 2013 at 11:21 AM, arun smartpink...@yahoo.com wrote:

 Hi,
 The one you showed is:
 
 dat1$time_on_task- c(diff(dat1$time),NA)
 
  dat1
 #  id  event   time time_on_task
 #1  1add 1373502892   80
 
 #2  2add 1373502972   23
 #3  3 delete 1373502995  901
 #4  4   view 1373503896  100
 #5  5add 1373503996   NA
 
 
 
 
 
 From: srecko joksimovic sreckojoksimo...@gmail.com
 
 To: arun smartpink...@yahoo.com
 Cc: R help r-help@r-project.org
 Sent: Thursday, August 29, 2013 2:15 PM
 Subject: Re: [R] Add new calculated column to data frame
 
 
 
 
 Thanks Arun,
 
 this is great. However, it should be just a little bit different:
 
 #  id  event   time time_on_task
 #1  1add 1373502892   80
 #2  2add 1373502972   23
 #3  3 delete 1373502995   901
 #4  4   view 1373503896  100
 #5  5add 1373503996  NA
 
 
 When I calculate difference, I need to know how long each activity was.
 It is id2-id1 for the first activity...
 
 
 
 On Thu, Aug 29, 2013 at 11:03 AM, arun smartpink...@yahoo.com wrote:
 
 
 
 Hi,
 Try:
 dat1- read.table(text=
 ideventtime
 
 1add  1373502892
 2add  1373502972
 3delete  1373502995
 4view  1373503896
 5add  1373503996
 ,sep=,header=TRUE,stringsAsFactors=FALSE)
  dat1$time_on_task- c(NA,diff(dat1$time))
  dat1
 #  id  event   time time_on_task
 #1  1add 1373502892   NA
 #2  2add 1373502972   80
 #3  3 delete 1373502995   23
 #4  4   view 1373503896  901
 #5  5add 1373503996  100
 
 #Not sure whether this depends on the values of event or not..
 A.K.
 
 
 
 
 
 
 - Original Message -
 From: srecko joksimovic sreckojoksimo...@gmail.com
 To: R help R-help@r-project.org
 Cc:
 Sent: Thursday, August 29, 2013 1:52 PM
 Subject: [R] Add new calculated column to data frame
 
 Hi,
 
 I have a following data set:
 ideventtime (in sec)
 1 add  1373502892
 2 add  1373502972
 3 delete   1373502995
 4 view  1373503896
 5 add   1373503996
 ...
 
 I'd like to add new column time on task which is time elapsed between
 two
 events (id2 - id1...). What would be the best approach to do that?
 
 Thanks,
 Srecko
 
 [[alternative HTML version deleted]]
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.
 
 
 


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Add new calculated column to data frame

2013-08-29 Thread Berend Hasselman

On 29-08-2013, at 20:15, srecko joksimovic sreckojoksimo...@gmail.com wrote:

 Thanks Arun,
 
 this is great. However, it should be just a little bit different:
 
 #  id  event   time time_on_task
 #1  1add 1373502892   80
 #2  2add 1373502972   23
 #3  3 delete 1373502995   901
 #4  4   view 1373503896  100
 #5  5add 1373503996  NA
 
 When I calculate difference, I need to know how long each activity was. It
 is id2-id1 for the first activity...

then why don't you try

dat1$time_on_task- c(diff(dat1$time),NA)

Berend

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Add new calculated column to data frame

2013-08-29 Thread srecko joksimovic
Thanks Berend,

I don't know why I didn't try that before posting the question... but...
anyways, thanks for your help

Srecko


On Thu, Aug 29, 2013 at 11:34 AM, Berend Hasselman b...@xs4all.nl wrote:


 On 29-08-2013, at 20:15, srecko joksimovic sreckojoksimo...@gmail.com
 wrote:

  Thanks Arun,
 
  this is great. However, it should be just a little bit different:
 
  #  id  event   time time_on_task
  #1  1add 1373502892   80
  #2  2add 1373502972   23
  #3  3 delete 1373502995   901
  #4  4   view 1373503896  100
  #5  5add 1373503996  NA
 
  When I calculate difference, I need to know how long each activity was.
 It
  is id2-id1 for the first activity...

 then why don't you try

 dat1$time_on_task- c(diff(dat1$time),NA)

 Berend



[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Add new calculated column to data frame

2013-08-29 Thread arun


Hi,
You could try this:
dat1- read.table(text=
id  module    event   time   time_on_task
1   sys login 1373502892   80
2   task    add  1373502892   80
3   task    add  1373502972   23
4   sys login 1373502892   80
5   list delete   1373502995  901
6   list  view 1373503896  100
7   task    add  1373503996   NA
,sep=,header=TRUE,stringsAsFactors=FALSE)
 
dat1$Categ-as.character(factor(with(dat1,paste(module,event,sep=_)),levels=c(task_add,sys_login,list_delete,list_view),labels=LETTERS[1:4]))


dat1
#  id module  event   time time_on_task Categ
#1  1    sys  login 1373502892   80 B
#2  2   task    add 1373502892   80 A
#3  3   task    add 1373502972   23 A
#4  4    sys  login 1373502892   80 B
#5  5   list delete 1373502995  901 C
#6  6   list   view 1373503896  100 D
#7  7   task    add 1373503996   NA A
A.K.


From: srecko joksimovic sreckojoksimo...@gmail.com
To: arun smartpink...@yahoo.com 
Cc: R help R-help@r-project.org 
Sent: Thursday, August 29, 2013 2:34 PM
Subject: Re: [R] Add new calculated column to data frame



Hi Arun,

There is one more question... you explained me how to use 
split(dat1,cumsum(dat1$action==login)) in one of previous questions, and that 
is great.
Now, if I have something like this:

id  module    event   time                       time_on_task
1   sys         login         1373502892   80
2   task        add          1373502892   80

3   task        add          1373502972   23
4   sys         login         1373502892   80
5   list         delete       1373502995  901
6   list          view         1373503896  100
7   task        add          1373503996   NA
I know how to split at each login occurrence, and I know how to add new 
column with time differences. But, how to add new column category which will 
be calculated based on columns module and even? For example if module=task 
and event=add = category= A...

Srecko





On Thu, Aug 29, 2013 at 11:22 AM, arun smartpink...@yahoo.com wrote:

Hi Srecko,
No problem.
Regards,
Arun








From: srecko joksimovic sreckojoksimo...@gmail.com
To: arun smartpink...@yahoo.com
Sent: Thursday, August 29, 2013 2:22 PM

Subject: Re: [R] Add new calculated column to data frame



Sorry... I should figure it out...

thanks so much!
Srecko



On Thu, Aug 29, 2013 at 11:21 AM, arun smartpink...@yahoo.com wrote:

Hi,
The one you showed is:

dat1$time_on_task- c(diff(dat1$time),NA)

 dat1
#  id  event   time time_on_task
#1  1    add 1373502892   80

#2  2    add 1373502972   23
#3  3 delete 1373502995  901
#4  4   view 1373503896  100
#5  5    add 1373503996   NA





From: srecko joksimovic sreckojoksimo...@gmail.com

To: arun smartpink...@yahoo.com
Cc: R help r-help@r-project.org
Sent: Thursday, August 29, 2013 2:15 PM
Subject: Re: [R] Add new calculated column to data frame




Thanks Arun,

this is great. However, it should be just a little bit different:

#  id  event   time time_on_task
#1  1    add 1373502892           80
#2  2    add 1373502972           23
#3  3 delete 1373502995           901
#4  4   view 1373503896          100
#5  5    add 1373503996          NA


When I calculate difference, I need to know how long each activity was. It is 
id2-id1 for the first activity...



On Thu, Aug 29, 2013 at 11:03 AM, arun smartpink...@yahoo.com wrote:



Hi,
Try:
dat1- read.table(text=
id    event    time

1    add  1373502892
2    add  1373502972
3    delete  1373502995
4    view  1373503896
5    add  1373503996
,sep=,header=TRUE,stringsAsFactors=FALSE)
 dat1$time_on_task- c(NA,diff(dat1$time))
 dat1
#  id  event   time time_on_task
#1  1    add 1373502892   NA
#2  2    add 1373502972   80
#3  3 delete 1373502995   23
#4  4   view 1373503896  901
#5  5    add 1373503996  100

#Not sure whether this depends on the values of event or not..
A.K.






- Original Message -
From: srecko joksimovic sreckojoksimo...@gmail.com
To: R help R-help@r-project.org
Cc:
Sent: Thursday, August 29, 2013 1:52 PM
Subject: [R] Add new calculated column to data frame

Hi,

I have a following data set:
id    event    time (in sec)
1     add      1373502892
2     add      1373502972
3     delete   1373502995
4     view      1373503896
5     add       1373503996
...

I'd like to add new column time on task which is time elapsed between two
events (id2 - id1...). What would be the best approach to do that?

Thanks,
Srecko

    [[alternative HTML version deleted]]

__

Re: [R] calculate with different columns from different datasets

2013-08-29 Thread arun
Hi,
Try:
dat1- read.table(text=
V1 V2 V3
2 6 8
4 3 4
1 9 8
,sep=,header=TRUE)

dat2- read.table(text=
V1 V2 V3
6 8 4
2 0 7
8 1 3
,sep=,header=TRUE)

res1- as.matrix(dat1-dat2)
res1
#    V1 V2 V3
#[1,] -4 -2  4
#[2,]  2  3 -3
#[3,] -7  8  5


res2-t(t(dat1)-colMeans(dat2))
res2
#    V1 V2 V3
#[1,] -3.33  3  3.333
#[2,] -1.33  0 -0.667
#[3,] -4.33  6  3.333


A.K.


Hi there 

I've got two datasets of the following form (just an example, the real dataset 
got a lot more columns) 

dataset1 

V1  V2  V3 
2   6   8 
4   3   4 
1   9   8 

and dataset 2 

V1   V2 V3 
6   8   4 
2   0   7 
8   1   3 

First, I'd like to calculate the following: 

V1 from dataset1 minus V1 from dataset2, 
than 
V2 from dataset1 minus V2 from dataset2 
... 
and so on (always Vn-Vn, where n=1,2,n) and safe the solution-vectors in a 
new matrix. 

Second I'd like to run other functions over the two matching 
columns (for example: V1 from dataset1 minus mean(V1) from dataset2, V2 
from dataset1 minus mean(V2) from dataset2,...). 

So I'm looking for a simple solution that always takes the 
matching columns from the different datasets and than I can just change 
the function for the two. 

Thank you for your help! 

Kind regards

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] calculate with different columns from different datasets

2013-08-29 Thread arun
Hi,
Try:
 res-sapply(seq_len(ncol(dat1)),function(i) 
setNames(((1-coef(lm(dat1[,i]~dat2[,i]))[2])^2)*var(dat2[,i]),NULL))
 res
#[1] 21.0 16.11842 18.69231
A.K.


Thank you for your answer. But further calculations will be much more 
difficult, like 


(1-b)^2 * Var(V1)       for all matching columns 
  
where b is the slope from a regression V1 (from datset 1) on V1 (dataset 2) and 
Var(V1) the variance from V1(from dataset2). 

So what I'm looking for is something like a loop function... 


- Original Message -
From: arun smartpink...@yahoo.com
To: R help r-help@r-project.org
Cc: 
Sent: Thursday, August 29, 2013 3:49 PM
Subject: Re: calculate with different columns from different datasets

Hi,
Try:
dat1- read.table(text=
V1 V2 V3
2 6 8
4 3 4
1 9 8
,sep=,header=TRUE)

dat2- read.table(text=
V1 V2 V3
6 8 4
2 0 7
8 1 3
,sep=,header=TRUE)

res1- as.matrix(dat1-dat2)
res1
#    V1 V2 V3
#[1,] -4 -2  4
#[2,]  2  3 -3
#[3,] -7  8  5


res2-t(t(dat1)-colMeans(dat2))
res2
#    V1 V2 V3
#[1,] -3.33  3  3.333
#[2,] -1.33  0 -0.667
#[3,] -4.33  6  3.333


A.K.


Hi there 

I've got two datasets of the following form (just an example, the real dataset 
got a lot more columns) 

dataset1 

V1    V2    V3 
2    6    8 
4    3    4 
1    9    8 

and dataset 2 

V1     V2    V3 
6    8    4 
2    0    7 
8    1    3 

First, I'd like to calculate the following: 

V1 from dataset1 minus V1 from dataset2, 
than 
V2 from dataset1 minus V2 from dataset2 
... 
and so on (always Vn-Vn, where n=1,2,n) and safe the solution-vectors in a 
new matrix. 

Second I'd like to run other functions over the two matching 
columns (for example: V1 from dataset1 minus mean(V1) from dataset2, V2 
from dataset1 minus mean(V2) from dataset2,...). 

So I'm looking for a simple solution that always takes the 
matching columns from the different datasets and than I can just change 
the function for the two. 

Thank you for your help! 

Kind regards

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] spacing problem in main title using car package scatterplot

2013-08-29 Thread Gerard Smits
Hi All,

I'm using R 3.0.0.  I'm trying to add the sample size of the paired data 
(calculated by a function n(), which returns a value of 70, correctly).

My main title works fine except that the '70' appears far to the right on the 
line as in:

  at Month 18 (N=   70)

Is there a way of left justifying the result of .(ss)?  or some other way of 
removing with whitespace between n= and 70?.

Thanks for any suggestions.

Gerard




library (car)
data-read.csv(//users//smits//r_work//data.csv, header = TRUE)
attach(data);

##
ss-n(m18_das28*b_score)

scatterplot(m18_das28~b_score,
 jitter=list(x=1, y=1),
 grid=F,
 smooth=F,
 las=1,
 pch=c(1),
 col='blue',
 main=bquote(paste(Hypothesis 9.4.1\nBaseline XYZ with Disease Activity 
(DAS28)\nat Month 18 (N=,.(ss),))),
 xlab=Baseline XYZ, 
 ylab=Month 18 DAS28,
 legend.plot=F)

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Help If

2013-08-29 Thread MacQueen, Don
In addition to the other suggestions, try typing

  help('')

-Don
-- 
Don MacQueen

Lawrence Livermore National Laboratory
7000 East Ave., L-627
Livermore, CA 94550
925-423-1062





On 8/29/13 1:16 AM, Mª Teresa Martinez Soriano teresama...@hotmail.com
wrote:

Hi to everyone and sorry for my question,  I would like to use IF in an
example like this:


If((condition1 and condition2) Or (condition 3 and condition4)) {print
uhvef}


BUt I don´t know how to write it correctly,

Thanks in advance  
   [[alternative HTML version deleted]]


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] spacing problem in main title using car package scatterplot

2013-08-29 Thread John Fox
Dear Gerard,

Without your data, it's not possible to reproduce your problem exactly, but
it's clear that it isn't specific to the scatterplot() function in the car
package. For example, try

plot(1:10)
title(main=bquote(paste(Hypothesis 9.4.1\nBaseline XYZ with Disease
Activity (DAS28)\nat Month 18 (N=, 100 ,))), adj=0)

You should be able to adapt the following solution:

plot(1:10)
mtext(Hypothesis 9.4.1\nBaseline XYZ with Disease Activity (DAS28),
side=3, line=2)
mtext(paste(at Month 18 (N=, 100 ,), sep=), side=3, line=1)

I hope this helps,
 John

---
John Fox
McMaster University
Hamilton, Ontario, Canada




 -Original Message-
 From: r-help-boun...@r-project.org [mailto:r-help-bounces@r-
 project.org] On Behalf Of Gerard Smits
 Sent: Thursday, August 29, 2013 5:00 PM
 To: r-help@r-project.org
 Subject: [R] spacing problem in main title using car package
 scatterplot
 
 Hi All,
 
 I'm using R 3.0.0.  I'm trying to add the sample size of the paired
 data (calculated by a function n(), which returns a value of 70,
 correctly).
 
 My main title works fine except that the '70' appears far to the right
 on the line as in:
 
   at Month 18 (N=   70)
 
 Is there a way of left justifying the result of .(ss)?  or some other
 way of removing with whitespace between n= and 70?.
 
 Thanks for any suggestions.
 
 Gerard
 
 
 
 
 library (car)
 data-read.csv(//users//smits//r_work//data.csv, header = TRUE)
 attach(data);
 
 ###
 ###
 ss-n(m18_das28*b_score)
 
 scatterplot(m18_das28~b_score,
  jitter=list(x=1, y=1),
  grid=F,
  smooth=F,
  las=1,
  pch=c(1),
  col='blue',
  main=bquote(paste(Hypothesis 9.4.1\nBaseline XYZ with Disease
 Activity (DAS28)\nat Month 18 (N=,.(ss),))),
  xlab=Baseline XYZ,
  ylab=Month 18 DAS28,
  legend.plot=F)
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-
 guide.html
 and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Add new calculated column to data frame

2013-08-29 Thread arun
HI,
It's not really clear, but you can try this:
dat1- read.table(text=
id module  event   time time_on_task Categ    url
  1    sys  login 1373502892   80 B    http://post/add?id=42idp=45 
 2   task    add 1373502892   80 A http://post/add?id=33idp=45
 3   task    add 1373502972   23 A http://post/add?id=34idp=45
 4    sys  login 1373502892   80 B http://post/add?id=39idp=42
 5   list delete 1373502995  901 C http://post/add?id=37idp=41
 6   list   view 1373503896  100 D http://post/add?id=36idp=46
 7   task    add 1373503996   NA A http://post/add?id=31idp=45
,sep=,header=TRUE,stringsAsFactors=FALSE)
vec1-as.numeric(gsub(.*\\?.*=(\\d+)\\.*,\\1,dat1$url[dat1$Categ==A]))
 vec1
#[1] 33 34 31

dat2- read.table(text=
id idpost idtopic iduser
1   45  33   101
2   46  34   102
3   47  33   103
4   48  33   101
5   49  35   104
,sep=,header=TRUE)
 dat1$Categ[dat1$Categ==A][!vec1%in%dat2$idtopic]-F
 dat1
#  id module  event   time time_on_task Categ  url
#1  1    sys  login 1373502892   80 B http://post/add?id=42idp=45
#2  2   task    add 1373502892   80 A http://post/add?id=33idp=45
#3  3   task    add 1373502972   23 A http://post/add?id=34idp=45
#4  4    sys  login 1373502892   80 B http://post/add?id=39idp=42
#5  5   list delete 1373502995  901 C http://post/add?id=37idp=41
#6  6   list   view 1373503896  100 D http://post/add?id=36idp=46
#7  7   task    add 1373503996   NA F http://post/add?id=31idp=45


A.K.







From: srecko joksimovic sreckojoksimo...@gmail.com
To: arun smartpink...@yahoo.com 
Sent: Thursday, August 29, 2013 5:38 PM
Subject: Re: [R] Add new calculated column to data frame



Hi Arun,

I really appreciate your help, and we did a great job :)
but, now I think that R can do anything, so I'd like to try one more thing, if 
you don't mind...

from the table with categories, 

#  id module  event   time time_on_task Categ    url
#1  1    sys  login 1373502892   80 B         http:
#2  2   task    add 1373502892   80 A         http:
#3  3   task    add 1373502972   23 A         http:
#4  4    sys  login 1373502892   80 B          http:
#5  5   list delete 1373502995  901 C
#6  6   list   view 1373503896  100 D
#7  7   task    add 1373503996   NA A


I'd like to use only certain category (for example A). Each of these fields has 
an url whose format is something like http://post/add?id=33idp=45. First step 
would be to extract this id (33 in this case). Based on that value, I want to 
find all iduser from the following table:

id idpost idtopic iduser
1   45      33       101
2   46      34       102

3   47      33       103

4   48      33       101

5   49      35       104


The next step would be to check if at least one of these values (iduser) is not 
in the vectors users (only ids). If that is the case, I want to change 
category to F, if not, I want to keep the same category.

If this is too much for one question, I'll implement this in Java, but I'd 
really like to try this with R. Maybe this id extraction from url is the most 
important problem... I tried most of these steps, but still not able to put 
them all together...

Thank you so much for your time.
Srecko








On Thu, Aug 29, 2013 at 12:22 PM, arun smartpink...@yahoo.com wrote:

Hi Srecko,
No problem.

Arun







From: srecko joksimovic sreckojoksimo...@gmail.com
To: arun smartpink...@yahoo.com
Sent: Thursday, August 29, 2013 3:19 PM

Subject: Re: [R] Add new calculated column to data frame



This is great Arun, thank you again.

I was thinking to use sqldf and issue query for each module-action 
combination, but this is much better. Since I have table with categories 
(module, action, category), I could create vector levels based on the first 
two columns and vector labels based on the category column and that should 
to the work...

Best,
Srecko



On Thu, Aug 29, 2013 at 12:16 PM, arun smartpink...@yahoo.com wrote:

Hi Srecko,

You didn't mention the order in which the letters are assigned.  If you need 
a different order, just change the order in the ,levels=c(),.
Arun




- Original Message -
From: arun smartpink...@yahoo.com
To: srecko joksimovic sreckojoksimo...@gmail.com
Cc: R help r-help@r-project.org

Sent: Thursday, August 29, 2013 3:13 PM
Subject: Re: [R] Add new calculated column to data frame



Hi,
You could try this:
dat1- read.table(text=
id  module    event   time   time_on_task
1   sys login 1373502892   80
2   task    add  1373502892   80
3   task    add  1373502972   23
4   sys login

Re: [R] Add new calculated column to data frame

2013-08-29 Thread srecko joksimovic
Hi Arun,

this could to the work...

Thanks so much!


On Thu, Aug 29, 2013 at 3:10 PM, arun smartpink...@yahoo.com wrote:

 HI,
 It's not really clear, but you can try this:
 dat1- read.table(text=
 id module  event   time time_on_task Categurl
   1sys  login 1373502892   80 B
 http://post/add?id=42idp=45
  2   taskadd 1373502892   80 A
 http://post/add?id=33idp=45
  3   taskadd 1373502972   23 A
 http://post/add?id=34idp=45
  4sys  login 1373502892   80 B
 http://post/add?id=39idp=42
  5   list delete 1373502995  901 C
 http://post/add?id=37idp=41
  6   list   view 1373503896  100 D
 http://post/add?id=36idp=46
  7   taskadd 1373503996   NA A
 http://post/add?id=31idp=45
 ,sep=,header=TRUE,stringsAsFactors=FALSE)

 vec1-as.numeric(gsub(.*\\?.*=(\\d+)\\.*,\\1,dat1$url[dat1$Categ==A]))
  vec1
 #[1] 33 34 31

 dat2- read.table(text=
 id idpost idtopic iduser
 1   45  33   101
 2   46  34   102
 3   47  33   103
 4   48  33   101
 5   49  35   104
 ,sep=,header=TRUE)
  dat1$Categ[dat1$Categ==A][!vec1%in%dat2$idtopic]-F
  dat1
 #  id module  event   time time_on_task Categ
 url
 #1  1sys  login 1373502892   80 B
 http://post/add?id=42idp=45
 #2  2   taskadd 1373502892   80 A
 http://post/add?id=33idp=45
 #3  3   taskadd 1373502972   23 A
 http://post/add?id=34idp=45
 #4  4sys  login 1373502892   80 B
 http://post/add?id=39idp=42
 #5  5   list delete 1373502995  901 C
 http://post/add?id=37idp=41
 #6  6   list   view 1373503896  100 D
 http://post/add?id=36idp=46
 #7  7   taskadd 1373503996   NA F
 http://post/add?id=31idp=45


 A.K.






 
 From: srecko joksimovic sreckojoksimo...@gmail.com
 To: arun smartpink...@yahoo.com
 Sent: Thursday, August 29, 2013 5:38 PM
 Subject: Re: [R] Add new calculated column to data frame



 Hi Arun,

 I really appreciate your help, and we did a great job :)
 but, now I think that R can do anything, so I'd like to try one more
 thing, if you don't mind...

 from the table with categories,

 #  id module  event   time time_on_task Categurl
 #1  1sys  login 1373502892   80 B http:
 #2  2   taskadd 1373502892   80 A http:
 #3  3   taskadd 1373502972   23 A http:
 #4  4sys  login 1373502892   80 B  http:
 #5  5   list delete 1373502995  901 C
 #6  6   list   view 1373503896  100 D
 #7  7   taskadd 1373503996   NA A


 I'd like to use only certain category (for example A). Each of these
 fields has an url whose format is something like
 http://post/add?id=33idp=45. First step would be to extract this id (33
 in this case). Based on that value, I want to find all iduser from the
 following table:

 id idpost idtopic iduser
 1   45  33   101
 2   46  34   102

 3   47  33   103

 4   48  33   101

 5   49  35   104


 The next step would be to check if at least one of these values (iduser)
 is not in the vectors users (only ids). If that is the case, I want to
 change category to F, if not, I want to keep the same category.

 If this is too much for one question, I'll implement this in Java, but I'd
 really like to try this with R. Maybe this id extraction from url is the
 most important problem... I tried most of these steps, but still not able
 to put them all together...

 Thank you so much for your time.
 Srecko








 On Thu, Aug 29, 2013 at 12:22 PM, arun smartpink...@yahoo.com wrote:

 Hi Srecko,
 No problem.
 
 Arun
 
 
 
 
 
 
 
 From: srecko joksimovic sreckojoksimo...@gmail.com
 To: arun smartpink...@yahoo.com
 Sent: Thursday, August 29, 2013 3:19 PM
 
 Subject: Re: [R] Add new calculated column to data frame
 
 
 
 This is great Arun, thank you again.
 
 I was thinking to use sqldf and issue query for each module-action
 combination, but this is much better. Since I have table with categories
 (module, action, category), I could create vector levels based on the
 first two columns and vector labels based on the category column and that
 should to the work...
 
 Best,
 Srecko
 
 
 
 On Thu, Aug 29, 2013 at 12:16 PM, arun smartpink...@yahoo.com wrote:
 
 Hi Srecko,
 
 You didn't mention the order in which the letters are assigned.  If you
 need a different order, just change the order in the ,levels=c(),.
 Arun
 
 
 
 
 - Original Message -
 From: arun smartpink...@yahoo.com
 To: srecko joksimovic sreckojoksimo...@gmail.com
 Cc: R help r-help@r-project.org
 
 Sent: Thursday, August 29, 2013 3:13 PM
 Subject: Re: [R] Add new calculated column to data frame
 
 
 
 Hi,
 You could try this:
 dat1- read.table(text=
 id  moduleevent   time

Re: [R] Add new calculated column to data frame

2013-08-29 Thread arun


Hi Srecko,
Try this:
dat1- read.table(text=
id module  event   time time_on_task Categ    url
1    sys  login 1373502892   80 B http://
2   task    add 1373502892   80 A 
http://post/add?id=33idp=67
3   task    add 1373502972   23 A 
http://post/add?id=34idp=67
4    sys  login 1373502892   80 B  http://
5   list delete 1373502995  901 C  http://
6   list   view 1373503896  100 D   http://
7   task    add 1373503996   NA A    
http://post/add?id=35idp=99
,sep=,header=TRUE,stringsAsFactors=FALSE)
vec1-as.numeric(gsub(.*\\?.*=(\\d+)\\.*,\\1,dat1$url[dat1$Categ==A]))

dat2- read.table(text=
id idpost idtopic iduser
1   45  33   101
2   46  34   102
3   47  33   103
4   48  33   101
5   49  35   104
,sep=,header=TRUE)
 student_list- c(101:102,104:107)
 vec2-with(dat2,tapply(iduser,list(idtopic),FUN=function(x) all(x%in% 
student_list)))
dat1$Categ[dat1$Categ==A][match(vec1,as.numeric(names(vec2)))[!vec2]]-F
 dat1
#  id module  event   time time_on_task Categ  url
#1  1    sys  login 1373502892   80 B  http://
#2  2   task    add 1373502892   80 F http://post/add?id=33idp=67
#3  3   task    add 1373502972   23 A http://post/add?id=34idp=67
#4  4    sys  login 1373502892   80 B  http://
#5  5   list delete 1373502995  901 C  http://
#6  6   list   view 1373503896  100 D  http://
#7  7   task    add 1373503996   NA A http://post/add?id=35idp=99

A.K.


From: srecko joksimovic sreckojoksimo...@gmail.com
To: arun smartpink...@yahoo.com 
Sent: Thursday, August 29, 2013 6:04 PM
Subject: Re: [R] Add new calculated column to data frame



Did you mean to separate the number 33 from the link? , yes that is correct. 
It should be something like this:


#  id module  event   time time_on_task Categ    url
#1  1    sys  login 1373502892   80 B         http://
#2  2   task    add 1373502892   80 A         
http://post/add?id=33idp=67
#3  3   task    add 1373502972   23 A         
http://post/add?id=34idp=67
#4  4    sys  login 1373502892   80 B          http://

#5  5   list delete 1373502995  901 C          http://
#6  6   list   view 1373503896  100 D           http://
#7  7   task    add 1373503996   NA A        
http://post/add?id=35idp=99

from this table I should get 3 rows with 3 URLs: http://post/add?id=33idp=67, 
http://post/add?id=34idp=67, and http://post/add?id=35idp=99
For each of them, I need to extract id (33, 34, and 35). Once I do that, I need 
to obtain users from this table:
id idpost idtopic iduser
1   45      33       101
2   46      34       102

3   47      33       103

4   48      33       101

5   49      35       104

again, for each id. This means: 
id = 33 = 101, 103
id = 34 = 102

id = 35 = 104


Next, for each vector I need to check whether or not all it's values are in the 
students list (101,102, 104,105, 106,107)

id = 33 = FALSE (since 103 is not in the list)
id = 34 = TRUE

id = 35 = TRUE


This means that category for row 2 in the first table is not A any more, but 
F...

Thanks,
Srecko





On Thu, Aug 29, 2013 at 2:56 PM, arun smartpink...@yahoo.com wrote:

HI Srecko,
Did you mean to separate the number 33 from the link? Could you provide a 
reproducible example with the output you expected?
Tx.


Arun






From: srecko joksimovic sreckojoksimo...@gmail.com
To: arun smartpink...@yahoo.com
Sent: Thursday, August 29, 2013 5:38 PM

Subject: Re: [R] Add new calculated column to data frame



Hi Arun,

I really appreciate your help, and we did a great job :)
but, now I think that R can do anything, so I'd like to try one more thing, if 
you don't mind...

from the table with categories, 

#  id module  event   time time_on_task Categ    url
#1  1    sys  login 1373502892   80 B         http:
#2  2   task    add 1373502892   80 A         http:
#3  3   task    add 1373502972   23 A         http:
#4  4    sys  login 1373502892   80 B          http:
#5  5   list delete 1373502995  901 C
#6  6   list   view 1373503896  100 D
#7  7   task    add 1373503996   NA A


I'd like to use only certain category (for example A). Each of these fields 
has an url whose format is something like http://post/add?id=33idp=45. First 
step would be to extract this id (33 in this case). Based on that value, I 
want to find all iduser from the following table:

id idpost idtopic iduser
1   45      33       101
2   46      34       102

3   47      33       103

4   48      33       101

5   49      35       104



Re: [R] Add new calculated column to data frame

2013-08-29 Thread srecko joksimovic
Thanks, I'll try this as well.

Srecko


On Thu, Aug 29, 2013 at 3:26 PM, arun smartpink...@yahoo.com wrote:



 Hi Srecko,
 Try this:
 dat1- read.table(text=
 id module  event   time time_on_task Categurl
 1sys  login 1373502892   80 B http://
 2   taskadd 1373502892   80 A
 http://post/add?id=33idp=67
 3   taskadd 1373502972   23 A
 http://post/add?id=34idp=67
 4sys  login 1373502892   80 B  http://
 5   list delete 1373502995  901 C  http://
 6   list   view 1373503896  100 D   http://
 7   taskadd 1373503996   NA A
 http://post/add?id=35idp=99
 ,sep=,header=TRUE,stringsAsFactors=FALSE)

 vec1-as.numeric(gsub(.*\\?.*=(\\d+)\\.*,\\1,dat1$url[dat1$Categ==A]))

 dat2- read.table(text=
 id idpost idtopic iduser
 1   45  33   101
 2   46  34   102
 3   47  33   103
 4   48  33   101
 5   49  35   104
 ,sep=,header=TRUE)
  student_list- c(101:102,104:107)
  vec2-with(dat2,tapply(iduser,list(idtopic),FUN=function(x) all(x%in%
 student_list)))

 dat1$Categ[dat1$Categ==A][match(vec1,as.numeric(names(vec2)))[!vec2]]-F
  dat1
 #  id module  event   time time_on_task Categ
 url
 #1  1sys  login 1373502892   80 B
 http://
 #2  2   taskadd 1373502892   80 F
 http://post/add?id=33idp=67
 #3  3   taskadd 1373502972   23 A
 http://post/add?id=34idp=67
 #4  4sys  login 1373502892   80 B
 http://
 #5  5   list delete 1373502995  901 C
 http://
 #6  6   list   view 1373503896  100 D
 http://
 #7  7   taskadd 1373503996   NA A
 http://post/add?id=35idp=99

 A.K.

 
 From: srecko joksimovic sreckojoksimo...@gmail.com
 To: arun smartpink...@yahoo.com
 Sent: Thursday, August 29, 2013 6:04 PM
 Subject: Re: [R] Add new calculated column to data frame



 Did you mean to separate the number 33 from the link? , yes that is
 correct. It should be something like this:


 #  id module  event   time time_on_task Categurl
 #1  1sys  login 1373502892   80 B http://
 #2  2   taskadd 1373502892   80 A
 http://post/add?id=33idp=67
 #3  3   taskadd 1373502972   23 A
 http://post/add?id=34idp=67
 #4  4sys  login 1373502892   80 B  http://

 #5  5   list delete 1373502995  901 C  http://
 #6  6   list   view 1373503896  100 D   http://
 #7  7   taskadd 1373503996   NA A
 http://post/add?id=35idp=99

 from this table I should get 3 rows with 3 URLs:
 http://post/add?id=33idp=67, http://post/add?id=34idp=67, and
 http://post/add?id=35idp=99
 For each of them, I need to extract id (33, 34, and 35). Once I do that, I
 need to obtain users from this table:
 id idpost idtopic iduser
 1   45  33   101
 2   46  34   102

 3   47  33   103

 4   48  33   101

 5   49  35   104

 again, for each id. This means:
 id = 33 = 101, 103
 id = 34 = 102

 id = 35 = 104


 Next, for each vector I need to check whether or not all it's values are
 in the students list (101,102, 104,105, 106,107)

 id = 33 = FALSE (since 103 is not in the list)
 id = 34 = TRUE

 id = 35 = TRUE


 This means that category for row 2 in the first table is not A any more,
 but F...

 Thanks,
 Srecko





 On Thu, Aug 29, 2013 at 2:56 PM, arun smartpink...@yahoo.com wrote:

 HI Srecko,
 Did you mean to separate the number 33 from the link? Could you provide a
 reproducible example with the output you expected?
 Tx.
 
 
 Arun
 
 
 
 
 
 
 From: srecko joksimovic sreckojoksimo...@gmail.com
 To: arun smartpink...@yahoo.com
 Sent: Thursday, August 29, 2013 5:38 PM
 
 Subject: Re: [R] Add new calculated column to data frame
 
 
 
 Hi Arun,
 
 I really appreciate your help, and we did a great job :)
 but, now I think that R can do anything, so I'd like to try one more
 thing, if you don't mind...
 
 from the table with categories,
 
 #  id module  event   time time_on_task Categurl
 #1  1sys  login 1373502892   80 B http:
 #2  2   taskadd 1373502892   80 A http:
 #3  3   taskadd 1373502972   23 A http:
 #4  4sys  login 1373502892   80 B  http:
 #5  5   list delete 1373502995  901 C
 #6  6   list   view 1373503896  100 D
 #7  7   taskadd 1373503996   NA A
 
 
 I'd like to use only certain category (for example A). Each of these
 fields has an url whose format is something like
 http://post/add?id=33idp=45. First step would be to extract this id (33
 in this case). Based on that value, I want to find all iduser from the
 following table:
 
 id idpost idtopic iduser
 1   45  33   101
 2   46  34   102
 
 3  

[R] Vectorized version of colMeans/rowMeans for higher dimension arrays?

2013-08-29 Thread Jonathan Greenberg
For matrices, colMeans/rowMeans are quick, vectorized functions.  But
say I have a higher dimensional array:

moo - array(runif(400*9*3),dim=c(400,9,3))

And I want to get the mean along the 2nd dimension.  I can, of course,
use apply:

moo1 - apply(moo,c(1,3),mean)

But this is not a vectorized operation (so it doesn't execute as
quickly).  How would one vectorize this operation (if possible)?  Is
there an array equivalent of colMeans/rowMeans?

--j

-- 
Jonathan A. Greenberg, PhD
Assistant Professor
Global Environmental Analysis and Remote Sensing (GEARS) Laboratory
Department of Geography and Geographic Information Science
University of Illinois at Urbana-Champaign
607 South Mathews Avenue, MC 150
Urbana, IL 61801
Phone: 217-300-1924
http://www.geog.illinois.edu/~jgrn/
AIM: jgrn307, MSN: jgrn...@hotmail.com, Gchat: jgrn307, Skype: jgrn3007

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] why is this a factor?

2013-08-29 Thread Rolf Turner

On 29/08/13 12:10, Ista Zahn wrote:

On Wed, Aug 28, 2013 at 7:44 PM, Steve Lianoglou
lianoglou.st...@gene.com wrote:

Hi,

On Wed, Aug 28, 2013 at 3:58 PM, Ista Zahn istaz...@gmail.com wrote:

Or go all the way and put

options(stringsAsFactors = FALSE)

at the top your script or in your .Rprofile. This will prevent this
kind of annoyance in the future without having to say stringsAsFactors
= FALSE all the time.

I go back and forth about doing this too (setting a global hammer to
stringsAsFactors), but then other things might mess up -- imagine a
scenario where a package is written with the assumption that the
default `stringsAsFactors=TRUE` setting hasn't been changed, which
could then break when you go the nuclear-global-override route.

Yes, possibly, but I've yet to have that problem, whereas before I
started changing it globally things used to break fairly regularly.


Like Ista I have never had a problem arising from a package's assuming that
`stringsAsFactors=TRUE` --- and I would opine that any package making such
an assumption is badly written.  (Of course there is a lot of bad code 
out there )


I have once or twice stumbled over a conundrum in respect of questions 
posed on
r-help where the poster assumed `stringsAsFactors=TRUE`.  But I 
eventually figured
out what was going on.  (And anyway that's the poster's problem, as far 
as I'm

concerned.)

cheers,

Rolf

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Vectorized version of colMeans/rowMeans for higher dimension arrays?

2013-08-29 Thread arun
Hi,
You could try:
res-colMeans(aperm(moo,c(2,1,3)))
resOld-apply(moo,c(1,3),mean)
 identical(res,resOld)
#[1] TRUE

#Speed:
set.seed(285)
moo1- array(runif(1400*9*15),dim=c(1400,9,15))

system.time({res1- colMeans(aperm(moo1,c(2,1,3)))})
 #user  system elapsed 
 # 0.004   0.000   0.002 

system.time({res2- apply(moo1,c(1,3),mean)})
 # user  system elapsed 
 # 0.180   0.000   0.178 
identical(res1,res2)
#[1] TRUE


A.K.



- Original Message -
From: Jonathan Greenberg j...@illinois.edu
To: r-help r-help@r-project.org
Cc: 
Sent: Thursday, August 29, 2013 6:36 PM
Subject: [R] Vectorized version of colMeans/rowMeans for higher dimension   
arrays?

For matrices, colMeans/rowMeans are quick, vectorized functions.  But
say I have a higher dimensional array:

moo - array(runif(400*9*3),dim=c(400,9,3))

And I want to get the mean along the 2nd dimension.  I can, of course,
use apply:

moo1 - apply(moo,c(1,3),mean)

But this is not a vectorized operation (so it doesn't execute as
quickly).  How would one vectorize this operation (if possible)?  Is
there an array equivalent of colMeans/rowMeans?

--j

-- 
Jonathan A. Greenberg, PhD
Assistant Professor
Global Environmental Analysis and Remote Sensing (GEARS) Laboratory
Department of Geography and Geographic Information Science
University of Illinois at Urbana-Champaign
607 South Mathews Avenue, MC 150
Urbana, IL 61801
Phone: 217-300-1924
http://www.geog.illinois.edu/~jgrn/
AIM: jgrn307, MSN: jgrn...@hotmail.com, Gchat: jgrn307, Skype: jgrn3007

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] why is this a factor?

2013-08-29 Thread Steve Lianoglou
Hi,

On Thu, Aug 29, 2013 at 3:03 PM, Rolf Turner rolf.tur...@xtra.co.nz wrote:
 On 29/08/13 12:10, Ista Zahn wrote:

 On Wed, Aug 28, 2013 at 7:44 PM, Steve Lianoglou
 lianoglou.st...@gene.com wrote:

 Hi,

 On Wed, Aug 28, 2013 at 3:58 PM, Ista Zahn istaz...@gmail.com wrote:

 Or go all the way and put

 options(stringsAsFactors = FALSE)

 at the top your script or in your .Rprofile. This will prevent this
 kind of annoyance in the future without having to say stringsAsFactors
 = FALSE all the time.

 I go back and forth about doing this too (setting a global hammer to
 stringsAsFactors), but then other things might mess up -- imagine a
 scenario where a package is written with the assumption that the
 default `stringsAsFactors=TRUE` setting hasn't been changed, which
 could then break when you go the nuclear-global-override route.

 Yes, possibly, but I've yet to have that problem, whereas before I
 started changing it globally things used to break fairly regularly.


 Like Ista I have never had a problem arising from a package's assuming that
 `stringsAsFactors=TRUE` --- and I would opine that any package making such
 an assumption is badly written.  (Of course there is a lot of bad code out
 there )

It never happened to me either, except when code that *I* wrote was
dependent on the global options settings to stringsAsFactors=FALSE.

I had to hand over a codebase to a colleague in my lab when I left.
Her options(stringsAsFactors) was at the default (TRUE), and things
mysteriously broke until we (eventually) sorted out what was the what
-- it took a while to find because I *totally* forgot I had set
`options(stringsAsFactors=FALSE)` my ~/.Rprofile several years prior
(a testament to how little it breaks things I guess).

Of course, I can't argue with your premise that code written that
depends on the defaults (or changed defaults) is, in the end, poorly
written code ... sometimes we have to own up to being the ones who
write poorly written code ;-)

I only posted my original warning here to serve, more or less, as the
sentiment put forth in this poster since a decent amount of time was
lost chasing our tails:

http://www.despair.com/mistakes.html

;-)

-steve

-- 
Steve Lianoglou
Computational Biologist
Bioinformatics and Computational Biology
Genentech

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] scale breaks

2013-08-29 Thread Jim Lemon

On 08/30/2013 01:28 AM, Shane Carey wrote:

Hello all,

I have decided to go ahead with gap.boxplot. I am trying to suppress the
axis labels, both x and y labels. I tried using axis.labels=NULL but it
would not work.


Hi Shane,
To suppress the axis labels, pass an empty string:

gap.barplot(...,xlab=,ylab=,...)

Many default values of NULL tell the function to work out labels from 
the data, usually names.


Jim

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] new.env() and attach for write?

2013-08-29 Thread Greg Snow
Or you can use with:

 a - new.env()
 with(a, b - function(x) x )
 a$b
function(x) x
environment: 0x06e9e0b8




On Wed, Aug 28, 2013 at 3:45 PM, ivo welch ivo.we...@gmail.com wrote:

 duh!


 
 Ivo Welch (ivo.we...@gmail.com)
 http://www.ivo-welch.info/
 J. Fred Weston Professor of Finance
 Anderson School at UCLA, C519
 Director, UCLA Anderson Fink Center for Finance and Investments
 Free Finance Textbook, http://book.ivo-welch.info/
 Editor, Critical Finance Review, http://www.critical-finance-review.org/



 On Wed, Aug 28, 2013 at 2:42 PM, Hadley Wickham h.wick...@gmail.com
 wrote:

  On Wed, Aug 28, 2013 at 4:32 PM, ivo welch ivo.we...@anderson.ucla.edu
  wrote:
   is it possible to temporarily change the destination environment where
   objects are written to?  I am thinking
  
 a - new.env()
 attach(a)
 ### run some code, such as...
 b - function(x) x
 detach(a)
 a$b
  
   obviously, this is wrong.  attach() only attaches for read access.  I
  could
   copy the globalenv, run my code, see what objects have been changed
  (how?),
   move the changed and new functions into my a environment, and then
  restore
   globalenv.  or is this already done somewhere else?
 
  within?
 
  Or just:
 
  evalq({
   b - function(x) x
  }, a)
 
  Hadley
 
  --
  Chief Scientist, RStudio
  http://had.co.nz/
 

 [[alternative HTML version deleted]]

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.




-- 
Gregory (Greg) L. Snow Ph.D.
538...@gmail.com

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] calculate with different columns from different datasets

2013-08-29 Thread laro
Thank you for your answer. But further calculations will be much more
difficult, like


(1-b)^2 * Var(V1)   for all matching columns
 
where b is the slope from a regression V1 (from datset 1) on V1 (dataset 2)
and Var(V1) the variance from V1(from dataset2).

So what I'm looking for is something like a loop function...




--
View this message in context: 
http://r.789695.n4.nabble.com/calculate-with-different-columns-from-different-datasets-tp4674918p4674926.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] calculate with different columns from different datasets

2013-08-29 Thread laro
Hi thereI've got two datasets of the following form (just an example, the
real dataset got a lot more columns)dataset1V1  V2  V32 6   84  
3   41  9   8and
dataset 2V1  V2 V36 8   42  0   78  1   3First, 
I'd like to calculate the
following:V1 from dataset1 minus V1 from dataset2,thanV2 from dataset1 minus
V2 from dataset2...and so on (always Vn-Vn, where n=1,2,n) and safe the
solution-vectors in a new matrix.Second I'd like to run other functions over
the two matching columns (for example: V1 from dataset1 minus mean(V1) from
dataset2, V2 from dataset1 minus mean(V2) from dataset2,...).So I'm looking
for a simple solution that always takes the matching columns from the
different datasets and than I can just change the function for the two.Thank
you for your help!Kind regards



--
View this message in context: 
http://r.789695.n4.nabble.com/calculate-with-different-columns-from-different-datasets-tp4674918.html
Sent from the R help mailing list archive at Nabble.com.
[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Running pre R.14 version of R with R3.0.0

2013-08-29 Thread Luvalle, Michael J (Michael)
I upgraded R from 2.12.1 to 3.0.0 (on windows XP(, and as soon as I saved the 
3.0.0 workspace, was unable to access .Rdata from 2.12.1.  The message in the R 
console is Error in loadNamesSpace(name): there is no package called parallel 
and a popup window that says Fatal error: unable to restore saved date in 
.Rdata

Is there anything that can be done to access the old .Rdata  without destroying 
the new?

Thanks

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Missing value handling for felm function in lfe package

2013-08-29 Thread Megha Patnaik
Dear All,

I am trying to use the felm function in the lfe package. However it does
not seem to deal with missing values the way the lm function does. I wish
to tell it na.omit or na.action = na.omit but it does not recognize this. I
need to allow for missing values as I have different specifications and
don't want to remove observations for all. Help on this will be greatly
appreciated!

Thanks in advance and hope this is clear.
Megha


Megha Patnaik
PhD candidate
Dept of Economics
Stanford University
650-868-6084

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Problem with Peaks package - followup…

2013-08-29 Thread Wildgruber, Christoph U.
Hi,

I apologize for not following the posting rules…

Here is the text from my previous post:

I started evaluating the 'Peaks' package a couple of months ago and found it 
to be quite
useful. Getting back to it last week I had to set up my R environment due to 
hardware 
changes again. The Peaks package loads with no problem.
After successfully reinstalling all packages (RedHat  4.4.7-3 and OS X 10.8.4)
I am getting the following error message :

Error in .Call(R_SpectrumSearchHighRes, as.vector(y), as.numeric(sigma),  : 
 R_SpectrumSearchHighRes not available for .Call() for package Peaks

I was not able to identify a problem with my installation. The script calling
this function is the same, the actual call is the same as it was when I stopped 
working with this script.
Any suggestion for how to fix this issue will be greatly appreciated.

The following script (OS X 10.8.4) fails in a reproducible way:

###
## CUW 08/2013
## Demo: Peaksearch issue (package 'Peaks')
##
library(Peaks)

## Signal with well defined peaks
x - seq( 0, 50, len=1024)
y - 1/x * sin(x)

## Plot signal...
plot(x, y, type='s')

## Call SpectrumSearch with default parameters
res - SpectrumSearch(y, sigma=3.0, threshold=1.0, background=FALSE, 
iterations=13, markov=FALSE, window=3)
##

Error message:

Error in .Call(R_SpectrumSearchHighRes, as.vector(y), as.numeric(sigma),  : 
  R_SpectrumSearchHighRes not available for .Call() for package Peaks


Any suggestion for fixing this issue is very much appreciated!

Thanks,

Uli
__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Omitted/blank variables in R function

2013-08-29 Thread newruser12345
Hi All,

I'm very green user and have little programming background, but appreciate
any and all help/direction.  I have a spreadsheet that successfully sends
values from Excel cells to R as variables for a function, which then runs
and generates a plot.  I cannot figure out how to make R recognize those
variables as NA if one of the cells in Excel is blank - or, for that matter,
I don't know how to get an R function  to recognize variables as NA if no
value is assigned to that variable.

I have unsuccessfully tried using :

if(is.na(four)) return(NA)

My function is very simple: 

mtmatches - c(one,two,three,four)

Everything runs smoothly if the four variables have values assigned to them. 
Any advice on how to get it to run when one of the variables has no value? 
No worries about the Excel element...figure I can decipher that puzzle
later!


Thanks!



--
View this message in context: 
http://r.789695.n4.nabble.com/Omitted-blank-variables-in-R-function-tp4674931.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Problem with Peaks package - followup…

2013-08-29 Thread arun


Hi,
I am getting the same error with R 3.0.1
SpectrumSearch(y, sigma=3.0, threshold=1.0, background=TRUE, iterations=13, 
markov=FALSE, window=3)
#Error in .Call(R_SpectrumSearchHighRes, as.vector(y), as.numeric(sigma),  : 
 # R_SpectrumSearchHighRes not available for .Call() for package Peaks


But, it worked with R 2.15.2 



It would be better to contact  the package maintainer
maintainer(Peaks)
#[1] M.Kondrin mkond...@hppi.troitsk.ru



 sessionInfo()
R version 3.0.1 (2013-05-16)
Platform: x86_64-unknown-linux-gnu (64-bit)

locale:
 [1] LC_CTYPE=en_CA.UTF-8   LC_NUMERIC=C  
 [3] LC_TIME=en_CA.UTF-8    LC_COLLATE=en_CA.UTF-8    
 [5] LC_MONETARY=en_CA.UTF-8    LC_MESSAGES=en_CA.UTF-8   
 [7] LC_PAPER=C LC_NAME=C 
 [9] LC_ADDRESS=C   LC_TELEPHONE=C    
[11] LC_MEASUREMENT=en_CA.UTF-8 LC_IDENTIFICATION=C   

attached base packages:
[1] stats graphics  grDevices utils datasets  methods   base 

other attached packages:
[1] Peaks_0.2  stringr_0.6.2  reshape2_1.2.2

loaded via a namespace (and not attached):
[1] plyr_1.8    tcltk_3.0.1 tools_3.0.1


A.K.




- Original Message -
From: Wildgruber, Christoph U. wildgrube...@ornl.gov
To: R-help@r-project.org R-help@r-project.org
Cc: Wildgruber, Christoph U. wildgrube...@ornl.gov
Sent: Thursday, August 29, 2013 11:16 AM
Subject: [R] Problem with Peaks package - followup…

Hi,

I apologize for not following the posting rules…

Here is the text from my previous post:

I started evaluating the 'Peaks' package a couple of months ago and found it 
to be quite
useful. Getting back to it last week I had to set up my R environment due to 
hardware 
changes again. The Peaks package loads with no problem.
After successfully reinstalling all packages (RedHat  4.4.7-3 and OS X 10.8.4)
I am getting the following error message :

Error in .Call(R_SpectrumSearchHighRes, as.vector(y), as.numeric(sigma),  : 
R_SpectrumSearchHighRes not available for .Call() for package Peaks

I was not able to identify a problem with my installation. The script calling
this function is the same, the actual call is the same as it was when I stopped 
working with this script.
Any suggestion for how to fix this issue will be greatly appreciated.

The following script (OS X 10.8.4) fails in a reproducible way:

###
## CUW 08/2013
## Demo: Peaksearch issue (package 'Peaks')
##
library(Peaks)

## Signal with well defined peaks
x - seq( 0, 50, len=1024)
y - 1/x * sin(x)

## Plot signal...
plot(x, y, type='s')

## Call SpectrumSearch with default parameters
res - SpectrumSearch(y, sigma=3.0, threshold=1.0, background=FALSE, 
iterations=13, markov=FALSE, window=3)
##

Error message:

Error in .Call(R_SpectrumSearchHighRes, as.vector(y), as.numeric(sigma),  : 
  R_SpectrumSearchHighRes not available for .Call() for package Peaks


Any suggestion for fixing this issue is very much appreciated!

Thanks,

Uli
__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Omitted/blank variables in R function

2013-08-29 Thread Greg Snow
Look at the missing function.

Or set the default value of the arguments to NA.


On Thu, Aug 29, 2013 at 3:23 PM, newruser12345 smetc...@gelbergroup.comwrote:

 Hi All,

 I'm very green user and have little programming background, but appreciate
 any and all help/direction.  I have a spreadsheet that successfully sends
 values from Excel cells to R as variables for a function, which then runs
 and generates a plot.  I cannot figure out how to make R recognize those
 variables as NA if one of the cells in Excel is blank - or, for that
 matter,
 I don't know how to get an R function  to recognize variables as NA if no
 value is assigned to that variable.

 I have unsuccessfully tried using :

 if(is.na(four)) return(NA)

 My function is very simple:

 mtmatches - c(one,two,three,four)

 Everything runs smoothly if the four variables have values assigned to
 them.
 Any advice on how to get it to run when one of the variables has no value?
 No worries about the Excel element...figure I can decipher that puzzle
 later!


 Thanks!



 --
 View this message in context:
 http://r.789695.n4.nabble.com/Omitted-blank-variables-in-R-function-tp4674931.html
 Sent from the R help mailing list archive at Nabble.com.

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.




-- 
Gregory (Greg) L. Snow Ph.D.
538...@gmail.com

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Validating data type

2013-08-29 Thread jeffjohn


I'm very new to R. I have a data file that I have read in via read.csv. I
expect one of the columns to be of type date for example. However at
least one value in that column is not of date type. I know this because
another program I am trying to process the file with is erroring, yet it
doesn't tell me what row/value is erroring. Does R have a way to: treat
column x as date type, and print out all values/row numbers do not conform
to that type for that specified column?

Many thanks!
Jeff
[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Validating data type

2013-08-29 Thread Jeff Newmiller
The answer to your question is yes. You can convert a column of values to Date 
using the as.Date function with the appropriate format, and then test if any 
values are NA using the is.na function, and find them with the which function.

If you want something less vague then you should read the Posting Guide 
mentioned at the bottom of this message and follow the advice about using plain 
text and providing a sample of data that exhibits the issue and your attempts 
to solve the problem (code). Sample data is almost always needed... if you 
don't make it, then we have do so in order to illustrate the solution, but we 
would be guessing and that is just a waste of time. You may find the following 
link helpful also: 
http://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example
---
Jeff NewmillerThe .   .  Go Live...
DCN:jdnew...@dcn.davis.ca.usBasics: ##.#.   ##.#.  Live Go...
  Live:   OO#.. Dead: OO#..  Playing
Research Engineer (Solar/BatteriesO.O#.   #.O#.  with
/Software/Embedded Controllers)   .OO#.   .OO#.  rocks...1k
--- 
Sent from my phone. Please excuse my brevity.

jeffj...@worldvision.org wrote:


I'm very new to R. I have a data file that I have read in via read.csv.
I
expect one of the columns to be of type date for example. However at
least one value in that column is not of date type. I know this because
another program I am trying to process the file with is erroring, yet
it
doesn't tell me what row/value is erroring. Does R have a way to: treat
column x as date type, and print out all values/row numbers do not
conform
to that type for that specified column?

Many thanks!
Jeff
   [[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.