[R] Format utils::bibentry with a 'corporate name'

2021-10-29 Thread Sam Albers
Hi all,

Does anyone know of a way to force utils::bibentry to mimic the BibTex
behaviour of using double { to force a "corporate name" in the author
field to print correctly? For example take this bibentry:

entry <- utils::bibentry(
  bibtype = "Manual",
  title = "The Thing",
  author = "The Data People",
  organization = "The Data Org",
  year = format(Sys.Date(), "%Y")
)

entry
#> People TD (2021). _The Thing_. The Data Org.
print(entry, style = "citation")
#>
#> People TD (2021). _The Thing_. The Data Org.
#>
#> A BibTeX entry for LaTeX users is
#>
#>   @Manual{,
#> title = {The Thing},
#> author = {The Data People},
#> organization = {The Data Org},
#> year = {2021},
#>   }

I can simply add "{" right in the author string which then passes that
to the Bibtex entry but the author field is still thinking it is a
person with a name and I also get some warnings:

entry <- utils::bibentry(
  bibtype = "Manual",
  title = "The Thing",
  author = "{The Data People}",
  organization = "The Data Org",
  year = format(Sys.Date(), "%Y")
)


print(entry, style = "citation")
#> Warning in parseLatex(x): x:1: unexpected '}'
#> Warning in parseLatex(x): x:1: unexpected END_OF_INPUT 'The'
#> Warning in parseLatex(x): x:1: unexpected '}'
#> Warning in parseLatex(x): x:1: unexpected END_OF_INPUT 'The'
#> Warning in withCallingHandlers(.External2(C_parseRd, tcon, srcfile,
"UTF-8", :
#> :1: unexpected '}'
#> Warning in withCallingHandlers(.External2(C_parseRd, tcon, srcfile,
"UTF-8", : :4: unexpected END_OF_INPUT 'The Data Org.
#> '
#>
#> People D (2021). _The Thing_. The Data Org.
#>
#> A BibTeX entry for LaTeX users is
#>
#>   @Manual{,
#> title = {The Thing},
#> author = {{The Data People}},
#> organization = {The Data Org},
#> year = {2021},
#>   }

Any thoughts?

Thanks in advance,

Sam

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] file.access returning -1 for a file on remote Windows drive.

2020-02-28 Thread Sam Albers
Thanks Jeff. And for future readers head here:

https://github.com/eddelbuettel/digest/issues/49

and here:

https://github.com/eddelbuettel/digest/issues/13


Sam

On Fri, Feb 28, 2020 at 3:40 PM Jeff Newmiller  wrote:
>
> Read the closed issues in his digest Github repo first... this discussion has 
> already occurred there.
>
> On February 28, 2020 3:35:09 PM PST, Sam Albers  
> wrote:
> >Great question Will. If it were my code I would definitely do this.
> >However the problem is manifesting itself for my work with Dirk's
> >great digest package here:
> >
> >https://github.com/eddelbuettel/digest/blob/947b77e82b97024a874a808a4644be21fc329275/R/digest.R#L170-L173
> >
> >So because file.access is saying the permissions aren't right, I get
> >an error message from digest and can't create a hash. Knowing full
> >well that this is some weird Windows thing but also knowing I am stuck
> >in that environment, I wanted to figure where I was seeing a
> >difference between those two functions before I went asked Dirk if
> >he'd be interested in a change to that particular bit of code.
> >
> >
> >On Fri, Feb 28, 2020 at 3:28 PM William Dunlap 
> >wrote:
> >>
> >> If file.access() says the file is unreadable but file() says it can
> >be opened, why don't you
> >> just open the file and read it?  You can use tryCatch to deal with
> >problems opening or
> >> reading the file.
> >>
> >> Bill Dunlap
> >> TIBCO Software
> >> wdunlap tibco.com
> >>
> >>
> >> On Fri, Feb 28, 2020 at 2:54 PM Sam Albers
> > wrote:
> >>>
> >>> Thanks Jeff. I am probably not explaining myself very well but my
> >>> question under what circumstances would
> >>>
> >>> summary(file(remote_file, "rb"))$`can read`
> >>>
> >>> be different from:
> >>>
> >>> file.access(remote_file, 4)
> >>>
> >>> If my permissions were different across remote and local should that
> >>> not be reflected in both of these functions?
> >>>
> >>> On Fri, Feb 28, 2020 at 2:37 PM Jeff Newmiller
> > wrote:
> >>> >
> >>> > Dunno. They agree for me. Maybe look closer at all permissions via
> >Windows File Manager?
> >>> >
> >>> > On February 28, 2020 2:06:34 PM PST, Sam Albers
> > wrote:
> >>> > >Some additional follow-up:
> >>> > >
> >>> > >> summary(file(remote_file, "rb"))$`can read`
> >>> > >[1] "yes"
> >>> > >
> >>> > >> summary(file(local_file, "rb"))$`can read`
> >>> > >[1] "yes"
> >>> > >
> >>> > >compared to:
> >>> > >
> >>> > >> file.access(local_file, 4)
> >>> > >local.R
> >>> > > 0
> >>> > >
> >>> > >> file.access(remote_file, 4)
> >>> > >remote.R
> >>> > >-1
> >>> > >
> >>> > >Can anyone think why file.access and file would be contradicting
> >each
> >>> > >other?
> >>> > >
> >>> > >Sam
> >>> > >
> >>> > >On Fri, Feb 28, 2020 at 10:47 AM Sam Albers
> >>> > > wrote:
> >>> > >>
> >>> > >> Hi there,
> >>> > >>
> >>> > >> Looking for some help in diagnosing or developing a work around
> >to a
> >>> > >> problem I am having on a Windows machine. I am running R 3.6.2.
> >>> > >>
> >>> > >> I have two identical files, one stored locally and the other
> >stored
> >>> > >on
> >>> > >> a network drive.
> >>> > >>
> >>> > >> For access:
> >>> > >>
> >>> > >> > file.access(local_file, 4)
> >>> > >> local.R
> >>> > >>  0
> >>> > >>
> >>> > >> > file.access(remote_file, 4)
> >>> > >> remote.R
> >>> > >> -1
> >>> > >>
> >>> > >> Also for file.info
> >>> > >>
> >>> > >> > file.info(local_file)$mode:
> >>> > >> [1] "666"
> >>&

Re: [R] file.access returning -1 for a file on remote Windows drive.

2020-02-28 Thread Sam Albers
Great question Will. If it were my code I would definitely do this.
However the problem is manifesting itself for my work with Dirk's
great digest package here:

https://github.com/eddelbuettel/digest/blob/947b77e82b97024a874a808a4644be21fc329275/R/digest.R#L170-L173

So because file.access is saying the permissions aren't right, I get
an error message from digest and can't create a hash. Knowing full
well that this is some weird Windows thing but also knowing I am stuck
in that environment, I wanted to figure where I was seeing a
difference between those two functions before I went asked Dirk if
he'd be interested in a change to that particular bit of code.


On Fri, Feb 28, 2020 at 3:28 PM William Dunlap  wrote:
>
> If file.access() says the file is unreadable but file() says it can be 
> opened, why don't you
> just open the file and read it?  You can use tryCatch to deal with problems 
> opening or
> reading the file.
>
> Bill Dunlap
> TIBCO Software
> wdunlap tibco.com
>
>
> On Fri, Feb 28, 2020 at 2:54 PM Sam Albers  wrote:
>>
>> Thanks Jeff. I am probably not explaining myself very well but my
>> question under what circumstances would
>>
>> summary(file(remote_file, "rb"))$`can read`
>>
>> be different from:
>>
>> file.access(remote_file, 4)
>>
>> If my permissions were different across remote and local should that
>> not be reflected in both of these functions?
>>
>> On Fri, Feb 28, 2020 at 2:37 PM Jeff Newmiller  
>> wrote:
>> >
>> > Dunno. They agree for me. Maybe look closer at all permissions via Windows 
>> > File Manager?
>> >
>> > On February 28, 2020 2:06:34 PM PST, Sam Albers 
>> >  wrote:
>> > >Some additional follow-up:
>> > >
>> > >> summary(file(remote_file, "rb"))$`can read`
>> > >[1] "yes"
>> > >
>> > >> summary(file(local_file, "rb"))$`can read`
>> > >[1] "yes"
>> > >
>> > >compared to:
>> > >
>> > >> file.access(local_file, 4)
>> > >local.R
>> > > 0
>> > >
>> > >> file.access(remote_file, 4)
>> > >remote.R
>> > >-1
>> > >
>> > >Can anyone think why file.access and file would be contradicting each
>> > >other?
>> > >
>> > >Sam
>> > >
>> > >On Fri, Feb 28, 2020 at 10:47 AM Sam Albers
>> > > wrote:
>> > >>
>> > >> Hi there,
>> > >>
>> > >> Looking for some help in diagnosing or developing a work around to a
>> > >> problem I am having on a Windows machine. I am running R 3.6.2.
>> > >>
>> > >> I have two identical files, one stored locally and the other stored
>> > >on
>> > >> a network drive.
>> > >>
>> > >> For access:
>> > >>
>> > >> > file.access(local_file, 4)
>> > >> local.R
>> > >>  0
>> > >>
>> > >> > file.access(remote_file, 4)
>> > >> remote.R
>> > >> -1
>> > >>
>> > >> Also for file.info
>> > >>
>> > >> > file.info(local_file)$mode:
>> > >> [1] "666"
>> > >>
>> > >> > file.info(remote_file)$mode:
>> > >> [1] "666"
>> > >>
>> > >> Ok so I am access issues. Maybe they are ephemeral and I can change
>> > >> the permissions:
>> > >>
>> > >> > Sys.chmod('remote.R', mode = '666')
>> > >> > file.access(remote_file, 4)
>> > >> remote.R
>> > >> -1
>> > >>
>> > >> Nope. I am thoroughly stumped and maybe can't make it any further
>> > >> because of Windows.
>> > >>
>> > >> Downstream I am trying to use digest::digest to create a hash but
>> > >> digest thinks we don't have permission because file.access is
>> > >failing.
>> > >> Any thoughts on how I can get file.access to return 0 for the
>> > >remote.R
>> > >> file? Any ideas?
>> > >>
>> > >> Thanks in advance,
>> > >>
>> > >> Sam
>> > >
>> > >__
>> > >R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
>> > >https://stat.ethz.ch/mailman/listinfo/r-help
>> > >PLEASE do read the posting guide
>> > >http://www.R-project.org/posting-guide.html
>> > >and provide commented, minimal, self-contained, reproducible code.
>> >
>> > --
>> > Sent from my phone. Please excuse my brevity.
>>
>> __
>> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] file.access returning -1 for a file on remote Windows drive.

2020-02-28 Thread Sam Albers
Thanks Jeff. I am probably not explaining myself very well but my
question under what circumstances would

summary(file(remote_file, "rb"))$`can read`

be different from:

file.access(remote_file, 4)

If my permissions were different across remote and local should that
not be reflected in both of these functions?

On Fri, Feb 28, 2020 at 2:37 PM Jeff Newmiller  wrote:
>
> Dunno. They agree for me. Maybe look closer at all permissions via Windows 
> File Manager?
>
> On February 28, 2020 2:06:34 PM PST, Sam Albers  
> wrote:
> >Some additional follow-up:
> >
> >> summary(file(remote_file, "rb"))$`can read`
> >[1] "yes"
> >
> >> summary(file(local_file, "rb"))$`can read`
> >[1] "yes"
> >
> >compared to:
> >
> >> file.access(local_file, 4)
> >local.R
> > 0
> >
> >> file.access(remote_file, 4)
> >remote.R
> >-1
> >
> >Can anyone think why file.access and file would be contradicting each
> >other?
> >
> >Sam
> >
> >On Fri, Feb 28, 2020 at 10:47 AM Sam Albers
> > wrote:
> >>
> >> Hi there,
> >>
> >> Looking for some help in diagnosing or developing a work around to a
> >> problem I am having on a Windows machine. I am running R 3.6.2.
> >>
> >> I have two identical files, one stored locally and the other stored
> >on
> >> a network drive.
> >>
> >> For access:
> >>
> >> > file.access(local_file, 4)
> >> local.R
> >>  0
> >>
> >> > file.access(remote_file, 4)
> >> remote.R
> >> -1
> >>
> >> Also for file.info
> >>
> >> > file.info(local_file)$mode:
> >> [1] "666"
> >>
> >> > file.info(remote_file)$mode:
> >> [1] "666"
> >>
> >> Ok so I am access issues. Maybe they are ephemeral and I can change
> >> the permissions:
> >>
> >> > Sys.chmod('remote.R', mode = '666')
> >> > file.access(remote_file, 4)
> >> remote.R
> >> -1
> >>
> >> Nope. I am thoroughly stumped and maybe can't make it any further
> >> because of Windows.
> >>
> >> Downstream I am trying to use digest::digest to create a hash but
> >> digest thinks we don't have permission because file.access is
> >failing.
> >> Any thoughts on how I can get file.access to return 0 for the
> >remote.R
> >> file? Any ideas?
> >>
> >> Thanks in advance,
> >>
> >> Sam
> >
> >__
> >R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> >https://stat.ethz.ch/mailman/listinfo/r-help
> >PLEASE do read the posting guide
> >http://www.R-project.org/posting-guide.html
> >and provide commented, minimal, self-contained, reproducible code.
>
> --
> Sent from my phone. Please excuse my brevity.

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] file.access returning -1 for a file on remote Windows drive.

2020-02-28 Thread Sam Albers
Some additional follow-up:

> summary(file(remote_file, "rb"))$`can read`
[1] "yes"

> summary(file(local_file, "rb"))$`can read`
[1] "yes"

compared to:

> file.access(local_file, 4)
local.R
 0

> file.access(remote_file, 4)
remote.R
-1

Can anyone think why file.access and file would be contradicting each other?

Sam

On Fri, Feb 28, 2020 at 10:47 AM Sam Albers  wrote:
>
> Hi there,
>
> Looking for some help in diagnosing or developing a work around to a
> problem I am having on a Windows machine. I am running R 3.6.2.
>
> I have two identical files, one stored locally and the other stored on
> a network drive.
>
> For access:
>
> > file.access(local_file, 4)
> local.R
>  0
>
> > file.access(remote_file, 4)
> remote.R
> -1
>
> Also for file.info
>
> > file.info(local_file)$mode:
> [1] "666"
>
> > file.info(remote_file)$mode:
> [1] "666"
>
> Ok so I am access issues. Maybe they are ephemeral and I can change
> the permissions:
>
> > Sys.chmod('remote.R', mode = '666')
> > file.access(remote_file, 4)
> remote.R
> -1
>
> Nope. I am thoroughly stumped and maybe can't make it any further
> because of Windows.
>
> Downstream I am trying to use digest::digest to create a hash but
> digest thinks we don't have permission because file.access is failing.
> Any thoughts on how I can get file.access to return 0 for the remote.R
> file? Any ideas?
>
> Thanks in advance,
>
> Sam

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] file.access returning -1 for a file on remote Windows drive.

2020-02-28 Thread Sam Albers
Hi there,

Looking for some help in diagnosing or developing a work around to a
problem I am having on a Windows machine. I am running R 3.6.2.

I have two identical files, one stored locally and the other stored on
a network drive.

For access:

> file.access(local_file, 4)
local.R
 0

> file.access(remote_file, 4)
remote.R
-1

Also for file.info

> file.info(local_file)$mode:
[1] "666"

> file.info(remote_file)$mode:
[1] "666"

Ok so I am access issues. Maybe they are ephemeral and I can change
the permissions:

> Sys.chmod('remote.R', mode = '666')
> file.access(remote_file, 4)
remote.R
-1

Nope. I am thoroughly stumped and maybe can't make it any further
because of Windows.

Downstream I am trying to use digest::digest to create a hash but
digest thinks we don't have permission because file.access is failing.
Any thoughts on how I can get file.access to return 0 for the remote.R
file? Any ideas?

Thanks in advance,

Sam

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] (no subject)

2019-01-10 Thread Sam Albers
Hello all,

I am experience some issues with building a package that we are
hosting on GitHub. The package itself is quite large.  It is a data
package with a bunch of spatial files stored as .rds files.

The repo is located here: https://github.com/bcgov/bcmaps.rdata

If we clone that package to local machine via:
git clone https://github.com/bcgov/bcmaps.rdata

The first oddity is that the package installs successfully using this:

$ R CMD INSTALL "./bcmaps.rdata"

But fails when I try to build the package:

$ R CMD build "./bcmaps.rdata"
* checking for file './bcmaps.rdata/DESCRIPTION' ... OK
* preparing 'bcmaps.rdata':
* checking DESCRIPTION meta-information ... OK
* checking for LF line-endings in source and make files and shell scripts
* checking for empty or unneeded directories
* looking to see if a 'data/datalist' file should be added
Warning in gzfile(file, "rb") :
  cannot open compressed file 'bcmaps.rdata', probable reason
'Permission denied'
Error in gzfile(file, "rb") : cannot open the connection
Execution halted


The second oddity is that if I remove the . from the Package name in
the DESCRIPTION file, the build proceeds smoothly:

$ R CMD build "./bcmaps.rdata"
* checking for file './bcmaps.rdata/DESCRIPTION' ... OK
* preparing 'bcmapsrdata':
* checking DESCRIPTION meta-information ... OK
* checking for LF line-endings in source and make files and shell scripts
* checking for empty or unneeded directories
* looking to see if a 'data/datalist' file should be added
* building 'bcmapsrdata_0.2.0.tar.gz'

I am assuming that R CMD install builds the package internally so I
find it confusing that I am not able to build it myself. Similarly
confusing is the lack of a . in the package name indicative of
anything?

Does anyone have any idea what's going on here? Am I missing something obvious?

Thanks in advance,

Sam

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Unable to return gmtoff from as.POSIXlt without converting date string to as.POSIXct first

2018-06-28 Thread Sam Albers
Is it possible for someone to explain what is going on here? I would expect
that `as.POSIXlt` would be able to accept `datestring` and return all the
elements without having to convert it using `as.POSIXct` first. Do
`as.POSIXlt` and `as.POSIXct` do different things with the `tz` arg?

datestring <- "2017-01-01 12:00:00"
foo <- as.POSIXlt(datestring, tz = "America/Moncton")
foo
[1] "2017-01-01 12:00:00 AST"
foo$gmtoff
[1] NA

bar <- as.POSIXlt(as.POSIXct(datestring, tz = "America/Moncton"))
bar
[1] "2017-01-01 12:00:00 AST"
bar$gmtoff
[1] -14400

Thanks in advance,

Sam

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] getOption() versus Sys.getenv

2017-08-25 Thread Sam Albers
Hi there,

I am trying to distinguish between getOption() and Sys.getenv(). My
understanding is that these are both used to set values for variables.
getOption is set something like this: option("var" = "A"). This can be
placed in an .Rprofile or at the top of script. They are called like this
getOption("var").

Environmental variables are set in the .Renviron file like this: "var" =
"A" and called like this: Sys.getenv("var"). I've seen mention in the httr
package documentation that credentials for APIs should be stored in this
way.

So my question is how does one decide which path is most appropriate? For
example I am working on a package that has to query a database in almost
every function call. I want to provide users an ability to skip having to
specify that path in every function call. So in this case should I
recommend users store the path as an option or as an environmental
variable? If I am storing credentials in an .Renviron file then maybe I
should store the path there as well?

More generally the question is can anyone recommend some good
discussion/documentation on this topic?

Thanks in advance,

Sam

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] [R-pkgs] Introducing the rsoi package

2017-04-11 Thread Sam Albers
Hi folks,

I am pleased to announce that the rsoi is now up on CRAN (v0.2.1
https://CRAN.R-project.org/package=rsoi). rsoi is a minimal but
hopefully useful package to folks that are looking for easy access in
R to Southern Oscillation Index and Oceanic Nino Index data.

rsoi uses data collected by the National Oceanic Atmospheric
Administration. Their data are usually updated monthly. Data are
downloaded and formatted for use in R by the `download_enso()`
function. El Nino, La Nina and neutral periods of ENSO are categorized
by temperature anomalies from a 30 year base period in the Central
South Pacific Ocean.

Suggestions and contributions are very much welcome at the rsoi github
page: https://github.com/boshek/rsoi

-Sam

___
R-packages mailing list
r-packa...@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-packages

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] R implementation of the Split and Merge Algorithm

2016-12-21 Thread Sam Albers
Hello there,

I am wondering if anyone on this list has ever encountered an
implementation of the Split and Merge algorithm for R. This algorithm
is reasonably well known and was first developed in this paper:

https://www.computer.org/csdl/trans/tc/1974/08/01672634-abs.html

>From that paper here is the essence of what the Split and Merge
Algorithm accomplishes:

"Given a set of points S = {xi,yi | i = 1,2... N} determine the
minimum number n such that S is divided in n subsets S1, S2...Sn,
where on each of them the data points are approximated by a polynomial
of order at most m - 1 with an error norm less than a prespecified
quantity e."

There has been work on this in MatLab but so far I've not been able to
find the approach in R. The `segmented` package comes close but as far
as I understand, is not what I am looking for. I am happy to do this
myself but wanted to check first to see if someone has already
accomplished this. I know this is a stretch but I thought it wouldn't
hurt to ask.

Thanks in advance.

Sam

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Extract an number from a character

2015-11-23 Thread Sam Albers
Hello,

I have a problem to which I am certain grep or gsub or that family of
functions are the solution. However, I just can't seem to wrap my mind
around exactly how. I have a dataframe below that has the dimensions
of a net. I am given the data is the "W X H" format. For calculations
I'll like to have each number as a separated column. I have been using
ifelse(). However that seems like a poor solution to this problem
especially once dataframes get larger and larger.

So my question is, can anyone describe a way to extract the number
from the variable y below is the example? I had also tried substr()
but that fall apart with the 2.5 x 2.5 net.


Thanks in advance!

Sam

Example:
##dataframe
df<-data.frame(x=rnorm(10),
   y=c("7 x 3","7 x 3","7 x 3","7 x 3","7 x 3","2.5 x
2.5","2.5 x 2.5","2.5 x 2.5","2.5 x 2.5","2.5 x 2.5"))


df$Width<-as.numeric(ifelse(df$y=="7 x 3","7","2.5"))
df$Height<-as.numeric(ifelse(df$y=="7 x 3","3","2.5"))


df$Width<-as.numeric(substr(df$y,5,5))
df$Width<-as.numeric(substr(df$y,5,5))

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Choosing columns by number

2015-08-25 Thread Sam Albers
Hi all,

This is a process question. How do folks efficiently identify column
numbers in a dataframe without manually counting them. For example, if I
want to choose columns from the iris dataframe I know of two options. I can
do this:

 str(iris)'data.frame':150 obs. of  5 variables:
 $ Sepal.Length: num  5.1 4.9 4.7 4.6 5 5.4 4.6 5 4.4 4.9 ...
 $ Sepal.Width : num  3.5 3 3.2 3.1 3.6 3.9 3.4 3.4 2.9 3.1 ...
 $ Petal.Length: num  1.4 1.4 1.3 1.5 1.4 1.7 1.4 1.5 1.4 1.5 ...
 $ Petal.Width : num  0.2 0.2 0.2 0.2 0.2 0.4 0.3 0.2 0.2 0.1 ...
 $ Species : Factor w/ 3 levels setosa,versicolor,..: 1 1 1 1
1 1 1 1 1 1 ...

or this:

 names(iris)[1] Sepal.Length Sepal.Width  Petal.Length Petal.Width  
 Species

Neither option explicitly identifies the column number so that I can
do something like this:

iris[,c(2,4)]

I feel like there must be a better way to do this so I wanted to ask
the collective wisdom here what people do to accomplish this.
Obviously this is a trivial example, but the issue really becomes
problematic when you have a large dataframe.

Thanks in advance!

Sam

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Choosing columns by number

2015-08-25 Thread Sam Albers
Thierry's answer of:

data.frame(
  seq_along(iris),
  colnames(iris)
)

is exactly what I was looking for. Apologies for vagueness and HTML.
It was unintended.

Sam

On Tue, Aug 25, 2015 at 8:32 AM, stephen sefick ssef...@gmail.com wrote:
 ?grep

 I think this will do what you want.

 #something like
 a - data.frame(a=rnorm(10), b=rnorm(10), c=rnorm(10), d=rnorm(10))

 toMatch - c(a, d)

 grep(paste(toMatch,collapse=|), colnames(a))

 #to subset
 a[,grep(paste(toMatch,collapse=|), colnames(a))]


 On Tue, Aug 25, 2015 at 10:17 AM, Sam Albers tonightstheni...@gmail.com
 wrote:

 Hi all,

 This is a process question. How do folks efficiently identify column
 numbers in a dataframe without manually counting them. For example, if I
 want to choose columns from the iris dataframe I know of two options. I
 can
 do this:

  str(iris)'data.frame':150 obs. of  5 variables:
  $ Sepal.Length: num  5.1 4.9 4.7 4.6 5 5.4 4.6 5 4.4 4.9 ...
  $ Sepal.Width : num  3.5 3 3.2 3.1 3.6 3.9 3.4 3.4 2.9 3.1 ...
  $ Petal.Length: num  1.4 1.4 1.3 1.5 1.4 1.7 1.4 1.5 1.4 1.5 ...
  $ Petal.Width : num  0.2 0.2 0.2 0.2 0.2 0.4 0.3 0.2 0.2 0.1 ...
  $ Species : Factor w/ 3 levels setosa,versicolor,..: 1 1 1 1
 1 1 1 1 1 1 ...

 or this:

  names(iris)[1] Sepal.Length Sepal.Width  Petal.Length
  Petal.Width  Species

 Neither option explicitly identifies the column number so that I can
 do something like this:

 iris[,c(2,4)]

 I feel like there must be a better way to do this so I wanted to ask
 the collective wisdom here what people do to accomplish this.
 Obviously this is a trivial example, but the issue really becomes
 problematic when you have a large dataframe.

 Thanks in advance!

 Sam

 [[alternative HTML version deleted]]

 __
 R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.




 --
 Stephen Sefick
 **
 Auburn University
 Biological Sciences
 331 Funchess Hall
 Auburn, Alabama
 36849
 **
 sas0...@auburn.edu
 http://www.auburn.edu/~sas0025
 **

 Let's not spend our time and resources thinking about things that are so
 little or so large that all they really do for us is puff us up and make us
 feel like gods.  We are mammals, and have not exhausted the annoying little
 problems of being mammals.

 -K. Mullis

 A big computer, a complex algorithm and a long time does not equal
 science.

   -Robert Gentleman


__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Automatically updating a plot from a regularly updated data file

2015-05-29 Thread Sam Albers
Hi all,

I have a question about using R in a way that may not be correct but I
thought I would ask anyway.

I have an instrument that outputs a text file with comma separated data. A
new line is added to the file each time the instrument takes a new reading.
Is there any way to configure R such that a script to generate a plot from
said text file is re-run each time the file is modified (i.e. a new line is
added). So basically update an exported plot each time a new line of data
is collected.

Is this type of thing possible in R? If not can anyone recommend some
Windows (or Linux if need be) tools that could help me accomplish this
preferably still utilizing R's plotting capabilites? I know that there are
other tools that can do this all but nothing makes figures as nicely as R.

I suppose more generally this is a question about way to automate processes
with R to take advantage of R's functionality.

Thanks in advance.

Sam

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Working with and is data sets

2015-01-26 Thread Sam Albers
Hello,

I am having some trouble figuring out how to deal with data that has some
observations that are detection limits and others that are integers denoted
by greater and less than symbols. Ideally I would like a column that has
the data as numbers then another column with values Measured or Limit
or something like that. Data and further clarification below.

##Data
zp-structure(list(variable = structure(c(1L, 1L, 1L, 1L, 2L, 2L, 2L, 2L,
3L, 3L, 3L, 3L, 4L, 4L, 4L, 4L, 5L, 5L, 5L, 5L),
.Label = c(ZP.1, ZP.3, ZP.5,
ZP.7, ZP.9), class = factor),
   value = structure(c(3L, 4L, 2L, 1L, 7L, 8L, 6L, 5L, 12L,
11L, 10L, 9L, 15L, 16L, 14L, 13L, 19L, 18L, 17L, 9L),
 .Label = c(0.030, 1.2, 1160,
27.3, 0.025, 0.85, 1870, 45.7, 0.0020,
0.050, 31.9, 695,
0.0060, 0.20, 311, 8.84, 0.090, 12, 646), class =
factor)),
  .Names = c(variable, value), row.names = c(NA, -20L),
class = data.frame)

## As expected converting everything to numeric results is a slew of NA
values
zp$valuefactor-as.numeric(as.character(zp$value))

## At this point I am unsure how to proceed.

zp

###

So I am just wondering how folks deal with this type of data. Any advice
would be much appreciated as I am looking for something that will reliably
works on a large data set.

Thanks in advance!

Sam

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] symbols in a data frame

2014-07-09 Thread Sam Albers
Hello,

I have recently received a dataset from a metal analysis company. The
dataset is filled with less than symbols. What I am looking for is a
efficient way to subset for any whole numbers from the dataset. The column
is automatically formatted as a factor because of the  symbols making it
difficult to deal with the numbers is a useful way.

So in sum any ideas on how I could subset the example below for only whole
numbers?

Thanks in advance!

Sam

#code

metals -


structure(list(Parameter = structure(c(1L, 2L, 3L, 4L, 6L, 7L,
8L, 9L, 10L, 11L, 12L, 13L, 15L, 16L, 17L, 18L, 19L, 20L, 1L), .Label
= c(Antimony,
Arsenic, Barium, Beryllium, Boron (Hot Water Soluble),
Cadmium, Chromium, Cobalt, Copper, Lead, Mercury,
Molybdenum, Nickel, pH 1:2, Selenium, Silver, Thallium,
Tin, Vanadium, Zinc), class = factor), Cedar.Creek = structure(c(3L,
3L, 7L, 3L, 2L, 4L, 3L, 34L, 36L, 2L, 5L, 7L, 3L, 7L, 3L, 45L,
4L, 4L, 3L), .Label = c(1, 10, 100, 1000, 200,
5, 500, 0.1, 0.13, 0.5, 0.8, 1.07, 1.1, 1.4,
1.5, 137, 154, 163, 165, 169, 178, 2.3, 2.4,
22, 24, 244, 27.2, 274, 3, 3.1, 40.2, 43, 50,
516, 53.3, 550, 569, 65, 66.1, 68, 7.6, 72,
77, 89, 951), class = factor)), .Names = c(Parameter,
Cedar.Creek), row.names = c(NA, 19L), class = data.frame)

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] symbols in a data frame

2014-07-09 Thread Sam Albers
Thanks for all the responses. It sometimes difficult to outline
exactly what you need. These response were helpful to get there.
Speaking to Bert's point a bit, I needed a column to identify where
the  symbol was used. If I knew more about R I think I might be
embarrassed to post my solution to that problem but here is how I used
Sarah's solution but still kept the info about detection limits. I'm
sure there is a more elegant way:

metals -
structure(list(Parameter = structure(c(1L, 2L, 3L, 4L, 6L, 7L,
8L, 9L, 10L, 11L, 12L, 13L, 15L, 16L, 17L, 18L, 19L, 20L, 1L), .Label
= c(Antimony,
Arsenic, Barium, Beryllium, Boron (Hot Water Soluble),
Cadmium, Chromium, Cobalt, Copper, Lead, Mercury,
Molybdenum, Nickel, pH 1:2, Selenium, Silver, Thallium,
Tin, Vanadium, Zinc), class = factor), Cedar.Creek = structure(c(3L,
3L, 7L, 3L, 2L, 4L, 3L, 34L, 36L, 2L, 5L, 7L, 3L, 7L, 3L, 45L,
4L, 4L, 3L), .Label = c(1, 10, 100, 1000, 200,
5, 500, 0.1, 0.13, 0.5, 0.8, 1.07, 1.1, 1.4,
1.5, 137, 154, 163, 165, 169, 178, 2.3, 2.4,
22, 24, 244, 27.2, 274, 3, 3.1, 40.2, 43, 50,
516, 53.3, 550, 569, 65, 66.1, 68, 7.6, 72,
77, 89, 951), class = factor)), .Names = c(Parameter,
Cedar.Creek), row.names = c(NA, 19L), class = data.frame)



metals$temp1-metals$Cedar.Creek
metals$Cedar.Creek - as.character(metals$Cedar.Creek)
metals$Cedar.Creek - gsub(, , metals$Cedar.Creek)
metals$Cedar.Creek - as.numeric(metals$Cedar.Creek)

metals$temp2-metals$temp1==metals$Cedar.Creek
metals$Detection-factor(ifelse(metals$temp2==TRUE,Measured,Limit))
metals[,c(1,2,5)]


Thanks again!

Sam

On Wed, Jul 9, 2014 at 10:41 AM, Bert Gunter gunter.ber...@gene.com wrote:
 Well, ?grep and ?regex are clearly apropos here -- dealing with
 character data is an essential skill for handling input from diverse
 sources with various formatting conventions. I suggest you go through
 one of the many regular expression tutorials on the web to learn more.

 But this may not be the important issue here at all. If k means the
 value is left censored at k -- i.e. we know it's less than k but not
 how much less -- than Sarah's proposal is not what you want to do.
 Exactly what you do want to do depends on context, and as it concerns
 statistical methodology, is not something that should be discussed
 here. Consult a local statistician if this is a correct guess.
 Otherwise ignore.

 ... and please post in plain text in future (as requested) as HTML can
 get garbled.

 Bert Gunter
 Genentech Nonclinical Biostatistics
 (650) 467-7374

 Data is not information. Information is not knowledge. And knowledge
 is certainly not wisdom.
 Clifford Stoll




 On Wed, Jul 9, 2014 at 10:26 AM, Sarah Goslee sarah.gos...@gmail.com wrote:
 Hi Sam,

 I'd take the similar tack of removing the  instead. Note that if you
 import the data frame using the stringsAsFactors=FALSE argument, you
 don't need the first step.

 metals$Cedar.Creek - as.character(metals$Cedar.Creek)
 metals$Cedar.Creek - gsub(, , metals$Cedar.Creek)
 metals$Cedar.Creek - as.numeric(metals$Cedar.Creek)

 R str(metals)
 'data.frame':19 obs. of  2 variables:
  $ Parameter  : Factor w/ 20 levels Antimony,Arsenic,..: 1 2 3 4 6
 7 8 9 10 11 ...
  $ Cedar.Creek: num  100 100 500 100 10 1000 100 516 550 10 ...

 Sarah


 On Wed, Jul 9, 2014 at 1:19 PM, Sam Albers tonightstheni...@gmail.com 
 wrote:
 Hello,

 I have recently received a dataset from a metal analysis company. The
 dataset is filled with less than symbols. What I am looking for is a
 efficient way to subset for any whole numbers from the dataset. The column
 is automatically formatted as a factor because of the  symbols making it
 difficult to deal with the numbers is a useful way.

 So in sum any ideas on how I could subset the example below for only whole
 numbers?

 Thanks in advance!

 Sam

 #code

 metals -


 structure(list(Parameter = structure(c(1L, 2L, 3L, 4L, 6L, 7L,
 8L, 9L, 10L, 11L, 12L, 13L, 15L, 16L, 17L, 18L, 19L, 20L, 1L), .Label
 = c(Antimony,
 Arsenic, Barium, Beryllium, Boron (Hot Water Soluble),
 Cadmium, Chromium, Cobalt, Copper, Lead, Mercury,
 Molybdenum, Nickel, pH 1:2, Selenium, Silver, Thallium,
 Tin, Vanadium, Zinc), class = factor), Cedar.Creek = structure(c(3L,
 3L, 7L, 3L, 2L, 4L, 3L, 34L, 36L, 2L, 5L, 7L, 3L, 7L, 3L, 45L,
 4L, 4L, 3L), .Label = c(1, 10, 100, 1000, 200,
 5, 500, 0.1, 0.13, 0.5, 0.8, 1.07, 1.1, 1.4,
 1.5, 137, 154, 163, 165, 169, 178, 2.3, 2.4,
 22, 24, 244, 27.2, 274, 3, 3.1, 40.2, 43, 50,
 516, 53.3, 550, 569, 65, 66.1, 68, 7.6, 72,
 77, 89, 951), class = factor)), .Names = c(Parameter,
 Cedar.Creek), row.names = c(NA, 19L), class = data.frame)


 --
 Sarah Goslee
 http://www.functionaldiversity.org

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code

[R] Define a variable on a non-standard year interval (Water Years)

2012-06-11 Thread Sam Albers
Hello,

I am trying to define a different interval for a year. In hydrology,
a water year is defined as the period between October 1st and
September 30 of the following year. I was wondering how I might do
this in R. Say I have a data.frame like the following and I want to
extract a variable with the water year specs as defined above:

df-data.frame(Date=seq(as.Date(2000/10/1), as.Date(2003/9/30), days))

## Extract the normal year
df$year - factor(format(as.Date(df$Date), %Y))

So the question is how might I define a variable that extends from
October 1st to September 30th rather than the normal January 1st to
December 31st?

Thanks in advance!

Sam

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Drop values of one dataframe based on the value of another

2012-06-01 Thread Sam Albers
Hello all,

Let me first say that this isn't a question about outliers. I am using
the outlier function from the outliers package but I am using it only
because it is a convenient wrapper to determine values that have the
largest difference between itself and the sample mean. Where I am
running into problems is that I am several groups where I want to
calculate the outlier within that group. Then I want to create two
data.frames, one with the outliers and the other those values
dropped. And both dataframes need to include additional columns of
data present before the subset. The first case is easy but I can't
seem to figure out how to determine the next. So for example:

library(plyr)
library(outliers)

## A dataframe with some obviously extreme values
dfa - data.frame(Mins=runif(15, 0,1),
Fac=rep(c(Test1,Test2,Test3), each=5))
df.out - data.frame(Mins=c(3,4,5), Fac=c(Test1,Test2,Test3))
df - rbind(dfa, df.out)
df$Meta - runif(18,4,5); df

## Dataframe with the extreme value
To_remove-ddply(df, c(Fac), subset, Mins==outlier(Mins)); To_remove

So now my question is how can I use this dataframe (To_remove) to
remove all these values from df and create a new dataframe. Given a df
(To_remove) with a list of values, how can I choose all values of
another dataframe (df) that aren't those values in the To_remove
dataframe?. There is a rm.outliers function in this same package but I
having trouble with that and would like to try another approach.

Thanks in advance!

Sam

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Extract time from irregular date and time data records

2012-05-29 Thread Sam Albers
Hello,

I am having a problem making use of some data outputted from an
instrument in a somewhat weird format. The instrument outputs two
columns - one called JulianDay.Hour and one called Minutes.Seconds. I
would like to convert these columns into a single column with a time.
So I was using substr() and paste to extract that info. This works
fine for the JulianDay.Hour column as there are always five characters
in an entry. However in the Minutes.Seconds column any leading zeroes
are dropped by the instrument. So if I use substr() to selected based
on character position I end up with incorrect times. So for example:

## df

df-structure(list(Temperature = c(18.63, 18.4, 18.18, 16.99, 16.86,
11.39, 11.39, 11.37, 11.37, 11.37, 11.37), JulianDay.Hour = c(22610L,
22610L, 22610L, 22610L, 22610L, 22611L, 22611L, 22611L, 22611L,
22611L, 22611L), Minutes.Seconds = c(4608L, 4611L, 4614L, 4638L,
4641L, 141L, 144L, 208L, 211L, 214L, 238L)), .Names = c(Temperature,
JulianDay.Hour, Minutes.Seconds), row.names = c(3176L, 3177L,
3178L, 3179L, 3180L, 3079L, 3080L, 3054L, 3055L, 3056L, 3057L
), class = data.frame)

## Extraction method for times
df$Time.Incorrect - paste(substr(df$JulianDay.Hour, 4,5),:,
 substr(df$Minutes.Seconds,1,2),:,
 substr(df$Minutes.Seconds,3,4),
 sep=)


## Manual generation of desired time
df$Time.Correct - c(10:46:08,
10:46:11,10:46:14,10:46:38,10:46:41,11:01:41,11:01:44,11:02:08,11:02:11,11:02:14,11:02:38)


## Note the absence of leading zeroes in the Minutes.Seconds leading
to incomplete time records (df$Time.Incorrect)
df

##

So can anyone recommend a good way to extract a time from variables
like these two? Basically this is subsetting a string issue.

Thanks in advance!

Sam

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Extract time from irregular date and time data records

2012-05-29 Thread Sam Albers
Apologies. I was searching using the wrong search terms. This is
clearly a string issue. I've added the solution below.

Sam

On Tue, May 29, 2012 at 11:39 AM, Sam Albers tonightstheni...@gmail.com wrote:
 Hello,

 I am having a problem making use of some data outputted from an
 instrument in a somewhat weird format. The instrument outputs two
 columns - one called JulianDay.Hour and one called Minutes.Seconds. I
 would like to convert these columns into a single column with a time.
 So I was using substr() and paste to extract that info. This works
 fine for the JulianDay.Hour column as there are always five characters
 in an entry. However in the Minutes.Seconds column any leading zeroes
 are dropped by the instrument. So if I use substr() to selected based
 on character position I end up with incorrect times. So for example:

 ## df

 df-structure(list(Temperature = c(18.63, 18.4, 18.18, 16.99, 16.86,
 11.39, 11.39, 11.37, 11.37, 11.37, 11.37), JulianDay.Hour = c(22610L,
 22610L, 22610L, 22610L, 22610L, 22611L, 22611L, 22611L, 22611L,
 22611L, 22611L), Minutes.Seconds = c(4608L, 4611L, 4614L, 4638L,
 4641L, 141L, 144L, 208L, 211L, 214L, 238L)), .Names = c(Temperature,
 JulianDay.Hour, Minutes.Seconds), row.names = c(3176L, 3177L,
 3178L, 3179L, 3180L, 3079L, 3080L, 3054L, 3055L, 3056L, 3057L
 ), class = data.frame)

 ## Extraction method for times
 df$Time.Incorrect - paste(substr(df$JulianDay.Hour, 4,5),:,
                 substr(df$Minutes.Seconds,1,2),:,
                 substr(df$Minutes.Seconds,3,4),
                 sep=)


## Addition of leading zeroes

df$Time.Correct - paste(substr(df$JulianDay.Hour, 4,5),:,
   substr(formatC(df$Minutes.Seconds, width =
4, format = d, flag = 0),1,2),:,
   substr(formatC(df$Minutes.Seconds, width =
4, format = d, flag = 0),3,4),
   sep=)



 ## Note the absence of leading zeroes in the Minutes.Seconds leading
 to incomplete time records (df$Time.Incorrect)
 df

 ##

 So can anyone recommend a good way to extract a time from variables
 like these two? Basically this is subsetting a string issue.

 Thanks in advance!

 Sam

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Displayed Date Format in Plot Title.

2012-04-13 Thread Sam Albers
Hello all,

I can't seem to figure out how to format a date as a title. I have
something like this:

plot(x=1:10, y=runif(10,1,18), main=paste(as.Date(2011-05-03,
format=%Y-%m-%d)))

## When I would really like this
plot(x=1:10, y=runif(10,1,18), main=paste(May-03-2011))

## I thought to try this but that produces an NA.
plot(x=1:10, y=runif(10,1,18), main=paste(as.Date(2011-05-03,
format=%Y-%b-%d)))

How do folks usually accomplish something like this?

Thanks so much in advance!

Sam

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Convert day of year back into a date format.

2012-03-27 Thread Sam Albers
Hello,

I am having trouble figuring out how to convert a Day of Year integer
back into a Date format. For example I have the following:

date - 
c('2008-01-01','2008-01-02','2008-01-03','2008-01-04','2008-01-05','2008-01-06','2008-01-07',
'2008-01-08','2008-01-09','2008-01-10','2008-01-11','2008-01-12','2008-01-13','2008-01-14','2008-01-15',
'2008-01-16','2008-01-17','2008-01-18','2008-01-19','2008-01-20','2008-01-21','2008-01-22','2008-01-23')

## this is then converted into a number corresponding to the day of
the year like so:

dayofyear - strptime(date, format=%Y-%m-%d)$yday + 1

## Now my question is how do I get back to a date format (obviously
omitting the year).
## The end result is that I'd like to be able to have axis labels as
something like Month-Day or just Month
## instead of just an integers which isn't always intuitive for people
but I can't seem to figure out how to tell R
## to recognize an integer as a date.

Any suggestions?

Many thanks in advance!

Sam

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Memory limits for MDSplot in randomForest package

2012-03-23 Thread Sam Albers
Hello,

I am struggling to produce an MDS plot using the randomForest package
with a moderately large data set. My data set has one categorical
response variables, 7 predictor variables and just under 19000
observations. That means my proximity matrix is approximately 133000
by 133000 which is quite large. To train a random forest on this large
a dataset I have to use my institutions high performance computer.
Using this setup I was able to train a randomForest with the proximity
argument set to TRUE. At this point I wanted to construct an MDSplot
using the following:

MDSplot(nech.rf, nech.d$pd.fl, palette=c(1,2,3), pch=as.numeric(nech.d$pd.fl))

where nech.rf is the randomForest object and nech.d$pd.fl is the
classification factor. Now with the architecture listed below, I've
been waiting for approximately 2 days for this to run. My issue is
that I am not sure if this will ever run.

Can anyone recommend a way to tweak the MDSplot function to run a
little faster? I tried changing the cmdscale arguments (i.e.
eigenvalues) within the MDSplot function a little but that didn't seem
to have any effect of the overall running time using a much smaller
data set. Or even if someone could comment whether I am dreaming that
this will actually ever run?

This is probably the best computer that I will have access to so I was
hoping that somehow I could get this to run. I was just hoping that
someone reading the list might have some experience with randomForests
and using large datasets and might be able to comment on my situation.
Below the architecture information I have constructed a dummy example
to illustrate what I am doing but given the nature of the problem,
this doesn't completely reflect my situation.

Any help would be much appreciated!

Thanks!

Sam



Computer specs and sessionInfo()

OS: Suse Linux
Memory: 64 GB
Processors: Intel Itanium 2, 64 x 1500 MHz

And:

 sessionInfo()
R version 2.6.2 (2008-02-08)
ia64-unknown-linux-gnu

locale:
LC_CTYPE=en_US.UTF-8;LC_NUMERIC=C;LC_TIME=en_US.UTF-8;LC_COLLATE=en_US.UTF-8;LC_MONETARY=en_US.UTF-8;LC_MESSAGES=en_US.UTF-8;LC_PAPER=en_US.UTF-8;LC_NAME=C;LC_ADDRESS=C;LC_TELEPHONE=C;LC_MEASUREMENT=en_US.UTF-8;LC_IDENTIFICATION=C

attached base packages:
[1] stats graphics  grDevices utils datasets  methods   base

other attached packages:
[1] randomForest_4.6-6

loaded via a namespace (and not attached):
[1] rcompgen_0.1-17


###
# Dummy Example
###

require(randomForest)
set.seed(17)

## Number of points
x - 10

df - rbind(
data.frame(var1=runif(x, 10, 50),
   var2=runif(x, 2, 7),
   var3=runif(x, 0.2, 0.35),
   var4=runif(x, 1, 2),
   var5=runif(x, 5, 8),
   var6=runif(x, 1, 2),
   var7=runif(x, 5, 8),
   cls=factor(CLASS-2)
   )
  ,
data.frame(var1=runif(x, 10, 50),
   var2=runif(x, -3, 3),
   var3=runif(x, 0.1, 0.25),
   var4=runif(x, 1, 2),
   var5=runif(x, 5, 8),
   var6=runif(x, 1, 2),
   var7=runif(x, 5, 8),
   cls=factor(CLASS-1)
   )

)


df.rf-randomForest(y=df[,8],x=df[,1:7], proximity=TRUE, importance=TRUE)

MDSplot(df.rf, df$cls, k=2, palette=c(1,2,3,4), pch=as.numeric(df$cls))

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Lag based on Date objects with non-consecutive values

2012-03-20 Thread Sam Albers
On Mon, Mar 19, 2012 at 9:11 PM, Gabor Grothendieck
ggrothendi...@gmail.com wrote:


 On Mon, Mar 19, 2012 at 8:03 PM, Sam Albers tonightstheni...@gmail.com
 wrote:

 Hello R-ers,

 I just wanted to update this post. I've made some progress on this but
 am still not quite where I need to be. I feel like I am close so I
 just wanted to share my work so far.


 Try this:

 Lines - Date      Dis1
 1967-06-05  1.146405
 1967-06-06  9.732887
 1967-06-07 -9.279462
 1967-06-08  7.856646
 1967-06-09  5.494370
 1967-06-15  5.070176
 1967-06-16  3.847314
 1967-06-17 -5.243094
 1967-06-18  9.396560
 1967-06-19  4.112792

 # read in data
 library(zoo)
 z - read.zoo(text = Lines, header = TRUE)

 # process it
 g - seq(start(z), end(z), day) # all days
 zg - merge(z, zoo(, g)) # fill in missing days
 lag(zg, 0:-2)[time(z)]


Thanks Gabor. I was, however, hoping for base R solution. I think I've
got it and I will post the result here just to be complete. A big
thanks to Brain Cade for an off-list suggestion.

set.seed(32)
df1-data.frame(
   Date=seq(as.Date(1967-06-05,%Y-%m-%d),by=day, length=5),
   Dis1=rnorm(5, 1,10)
   )
df2-data.frame(
  Date=seq(as.Date(1967-06-15,%Y-%m-%d),by=day, length=5),
  Dis1=rnorm(5, 1,10)
  )

df - rbind(df1,df2)
df$Dis2 - df$Dis1*2


lag.base - function (lag.date, lag.by, lag.var) {
  time_dif - as.numeric(lag.date)-c(rep(NA,lag.by), head(lag.date, -lag.by))
  lag.tmp -c(rep(NA,lag.by), head(lag.var, -lag.by))
  lv - ifelse(time_dif=lag.by,lag.tmp,NA)
  return(lv)
}

df$lag - lag.base(lag.date=df$Date, lag.var=df$Dis1, lag.by=3);df
df$lag2 - lag.base(lag.date=df$Date, lag.var=df$Dis2, lag.by=3);df


 --
 Statistics  Software Consulting
 GKX Group, GKX Associates Inc.
 tel: 1-877-GKX-GROUP
 email: ggrothendieck at gmail.com


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Lag based on Date objects with non-consecutive values

2012-03-19 Thread Sam Albers
Hello all,

I need to figure out a way to lag a variable in by a number of days
without using the zoo package. I need to use a remote R connection
that doesn't have the zoo package installed and is unwilling to do so.
So that is, I want a function where I can specify the number of days
to lag a variable against a Date formatted column. That is relatively
easy to do. The problem arises when I don't have consecutive dates. I
can't seem to figure out a way to insert an NA when there is
non-consecutive date. So for example:


## A dataframe with non-consecutive dates
set.seed(32)
df1-data.frame(
   Date=seq(as.Date(1967-06-05,%Y-%m-%d),by=day, length=5),
   Dis1=rnorm(5, 1,10)
   )
df2-data.frame(
  Date=seq(as.Date(1967-07-05,%Y-%m-%d),by=day, length=10),
  Dis1=rnorm(5, 1,10)
  )

df - rbind(df1,df2); df

## A function to lag the variable by a specified number of days
lag.day - function (lag.by, data) {
  c(rep(NA,lag.by), head(data$Dis1, -lag.by))
}

## Using the function
df$lag1 - lag.day(lag.by=1, data=df); df
## returns this data frame

 Date  Dis1  lag1
1  1967-06-05  1.146405NA
2  1967-06-06  9.732887  1.146405
3  1967-06-07 -9.279462  9.732887
4  1967-06-08  7.856646 -9.279462
5  1967-06-09  5.494370  7.856646
6  1967-06-15  5.070176  5.494370
7  1967-06-16  3.847314  5.070176
8  1967-06-17 -5.243094  3.847314
9  1967-06-18  9.396560 -5.243094
10 1967-06-19  4.112792  9.396560


## When really what I would like is something like this:

 Date  Dis1  lag1
1  1967-06-05  1.146405NA
2  1967-06-06  9.732887  1.146405
3  1967-06-07 -9.279462  9.732887
4  1967-06-08  7.856646 -9.279462
5  1967-06-09  5.494370  7.856646
6  1967-06-15  5.070176  NA
7  1967-06-16  3.847314  5.070176
8  1967-06-17 -5.243094  3.847314
9  1967-06-18  9.396560 -5.243094
10 1967-06-19  4.112792  9.396560

So can anyone recommend a way (either using my function or any other
approaches) that I might be able to consistently lag values based on a
lag.by value and consecutive dates?

Thanks so much in advance!

Sam

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Lag based on Date objects with non-consecutive values

2012-03-19 Thread Sam Albers
Hello R-ers,

I just wanted to update this post. I've made some progress on this but
am still not quite where I need to be. I feel like I am close so I
just wanted to share my work so far.

Thanks in advance!

Sam

On Mon, Mar 19, 2012 at 1:10 PM, Sam Albers tonightstheni...@gmail.com wrote:
 Hello all,

 I need to figure out a way to lag a variable in by a number of days
 without using the zoo package. I need to use a remote R connection
 that doesn't have the zoo package installed and is unwilling to do so.
 So that is, I want a function where I can specify the number of days
 to lag a variable against a Date formatted column. That is relatively
 easy to do. The problem arises when I don't have consecutive dates. I
 can't seem to figure out a way to insert an NA when there is
 non-consecutive date. So for example:


 ## A dataframe with non-consecutive dates
 set.seed(32)
 df1-data.frame(
           Date=seq(as.Date(1967-06-05,%Y-%m-%d),by=day, length=5),
           Dis1=rnorm(5, 1,10)
           )
 df2-data.frame(
  Date=seq(as.Date(1967-07-05,%Y-%m-%d),by=day, length=10),
  Dis1=rnorm(5, 1,10)
  )

 df - rbind(df1,df2); df

 ## A function to lag the variable by a specified number of days
 lag.day - function (lag.by, data) {
  c(rep(NA,lag.by), head(data$Dis1, -lag.by))
 }

 ## Using the function
 df$lag1 - lag.day(lag.by=1, data=df); df
 ## returns this data frame

         Date      Dis1      lag1
 1  1967-06-05  1.146405        NA
 2  1967-06-06  9.732887  1.146405
 3  1967-06-07 -9.279462  9.732887
 4  1967-06-08  7.856646 -9.279462
 5  1967-06-09  5.494370  7.856646
 6  1967-06-15  5.070176  5.494370
 7  1967-06-16  3.847314  5.070176
 8  1967-06-17 -5.243094  3.847314
 9  1967-06-18  9.396560 -5.243094
 10 1967-06-19  4.112792  9.396560


 ## When really what I would like is something like this:

         Date      Dis1      lag1
 1  1967-06-05  1.146405        NA
 2  1967-06-06  9.732887  1.146405
 3  1967-06-07 -9.279462  9.732887
 4  1967-06-08  7.856646 -9.279462
 5  1967-06-09  5.494370  7.856646
 6  1967-06-15  5.070176  NA
 7  1967-06-16  3.847314  5.070176
 8  1967-06-17 -5.243094  3.847314
 9  1967-06-18  9.396560 -5.243094
 10 1967-06-19  4.112792  9.396560

I've now gotten this far but have realized that my approach is flawed
because if I increase the lag.by value to anything great than 1, an NA
is no longer entered into the correct position. So here is my updated
effort:

lag.by - function (data, lag.by) {
  tmp-data.frame(
## Difference in days between dates
diff=c(diff(data$Date), NA),
lag.tmp=c(rep(NA,lag.by), head(data$Dis1, -lag.by))
)
  ## Diff calculates difference to next row so all the difference
  ## values need to be lagged
  ifelse(c(rep(NA,lag.by), head(tmp$diff, -lag.by))=1,tmp$lag.tmp,NA)
}


df$lag - lag.by(df, lag.by=1)
df$lag2 - lag.by(df, lag.by=2); df

 Date  Dis1   lag  lag2
1  1967-06-05  1.146405NANA
2  1967-06-06  9.732887  1.146405NA
3  1967-06-07 -9.279462  9.732887  1.146405
4  1967-06-08  7.856646 -9.279462  9.732887
5  1967-06-09  5.494370  7.856646 -9.279462
6  1967-06-15  5.070176NA 7.856646 - Need this to be a NA
7  1967-06-16  3.847314  5.070176NA
8  1967-06-17 -5.243094  3.847314  5.070176
9  1967-06-18  9.396560 -5.243094  3.847314
10 1967-06-19  4.112792  9.396560 -5.243094

So, I should have NA's in the lag2 column at rows 6 and 7. Any help or
thoughts would be much appreciated here.




 So can anyone recommend a way (either using my function or any other
 approaches) that I might be able to consistently lag values based on a
 lag.by value and consecutive dates?

 Thanks so much in advance!

 Sam

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Strategies to deal with unbalanced classification data in randomForest

2012-03-02 Thread Sam Albers
Hello all,

I have become somewhat confused with options available for dealing
with a highly unbalanced data set (1 in one class, 50 in the
other). As a summary I am unsure:

a) if I am perform the two class weighting methods properly,
b) if the data are too unbalanced and that this type of analysis is
appropriate and
c) if there is any interaction between the weighting for class
imbalances and number of trees in a forest.

An example will illustrate this best. Say I have a data set like the following:

df - rbind(
data.frame(var1=runif(1, 10, 50),
   var2=runif(1, -3, 3),
   var3=runif(1, 0.1, 0.25),
   cls=factor(CLASS-1)
   ),
data.frame(var1=runif(50, 10, 50),
   var2=runif(50, 2, 7),
   var3=runif(50, 0.2, 0.35),
   cls=factor(CLASS-2)
   )
)

## Where the response vector is highly imbalanced like so:
summary(df$cls)

library(randomForest)
set.seed(17)

## Now the obviously an extreme case but I am wondering what the
options are to deal with something like this.
## The problem with this situation manifests itself when I try to
train a random forest
## without accounting for this imbalance

df.rf-randomForest(cls~var1+var2+var3, data=df,importance=TRUE)

## Now one option is to down sample the majority variable. However, I
can seem to find exactly
## how to do this. Does this seem correct?

df.rf.downsamp -randomForest(cls~var1+var2+var3,
data=df,sampsize=c(50,50), importance=TRUE)
## 50 being the number of observations in the minority variable

## The other option which there seems to be some confusion over is
establish some class weights
## to balance the error rate. This approach I've mostly drawn from here:
## http://stat-www.berkeley.edu/users/breiman/RandomForests/cc_home.htm#balance
## This might not be appropriate, however, as of September it looks
like Breiman method wasn't used in R
df.rf.weights-randomForest(cls~var1+var2+var3, data=df,classwt=c(1,
600), importance=TRUE)

## Nevertheless, what I am concerned about is the effect of an
unbalanced data set has on my randomForest model
## For example:

par(mfrow=c(1,3))
plot(df.rf)
plot(df.rf.downsamp)
plot(df.rf.weights)

presents three very different scenarios and I having trouble resolving
the issues I mentioned above. I am extremely grateful for all the work
that has been done on randomForests in R up to this point. I was
hoping that someone, with more experience, might be able to advise
what the best strategy is to deal with this problem. Which of these
approaches are best and am I using them right?

Thanks so much in advance for any help.

Sam


 sessionInfo()
R version 2.14.2 (2012-02-29)
Platform: x86_64-pc-linux-gnu (64-bit)

locale:
 [1] LC_CTYPE=en_CA.UTF-8   LC_NUMERIC=C
LC_TIME=en_CA.UTF-8
 [4] LC_COLLATE=en_CA.UTF-8 LC_MONETARY=en_CA.UTF-8
LC_MESSAGES=en_CA.UTF-8
 [7] LC_PAPER=C LC_NAME=C
LC_ADDRESS=C
[10] LC_TELEPHONE=C LC_MEASUREMENT=en_CA.UTF-8
LC_IDENTIFICATION=C

attached base packages:
[1] stats graphics  grDevices utils datasets  methods   base

loaded via a namespace (and not attached):
[1] ggplot2_0.8.9 plyr_1.7.1tools_2.14.2

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Combining month and year into a single variable

2012-02-01 Thread Sam Albers
Hello all,

## I am trying to convert some year and month data into a single
variable that has a date format so I can plot a proper x axis.
## I've made a few tries at this and search around but I haven't found
anything. I am looking for something of the format %Y-%m

## A data.frame
df - data.frame(x=rnorm(36, 1, 10), month=rep(1:12, each = 3),
year=c(2000,2001,2002))

## One option. I'm not totally sure why this doesn't work
df$Date - as.Date(paste(df$year, df$month,sep=-), %Y-%m)

## This adds the monthly total to the day rather than the MOnday and this option
## is messy anyway as I am adding a day to the variable
or = format(ISOdate(df$year-1, 12, 31), %Y-%m-%d)
df$Date2 = as.Date(or) + df$month

## Just a plot to illustrate this.
plot(x~Date2, data=df)

## Any thoughts on how I can combine the year and the month into a
form that is useful for plotting?

Thanks in advance!

Sam

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Subsetting for the ten highest values by group in a dataframe

2012-01-27 Thread Sam Albers
Hello,

I am looking for a way to subset a data frame by choosing the top ten
maximum values from that dataframe. As well this occurs within some
factor levels.

## I've used plyr here but I'm not married to this approach
require(plyr)

## I've created a data.frame with two groups and then a id variable (y)
df - data.frame(x=rnorm(400, mean=20), y=1:400, z=c(A,B))

## So using ddply I can find the highest value of x
df.max1 - ddply(df, c(z), subset, x==sort(x, TRUE)[1])

## Or the 2nd highest value
df.max2 - ddply(df, c(z), subset, x==sort(x, TRUE)[2])

## And so on but when I try to make a series of numbers like so
## to get the top ten values, I don't get a warning message but
## two values that don't really make sense to me
df.max - ddply(df, c(z), subset, x==sort(x, TRUE)[1:10])

## So no error message when I use the method above, which is clearly wrong.
## But I really am not sure how to diagnose the problem.

## Can anyone suggest a way to subset a data.frame with groups to
select the top ten max values in that data.frame for each group?

## Thanks so much in advance?

Sam

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Establishing groups using something other than ifelse()

2012-01-19 Thread Sam Albers
Hello all,

This is one of those Is there a better way to do this questions. Say
I have a dataframe (df) with a grouping variable (z). This is my base
data. Now I know that there is a higher order level of grouping that
exist for my group variable. So what I want to do is create a new
column that express that higher order level of grouping based on
values in the sub-group (z  in this case). In the past I have used
ifelse() but this tends to get fairly redundant and messy with a large
amount of sub-groupings (z). I've created a sample dataset below. Can
anyone recommend a better way of achieving what I am currently
achieving with ifelse()? A long series of ifelse statements makes me
think that there is something better for this.

## Dataframe creation
df - data.frame(x=runif(36, 0, 120),
   y=runif(36, 0, 120),

z=factor(c(A1,A1,A2,A2,B1,B1,B2,B2,C1,C,C2,C2))
   )

## Current method is grouping
df$Big.Group - with(df, ifelse(df$z==A1,A, ifelse(df$z==A2,A,
ifelse(df$z==B1, B, ifelse(df$z==B2, B, C)


So any suggestions? Thanks in advance!

Sam

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Establishing groups using something other than ifelse()

2012-01-19 Thread Sam Albers
On Thu, Jan 19, 2012 at 3:34 PM, Justin Haynes jto...@gmail.com wrote:
 how bout

 levels(df$z)[grep('A',levels(df$z))] - 'A'
 levels(df$z)[grep('B',levels(df$z))] - 'B'
 levels(df$z)[grep('C',levels(df$z))] - 'C'

 does that do what you're wanting?

Shoot. Might have made my example confusing, sorry. First of all I
want to retain the information in the sub.group (z) here but more
importantly, I used A1 and A2 to illustrate the grouping under the
larger group A but the pattern of the group names is irrelevant for my
purposes. So to modify the example I wanted to achieve this without
pattern matching like the above:

df - data.frame(x=runif(36, 0, 120),
   y=runif(36, 0, 120),

z=factor(c(G1,G1,G2,G2,H1,H1,H2,H2,I1,I1,I2,I2))
   )

df$Big.Group - with(df, ifelse(df$z==G1,A, ifelse(df$z==G2,A,
ifelse(df$z==H1, B, ifelse(df$z==H2, B, C)

Thanks for the response!

Sam



 On Thu, Jan 19, 2012 at 3:05 PM, Sam Albers tonightstheni...@gmail.com
 wrote:

 Hello all,

 This is one of those Is there a better way to do this questions. Say
 I have a dataframe (df) with a grouping variable (z). This is my base
 data. Now I know that there is a higher order level of grouping that
 exist for my group variable. So what I want to do is create a new
 column that express that higher order level of grouping based on
 values in the sub-group (z  in this case). In the past I have used
 ifelse() but this tends to get fairly redundant and messy with a large
 amount of sub-groupings (z). I've created a sample dataset below. Can
 anyone recommend a better way of achieving what I am currently
 achieving with ifelse()? A long series of ifelse statements makes me
 think that there is something better for this.

 ## Dataframe creation
 df - data.frame(x=runif(36, 0, 120),
                       y=runif(36, 0, 120),

 z=factor(c(A1,A1,A2,A2,B1,B1,B2,B2,C1,C,C2,C2))
                       )

 ## Current method is grouping
 df$Big.Group - with(df, ifelse(df$z==A1,A, ifelse(df$z==A2,A,
 ifelse(df$z==B1, B, ifelse(df$z==B2, B, C)


 So any suggestions? Thanks in advance!

 Sam

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.



__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Establishing groups using something other than ifelse()

2012-01-19 Thread Sam Albers
That is great Jorge. Thanks! Just to complete this, I will include
using record with this example:

df$Big.Group2 - recode(df$z, c('G1','G2')='A';
  c('H1','H2')='B';
  else='C')

Sam

On Thu, Jan 19, 2012 at 3:49 PM, Jorge I Velez jorgeivanve...@gmail.com wrote:
 Hi Sam,

 Check the examples in

 require(car)
 ?recode

 HTH,
 Jorge.-


 On Thu, Jan 19, 2012 at 6:05 PM, Sam Albers  wrote:

 Hello all,

 This is one of those Is there a better way to do this questions. Say
 I have a dataframe (df) with a grouping variable (z). This is my base
 data. Now I know that there is a higher order level of grouping that
 exist for my group variable. So what I want to do is create a new
 column that express that higher order level of grouping based on
 values in the sub-group (z  in this case). In the past I have used
 ifelse() but this tends to get fairly redundant and messy with a large
 amount of sub-groupings (z). I've created a sample dataset below. Can
 anyone recommend a better way of achieving what I am currently
 achieving with ifelse()? A long series of ifelse statements makes me
 think that there is something better for this.

 ## Dataframe creation
 df - data.frame(x=runif(36, 0, 120),
                       y=runif(36, 0, 120),

 z=factor(c(A1,A1,A2,A2,B1,B1,B2,B2,C1,C,C2,C2))
                       )

 ## Current method is grouping
 df$Big.Group - with(df, ifelse(df$z==A1,A, ifelse(df$z==A2,A,
 ifelse(df$z==B1, B, ifelse(df$z==B2, B, C)


 So any suggestions? Thanks in advance!

 Sam

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.



__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Calculating rolling mean by group

2012-01-10 Thread Sam Albers
Thanks for getting me on the right path Gabor! I have one outstanding
issue though.

On Mon, Jan 9, 2012 at 4:21 PM, Gabor Grothendieck
ggrothendi...@gmail.com wrote:
 On Mon, Jan 9, 2012 at 6:39 PM, Sam Albers tonightstheni...@gmail.com wrote:
 Hello all,

 I am trying to determine how to calculate rolling means in R using a
 grouping variable. Say I have a dataframe like so:

 dat1 - data.frame(x = runif(2190, 0, 125), year=rep(1995:2000,
 each=365), jday=1:365, site=here)
 dat2 - data.frame(x = runif(2190, 0, 200), year=rep(1995:2000,
 each=365), jday=1:365, site=there)
 dat - rbind(dat1,dat2)

 ## What I would like to do is calculate a rolling 7 day mean
 separately for each site. I have looked at both
 ## rollmean() in the zoo package and running.mean() in the igraph
 package but neither seem to have led
 ## me to calculating a rolling mean by group. My first thought was to
 use the plyr package but I am confused
 ## by this output:

 library(plyr)
 library(zoo)

 ddply(dat, c(site), function(df) return(c(roll=rollmean(df$x, 7

 ## Can anyone recommend a better way to do this or shed some light on
 this output?


 Using dat in the question, try this:

 library(zoo)
 z - read.zoo(dat, index = 2:3, split = 4, format = %Y %j)
 zz - rollmean(z, 7)

 The result, zz, is a multivariate zoo series with one column per group.

Using the zoo approach works well except that an wrinkle in my dataset
not reflected in the sample data caused some problems. I am actually
dealing with a situation where there is an unequal number of
observations in each group like the below data set

library(zoo)

dat1 - data.frame(x = runif(2190, 0, 125), year=rep(1995:2000,
each=365), jday=1:365, site=here)
dat2 - data.frame(x = runif(4380, 0, 200), year=rep(1989:2000,
each=365), jday=1:365, site=there)
dat - rbind(dat1,dat2)

## When I use read.zoo everything is read in fine
z - read.zoo(dat, index = 2:3, split = 4, format = %Y %j)

## But when I use rollmean to get a 7 day average for both the 'here'
and 'there' columns only the 'there' column 7 day
## average is calculated
zz - rollmean(z, 7)

Any thoughts on how I can then calculate a rolling mean on groups
where there is an unequal number of observations in each group?

Thanks for the previous post and in advance.

Sam


 --
 Statistics  Software Consulting
 GKX Group, GKX Associates Inc.
 tel: 1-877-GKX-GROUP
 email: ggrothendieck at gmail.com

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Calculating rolling mean by group

2012-01-09 Thread Sam Albers
Hello all,

I am trying to determine how to calculate rolling means in R using a
grouping variable. Say I have a dataframe like so:

dat1 - data.frame(x = runif(2190, 0, 125), year=rep(1995:2000,
each=365), jday=1:365, site=here)
dat2 - data.frame(x = runif(2190, 0, 200), year=rep(1995:2000,
each=365), jday=1:365, site=there)
dat - rbind(dat1,dat2)

## What I would like to do is calculate a rolling 7 day mean
separately for each site. I have looked at both
## rollmean() in the zoo package and running.mean() in the igraph
package but neither seem to have led
## me to calculating a rolling mean by group. My first thought was to
use the plyr package but I am confused
## by this output:

library(plyr)
library(zoo)

ddply(dat, c(site), function(df) return(c(roll=rollmean(df$x, 7

## Can anyone recommend a better way to do this or shed some light on
this output?

Thanks so much in advance!

Sam

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Specifying argument values in a function

2011-08-25 Thread Sam Albers
Hello all,

I am trying write a fairly simple function that provide a quick way to
calculate several distributions for a data set. I am trying to provide
a function that has a argument that specifies which distribution is
outputted (here norm or cumu). I also have a melt argument but
that seems to be working fine. I have been able to get my function
working well for just one distribution but when I add another and try
to add a dist.type argument (with potential values cumu and
norm), I get an error message (see below). I am having trouble
finding material that explains how to add an argument that isn't a
TRUE/FALSE situation. Could any explain what I am doing wrong with the
second distribution specifying argument? I apologize as I am sure
this is a simple problem but I am just getting my feet wet with this
type of thing in R and am having a little trouble diagnosing the
problem.

#Example below:

library(reshape)

dat - data.frame(`v1`=runif(6, 0, 125),
  `v2`=runif(6, 50, 75),
  `v3`=runif(6, 0, 100),
  `v4`=runif(6, 0, 200)
  )


my.norm - function(x, melt=TRUE)
{
 #Normalized distribution
 N.dist - as.data.frame(sapply(1:length(x), function(i)
(x[[i]]/rowSums(x[,c(1:4)]))*100 ))
 norm.melt - melt.data.frame(N.dist)
 if (melt == TRUE) ##Default is a melted data frame
  return(norm.melt)
   if (melt == FALSE)
return(N.dist)
   }

## So this single distribution function works fine
my.norm(dat, melt=TRUE)

my.fun - function(x, melt=TRUE, dist.type=norm)
{
 #Normalized distribution
 N.dist - as.data.frame(sapply(1:length(x), function(i)
(x[[i]]/rowSums(x[,c(1:4)]))*100 ))
 norm.melt - melt.data.frame(N.dist)
 if (melt == TRUE  dist.type == norm) ##Default is a melted data frame
  return(norm.melt)
   if (melt == FALSE  dist.type == norm)
return(N.dist)

 ## Cumulative distribution
 C.dist - as.data.frame(t(apply(N.dist, 1, cumsum)))
 cumu.melt - melt.data.frame(C.dist)
 if (melt == TRUE  dist.type == cumu) ##Default is a melted data frame
  return(cumu.melt)
   if (melt == FALSE  dist.type == cumu)
return(C.dist)
   }

## But this function when used yields two different error messages
depending on the value used for dist.type:
my.fun(dat, melt=TRUE, dist.type = norm)
## Error in dist.type == norm :
##  comparison (1) is possible only for atomic and list types

my.fun(dat, melt=TRUE, dist.type = cumu)
## Error in my.fun(dat, dist.type = cumu) : object 'cumu' not found

Thanks in advance!

Sam

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Utilizing column names to multiply over all columns

2011-08-16 Thread Sam Albers
## Hello there,
## I have an issue where I need to use the value of column names to
multiply with the individual values in a column and I have many
columns to do this over. I have data like this where the column names
are numbers:

mydf - data.frame(`2.72`=runif(20, 0, 125),
  `3.2`=runif(20, 50, 75),
  `3.78`=runif(20, 0, 100),
  yy= head(letters,2), check.names=FALSE)

## I had been doing something like this but this seems rather tedious
and clunky. These append the correct values to my dataframe but is
there any way that I can do this generally over each column, also
using each column name as the multiplier for that column?

mydf$vd2.72 - mydf$'2.72'*2.72
mydf$vd3.2 - mydf$'3.2'*3.2
mydf$vd3.78 - mydf$'3.78'*3.78

## So can I get to this point with a more generalized solution? For
now, I would also prefer to keep this in wide format and I am aware
(thanks to the list!) that I could use melt() to get the values I
want.
mydf

## Thanks so much in advance!

Sam

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Utilizing column names to multiply over all columns

2011-08-16 Thread Sam Albers
Thanks for the response David.

On Tue, Aug 16, 2011 at 1:13 PM, David Winsemius dwinsem...@comcast.net wrote:

 On Aug 16, 2011, at 3:37 PM, Sam Albers wrote:

 ## Hello there,
 ## I have an issue where I need to use the value of column names to
 multiply with the individual values in a column and I have many
 columns to do this over. I have data like this where the column names
 are numbers:

 mydf - data.frame(`2.72`=runif(20, 0, 125),
                 `3.2`=runif(20, 50, 75),
                 `3.78`=runif(20, 0, 100),
                 yy= head(letters,2), check.names=FALSE)

 mydf
        2.72      3.2      3.78 yy
 1   31.07874 74.48555 89.306591  a
 2  123.68290 74.30030 11.943576  b
 3   89.64024 68.26378 97.627211  a
 4   81.46604 59.79607 91.005217  b


 ## I had been doing something like this but this seems rather tedious
 and clunky. These append the correct values to my dataframe but is
 there any way that I can do this generally over each column, also
 using each column name as the multiplier for that column?

 mydf$vd2.72 - mydf$'2.72'*2.72
 mydf$vd3.2 - mydf$'3.2'*3.2
 mydf$vd3.78 - mydf$'3.78'*3.78

 ## So can I get to this point with a more generalized solution? For
 now, I would also prefer to keep this in wide format and I am aware
 (thanks to the list!) that I could use melt() to get the values I
 want.

 You will get the warning that last last column is not going right but
 otherwise this returns what you asked for:

 sapply(1:length(mydf), function(i) mydf[[i]]* as.numeric(names(mydf)[i])  )

This suits my purposes well with a couple slight modifications:

## I made this into a data.frame so I could append it to the other one (mydf)
mydf.vd - as.data.frame(sapply(1:length(mydf), function(i)
mydf[[i]]*as.numeric(names(mydf)[i]) ))

## I also renamed all the columns accordingly.
colnames(mydf.vd) - paste(vd,names(mydf), sep=)

##Then added the new data.frame to the old one.
out - cbind(mydf,mydf.vd)

Thanks for your help with this! (Also thanks Bert for the other
helpful suggestion)

           [,1]     [,2]      [,3] [,4]
  [1,]  84.53416 238.3538 337.57891   NA
  [2,] 336.41748 237.7610  45.14672   NA
  [3,] 243.82145 218.4441 369.03086   NA
  [4,] 221.58762 191.3474 343.99972   NA
  [5,]  81.78911 213.0770  97.90072   NA
 snipped remainder



 --

 David Winsemius, MD
 West Hartford, CT



Sam

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Alternative and more efficient data manipulation

2011-08-15 Thread Sam Albers
Hello list,

## I have been doing the following process to convert data from one
form to another for a while but it occurs to me that there is probably
an easier way to do this. I am often given data that have column names
which are actually data and I much prefer dealing with data that are
sorted by factors. So to convert the columns I have previously made
use of make.groups() in the lattice package which works completely
satisfactorily. However, it is a bit clunky for what I am using it for
and I have to carry the other variables forward. Can anyone suggest a
better way of converting data like this?

library(lattice)

dat - data.frame(`x1`=runif(6, 0, 125),
  `x2`=runif(6, 50, 75),
  `x3`=runif(6, 0, 100),
  `x4`=runif(6, 0, 200),
  date =
as.Date(c(2009-09-25,2009-09-28,2009-10-02,2009-10-07,2009-10-15,2009-10-21)),
  yy= head(letters,2), check.names=FALSE)
## Here is an example of the type of data that NEED converting
dat

dat.group - with(dat, make.groups(x1,x2,x3,x4))
## Carrying the other variables forward
dat.group$date - dat$date
dat.group$yy - dat$yy
## Here is an example of what I would like the data to look like
dat.group

## The point of this all is so that I can used the data in a manner
such as this:
with(dat.group, xyplot(data ~ as.numeric(substr(which, 2,2))|yy, groups=date))

## So I suppose what I am asking is if there is a more efficient way
of doing this?

Thanks so much in advance!

Sam

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Standardizing the number of records by group

2011-07-25 Thread Sam Albers
Hello R-help,

I have some data collected at regular intervals but for a varying
length of time. I would like to standardize the length of time
collected and I can do this by standardizing the number of records I
use for my analysis.

Take for example the data set below:


library(plyr)
x - runif(18,10, 15)
df - as.data.frame(x)
df$fac - factor(c(Test1,Test1,Test1,Test1,Test1,Test1,Test1,
 Test2,Test2,Test2,Test2,Test2,
 Test3,Test3,Test3,Test3,Test3,Test3))

## Here is where I would like to standardize the number of records

df.avg - ddply(df, c(fac), function(df) return(c(x.avg=mean(df$x),
n=length(df$x
df.avg

Here there is a different number of records for each factor level. Say
I only wanted to use the first 4 records at each factor level. Prior
to taking the mean of these values how might I drop all the records
after 4? Can anyone suggest a good way to do this?

I am using R 2.12.1 and Emacs + ESS.

Thanks so much in advance.

Sam

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Italicized greek symbols in PDF plots

2011-07-01 Thread Sam Albers
Many thanks Dr. Ripley. I was made aware that the problem for me was that I
was sending HTML in my email message. Transgression noted.

On Wed, Jun 29, 2011 at 10:17 PM, Prof Brian Ripley
rip...@stats.ox.ac.ukwrote:

 On Wed, 29 Jun 2011, Sam Albers wrote:

  I know that this has been asked before in other variations but I just
 can't
 seem to figure out my particular application from previous posts. My
 apologies if I have missed the answer to this question somewhere in the
 archives. I have indeed looked.

 I am running Ubuntu 11.04, with R 2.12.1 and ESS+Emacs.

 For journal formatting requirements, I need to italicize all the greek
 letters in any plot. This is reasonably straight forward to do and I
 accomplished this task like so:

 library(ggplot2)

 label_parseall - function(variable, value) {
  plyr::llply(value, function(x) parse(text = paste(x)))
 }

 dat - data.frame(x = runif(270, 0, 125), z = rep(LETTERS[1:3], each = 3),
 yy = 1:9, stringsAsFactors = TRUE)
 #unicode italicized delta
 dat$gltr =
 factor(c(italic(\u03b4)^14*N**,italic(\u03b4)^15*N,**
 italic(\u03b4)^13*C))

 #So this is what I want my plot to look like:
 plt - ggplot(data = dat, aes(x = yy, y = x)) +
   geom_point(aes(x= yy, y=x, shape=z, group=z), alpha=0.4,position =
 position_dodge(width = 0.8)) +
   facet_grid(gltr~.,labeller= label_parseall, scales=free_y)
 plt

 #So then I exported my plot as a PDF like so:
 pdf(Times_regular.pdf, family='Times')
 plt
 dev.off()
 #The problem with this was that the delta symbols turned into dots.


 You forgot to set the encoding: see the ?pdf help file.  Greek is most
 likely not covered by the default encoding (and you also forgot the 'at a
 minimum' information required by the posting guide, so we don't know what
 your defaults would be).


 Here is the results of sessionInfo(). Is this what you meant by defaults?

 sessionInfo()
R version 2.12.1 (2010-12-16)
Platform: i686-pc-linux-gnu (32-bit)

locale:
 [1] LC_CTYPE=en_CA.UTF-8   LC_NUMERIC=C
 [3] LC_TIME=en_CA.UTF-8LC_COLLATE=en_CA.UTF-8
 [5] LC_MONETARY=C  LC_MESSAGES=en_CA.UTF-8
 [7] LC_PAPER=en_CA.UTF-8   LC_NAME=C
 [9] LC_ADDRESS=C   LC_TELEPHONE=C
[11] LC_MEASUREMENT=en_CA.UTF-8 LC_IDENTIFICATION=C

attached base packages:
[1] stats graphics  grDevices utils datasets  grid  methods
[8] base

other attached packages:
[1] Cairo_1.4-9   ggplot2_0.8.9 reshape_0.8.4 plyr_1.5.2proto_0.3-9.2

loaded via a namespace (and not attached):
[1] digest_0.4.2


I also tried using all the encodings found /usr/lib/R/library/grDevices/enc

AdobeStd.enc  CP1250.enc  CP1253.enc  Cyrillic.enc  ISOLatin1.enc
 ISOLatin7.enc  KOI8-R.enc  MacRoman.enc  TeXtext.enc
AdobeSym.enc  CP1251.enc  CP1257.enc  Greek.enc ISOLatin2.enc
 ISOLatin9.enc  KOI8-U.enc  PDFDoc.encWinAnsi.enc

None of these seemed to produce italicized greek letters. It seems like
encoding is ignored in CairoPDF so I never tried it with that command.




  #I solved this problem using Cairo
 library(Cairo)
 cairo_pdf(Cairo.pdf)
 plt
 dev.off()


 The problem that I face now is that I am unsure how to output a figure
 that
 maintains the greek symbols but outputs everything in the plot as TImes
 New
 Roman, another requirement of the journal. So I can produce a Times New
 Roman PDF plot and an italicize greek symbol unicode PDF plot but not
 both.
 Does anoyone have any idea how I might accomplish both of these things
 together in a single PDF?


 I woud use cairo_pdf() in base R (and not package Cairo).  Use grid
 facilities to change font, or use the version in R-devel which has a family=
 argument.

 I tried this using CairoPDF() like so:

CairoPDF(Cairo.pdf, 6, 6, family=Times)
plt
dev.off()

But this omitted the greek symbols AND didn't produce the figure in the
desired font. It seems like other folks have also experienced this problem
before:

https://stat.ethz.ch/pipermail/r-help/2011-January/266657.html

Have I missed something? Are there any other strategies that could suggest
to get italicized greek letters? Thanks again.

Sam



 Thanks so much in advance,

 Sam

[[alternative HTML version deleted]]


  __**
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/**listinfo/r-helphttps://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/**
 posting-guide.html http://www.R-project.org/posting-guide.html


 That does mean you!

 --
 Brian D. Ripley,  rip...@stats.ox.ac.uk
 Professor of Applied Statistics,  
 http://www.stats.ox.ac.uk/~**ripley/http://www.stats.ox.ac.uk/~ripley/
 University of Oxford, Tel:  +44 1865 272861 (self)
 1 South Parks Road, +44 1865 272866 (PA)
 Oxford OX1 3TG, UKFax:  +44 1865 272595


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https

Re: [R] Italicized greek symbols in PDF plots

2011-06-30 Thread Sam Albers
Thanks for the response Dr. Ripley. Much appreciated.

On Wed, Jun 29, 2011 at 10:17 PM, Prof Brian Ripley
rip...@stats.ox.ac.ukwrote:

 On Wed, 29 Jun 2011, Sam Albers wrote:

  I know that this has been asked before in other variations but I just
 can't
 seem to figure out my particular application from previous posts. My
 apologies if I have missed the answer to this question somewhere in the
 archives. I have indeed looked.

 I am running Ubuntu 11.04, with R 2.12.1 and ESS+Emacs.

 For journal formatting requirements, I need to italicize all the greek
 letters in any plot. This is reasonably straight forward to do and I
 accomplished this task like so:

 library(ggplot2)

 label_parseall - function(variable, value) {
  plyr::llply(value, function(x) parse(text = paste(x)))
 }

 dat - data.frame(x = runif(270, 0, 125), z = rep(LETTERS[1:3], each = 3),
 yy = 1:9, stringsAsFactors = TRUE)
 #unicode italicized delta
 dat$gltr =
 factor(c(italic(\u03b4)^14*N**,italic(\u03b4)^15*N,**
 italic(\u03b4)^13*C))

 #So this is what I want my plot to look like:
 plt - ggplot(data = dat, aes(x = yy, y = x)) +
   geom_point(aes(x= yy, y=x, shape=z, group=z), alpha=0.4,position =
 position_dodge(width = 0.8)) +
   facet_grid(gltr~.,labeller= label_parseall, scales=free_y)
 plt

 #So then I exported my plot as a PDF like so:
 pdf(Times_regular.pdf, family='Times')
 plt
 dev.off()
 #The problem with this was that the delta symbols turned into dots.


 You forgot to set the encoding: see the ?pdf help file.  Greek is most
 likely not covered by the default encoding (and you also forgot the 'at a
 minimum' information required by the posting guide, so we don't know what
 your defaults would be).


Here is the results of sessionInfo(). Is this what you meant by defaults?

 sessionInfo()
R version 2.12.1 (2010-12-16)
Platform: i686-pc-linux-gnu (32-bit)

locale:
 [1] LC_CTYPE=en_CA.UTF-8   LC_NUMERIC=C
 [3] LC_TIME=en_CA.UTF-8LC_COLLATE=en_CA.UTF-8
 [5] LC_MONETARY=C  LC_MESSAGES=en_CA.UTF-8
 [7] LC_PAPER=en_CA.UTF-8   LC_NAME=C
 [9] LC_ADDRESS=C   LC_TELEPHONE=C
[11] LC_MEASUREMENT=en_CA.UTF-8 LC_IDENTIFICATION=C

attached base packages:
[1] stats graphics  grDevices utils datasets  grid  methods
[8] base

other attached packages:
[1] Cairo_1.4-9   ggplot2_0.8.9 reshape_0.8.4 plyr_1.5.2proto_0.3-9.2

loaded via a namespace (and not attached):
[1] digest_0.4.2


I also tried using all the encodings found /usr/lib/R/library/grDevices/enc

AdobeStd.enc  CP1250.enc  CP1253.enc  Cyrillic.enc  ISOLatin1.enc
 ISOLatin7.enc  KOI8-R.enc  MacRoman.enc  TeXtext.enc
AdobeSym.enc  CP1251.enc  CP1257.enc  Greek.enc ISOLatin2.enc
 ISOLatin9.enc  KOI8-U.enc  PDFDoc.encWinAnsi.enc

None of these seemed to produce italicized greek letters. It seems like
encoding is ignored in CairoPDF so I never tried it with that command.



  #I solved this problem using Cairo
 library(Cairo)
 cairo_pdf(Cairo.pdf)
 plt
 dev.off()


 The problem that I face now is that I am unsure how to output a figure
 that
 maintains the greek symbols but outputs everything in the plot as TImes
 New
 Roman, another requirement of the journal. So I can produce a Times New
 Roman PDF plot and an italicize greek symbol unicode PDF plot but not
 both.
 Does anoyone have any idea how I might accomplish both of these things
 together in a single PDF?


 I woud use cairo_pdf() in base R (and not package Cairo).  Use grid
 facilities to change font, or use the version in R-devel which has a family=
 argument.


I tried this using CairoPDF() like so:

CairoPDF(Cairo.pdf, 6, 6, family=Times)
plt
dev.off()

But this omitted the greek symbols AND didn't produce the figure in the
desired font. It seems like other folks have also experienced this problem
before:

https://stat.ethz.ch/pipermail/r-help/2011-January/266657.html

Have I missed something? Are there any other strategies that could suggest
to get italicized greek letters? Thanks again.

Sam



 Thanks so much in advance,

 Sam

[[alternative HTML version deleted]]


  __**
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/**listinfo/r-helphttps://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/**
 posting-guide.html http://www.R-project.org/posting-guide.html


 That does mean you!


Apologies for the inadequate posting. I will try to be clearer in the
future.


 --
 Brian D. Ripley,  rip...@stats.ox.ac.uk
 Professor of Applied Statistics,  
 http://www.stats.ox.ac.uk/~**ripley/http://www.stats.ox.ac.uk/~ripley/
 University of Oxford, Tel:  +44 1865 272861 (self)
 1 South Parks Road, +44 1865 272866 (PA)
 Oxford OX1 3TG, UKFax:  +44 1865 272595


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https

[R] Italicized greek symbols in PDF plots

2011-06-29 Thread Sam Albers
I know that this has been asked before in other variations but I just can't
seem to figure out my particular application from previous posts. My
apologies if I have missed the answer to this question somewhere in the
archives. I have indeed looked.

I am running Ubuntu 11.04, with R 2.12.1 and ESS+Emacs.

For journal formatting requirements, I need to italicize all the greek
letters in any plot. This is reasonably straight forward to do and I
accomplished this task like so:

library(ggplot2)

label_parseall - function(variable, value) {
   plyr::llply(value, function(x) parse(text = paste(x)))
}

dat - data.frame(x = runif(270, 0, 125), z = rep(LETTERS[1:3], each = 3),
 yy = 1:9, stringsAsFactors = TRUE)
#unicode italicized delta
dat$gltr =
factor(c(italic(\u03b4)^14*N,italic(\u03b4)^15*N,italic(\u03b4)^13*C))

#So this is what I want my plot to look like:
plt - ggplot(data = dat, aes(x = yy, y = x)) +
geom_point(aes(x= yy, y=x, shape=z, group=z), alpha=0.4,position =
position_dodge(width = 0.8)) +
facet_grid(gltr~.,labeller= label_parseall, scales=free_y)
plt

#So then I exported my plot as a PDF like so:
pdf(Times_regular.pdf, family='Times')
plt
dev.off()
#The problem with this was that the delta symbols turned into dots.

#I solved this problem using Cairo
library(Cairo)
cairo_pdf(Cairo.pdf)
plt
dev.off()


The problem that I face now is that I am unsure how to output a figure that
maintains the greek symbols but outputs everything in the plot as TImes New
Roman, another requirement of the journal. So I can produce a Times New
Roman PDF plot and an italicize greek symbol unicode PDF plot but not both.
Does anoyone have any idea how I might accomplish both of these things
together in a single PDF?

Thanks so much in advance,

Sam

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Axes labels, greek letters and spaces

2011-06-28 Thread Sam Albers
Hello all,

I can't seem to figure how to use a greek character in expression() in
plot() labels without adding a space. So for example below when plotting
this out

x-1:10
plot(x,x^2, xlab=expression(Chlorophyll~italic(a)~mu~g~cm^-2))

the axis label read as  μ g cm^-2 because I have space there with a tilda.

But if I remove the tilda then my units are mug cm^-2.

Can anyone recommend a way that I can modify the axis label to look for like
this: μg cm^-2

Thanks in advance!

Sam

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] extract data from a column

2011-06-19 Thread Sam Albers
If I understand what you want (which I may very well not) you could use
something like this:

If this is an example of your type of data:
564589,+

substr(x, 1, 6)
as.numeric(x)

Please try to post something more thorough if you would like a better
answer.

Sam

--
View this message in context: 
http://r.789695.n4.nabble.com/extract-data-from-a-column-tp3609890p3610030.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Calculating a mean based on a factor range

2011-06-09 Thread Sam Albers
Hello all,

I have been using an instrument that collects a temperature profile of a
water column. The instrument records the temperature and depth any time it
takes a reading. I was sampling many times at discrete depth rather than a
complete profile of the water column (e.g. interested in 5m, 10m and 20m
depth position only). The issue was that these measurement were taken with
the instrument hanging off the side of a boat so a big enough wave moved the
instrument such that it recorded a slightly different depth. For my
purposes, however, this difference is negligible and I wish to consider all
those different readings at close depth as a single depth. So for example:


 library(ggplot2)

 eg - read.csv(http://dl.dropbox.com/u/1574243/example_data.csv;,
header=TRUE, sep=,)

## Calculating an average value from all the readings for each depth reading
 eg.avg - ddply(eg, c(site, depth), function(df)
return(c(temp=mean(df$temperature),
+
num_samp=length(df$temperature)
+  )))

## An example of my problem
 eg.avg[eg.avg$num_samp10  eg.avg==Station 3,]
 sitedepth temp num_samp
154 Station 3  1.09000 4.073667   30
159 Station 3  2.49744 3.950972   72
175 Station 3  7.96332 3.903188   69
208 Station 3 19.37708 4.066393   61
209 Station 3 19.54096 4.025385   13

## So here you will notice that record 208 and 209, by my criteria, should
be considered a sample at the same depth and lumped together. Yet I can't
figure out a way to coerce R to calculate a mean value of temperature based
on a threshold range depth (say +/- 0.25). Generally speaking this can be
said to be calculating a mean (temperature) based on a factor (depth) range.

Any thoughts on this? I am using R 2.12.1.

Thanks in advance!

Sam

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Subsetting depth profiles based on maximum depth

2011-05-16 Thread Sam Albers
Hello,

I am having a little trouble finding the right set of criteria to subset a
portion of data. I am using an instrument that does depth profiles of a
water column. The instrument records on the way down as well as the way up.

## So I am left with data like this:

dat - data.frame(var = runif(11, 0, 10))
dat$depth - c(1:5,5,5:1)

# So for the example
dat

## I am trying tp figure out how to subset the data so that all data
collected at the maximum depth and those collected on the way UP the water
column are used and the data collected on the way DOWN through the water
column are discarded. I got stumped by the fact that I can't just ask R for
all values less than the maximum depth.

## So I've tried determining the row number of the maximum depth value and
discarding all values above that but so far I haven't been able to figure
this out.

which.max(dat$depth)


Can anyone recommend a better strategy to figure this out?

Thanks so much in advance.

Sam

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Subsetting depth profiles based on maximum depth by group with plyr

2011-05-16 Thread Sam Albers
Hello,

Apologies for a similar earlier post. I didn't include enough details in
that one.

I am having a little trouble subsetting some data based on a grouping
variable. I am using an instrument that does depth profiles of a water
column. The instrument records on the way down as well as the way up. So
thanks to an off-list reply I can subset the data so that all data collected
at the maximum depth and those collected on the way UP the water column are
used and the data collected on the way DOWN through the water column are
discarded. This is illustrated by the following:

dat1 - data.frame(var=100*(0:10), depth=c(1:5,5,5:1))
dat1[ seq_len(nrow(dat1)) = which.max(dat1$depth), ]

However, I have data frame where I would like to perform this subset for
several groups. My data.frame looks like the following:

dat1 - data.frame(var=100*(0:10), depth=c(1:5,5,5:1))
dat1$group - A
dat2 - data.frame(var=100*(0:10), depth=c(1:5,7,5:1))
dat2$group - B
dat - rbind(dat1,dat2)

I thought I might be able to use the plyr package to do this but for some
reason the following gives me almost the opposite of what I was hoping for:


library(plyr)
ddply(dat, .(group), function(.df) {
   .df[seq_len(nrow(.df) = which.max(.df$depth)),]
  })

Can anyone recommend a way to subset based on a grouping variable
preferably?

Thanks in advance.

Sam

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Converting ordinal dates and time into sensible formats

2011-05-10 Thread Sam Albers
Hello all,

I am having a little trouble working with strptime and I was hoping
someone might be able to give me a hand. I have an instrument that outputs
an ordinal date and time in two columns something like this:

  day.hour min.sec
1125252050
2125182029
3125242023
4125242028
5125072035

Now the problem I am having is converting these numbers into dates and
times. I am able to convert these into their respective POSIXlt formats but
I am left with two columns where hour is left with the date (data$Only.Date)
and the date is left with the time (data$Only.Time). Can anyone recommend a
good way to convert these ordinal dates into something like the following?

day.hour min.secDateTime
12511 20332011-05-05   11:20:33

## A trivial example
##a data frame
day.hour -as.integer(runif(5, 12500, 12523)) #First 3 digits are the day of
the year, last 2 are the hour of the day
data - as.data.frame(day.hour)
data$min.sec -as.integer(runif(5, 2000, 2060)) #First 2 digits are the
minute, last 2 are the seconds

##example of how things get a little jumbled. strptime was easy enough to
use.
data$Date - strptime(data$day.hour, format=%j%H)
data$Time - strptime(data$min.sec, format=%M%S)
data

Using Ubuntu 10.10 and R 2.11.1. Thanks in advance

Sam

-- 
*
Sam Albers
Geography Program
University of Northern British Columbia
 University Way
Prince George, British Columbia
Canada, V2N 4Z9
*

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Loop Through Columns to Remove Rows

2011-03-09 Thread Sam Albers
Hello Venerable List,

I am trying to loop (I think) an operation through a list of columns in a
dataframe to remove set of #DIV/0! values. I am trying to do this like so:

#Data.frame
test - read.csv(http://dl.dropbox.com/u/1574243/sample_data.csv;,
header=TRUE, sep=,)


#This removes all the rows with #DIV/0! values in the mean column.
only.mean - test[!test$mean==#DIV/0!,]

#This removes the majority of #DIV/0! values as there is a large block of
these values that extends over every column.
#However, it doesn't remove then all. Can any recommend a way where I can
cycle through all the columns and remove these values other than manually
like so:
mean.median - only.mean[!only.mean$median==#DIV/0!,] # and so on through
each column?

Can anyone recommend a better way of doing this?

Thanks in advance!

Sam

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Loop Through Columns to Remove Rows

2011-03-09 Thread Sam Albers
Many thanks for this Jorge. Exactly what I was looking for. I've never
encountered any() before. Quite useful.

Thanks again!

Sam

On Wed, Mar 9, 2011 at 1:05 PM, Jorge Ivan Velez
jorgeivanve...@gmail.comwrote:

 Hi Sam,

 How about this?

 test[apply(test, 1, function(x) !any(x == '#DIV/0!')), ]

 HTH,
 Jorge


 On Wed, Mar 9, 2011 at 3:29 PM, Sam Albers  wrote:

 Hello Venerable List,

 I am trying to loop (I think) an operation through a list of columns in a
 dataframe to remove set of #DIV/0! values. I am trying to do this like so:

 #Data.frame
 test - read.csv(http://dl.dropbox.com/u/1574243/sample_data.csv;,
 header=TRUE, sep=,)


 #This removes all the rows with #DIV/0! values in the mean column.
 only.mean - test[!test$mean==#DIV/0!,]

 #This removes the majority of #DIV/0! values as there is a large block of
 these values that extends over every column.
 #However, it doesn't remove then all. Can any recommend a way where I can
 cycle through all the columns and remove these values other than manually
 like so:
 mean.median - only.mean[!only.mean$median==#DIV/0!,] # and so on
 through
 each column?

 Can anyone recommend a better way of doing this?

 Thanks in advance!

 Sam

[[alternative HTML version deleted]]

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.




[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Summarizing a response variable based on an irregular time period

2011-02-11 Thread Sam Albers
Hello,

I have a question about working with dates in R. I would like to summarize a
response variable based on a designated and irregular time period. The
purpose of this is to compare the summarized values (which were sampled
daily) to another variable that was sampled less frequently. Below is a
trivial example where I would like to summarize the response variable dat$x
such that I have average and sum values from Sept25-27 and Sept28-Oct1. Can
anyone suggest an efficient way to deal with dates like this? As an
extremely tedious previous effort, I simply created another grouping
variable but I had to do this manually. For a large dataset this really
isn't a good option.

Thanks in advance!

Sam

library(plyr)
dat - data.frame(x = runif(6, 0, 125), date =
as.Date(c(2009-09-25,2009-09-26,2009-09-27,2009-09-28,2009-09-29,2009-09-30,2009-10-01),
format=%Y-%m-%d), yy = letters[1:2], stringsAsFactors = TRUE)

#If I was using a regular factor, I would do something like this and this is
what I would be hoping for as a result (obviously switching yy for date as
the grouping variable)
ddply(dat, c(yy), function(df) return(c(avg=mean(df$x), sum=sum(df$x

#This is the data.frame that I would like to compare to dat.
dat2 - data.frame(y = runif(2, 0, 125), date =
as.Date(c(2009-09-27,2009-10-01), format=%Y-%m-%d))

-- 
*
Sam Albers
Geography Program
University of Northern British Columbia
 University Way
Prince George, British Columbia
Canada, V2N 4Z9
phone: 250 960-6777
*

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Drop non-integers

2010-11-17 Thread Sam Albers
Hello all,

I have a fairly simple data manipulation question. Say I have a dataframe
like this:

dat - as.data.frame(runif(7, 3, 5))
dat$cat - factor(c(1,4,13,1,4,13,13A))

dat
  runif(7, 3, 5) cat
1   3.880020   1
2   4.062800   4
3   4.828950  13
4   4.761850   1
5   4.716962   4
6   3.868348  13
7   3.420944 13A

Under the dat$cat variable the 13A value is an analytical replicate. For my
purposes I would like to drop all values that are not an integer (i.e. 13A)
from the dataframe. Can anyone recommend a way to drop all rows where the
cat value is a non-integer?

Sorry for the simple question and thanks in advance.

Sam
-- 
*
Sam Albers
Geography Program
University of Northern British Columbia
 University Way
Prince George, British Columbia
Canada, V2N 4Z9
phone: 250 960-6777
*

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Drop non-integers

2010-11-17 Thread Sam Albers
On Wed, Nov 17, 2010 at 3:49 PM, David Winsemius dwinsem...@comcast.netwrote:


 On Nov 17, 2010, at 6:27 PM, Sam Albers wrote:

  Hello all,

 I have a fairly simple data manipulation question. Say I have a dataframe
 like this:

 dat - as.data.frame(runif(7, 3, 5))
 dat$cat - factor(c(1,4,13,1,4,13,13A))

 dat
  runif(7, 3, 5) cat
 1   3.880020   1
 2   4.062800   4
 3   4.828950  13
 4   4.761850   1
 5   4.716962   4
 6   3.868348  13
 7   3.420944 13A

 Under the dat$cat variable the 13A value is an analytical replicate. For
 my
 purposes I would like to drop all values that are not an integer (i.e.
 13A)
 from the dataframe. Can anyone recommend a way to drop all rows where the
 cat value is a non-integer?


 dat[!is.na(as.numeric(as.character(dat$cat))), ]

 (You do get a warning about coercion to NA's but that is a good sign since
 that is what we were trying to exclude in the first place.)


Apologies. This worked fine but I didn't quite outline that I also wanted to
drop the unused levels of the factor as well. drop=TRUE doesn't seem to
work, so can anyone suggest a way to drop the factor levels in addition to
the values?

 sd - dat[!is.na(as.numeric(as.character(dat$cat))), ]
Warning message:
In `[.data.frame`(dat, !is.na(as.numeric(as.character(dat$cat))),  :
  NAs introduced by coercion
 str(sd)
'data.frame':6 obs. of  2 variables:
 $ runif(7, 3, 5): num  3.88 4.06 4.83 4.76 4.72 ...
 $ cat   : Factor w/ 4 levels 1,13,13A,..: 1 4 2 1 4 2



 Sorry for the simple question and thanks in advance.

 Sam
 --
 *
 Sam Albers
 Geography Program
 University of Northern British Columbia
  University Way
 Prince George, British Columbia
 Canada, V2N 4Z9
 phone: 250 960-6777
 *

[[alternative HTML version deleted]]

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


 David Winsemius, MD
 West Hartford, CT




-- 
*
Sam Albers
Geography Program
University of Northern British Columbia
 University Way
Prince George, British Columbia
Canada, V2N 4Z9
phone: 250 960-6777
*

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Change plot order in lattice xyplot

2010-09-08 Thread Sam Albers

Prior to creating a plot I usually just order the factor levels to the order
I want them in. So for your example I would do:

#Create some data
library(lattice)
x - runif(100, 0, 20) 
df - data.frame(x) 
df$y - (1:10) 
df$Month - c(October, September, August, July,June) 

#Plot the figure
plt -xyplot(x~y | Month, data =df,
 layout=c(5,1),
 xlab=Log density from hydroacoustics (integration),
 ylab=Log density from Tucker trawl,
 main=Density estimates, Tucker Trawl, 
 cex=1.5) 

#Factor levels aren't in the order you want them in. Reorder them how you
want.
df$Month - factor(df$Month, levels=c(June,July,August, September,
October), order=TRUE)

#Plot again.
plt

HTH,

Sam

-- 
View this message in context: 
http://r.789695.n4.nabble.com/Change-plot-order-in-lattice-xyplot-tp2531542p2531619.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Hydrology plots in R

2010-07-22 Thread Sam Albers
Hello,

I am trying to create a plot often seen in hydrodynamic work than includes a
contour plot representing the water speed with arrows pointing in the
direction of flow. Does anyone have any idea how I might add arrows based on
wf$angle (in the example below) to the plot below?

Thanks in advance!

Sam

library(lattice)

speed - runif(100, 0, 20)
wf - data.frame(speed)
wf$width - (1:10)
wf$length - rep(1:10, each=10)
wf$angle -runif(100, 0, 360)

#How do I add arrows based on wf$angle within each coloured box to represent
the direction of flow?
#i don't have to use lattice. Just using it as an example.
with(wf, contourplot(speed ~ width*length,
 region=TRUE,
 contour=FALSE
 ))

-- 
*
Sam Albers
Geography Program
University of Northern British Columbia
 University Way
Prince George, British Columbia
Canada, V2N 4Z9
phone: 250 960-6777
*

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Can I set default parameters for the default graphics device?

2010-07-15 Thread Sam Albers

Unless you aren't writing scripts why wouldn't you just use something like
this?

 x=c(1,2,3) 
 pdf(RRules.pdf)
 plot(x,x) 
 dev.off()
-- 
View this message in context: 
http://r.789695.n4.nabble.com/Can-I-set-default-parameters-for-the-default-graphics-device-tp2290827p2290836.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] k-sample Kolmogorov-Smirnov test?

2010-06-22 Thread Sam Albers
Hello,

I am curious if anyone has had any success with finding a R version of a
k-sample Kolmogorov-Smirnov test. Most of the references that I have able to
find on this are fairly old and I am wondering if this type of analysis has
fallen out of favour. If so, how do people tend to compare distributions
when they have more than two? Is it reasonable to pursue an adjusted p-value
method. That is, could you compare say three distributions by performing
three two-sample K-S test's then apply a bonferroni correction?

Just curious what some peoples approaches are when they want to compare more
than two distributions.

Thanks in advance.

Sam

-- 
*
Sam Albers
Geography Program
University of Northern British Columbia
 University Way
Prince George, British Columbia
Canada, V2N 4Z9
phone: 250 960-6777
*

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] k-sample Kolmogorov-Smirnov test?

2010-06-22 Thread Sam Albers

Hello,

I am curious if anyone has had any success with finding a R version of a
k-sample Kolmogorov-Smirnov test. Most of the references that I have able to
find on this are fairly old and I am wondering if this type of analysis has
fallen out of favour. If so, how do people tend to compare distributions
when they have more than two? Is it reasonable to pursue an adjusted p-value
method. That is, could you compare say three distributions by performing
three two-sample K-S test's then apply a bonferroni correction?

Just curious what some peoples approaches are when they want to compare more
than two distributions.

Thanks in advance.

Sam 
-- 
View this message in context: 
http://r.789695.n4.nabble.com/k-sample-Kolmogorov-Smirnov-test-tp812997p2264455.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] plotting data when all you have is the summary data

2010-05-10 Thread Sam Albers

Bimal,

in the memisc packages:

?panel.errbars

This might be a good option for you.

HTH,

Sam
-- 
View this message in context: 
http://r.789695.n4.nabble.com/plotting-data-when-all-you-have-is-the-summary-data-tp2173026p2173303.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] plotmeans in trellis view?

2010-04-26 Thread Sam Albers

I'm not sure about plotmeans but this is usually the way I plot means with
lattice:

library(lattice)
x -  runif(48, 2, 70)
data - data.frame(x)
data$factor1 - factor(c(A, B, C, D))
data$factor2 - factor(c(X, Y, Z))
data.mean - with(data, aggregate(data$x, by=list(factor1=factor1,
factor2=factor2), mean))
with(data.mean, xyplot(x~factor1 | factor2))

Is this sort of what you were looking for?

HTH,

Sam
-- 
View this message in context: 
http://r.789695.n4.nabble.com/plotmeans-in-trellis-view-tp2065860p2065945.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Assumptions on Non-Standard F ratios

2010-04-24 Thread Sam Albers
Hello there,

I am trying to run an ANOVA model using a non-Standard F ratio. Imagine that
the treatments (treatments 1  2) are applied to the row not to individual
samples. Thus the row is the experimental unit. Therefore my error term in
my ANOVA table should be the error associated with with row.

The question is how do I check the assumptions of an ANOVA model when I have
a non-standard F ratio?

For this type of model I would normally use plot(model) to examine the
residuals. However this doesn't seem to work and I expect that R is looking
for residuals that don't exist. Is there some option I can change on the
plot command?

Sorry if this is simple but searching for this answer was a little difficult
as plot() has many uses. Below is an example. I am using R 2.10.1 and Ubuntu
9.04.

Thanks in advance!

Sam

x -  runif(48, 2, 70)
data - data.frame(x)
data$treat1 - factor(c(ONE, TWO, THREE))
data$treat2 - factor(c(PRUNED, UNPRUNED))
data$row - factor(1:12)

model - with(data, aov(x ~ treat1 + treat2 + treat1*treat2 + Error(row)))

plot(model)
Error in plot.window(...) : need finite 'xlim' values
In addition: Warning messages:
1: In min(x) : no non-missing arguments to min; returning Inf
2: In max(x) : no non-missing arguments to max; returning -Inf
3: In min(x) : no non-missing arguments to min; returning Inf
4: In max(x) : no non-missing arguments to max; returning -Inf

-- 
*
Sam Albers
Geography Program
University of Northern British Columbia
 University Way
Prince George, British Columbia
Canada, V2N 4Z9
phone: 250 960-6777
*

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Error Bars in lattice- barcharts

2010-04-09 Thread Sam Albers

 Well, when the error message says argument 'lx' is missing, with no
 default, it really means that argument 'lx' is missing, with no
 default. Your panel function has an argument 'lx', which you forgot to
 change to 'ly' as you did with the prepanel function.

 Hope that helps...

 Thanks for the help Felix! It was a bit obvious what the problem was and I
apologize for not thinking about the error message more clearly.  I am going
to continue this as I think that it would be helpful if there was a working
example of this type of plot. Unfortunately, I do not have it yet. The
modified example below produced a stacked barplot with lovely error bars.
However, I can't seem to produce a plot that doesn't stack the bars.
stack=FALSE doesn't seem to have any effect. Any thoughts?

Thanks in advance.
Sam

#Generating the data
library(lattice)

temp - abs(rnorm(81*5))
err - as.data.frame(temp)
err$section=c(down,down,down,mid,mid,mid, up,up, up)

err$depth=c(Surface,D50, 2xD50)

err$err.date=c(05/09/2009,12/09/2009,13/10/2009,19/10/2009,21/09/2009)


err.split -
 with(err,
  split(temp, list(depth,section, err.date)))

#I've tried to alter the panel function according to the thread to produce
#vertical error bars in my barcharts

prepanel.ci - function(x, y, ly, uy, subscripts, ...) {

y - as.numeric(y)
ly - as.numeric(ly[subscripts])
 uy - as.numeric(uy[subscripts])
 list(ylim = range(y, uy, ly, finite = TRUE))
 }

panel.ci - function(x, y, ly, uy, subscripts, pch = 16, ...) {
 x - as.numeric(x)
 y - as.numeric(y)
 ly - as.numeric(ly[subscripts])
 uy - as.numeric(uy[subscripts])

 panel.arrows(x, ly, x, uy, col = 'black',
  length = 0.25, unit = native,
  angle = 90, code = 3)
 panel.barchart(x, y, pch = pch, ...)
 }

se -function(x) sqrt(var(x)/length(x))



err.ucl - sapply(err.split,
function(x) {
st - boxplot.stats(x)
c(mean(x), mean(x) + se(x), mean(x) -se(x))
})



err.ucl - as.data.frame(t(err.ucl))
names(err.ucl) - c(mean, upper.se, lower.se)
err.ucl$label - factor(rownames(err.ucl),levels = rownames(err.ucl))

# add factor, grouping and by variables
err.ucl$section=c(down,down,down,mid,mid,mid, up,up, up)
err.ucl$depth=c(Surface,D50, 2xD50)

s
err.ucl$err.date=c(05/09/2009,12/09/2009,13/10/2009,19/10/2009,21/09/2009)

#This produces the figure I am looking for minus the error bars.

with(err.ucl, barchart(mean ~ err.date | section, group=depth,
layout=c(1,3),
 horizontal=FALSE,
 scales=list(x=list(rot=45)),
))



#OK, now that this work and the error bars are drawn, I am curious why the
stack=TRUE doesn't produce each bar beside each other.

with(err.ucl, barchart(mean ~ err.date| section, group=depth,
 layout=c(1,3),
 horizontal=FALSE,
 stack=FALSE,
 scales=list(x=list(rot=45)),
 ly=lower.se,
 uy=upper.se,
 auto.key = list(points = FALSE, rectangles = TRUE, space= right,
title = Depth, border = TRUE),
 #auto.key=TRUE,
 prepanel=prepanel.ci,
 panel=panel.superpose,
 panel.groups=panel.ci
 ))



-- 
*
Sam Albers
Geography Program
University of Northern British Columbia
 University Way
Prince George, British Columbia
Canada, V2N 4Z9
phone: 250 960-6777
*

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Error Bars in lattice- barcharts

2010-04-07 Thread Sam Albers
Hi Ivan,

Can you educate me a little bit on the use of barchart?


Unfortunately no... For this post I eventually used the barplot2() in the
gplots packages. I got bogged down trying to do it in lattice so I looked
for an alternative. It was quite straight forward which was nice and I was
able to get what I wanted quite quickly. Sorry I can't be of more help.

Sam

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Using ifelse and grep

2010-04-03 Thread Sam Albers
Good Morning,

I am trying to create a new column of character strings based on the first
two letters in a string in another column. I believe that I need to use some
combination of ifelse and grep but I am not totally sure how to combine
them. I am not totally sure why the command below isn't working. Obviously
it isn't finding anything that matches my criteria but I am not sure why.
Any ideas on how I might be able to modify this to get to work? Below is
also a data example of what I would like to achieve with this command.

 section - ifelse(Sample==grep(^BU, Sample),up,
ifelse(Sample==grep(^BM, Sample), mid,down))
 section
 [1] down down down down down down down down down down
[11] down down

Thanks in advance.

Sam

 Sample Transmission section  BU1 0.39353 up  BU2 0.38778 up  BU3 0.42645 up
BM1 0.37510 mid  BM2 0.5103 mid  BM3 0.67224 mid  BD1 0.37482 down  BD2
0.54716 down  BD3 0.50866 down  BU1 0.34869 up  BU2 0.32831 up  BU3 0.59877
up  BM1 0.52518 mid  BM2 0.94387 mid  BM3 0.94387 mid  BD1 0.46872 down  BD2
0.63115 down  BD3 0.45239 down
n down down down down down down




-- 
*
Sam Albers
Geography Program
University of Northern British Columbia
 University Way
Prince George, British Columbia
Canada, V2N 4Z9
phone: 250 960-6777
*

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Writing summary.aov results to a file which can be opened in Excel

2010-04-03 Thread Sam Albers

Julia,

I think exporting to excel takes you a step back. Likely it would be easier
to work solely in R and sort the P values like that. I had to do something
similar only with a bunch of regressions a while back. I found this post
extremely helpful as well as the plyr package

http://www.r-bloggers.com/r-calculating-all-possible-linear-regression-models-for-a-given-set-of-predictors/

Not much help I realize but maybe it will point you in the right path.

Sam
-- 
View this message in context: 
http://n4.nabble.com/Writing-summary-aov-results-to-a-file-which-can-be-opened-in-Excel-tp1749775p1750249.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Using ifelse and grep

2010-04-03 Thread Sam Albers
Fantastic. Solved.

Thanks!

On Sat, Apr 3, 2010 at 9:59 AM, Gabor Grothendieck
ggrothendi...@gmail.comwrote:

 # 1
 grep returns an index,  not the value unless you use grep(..., value =
 TRUE).

 Easier might be:

 # 2
 Sample2 - substr(Sample, 1, 2)
 ifelse(Sample2 == BU, up, ifelse(Sample2 == BM, mid, down))

 or

 #3 the following which matches the first 2 characters against the
 given list names and return the corresponding list values.

 library(gsubfn)
 gsubfn(^(..)., list(BU = up, BD = down, BM = mid), Sample)

 Note that if Sample is a factor rather than character then use
 as.character(Sample) in place of Sample in the last line.


 On Sat, Apr 3, 2010 at 12:18 PM, Sam Albers tonightstheni...@gmail.com
 wrote:
  Good Morning,
 
  I am trying to create a new column of character strings based on the
 first
  two letters in a string in another column. I believe that I need to use
 some
  combination of ifelse and grep but I am not totally sure how to combine
  them. I am not totally sure why the command below isn't working.
 Obviously
  it isn't finding anything that matches my criteria but I am not sure why.
  Any ideas on how I might be able to modify this to get to work? Below is
  also a data example of what I would like to achieve with this command.
 
  section - ifelse(Sample==grep(^BU, Sample),up,
  ifelse(Sample==grep(^BM, Sample), mid,down))
  section
   [1] down down down down down down down down down
 down
  [11] down down
 
  Thanks in advance.
 
  Sam
 
   Sample Transmission section  BU1 0.39353 up  BU2 0.38778 up  BU3 0.42645
 up
  BM1 0.37510 mid  BM2 0.5103 mid  BM3 0.67224 mid  BD1 0.37482 down  BD2
  0.54716 down  BD3 0.50866 down  BU1 0.34869 up  BU2 0.32831 up  BU3
 0.59877
  up  BM1 0.52518 mid  BM2 0.94387 mid  BM3 0.94387 mid  BD1 0.46872 down
  BD2
  0.63115 down  BD3 0.45239 down
  n down down down down down down
 
 
 
 
  --
  *
  Sam Albers
  Geography Program
  University of Northern British Columbia
   University Way
  Prince George, British Columbia
  Canada, V2N 4Z9
  phone: 250 960-6777
  *
 
 [[alternative HTML version deleted]]
 
  __
  R-help@r-project.org mailing list
  https://stat.ethz.ch/mailman/listinfo/r-help
  PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
  and provide commented, minimal, self-contained, reproducible code.
 




-- 
*
Sam Albers
Geography Program
University of Northern British Columbia
 University Way
Prince George, British Columbia
Canada, V2N 4Z9
phone: 250 960-6777
*

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Selecting the first row based on a factor

2010-04-02 Thread Sam Albers
Hello there,

I have a situation where I would like to select the first row of a
particular factor for a data frame (data example below). So that is, I would
like to select the first entry when the factor1 =A and then the first row
when factor1=B etc. I have thousands of entries so I need some general way
of doing this. I have a minimal example that should illustrate what I am
trying to do. I am using R version 2.9.2, ESS version 5.4 and Ubuntu 9.04.

Thanks so much in advance!

Sam

#Minimal example

x - rnorm(100)
y - rnorm(100)
xy - data.frame(x,y)
xy$factor1 - c(A, B,C,D)
xy$factor2 - c(a,b)
xy - xy[order(xy$factor1),]  #This simply orders the data to look more like
the actual data I am working with

#I am trying to use this approach but I am not sure that I am selecting the
correct row and then the output temp is a total mess.
temp - with(xy, unlist(lapply(split(xy, list(factor1=factor1,
factor2=factor2)), function(x) x[1,])))

   xy   factor1 factor2
10.700042585 -2.481633101   A   a   # I would like to select
this row
51.402677849 -0.691143942   A   a
90.188287765 -1.723823157   A   a
13   0.714946028  0.715361315   A   a
17   0.690177271 -0.112394002   A   a
21   0.333101579 -0.316285321   A   a
25   0.439505793 -3.356415326   A   a
89  -1.001153334 -0.739440288   A   a
93   0.135509539  0.949943380   A   a
97  -1.730936150  0.356133105   A   a
2   -0.399355582 -0.843874548   B   b # Then I would like to
select this row. etc
61.285958969  0.958501988   B   b
10   0.495795836 -0.805012667   B   b
14   0.512486789 -0.968247016   B   b
18  -1.189627025  0.455278250   B   b

-- 
*
Sam Albers
Geography Program
University of Northern British Columbia
 University Way
Prince George, British Columbia
Canada, V2N 4Z9
phone: 250 960-6777
*

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Selecting the first row based on a factor

2010-04-02 Thread Sam Albers
Thanks!

On Fri, Apr 2, 2010 at 11:35 AM, Erik Iverson er...@ccbr.umn.edu wrote:

 Hello,


 Sam Albers wrote:

 Hello there,

 I have a situation where I would like to select the first row of a
 particular factor for a data frame (data example below). So that is, I
 would
 like to select the first entry when the factor1 =A and then the first row
 when factor1=B etc. I have thousands of entries so I need some general way
 of doing this. I have a minimal example that should illustrate what I am
 trying to do. I am using R version 2.9.2, ESS version 5.4 and Ubuntu 9.04.

 Thanks so much in advance!

 Sam

 #Minimal example

 x - rnorm(100)
 y - rnorm(100)
 xy - data.frame(x,y)
 xy$factor1 - c(A, B,C,D)
 xy$factor2 - c(a,b)
 xy - xy[order(xy$factor1),]  #This simply orders the data to look more
 like
 the actual data I am working with


 Does

 xy[!duplicated(xy$factor1),]


This most definitely works. What a beautifully elegant solution. Thanks!


 do what you want?




-- 
*
Sam Albers
Geography Program
University of Northern British Columbia
 University Way
Prince George, British Columbia
Canada, V2N 4Z9
phone: 250 960-6777
*

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Vertical subtraction in dataframes

2010-03-12 Thread Sam Albers
Hello all,

I have not been able to find an answer to this problem. I feel like it might
be so simple though that it might not get a response.

Suppose I have a dataframe like the one I have copied below (minus the
'calib' column). I wish to create a column like calib where I am subtracting
the 'Count' when 'stain' is 'none' from all other 'Count'  data for every
value of 'rep'. This is sort of analogous to putting a $ in front of the
number that identifies a cell in a spreadsheet environment.  Specifically I
need some like this:

mydataframe$calib - Count - (Count when stain = none for each value rep)

Any thoughts on how I might accomplish this?

Thanks in advance.

Sam

Note: I've already calculated the calib column in gnumeric for clarity.

rep Count stain calib
1 1522 none 0
1 147 syto -1375
1 544.8 sytolec -977.2
1 2432.6 sytolec 910.6
1 234.6 sytolec -1287.4
2 5699.8 none 0
2 265.6 syto -5434.2
2 329.6 sytolec -5370.2
2 383 sytolec -5316.8
2 968.8 sytolec -4731
3 2466.8 none 0
3 1303 syto -1163.8
3 1290.6 sytolec -1176.2
3 110.2 sytolec -2356.6
3 15086.8 sytolec 12620

-- 
*
Sam Albers
Geography Program
University of Northern British Columbia
 University Way
Prince George, British Columbia
Canada, V2N 4Z9
phone: 250 960-6777
*

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Error Bars in lattice- barcharts

2010-02-19 Thread Sam Albers
Hello,

I am attempting to write a script that adds error bars to a barchart. I
basing my attempt heavily on the following thread:

http://tolstoy.newcastle.edu.au/R/e2/help/06/10/2791.html

I can't seem to get around the problem that was discussed in the thread. The
following example should illustrate my problem. Sorry about the messy
example but I am 1) trying to make it as close as possible to my actual work
and 2) my skill level is spotty at best. Can anyone suggest a way to do this
or even another way to make a grouped barchart with error bars? I'm not
married to this method although I prefer working with lattice. Thanks for
any help in advance!

Sam

#Generating the data
library(lattice)

temp - abs(rnorm(81*5))
err - as.data.frame(temp)
err$section=c(down,down,down,mid,mid,mid, up,up, up)

err$depth=c(Surface,D50, 2xD50)

err$err.date=c(05/09/2009,05/09/2009,05/09/2009,05/09/2009,05/09/2009,05/09/2009,05/09/2009,05/09/2009,05/09/2009,05/10/2009,05/10/2009,05/10/2009,05/10/2009,05/10/2009,05/10/2009,05/10/2009,05/10/2009,05/10/2009,12/09/2009,12/09/2009,12/09/2009,12/09/2009,12/09/2009,12/09/2009,12/09/2009,12/09/2009,12/09/2009,13/10/2009,13/10/2009,13/10/2009,13/10/2009,13/10/2009,13/10/2009,13/10/2009,13/10/2009,13/10/2009,19/10/2009,19/10/2009,19/10/2009,19/10/2009,19/10/2009,19/10/2009,19/10/2009,19/10/2009,19/10/2009,21/09/2009,21/09/2009,21/09/2009,21/09/2009,21/09/2009,21/09/2009,21/09/2009,21/09/2009,21/09/2009,26/10/2009,26/10/2009,26/10/2009,26/10/2009,26/10/2009,26/10/2009,26/10/2009,26/10/2009,26/10/2009,27/09/2009,27/09/2009,27/09/2009,27/09/2009,27/09/2009,27/09/2009,27/09/2009,27/09/2009,27/09/2009,28/08/2009,
28/08/2009, 28/08/2009,28/08/2009, 28/08/2009,
28/08/2009,28/08/2009, 28/08/2009, 28/08/2009)


err.split -
 with(err,
  split(temp, list(depth,section, err.date)))

#I've tried to alter the panel function according to the thread to produce
vertical error bars in my barcharts

prepanel.ci - function(x, y, ly, uy, subscripts, ...) {

y - as.numeric(y)
ly - as.numeric(ly[subscripts])
 uy - as.numeric(uy[subscripts])
 list(ylim = range(y, uy, ly, finite = TRUE))
 }

panel.ci - function(x, y, lx, ux, subscripts, pch = 16, ...) {
 x - as.numeric(x)
 y - as.numeric(y)
 lx - as.numeric(lx[subscripts])
 ux - as.numeric(ux[subscripts])

 panel.arrows(x, ly, x, uy, col = 'black',
  length = 0.25, unit = native,
  angle = 90, code = 3)
 panel.barchart(x, y, pch = pch, ...)
 }

se -function(x) sqrt(var(x)/length(x))



err.ucl - sapply(err.split,
function(x) {
st - boxplot.stats(x)
c(mean(x), mean(x) + se(x), mean(x) -se(x))
})



err.ucl - as.data.frame(t(err.ucl))
names(err.ucl) - c(mean, upper.se, lower.se)
err.ucl$label - factor(rownames(err.ucl),levels = rownames(err.ucl))

# add factor, grouping and by variables
err.ucl$section=c(down,down,down,mid,mid,mid, up,up, up)
err.ucl$depth=c(Surface,D50, 2xD50)

#There has got to be a better way of doing this
err.ucl$err.date=c(05/09/2009,05/09/2009,05/09/2009,05/09/2009,05/09/2009,05/09/2009,05/09/2009,05/09/2009,05/09/2009,05/10/2009,05/10/2009,05/10/2009,05/10/2009,05/10/2009,05/10/2009,05/10/2009,05/10/2009,05/10/2009,12/09/2009,12/09/2009,12/09/2009,12/09/2009,12/09/2009,12/09/2009,12/09/2009,12/09/2009,12/09/2009,13/10/2009,13/10/2009,13/10/2009,13/10/2009,13/10/2009,13/10/2009,13/10/2009,13/10/2009,13/10/2009,19/10/2009,19/10/2009,19/10/2009,19/10/2009,19/10/2009,19/10/2009,19/10/2009,19/10/2009,19/10/2009,21/09/2009,21/09/2009,21/09/2009,21/09/2009,21/09/2009,21/09/2009,21/09/2009,21/09/2009,21/09/2009,26/10/2009,26/10/2009,26/10/2009,26/10/2009,26/10/2009,26/10/2009,26/10/2009,26/10/2009,26/10/2009,27/09/2009,27/09/2009,27/09/2009,27/09/2009,27/09/2009,27/09/2009,27/09/2009,27/09/2009,27/09/2009,28/08/2009,
28/08/2009, 28/08/2009,28/08/2009, 28/08/2009,
28/08/2009,28/08/2009, 28/08/2009, 28/08/2009)

#This produces the figure I am looking for minus the error bars.

with(err.ucl, barchart(mean ~ err.date | section, group=depth,
layout=c(1,3),
 horizontal=FALSE,
 scales=list(x=list(rot=45)),
))


# Deepayan's original example. I am unsure how to diagnose the packet error.
This is where I run into problems

with(err.ucl, barchart(mean ~ err.date | section, group=depth,
 layout=c(1,3),
 horizontal=FALSE,
 scales=list(x=list(rot=45)),
 ly=lower.se,
 uy=upper.se,
 prepanel=prepanel.ci,
 panel=panel.superpose,
 panel.groups=panel.ci
 ))





-- 
*
Sam Albers
Geography Program
University of Northern British Columbia
 University Way
Prince George, British Columbia
Canada, V2N 4Z9
phone: 250 960-6777
*

[[alternative HTML version deleted

[R] augPred and nlme

2009-11-25 Thread Sam Albers
Hello there,

Using 'The R Book' (p675-677) I am following instructions on performing a
series of nonlinear regressions fitting the same model to a set of groups. I
have been to able to fit the model to my data using the following call to
nlme:
 library(nlme)
 inorg.model-nlme(inorg.grv ~ a*exp( - ((numDate-b)^2 / 2*c^2)),
fixed=a+b+c~1,
random=a~1|Sectionf,
start=c(a=adi,b=bdi,c=cdi), verbose = TRUE)

Now, again following the R book, I would like to plot these models using
augPred. However I receive the following error:

plot(augPred(inorg.model))
Error in augPred.lme(inorg.model) :
Data in inorg.model call must evaluate to a data frame

I am not sure even how to diagnose this problem. I basically followed to R
book directions to the letter. I could plot each of these curves from each
out individually but the prospect of R doing it all for me is too tempting.

I am using R 2.8.1-1 and Ubuntu 9.04.

Thanks in advance!

Sam

-- 
*
Sam Albers
Geography Program
University of Northern British Columbia
 University Way
Prince George, British Columbia
Canada, V2N 4Z9
phone: 250 960-6777
*

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] linear model and by()

2009-11-13 Thread Sam Albers
/09,0.2
 Middle,125.31,04/09/09,0.13
 Middle,125.31,04/09/09,0.11
 Downstream,125.31,04/09/09,0.16
 Downstream,125.31,04/09/09,0.17
 Downstream,125.31,04/09/09,0.17
 Upstream,150.29,04/09/09,0.17
 Upstream,150.29,04/09/09,0.19
 Upstream,150.29,04/09/09,0.14
 Middle,150.29,04/09/09,0.2
 Middle,150.29,04/09/09,0.13
 Middle,150.29,04/09/09,0.11
 Downstream,150.29,04/09/09,0.16
 Downstream,150.29,04/09/09,0.17
 Downstream,150.29,04/09/09,0.17
 Upstream,0,11/09/09,0.12
 Upstream,0,11/09/09,0.16
 Upstream,0,11/09/09,0.12
 Middle,0,11/09/09,0.08
 Middle,0,11/09/09,0.12
 Middle,0,11/09/09,0.1
 Downstream,0,11/09/09,0.11
 Downstream,0,11/09/09,0.13
 Downstream,0,11/09/09,0.13
 Upstream,25,11/09/09,0.12
 Upstream,25,11/09/09,0.16
 Upstream,25,11/09/09,0.12
 Middle,25,11/09/09,0.08
 Middle,25,11/09/09,0.12
 Middle,25,11/09/09,0.1
 Downstream,25,11/09/09,0.11
 Downstream,25,11/09/09,0.13
 Downstream,25,11/09/09,0.13
 Upstream,50,11/09/09,0.12
 Upstream,50,11/09/09,0.16
 Upstream,50,11/09/09,0.12
 Middle,50,11/09/09,0.08
 Middle,50,11/09/09,0.12
 Middle,50,11/09/09,0.1
 Downstream,50,11/09/09,0.11
 Downstream,50,11/09/09,0.13
 Downstream,50,11/09/09,0.13
 Upstream,75,11/09/09,0.12
 Upstream,75,11/09/09,0.16
 Upstream,75,11/09/09,0.12
 Middle,75,11/09/09,0.08
 Middle,75,11/09/09,0.12
 Middle,75,11/09/09,0.1
 Downstream,75,11/09/09,0.11
 Downstream,75,11/09/09,0.13
 Downstream,75,11/09/09,0.13
 Upstream,100,11/09/09,0.12
 Upstream,100,11/09/09,0.16
 Upstream,100,11/09/09,0.12
 Middle,100,11/09/09,0.08
 Middle,100,11/09/09,0.12
 Middle,100,11/09/09,0.1
 Downstream,100,11/09/09,0.11
 Downstream,100,11/09/09,0.13
 Downstream,100,11/09/09,0.13
 Upstream,125.04,11/09/09,0.12
 Upstream,125.04,11/09/09,0.16
 Upstream,125.04,11/09/09,0.12
 Middle,125.04,11/09/09,0.08
 Middle,125.04,11/09/09,0.12
 Middle,125.04,11/09/09,0.1
 Downstream,125.04,11/09/09,0.11
 Downstream,125.04,11/09/09,0.13
 Downstream,125.04,11/09/09,0.13

 --
 *
 Sam Albers
 Geography Program


 David Winsemius, MD
 Heritage Laboratories
 West Hartford, CT




-- 
*
Sam Albers
Geography Program
University of Northern British Columbia
 University Way
Prince George, British Columbia
Canada, V2N 4Z9
phone: 250 960-6777
*

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] linear model and by()

2009-11-12 Thread Sam Albers
,75,11/09/09,0.13
Upstream,100,11/09/09,0.12
Upstream,100,11/09/09,0.16
Upstream,100,11/09/09,0.12
Middle,100,11/09/09,0.08
Middle,100,11/09/09,0.12
Middle,100,11/09/09,0.1
Downstream,100,11/09/09,0.11
Downstream,100,11/09/09,0.13
Downstream,100,11/09/09,0.13
Upstream,125.04,11/09/09,0.12
Upstream,125.04,11/09/09,0.16
Upstream,125.04,11/09/09,0.12
Middle,125.04,11/09/09,0.08
Middle,125.04,11/09/09,0.12
Middle,125.04,11/09/09,0.1
Downstream,125.04,11/09/09,0.11
Downstream,125.04,11/09/09,0.13
Downstream,125.04,11/09/09,0.13






-- 
*
Sam Albers
Geography Program
University of Northern British Columbia
 University Way
Prince George, British Columbia
Canada, V2N 4Z9
phone: 250 960-6777
*

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Dates plotting backwards

2009-11-10 Thread Sam Albers
Hello,

I am having a little trouble formatting my dates correctly. When I plot
something using the following commands, R plots the most recent date on the
left of the figure and then earlier date on the right of the figure. Given
that English is read from left to right I would like to have the dates on my
figure arranged in the same way.  I am sure that this is something fairly
simple but I was wondering if someone could help me out. Here is a minimal
example that should reproduce my problem. I've also included so data
thinking that perhaps my data format was the problem.

Thanks in advance!

Sam

Date=as.Date(test$Date, format= %d/%m/%Y)
plot(test$D.2D50.SA ~ test$Date)

Date,D.2D50.SA
28/08/2009,60.67
28/08/2009,66.4
28/08/2009,50.19
28/08/2009,38.19
28/08/2009,50.19
12/09/2009,62.2
12/09/2009,93.77
12/09/2009,49.89
12/09/2009,106.34
12/09/2009,42.22
22/09/2009,24.15
22/09/2009,105.17
22/09/2009,15.04
22/09/2009,23.54
22/09/2009,19.6
05/10/2009,74.41
05/10/2009,34.78
05/10/2009,28.74
05/10/2009,41.29
05/10/2009,42.68
12/10/2009,46.26
12/10/2009,13.31
12/10/2009,29.95
12/10/2009,34.28
12/10/2009,74.51
19/10/2009,33.67
19/10/2009,69.86
19/10/2009,61.3
19/10/2009,21.38
19/10/2009,80.37
26/10/2009,20.69
26/10/2009,63.37
26/10/2009,70.91
26/10/2009,22.7
26/10/2009,23.89

-- 
*
Sam Albers
Geography Program
University of Northern British Columbia
 University Way
Prince George, British Columbia
Canada, V2N 4Z9
phone: 250 960-6777
*

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Dates plotting backwards

2009-11-10 Thread Sam Albers


 
  Thanks in advance!
 
  Sam
 
  Date=as.Date(test$Date, format= %d/%m/%Y)

 Change that to
   test$Data - as.Date(...)
 or plot Date instead of test$Date.


Yes that worked. Silly mistake. Sometimes those are the hardest ones to
spot. Thanks!


 Bill Dunlap
 Spotfire, TIBCO Software
 wdunlap tibco.com

  plot(test$D.2D50.SA ~ test$Date)
 
  Date,D.2D50.SA
  28/08/2009,60.67
  28/08/2009,66.4
  28/08/2009,50.19
  28/08/2009,38.19
  28/08/2009,50.19
  12/09/2009,62.2
  12/09/2009,93.77
  12/09/2009,49.89
  12/09/2009,106.34
  12/09/2009,42.22
  22/09/2009,24.15
  22/09/2009,105.17
  22/09/2009,15.04
  22/09/2009,23.54
  22/09/2009,19.6
  05/10/2009,74.41
  05/10/2009,34.78
  05/10/2009,28.74
  05/10/2009,41.29
  05/10/2009,42.68
  12/10/2009,46.26
  12/10/2009,13.31
  12/10/2009,29.95
  12/10/2009,34.28
  12/10/2009,74.51
  19/10/2009,33.67
  19/10/2009,69.86
  19/10/2009,61.3
  19/10/2009,21.38
  19/10/2009,80.37
  26/10/2009,20.69
  26/10/2009,63.37
  26/10/2009,70.91
  26/10/2009,22.7
  26/10/2009,23.89
 
  --
  *
  Sam Albers
  Geography Program
  University of Northern British Columbia
   University Way
  Prince George, British Columbia
  Canada, V2N 4Z9
  phone: 250 960-6777
  *
 
[[alternative HTML version deleted]]
 
  __
  R-help@r-project.org mailing list
  https://stat.ethz.ch/mailman/listinfo/r-help
  PLEASE do read the posting guide
  http://www.R-project.org/posting-guide.html
  and provide commented, minimal, self-contained, reproducible code.
 




-- 
*
Sam Albers
Geography Program
University of Northern British Columbia
 University Way
Prince George, British Columbia
Canada, V2N 4Z9
phone: 250 960-6777
*

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Web implementation of R?

2009-11-05 Thread Sam Albers
Hello,

Can anyone recommend a good example of web implementation of R? Can't seem
to find anything on my own.

Thanks in advance!

Sam

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Contour Plot Aspect Ratio

2008-10-03 Thread Sam Albers
So I managed to solve this for myself in a very roundabout kind of way. So I
figured that I should share in case anyone else needed something like this.

filled.contour(contour, axes=F, frame.plot=F, color=terrain.colors, ylab=
, key.title = title(main=Velocity\n(m/s)),asp=2, key.axes = axis(4,
seq(0, 0.6, by = 0.1)), plot.axes = {
axis.mult(side=1,mult=0.005,mult.label=Width (cm))
axis(side=2, at=x, line = -5, labels=colnames(contour)) })
mtext(side=2, line=-1.5, 'Length Along Flume (m)')

I just removed the frame.plot then manually shifted the axes and axes label
over until they lined up nicely with the edge of the actual contour plot.
Anyways, this is solved and thank you for your help.

Sam

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Contour Plot Aspect Ratio

2008-10-02 Thread Sam Albers
Hello there,

I have a fairly simple request (I hope!)

I have produced a filled contour plot like this:

library(grDevices)
library(gplots)
library(plotrix)

filled.contour(contour, axes=F, frame.plot=TRUE, color=terrain.colors, ylab=
Length Along Flume (m), key.title = title(main=Velocity\n(m/s)),
key.axes = axis(4, seq(0, 0.6, by = 0.1)), asp=2, plot.axes = {
axis.mult(side=1,mult=0.005,mult.label=Width (cm))
axis(side=2, at=x, labels=colnames(contour)) })

Note the asp=2 argument.

I would like to make this plot twice as long as it is wide. I accomplish
this using asp=2 but the actual box that I am plotting now is too big for
the data contained within.

Here is what it looks like:

http://docs.google.com/Doc?id=ddqdnxbq_30ffthshgk

Does anyone know how I might be able to lengthen this graph without it
looking like this? I want to suck in that vertical axes so that it is snug
with the actual contour plot.

Thanks in advance.

Sam

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Grid building in R

2008-07-09 Thread Sam Albers
Right equidistant was clearly the wrong word. Sorry. I just meant that any
given point should have an equal distance from the four points immediately
surrounding it (x,-x,y-y) aside from those on the edge which will obviously
only have two or three points surrounding.

On Wed, Jul 9, 2008 at 3:12 PM, hadley wickham [EMAIL PROTECTED] wrote:

 What do you mean by equidistant?  You can have three points that are
 equidistant on the plane, but there's no way to add another point and
 have it be the same distance from all of the existing points.  (Unless
 all the points are in the same place)

 Hadley

 On Wed, Jul 9, 2008 at 5:02 PM, hippie dream [EMAIL PROTECTED]
 wrote:
 
  This might not possible in R but I thought I would give it shot. I am
 have to
  set up a 40 x 40 cm grid of 181 points equidistant from each other. Is
 there
  any way to produce a graph with R that can do this for me? Actual sizes
 are
  unimportant as long it is to scale. Thanks
  --
  View this message in context:
 http://www.nabble.com/Grid-building-in-R-tp18371874p18371874.html
  Sent from the R help mailing list archive at Nabble.com.
 
  __
  R-help@r-project.org mailing list
  https://stat.ethz.ch/mailman/listinfo/r-help
  PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
  and provide commented, minimal, self-contained, reproducible code.
 



 --
 http://had.co.nz/


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Grid building in R

2008-07-09 Thread Sam Albers
Basically, I want 181 points equally spaced over a 40 x 40 cm area. I want
to be able to specify the number of points and the area to which they are
plotted on. I think you are right that grid is what I am looking for but I
was the grid to have axes which your code below, although appreciated, did
not give me. Sorry to be unclear.

On Wed, Jul 9, 2008 at 3:48 PM, Erik Iverson [EMAIL PROTECTED]
wrote:

 Still not sure exactly what you want, but it sounds like the 'grid' package
 may be of some help.

 It has very flexible ways partitioning regions for plotting.  Is this
 anything like you're after?

 library(grid)

 for(i in 0:10)
  for(j in 0:10)
grid.points(i / 10, j / 10, default.unit = npc)


 hippie dream wrote:

 This might not possible in R but I thought I would give it shot. I am have
 to
 set up a 40 x 40 cm grid of 181 points equidistant from each other. Is
 there
 any way to produce a graph with R that can do this for me? Actual sizes
 are
 unimportant as long it is to scale. Thanks



[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Grid building in R

2008-07-09 Thread Sam Albers
Ahhh. That worked perfectly. Thank you very much.

On Wed, Jul 9, 2008 at 4:19 PM, Dylan Beaudette [EMAIL PROTECTED]
wrote:

 On Wednesday 09 July 2008, hippie dream wrote:
  This might not possible in R but I thought I would give it shot. I am
 have
  to set up a 40 x 40 cm grid of 181 points equidistant from each other. Is
  there any way to produce a graph with R that can do this for me? Actual
  sizes are unimportant as long it is to scale. Thanks

 how about:

 # 40cm spacing
 spacings - 0:13*40

 # a square grid with 196 points
 # sqrt(181) is not an integer, sorry!
 g - expand.grid(x=spacings, y=spacings)

 # check it out
 plot(g, pch=3, cex=0.5)




 --
 Dylan Beaudette
 Soil Resource Laboratory
 http://casoilresource.lawr.ucdavis.edu/
 University of California at Davis
 530.754.7341


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Graph Order in xyplot

2008-07-01 Thread Sam Albers
I have constructed a Trellis style xyplot.

lengthf - factor(length)
xyplot(SLI$velocity ~ SLI$width | SLI$lengthf, layout = c(2,7), xlab =
Width (cm), ylab = Velocity (m/s^2), col = black)

This produces a lovely little plot. However, the grouping factor(lengthf)
isn't in the right order. My values range from 2-28 and the 2 graph on the
bottom left and the graphs continue sequentially left to right to the top of
the page. I would like to have the 2 at the top and have the graphs shown in
descending order (i.e. have the entire graph read like a book)

I tried the following but it didn't seem to work.

lengthd -sort(SLI$length, decreasing =TRUE)
lengthdf - factor(SLI$lengthd)

Then I plotted the graph again:
xyplot(SLI$velocity ~ SLI$width | SLI$lengthdf, layout = c(2,7), xlab =
Width (cm), ylab = Velocity (m/s^2), col = black)

This simply gave me the same graph and now I am a little lost. Is there an
easier way to do this? Do I have to rearrange my data or can this be changed
around using the original xyplot command line. Thanks.

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Problems exporting graphs

2008-06-26 Thread Sam Albers
I have trying to figure this out all day so hopefully the answer isn't too
obvious. I am able to view a graph in the viewer window. However, I need to
export graph outside of the viewer window. Here is the script I am using:

 png(Compare.png)
 plot(compare$DepthSLI, compare$DischargeSLI, col=blue, xlab = Average
Water Depth (cm), ylab = Discharge (m^3/s), xlim=c(5,40),
ylim=c(0.03,0.1), main=Discharge in Flume 1)
 dev.off()
null device
  1

When I do this R simply produces an empty file that produces an error
message when I try to open it. I have tried this same script with every file
format available under help(device) all with the same result. I am running
Ubuntu 8.04 so I unable to simply copy and paste the graph as in windows. I
am fairly sure I have all the right packages installed. I am running R
2.6.2-2. Any suggestions?

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.