[R] Bestglm subset analysis

2016-06-29 Thread D Wolf via R-help
Hello All,
I am working on a linear regression model and trying to find the best subset of 
variables for my dataset. I have 21 predictors, 1 response variable, and 79 
observations. I need to find the best 5 or 6 predictors for my model. I've used 
leaps for lm() and I'm now trying bestglm for glm(). I'm following this 
webpage, which gives the code below. 
https://rstudio-pubs-static.s3.amazonaws.com/2897_9220b21cfc0c43a396ff9abf122bb351.html
My code:library(bestglm)library(base)lbw.for.bestglm <- within(df_Chl, {y <- 
df_Chl$Chloro })res.bestglm <- bestglm(Xy = lbw.for.bestglm, family = gaussian, 
IC = "AIC", method = "exhaustive")
# get coefficientsres.bestglm$BestModelsHere is a sample of my results (I 
removed the 5th through 21st predictors for brevity).> res.bestglm$BestModels   
 R21   R31   R32   R41 1 FALSE FALSE FALSE FALSE  2 FALSE  TRUE FALSE FALSE  3 
FALSE FALSE FALSE FALSE 4 FALSE  TRUE FALSE FALSE 5 FALSE  TRUE FALSE FALSE  
Criterion1  326.73272  326.95253  327.06594  327.09125  327.8208
Is it correct to assume I should keep variables that are TRUE from 1 through 5? 
What do those five rows represent? 
I know the AIC criterion result should be as low as possible. Is it possible to 
discern a good result for any of the IC criterion results, such as AIC, LOOCV, 
BICg, etc..? If BIC returns lower Criterion results, does that mean I need to 
use the BIC subset instead of the subset from AIC?
Thank You,
Doug

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Reading a datetime vector

2016-02-24 Thread D Wolf via R-help
In addition to my previous message, DF_extract_clean.R is the program in the 
dropbox folder that I am currently working on.
Doug 

On Tuesday, February 23, 2016 4:02 AM, Jim Lemon  
wrote:
 

 Hi Doug,It is difficult for us to work out what is happening as we don't have 
access to a toy data set that we can play with. Excel spreadsheets are one of 
those things that you can't just attach to your email to the help list. If 
there is somewhere you can leave a _small_ Excel sample file (take the first 10 
rows, say) that we can download (Google Drive, Dropbox?) and include the URL in 
your email, maybe someone can offer more than guesses.
Jim


   
[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Reading a datetime vector

2016-02-22 Thread D Wolf via R-help
Hello Everyone, 
The column begins populated with integers as so:1/1/2013 0:00 in the 
spreadsheet equals 41257 in R's dataframe1/1/2013 0:15 in the spreadsheet 
equals 41257.01041664 in R's dataframe...41257 must be in minutes since 
1440min/day * .01041664 day = 15 minutes. 41257 minutes is about 29 days: 
41257 min / 1440 min/day = 28.65 days. So I don't know why the dataframe is 
showing 41257 for 1/12013 0:00. 
Oddly, R sees the vector as NULL despite the fact it has integers in each 
record in the column:data_type = str(df2_TZ$DateTimeStamp) produces a NULL 
(empty) variable. 

I tried:
df2_TZ = read.xlsx2("DF_exp.xlsx", sheetName = "Sheet1")Sys.setenv(TZ = 
"GMT")testdtm <- as.POSIXct(df2_TZ$DateTimeStamp, format = "%m/%d/%Y %H:%M")# 
Inspect the resulttestdtmstr(testdtm)
testdtm is a vector filled with NA values, which figures since DateTimeStamp is 
NULL. 
I noticed in the table on page 32 of the R Help Desk pdf you linked to that 
dp-as.POSIXct(format(dp, tz="GMT")) is the only option listed for time zone 
difference. So I tried:df2_TZ = read.xlsx2("DF_exp.xlsx", sheetName = 
"Sheet1")df2_TZ_seq <- as.POSIXct(format(dt2_TZ, tz="GMT"))
and got: Error in format(dt2_TZ, tz = "GMT") : object 'dt2_TZ' not found
Is the vector neither character nor factor, since it's NULL? Where do I go from 
here? 
 Thank You,Doug

Hi Doug,What you have done is to ask whether the character string "DF_exp.xlsx" 
is a character string. I think Yogi Berra, were he still around, could have 
told you that. What will give you some useful information is:
str(DF_exp.xlsx)
which asks for information about the object, not its name.
Jim

On Friday, February 19, 2016 12:41 PM, Jeff Newmiller 
 wrote:
 

 This is a mailing list. I don't know how you are interacting with it... using 
a website rather than an email program can lead to some confusion since there 
can be many ways to accomplish the task of interacting with the mailing list. 
My email program has a "reply-all" button when I am looking at an email. It 
also has an option to write the email in plain text, which often prevents the 
message from getting corrupted (recipient not seeing what you sent to the list).

Using the str function on a literal string (the name of a file) will indeed 
tell you that you gave it a character string. Specifying a column in your data 
might tell you something more interesting... e.g.

str( df2_TZ$DateTimeStamp )

If that says you have character data then Jim Lemon's suggestion would be a 
good next thing to look at. If it is factor data then you should use the 
as.character function on the data column and then follow Jim's suggestion. If 
it is numeric then you probably need to convert it using an appropriate origin 
(e.g. as described at [1] or [2]).

I have had best luck setting the default timezone string when converting to 
POSIXt types... e.g.

# specify timezone assumed by input data
Sys.setenv( TZ="GMT" )
testdtm <- as.POSIXct( "1/1/2016 00:00", format = "%m/%d/%Y %H:%M" )
# inspect the result
testdtm
str( testdtm )
# view data from a different timezone
Sys.setenv( TZ="Etc/GMT+8" )
# no change to the underlying data, but it prints out differently now because 
the tz attribute is "" which implies using the default TZ
testdtm

[1] http://blog.mollietaylor.com/2013/08/date-formats-in-r.html
[2] https://www.r-project.org/doc/Rnews/Rnews_2004-1.pdf

-- 
Sent from my phone. Please excuse my brevity.

On February 19, 2016 7:48:31 AM PST, D Wolf  wrote:
Hello Jeff,
I ran str() on the vector and it returned character.> str("DF_exp.xlsx") chr 
"DF_exp.xlsx"
This is my first thread on this forum, and I'm not sure how to reply to the 
thread instead of just sending the reply to your email account; I don't see a 
'reply' link in the thread.I've read this page and I don't think it advises on 
how to reply in the thread: R: Posting Guide: How to ask good questions that 
prompt useful answers

|   |
|   |  |   |   |   |   |   |
| R: Posting Guide: How to ask good questions that prompt ...Posting Guide: How 
to ask good questions that prompt useful answers This guide is intended to help 
you get the most out of the R mailing lists, and to avoid embarra... |
|  |
| View on www.r-project.org | Preview by Yahoo |
|  |
|   |


Thank You,Doug Wolfinger
 

On Friday, February 19, 2016 12:51 AM, Jeff Newmiller 
 wrote:
 

 You are being rather scattershot in your explanation, so I suspect you are not 
being systematic in your troubleshooting. Use the str function to examine the 
data column after you pull it in from excel. It may be numeric, factor, or 
character, and the approach depends on which that function returns. 
-- 
Sent from my phone. Please excuse my brevity.

On F

Re: [R] Reading a datetime vector

2016-02-19 Thread D Wolf via R-help
Hello Jim,
I ran str() on the vector and it returned character:str("DF_exp.xlsx") chr 
"DF_exp.xlsx"
I tried df2_TZ$DateTimeStamp <- 
strptime(as.Date(as.character(df2_TZ$DateTimeStamp, format = "%m/%d/%Y %H:%M", 
tz = "GMT"))), which produced an error: Error in charToDate(x) :   character 
string is not in a standard unambiguous formatIn Excel, the column is formatted 
to m/d/ h:mm
Removing %S from these linesdf2_TZ$DateTimeStamp = 
as.POSIXct(df2_TZ$DateTimeStamp, format="%m/%d/%Y %H:%M", tz="GMT") 
df2_TZ$DateTimeStamp = as.POSIXct(as.character(df2_TZ$DateTimeStamp), format = 
"%m/%d/%Y %H:%M")
made the column NA

Thank You,Doug Wolfinger


 

On Friday, February 19, 2016 1:35 AM, Jim Lemon  
wrote:
 

 Hi Doug,For one thing, you may be using the wrong format. Your example format 
has no seconds field. The other thing to watch is whether the data are in 
%m/%d/%Y or %d/%m/%Y date format. If the latter, you would probably get that 
error on dates like 19/02/2016.
Jim

On Fri, Feb 19, 2016 at 8:12 AM, D Wolf via R-help  wrote:

Hello,I am trying to read a data frame column named DateTimeStamp. The time is 
in GMT in this format: 1/4/2013 23:30
require(xlsx)
df2_TZ = read.xlsx2("DF_exp.xlsx", sheetName = "Sheet1")

It's good to that line. But these three lines, which makes the dataframe, 
converts the column's values to NA:df2_TZ$DateTimeStamp = 
as.POSIXct(df2_TZ$DateTimeStamp, format="%m/%d/%Y %H:%M:%S", tz="GMT")

and... df2_TZ$DateTimeStamp = as.POSIXct(as.character(df2_TZ$DateTimeStamp), 
format = "%m/%d/%Y %H:%M:%S")

and...df2_TZ$DateTimeStamp = as.Date(df2_TZ$DateTimeStamp, format = "%m/%d/%Y 
%H:%M:%S")

This line returns and error...df2_TZ$DateTimeStamp = 
as.POSIXct(as.Date(df2_TZ$DateTimeStamp), format = "%m/%d/%Y %H:%M:%S")
"Error in charToDate(x) :   character string is not in a standard unambiguous 
format"
Additionally, I need to convert from GMT to North American time zones, and I 
think the advice on this page would be good for that: 
http://blog.revolutionanalytics.com/2009/06/converting-time-zones.html
My ultimate goal is to write an R program that finds data in another variable 
in df2_TZ that corresponds to a date and time that match up with the date and 
time in another data frame. For now, any help reading the column would be much 
appreciated.
Thank You,Doug
        [[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.



  
[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Reading a datetime vector

2016-02-18 Thread D Wolf via R-help
Hello,I am trying to read a data frame column named DateTimeStamp. The time is 
in GMT in this format: 1/4/2013 23:30
require(xlsx)
df2_TZ = read.xlsx2("DF_exp.xlsx", sheetName = "Sheet1")

It's good to that line. But these three lines, which makes the dataframe, 
converts the column's values to NA:df2_TZ$DateTimeStamp = 
as.POSIXct(df2_TZ$DateTimeStamp, format="%m/%d/%Y %H:%M:%S", tz="GMT")

and... df2_TZ$DateTimeStamp = as.POSIXct(as.character(df2_TZ$DateTimeStamp), 
format = "%m/%d/%Y %H:%M:%S")

and...df2_TZ$DateTimeStamp = as.Date(df2_TZ$DateTimeStamp, format = "%m/%d/%Y 
%H:%M:%S")

This line returns and error...df2_TZ$DateTimeStamp = 
as.POSIXct(as.Date(df2_TZ$DateTimeStamp), format = "%m/%d/%Y %H:%M:%S")
"Error in charToDate(x) :   character string is not in a standard unambiguous 
format"
Additionally, I need to convert from GMT to North American time zones, and I 
think the advice on this page would be good for that: 
http://blog.revolutionanalytics.com/2009/06/converting-time-zones.html
My ultimate goal is to write an R program that finds data in another variable 
in df2_TZ that corresponds to a date and time that match up with the date and 
time in another data frame. For now, any help reading the column would be much 
appreciated.
Thank You,Doug
[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.