Any entry in the weather data is a good day. That is the point. And
please ignore my mistake about the quarters getting too large in
weather. I am being swamped with versions, and it does not matter for
this purpose.. so, the bad weather days are not in the weather data set.

I am trying to get gw=1 in arr if the date and quarter are in weather.

Thanks,
Jim

On 1/17/10 7:46 PM, David Winsemius wrote:
> But, but, but .... there is no weather goodness variable in weather?!?!?!
>
> > str(weather)
> 'data.frame':    155 obs. of  4 variables:
>  $ Date   :Class 'Date'  num [1:155] 14245 14245 14245 14245 14245 ...
>  $ minute : int  5 15 30 45 0 15 30 45 0 15 ...
>  $ hour   : int  15 15 15 15 17 17 17 17 18 18 ...
>  $ quarter: int  65 75 90 105 68 83 98 113 72 87 ..
>
> I thought you said the "weather" dataframe would have some information
> about "goodness" that we were supposed to map to arrivals.? What is
> the meaning of those variables? How do we define a "good" quarter
> hour? And why are the values of quarter not 1, 2, 3, 4? They ought to
> be a factor or integer that could be matched to those that are in
> "arr", which are also apparently not so defined. Let's see a better
> codebook or description of these variables.
>
> On Jan 17, 2010, at 6:47 PM, James Rome wrote:
>
>> Here are some sample data sets.
>>
>> I also tried making a combined field in each set such as
>> adq=paste(as.character(arr$Date), as.character(arr$quarter))
>> and similarly for the weather set, so I have unique single things to
>> compare, but that did not seem to help much.
>>
>> Thanks,
>> Jim
>>
>> On 1/17/10 5:50 PM, David Winsemius wrote:
>>> My guess (since we still have no data on which to test these ideas)
>>> is that you need either to merge() or to use a matrix created from the
>>> dates and qtr-hours entries in "gw", since matching on dates and hours
>>> separately will not uniquely classify the good qtr-hours within their
>>> proper corresponding dates. You want a structure (or a matching
>>> process) that takes:
>>>    hqhr1    qhr2    qhr3    qhr4 .......
>>> date1    good    bad    good    bad
>>> date2    bad    good    good    good
>>> date3    bad    bad    bad    good
>>> .
>>> .
>>> .
>>> and lets you use the values in "arr" to get values in "gw". Notice
>>> that the notion of arr$Date %in% gw$date & arr$qtrhr %in% gw$qtrhr
>>> simply will not accomplish anything correct/
>>>
>>> Merging by multiple criteria (with the merge function) would do that
>>> or you could construct a matrix whose entries were the categories good
>>> /bad. The table function could create the matrix for the purpose of
>>> using an indexed solution if you are dead-set against the merge
>>> concept.
>>>
>>>
>>>
>>>
>>> On Jan 17, 2010, at 4:47 PM, James Rome wrote:
>>>
>>>> Thank you Dennis.
>>>> arr$gw <- as.numeric(weather$Date == arr$Date & arr$quarter %in%
>>>> weather$quarter)
>>>> seems to be what I want to do, but in fact, with the full data set, it
>>>> misidentifies the rows, so I think the error message must mean
>>>> something.
>>>>
>>>>> arrr$Date <- as.Date(as.character(ewr$Date),format="%m/%d/%y")
>>>>> weather$Date <- as.Date(as.character(weather$Date),format="%m/%d/%y")
>>>>> gw = c(length(arrr))
>>>>> gw[1:length(arrr[,1])]=FALSE
>>>>> gw[arrr$Date==weather$Date & weather$quarter %in% arr$quarter]
>>>> Warning in `==.default`(arr$Date, weather$Date) :
>>>> longer object length is not a multiple of shorter object length
>>>> Warning in arr$Date == weather$Date & weather$quarter %in%
>>>> arr$quarter :
>>>> longer object length is not a multiple of shorter object length
>>>> [1] 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
>>>> 0 0 0 0
>>>> [38] 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
>>>> 0 0 0 0
>>>> [75] 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
>>>> 0 0 0 0
>>>> [112] 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
>>>> 0 0
>>>> 0 0 0 0
>>>> [149] 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
>>>> 0 0
>>>> 0 0 0 0
>>>> [186] 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
>>>> 0 0
>>>> 0 0 0 0
>>>> [223] 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
>>>> 0 0
>>>> 0 0 0 0
>>>> [260] 0 0 0 0 0 0 0 0
>>>>
>>>> There are many many more matches in the 99k line arrival data set.
>>>>
>>>> Thanks a bunch,
>>>> Jim
>>>>
>>>>
>>>> On 1/17/10 3:21 PM, Dennis Murphy wrote:
>>>>> Hi:
>>>>>
>>>>> To read a data set from a R-help message into R, one uses
>>>>> read.table(textConnection("<verbatim text>"), ...)
>>>>>
>>>>> Your weather data set had
>>>>> (a) a variable name with a space in it, that R misread and had to be
>>>>> altered manually;
>>>>> (b) a missing value with no NA that R interpreted as an incomplete
>>>>> line; again, it had
>>>>>    to be altered manually.
>>>>>
>>>>> This is why David suggested the use of dput(), so that these vagaries
>>>>> don't have to be
>>>>> dealt with by those who are trying to help.
>>>>>
>>>>> That being said, for the example that you gave and the desired value
>>>>> that you wanted, try
>>>>>
>>>>> arr$gw <- as.numeric(weather$Date == arr$Date & arr$quarter %in%
>>>>> weather$quarter)
>>>>>
>>>>> (I changed DateTime to Date in the arr data frame...)
>>>>>
>>>>> You'll get warnings like
>>>>>
>>>>> Warning messages:
>>>>> 1: In is.na <http://is.na>(e1) | is.na <http://is.na>(e2) :
>>>>> longer object length is not a multiple of shorter object length
>>>>>
>>>>> but it seems to do the right thing. The first equality is there to
>>>>> constrain matches for
>>>>> quarter to be within the same day.
>>>>>
>>>>> For future reference,
>>>>>
>>>>>> dput(weather)
>>>>> structure(list(Date = structure(c(1L, 1L, 1L, 1L), .Label = "1/1/09",
>>>>> class = "factor"),
>>>>>   minute = c(5L, 15L, 30L, 45L), hour = c(15L, 15L, 15L, 15L
>>>>>   ), quarter = 60:63, efficiency = c(NA, 72, 63.3, 85.4)), .Names =
>>>>> c("Date",
>>>>> "minute", "hour", "quarter", "efficiency"), class = "data.frame",
>>>>> row.names = c(NA,
>>>>> -4L))
>>>>>> dput(arr)
>>>>> structure(list(Date = structure(c(1L, 1L, 1L, 1L, 1L, 1L, 1L,
>>>>> 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L), .Label = "1/1/09",
>>>>> class = "factor"),
>>>>>   weekday = c(5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L,
>>>>>   5L, 5L, 5L, 5L, 5L, 5L, 5L), month = c(1L, 1L, 1L, 1L, 1L,
>>>>>   1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L),
>>>>>   quarter = c(59L, 59L, 60L, 60L, 60L, 60L, 60L, 60L, 60L,
>>>>>   60L, 60L, 60L, 60L, 61L, 61L, 61L, 61L, 66L, 67L), ICAO =
>>>>> structure(c(6L,
>>>>>   8L, 7L, 3L, 6L, 3L, 5L, 3L, 3L, 1L, 3L, 5L, 3L, 3L, 6L, 6L,
>>>>>   2L, 4L, 3L), .Label = c("AAL", "AWE", "BTA", "CHQ", "CJC",
>>>>>   "COA", "JBU", "NWA"), class = "factor"), Flight = structure(c(15L,
>>>>>   19L, 18L, 6L, 17L, 8L, 12L, 5L, 4L, 1L, 3L, 13L, 9L, 10L,
>>>>>   14L, 16L, 2L, 11L, 7L), .Label = c("AAL842", "AWE307", "BTA1234",
>>>>>   "BTA2064", "BTA2085", "BTA2347", "BTA2405", "BTA2916", "BTA3072",
>>>>>   "BTA3086", "CHQ5312", "CJC3225", "CJC3359", "COA1166", "COA349",
>>>>>   "COA855", "COA886", "JBU554", "NWA9934"), class = "factor"),
>>>>>   gw = c(FALSE, FALSE, TRUE, TRUE, TRUE, TRUE, TRUE, TRUE,
>>>>>   TRUE, TRUE, TRUE, TRUE, TRUE, TRUE, TRUE, TRUE, TRUE, FALSE,
>>>>>   FALSE)), .Names = c("Date", "weekday", "month", "quarter",
>>>>> "ICAO", "Flight", "gw"), row.names = c(NA, -19L), class =
>>>>> "data.frame")
>>>>>
>>>>> These can be copied and pasted directly into an R session without
>>>>> modification.
>>>>>
>>>>> HTH,
>>>>> Dennis
>>>>>
>>>>> On Sun, Jan 17, 2010 at 10:51 AM, James Rome <jamesr...@gmail.com
>>>>> <mailto:jamesr...@gmail.com>> wrote:
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>   On 1/17/10 1:06 PM, David Winsemius wrote:
>>>>>>
>>>>>> On Jan 17, 2010, at 12:37 PM, James Rome wrote:
>>>>>>
>>>>>>> I don't think it is that simple because it is not a one-to-one
>>>>>   match. In
>>>>>>> the arr data frame, there are many arrivals in a quarter hour
>>>>>   with good
>>>>>>> weather on a given day. So I need to match the date and the quarter
>>>>>>> hour.
>>>>>>>
>>>>>>> And all of the rows in the weather data frame are times with good
>>>>>>> weather--unique date + quarter hour. That is why I needed the
>>>>>   loop. For
>>>>>>> each date and quarter hour in weather, I want to mark all the
>>>>>   entries
>>>>>>> with the corresponding date and weather as TRUE in the arr$gw
>>>>>   column.
>>>>>>>
>>>>>>> I did convert the dates to POSIXlt dates and rewrote my function as
>>>>>>> gooddates = function(all, good) {
>>>>>>> la = length(all)   # All the arrivals
>>>>>>> lw = length(good)  # The good 15-minute periods
>>>>>>> for(j in 1:lw) {
>>>>>>>  d=good$Date[j]
>>>>>>>  q=good$quarter[j]
>>>>>>>  all$gw[all$Date==d && all$quarter==q]=TRUE
>>>>>>
>>>>>>
>>>>>> You are attempting a vectorized test and assignment with "&&" which
>>>>>> seems unlikely to succeed, but even then I am not sure your problems
>>>>>> would be over. (I'm also guessing that you might not have reported a
>>>>>> warning.)
>>>>>
>>>>>   Why shouldn't the && succeed? You are correct there, because I do
>>>>> get
>>>>>   items if I use either part of this and test, when I insert the &&,
>>>>>   I get
>>>>>   no hits. And I got no warnings.
>>>>>>
>>>>>> Why not merge arr to gw by date and quarter?
>>>>>   The sets contain different data, and the only thing I want from the
>>>>>   weather set is the fact that it has an entry for a given date and
>>>>> time
>>>>>>
>>>>>> Answering these questions would be greatly speeded up with a small
>>>>>> sample dataset. Are you aware of the virtues of the dput function?
>>>>>>
>>>>>
>>>>>   What I want is for a 1 to be in the gw column in the quarter
>>>>>   60,61,62,63,...
>>>>>
>>>>>   For example, here is some data from the good weather set:
>>>>>   Date    minute  hour    quarter         Efficiency Val
>>>>>   1/1/09  5       15      60
>>>>>   1/1/09  15      15      61      72
>>>>>   1/1/09  30      15      62      63.3
>>>>>   1/1/09  45      15      63      85.4
>>>>>
>>>>>
>>>>>
>>>>>   And this is from the arrivals set:
>>>>>   DateTime        weekday         month   quarter         ICAO
>>>>>    Flight  gw
>>>>>
>>>>>   1/1/09  5       1       59      COA     COA349          0
>>>>>   1/1/09  5       1       59      NWA     NWA9934         0
>>>>>   1/1/09  5       1       60      JBU     JBU554          0
>>>>>   1/1/09  5       1       60      BTA     BTA2347         0
>>>>>   1/1/09  5       1       60      COA     COA886          0
>>>>>   1/1/09  5       1       60      BTA     BTA2916         0
>>>>>   1/1/09  5       1       60      CJC     CJC3225         0
>>>>>   1/1/09  5       1       60      BTA     BTA2085         0
>>>>>   1/1/09  5       1       60      BTA     BTA2064         0
>>>>>   1/1/09  5       1       60      AAL     AAL842          0
>>>>>   1/1/09  5       1       60      BTA     BTA1234         0
>>>>>   1/1/09  5       1       60      CJC     CJC3359         0
>>>>>   1/1/09  5       1       60      BTA     BTA3072         0
>>>>>   1/1/09  5       1       61      BTA     BTA3086         0
>>>>>   1/1/09  5       1       61      COA     COA1166         0
>>>>>   1/1/09  5       1       61      COA     COA855          0
>>>>>   1/1/09  5       1       61      AWE     AWE307          0
>>>>>   1/1/09  5       1       66      CHQ     CHQ5312         0
>>>>>   1/1/09  5       1       67      BTA     BTA2405         0
>>>>>
>>>>>
>>>>>
>>>>>          [[alternative HTML version deleted]]
>>>>>
>>>>>   ______________________________________________
>>>>>   R-help@r-project.org <mailto:R-help@r-project.org> mailing list
>>>>>   https://stat.ethz.ch/mailman/listinfo/r-help
>>>>>   PLEASE do read the posting guide
>>>>>   http://www.R-project.org/posting-guide.html
>>>>>   and provide commented, minimal, self-contained, reproducible code.
>>>>>
>>>>>
>>>>
>>>>    [[alternative HTML version deleted]]
>>>>
>>>> ______________________________________________
>>>> R-help@r-project.org mailing list
>>>> https://stat.ethz.ch/mailman/listinfo/r-help
>>>> PLEASE do read the posting guide
>>>> http://www.R-project.org/posting-guide.html
>>>> and provide commented, minimal, self-contained, reproducible code.
>>>
>>> David Winsemius, MD
>>> Heritage Laboratories
>>> West Hartford, CT
>>>
>> <arr.rda><weather.rda>
>
> David Winsemius, MD
> Heritage Laboratories
> West Hartford, CT
>

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Reply via email to