I am working on developing a report that groups data into a two-dimensional
array based on date and time. More specifically, date is grouped into
categories:
day, week-to-date, month-to-date, and year-to-date
Then, for each of those categories, I need to get a count of records that fall
into the following categories:
0 minutes late, 1-5 minutes late, and 6-15 minutes late
where minutes late will be calculated based on a known scheduled time and the
time in the record. To further complicate things, there are actually two times
in each record, so under the day, week-to-date, month-to-date etc groups, there
will be two sets of "late" bins, one for each time. In table form it would look
something like this:
| day | week-to-date | month-to-date | year-to-date |
----------------------------------------------------------------------------------------
t1 0min | <counts for each time group>
t1 1-5 min | ...
t1 6-15 min | ...
t2 0min | ...
t2 1-5 min | ...
t2 6-15 min | ...
So in the extreme scenario of a record that is for the current day, it will be
counted into 8 bins: once each for day, week-to-date, month-to-date and
year-to-date under the proper "late" bin for the first time in the record, and
once each into each of the time groups under the proper "late" bin for the
second time in the record. An older record may only be counted twice, under the
year-to-date group. A record with no matching schedule is discarded, as is any
record that is "late" by more than 15 minutes (those are gathered into a
separate report)
My initial approach was to simply make dictionaries for each "row" in the
table, like so:
t10 = {'daily': 0, 'WTD': 0, 'MTD': 0, 'YTD': 0,}
t15 = {'daily': 0, 'WTD': 0, 'MTD': 0, 'YTD': 0,}
.
.
t25 = {'daily': 0, 'WTD': 0, 'MTD': 0, 'YTD': 0,}
t215 = {'daily': 0, 'WTD': 0, 'MTD': 0, 'YTD': 0,}
then loop through the records, find the schedule for that record (if any, if
not move on as mentioned earlier), compare t1 and t2 against the schedule, and
increment the appropriate bin counts using a bunch of if statements.
Functional, if ugly. But then I got to thinking: I keep hearing about all these
fancy numerical analysis tools for python like pandas and numpy - could
something like that help? Might there be a way to simply set up a table with
"rules" for the columns and rows, and drop my records into the table, having
them automatically counted into the proper bins or something? Or am I over
thinking this, and the "simple", if ugly approach is best?
-----------------------------------------------
Israel Brewster
Systems Analyst II
Ravn Alaska
5245 Airport Industrial Rd
Fairbanks, AK 99709
(907) 450-7293
-----------------------------------------------
--
https://mail.python.org/mailman/listinfo/python-list