On Aug 27, 2010, at 9:49 AM, Vincy Pyne wrote:
Hi
I have a large credit portfolio (exceeding 50000 borrowers). For
particular process I need to add up the exposures based on the
bands. I am giving a small test data below.
I would think that cut() would be the accepted method for defining a
factor variable based on specified cutpoints. If you then wanted to
see what the cumsum() was across the range of possible levels, that to
would be a fairly simple task.
df$ead.cat <- cut(df$ead, breaks=c(0, 100000, 500000, 1000000,
2000000, 5000000 , 10000000, 100000000) )
df
with(df, tapply(ead.cat, rating, length))
# A AA AAA B BB BBB
# 10 8 2 1 4 7
with(df, tapply(ead.cat, rating, table))
# returns a list of table objects by bond rating
lapply( with(df, tapply(ead.cat, rating, table)) , cumsum)
#returns the cumsum of those tables
# sapply gives a more compact output of that result:
sapply( with(df, tapply(ead.cat, rating, table)) , cumsum)
A AA AAA B BB BBB
(0,1e+05] 4 2 1 0 3 1
(1e+05,5e+05] 8 2 1 1 3 1
(5e+05,1e+06] 9 2 1 1 3 1
(1e+06,2e+06] 9 4 2 1 4 3
(2e+06,5e+06] 9 5 2 1 4 4
(5e+06,1e+07] 10 5 2 1 4 7
(1e+07,1e+08] 10 8 2 1 4 7
Loops, you say we need loops? We don't need no stinkin' loops.
--
David.
rating <- c("A", "AAA", "A", "BBB","AA","A","BB", "BBB", "AA", "AA",
"AA", "A", "A", "AA","BB","BBB","AA", "A", "AAA","BBB","BBB", "BB",
"A", "BB", "A", "AA", "B","A", "AA", "BBB", "A", "BBB")
ead <- c(169229.93,100, 5877794.25, 9530148.63, 75040962.06, 21000,
1028360, 6000000, 17715000, 14430325.24, 1180946.57, 150000,
167490, 81255.16, 54812.5, 3000, 1275702.94, 9100, 1763142.3,
3283048.61, 1200000, 11800, 3000, 96894.02, 453671.72, 7590,
106065.24, 940711.67, 2443000, 9500000, 39000, 1501939.67)
## First I have sorted the data rating-wise as
df <- data.frame(rating, ead)
df_sorted <-
df[order(df$rating),]
df_sorted_AAA <- subset(df_sorted, rating=="AAA")
df_sorted_AA <- subset(df_sorted, rating=="AA")
df_sorted_A <- subset(df_sorted, rating=="A")
df_sorted_BBB <- subset(df_sorted, rating=="BBB")
df_sorted_BB <- subset(df_sorted, rating=="BB")
df_sorted_B <- subset(df_sorted, rating=="B")
df_sorted_CCC <- subset(df_sorted, rating=="CCC")
## we begin with BBB rating. The R output for df_sorted_BBB is as
follows
df_sorted_BBB
rating ead
4 BBB 9530149
8 BBB 6000000
16 BBB 3000
20 BBB 3283049
21 BBB 1200000
30 BBB 9500000
32 BBB 1501940
My problem is I need to totals of eads falling in the respective bands
I
am defining bands in millions as
seq_BBB <- seq(1000000, max(df_sorted_BBB$ead), by = 1000000)
# The output is
[1] 1e+06 2e+06 3e+06 4e+06 5e+06 6e+06 7e+06 8e+06 9e+06
So for the sub data pertaining to Rating "BBB", I want corresponding
ead totals i.e. I want ead totals where ead < 1e+06, then I want ead
totals where 1+e06 < ead < 2e+06, 2e+06 < ead < 3e+06 ...and so on.
I have tried the following code
s_BBB <- NULL
for (i in 1:length(s_BBB))
{
s_BBB[i] = sum(subset(df_sorted_BBB$ead, df_sorted_BBB$ead <
s_BBB[i]))
}
I was trying to find totals ofads < 1e+06, ead < 2e+06, ead<3e+06and
so on.
but the result is
s_BBB
[1] 0
I apologize if I am not able to express my problem properly. My only
objective is first to sort the whole portfolio rating-wise and then
within each of these rating-wise sorted data, I wish to find out
total of eads based
on various bands starting <1000000, 1000000 - 200000, 2000000 -
3000000, 3000000 - 4000000 and so on. Since the database contains
more than 50000 records, various ead amounts ranging from few 000's
to billion are available.
Please guide
Thanking you all in advance
Vincy
[[alternative HTML version deleted]]
______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
David Winsemius, MD
West Hartford, CT
______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.