[R] Reading multiple csv files

2010-02-26 Thread Madhavi Bhave
Dear R helpers
 
Some particular analysis leads me to various number of output csv files 
depending on some conditions. Say e.g. I have output files variable1.csv, 
variable2.csv, .. Problem is I don't know how many csv files been 
generated. They could be 4, 5 or even 10. Each file will have a column called 
amount.
 
My problem is to find filewise mean(amount) and sd(amount). I need to write a 
loop where all these individual csv files will be read and after reading each 
file, mean and sd will be calculated.
 
I have tried to write some R code which is very absurd. 
 
for (i in 1 : n)  # n is no of input files
 
{
data[i] = read.csv(file = paste("variable", i, ".csv", sep = ""))$amount
mean(data[i])
sd(data[i])
}
 
I get following error.
 
Error in file(file, "rt") : cannot open the connection
In addition: Warning message:
In file(file, "rt") :
  cannot open file 'paste("output", i, ".csv", sep = "")': Invalid argument

 
Please guide
 
Regards
 
Madhavi


  Your Mail works best with the New Yahoo Optimized IE8. Get it NOW! 
http://downloads.yahoo.com/in/internetexplorer/
[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] How to use same function for diffrent input values

2010-02-19 Thread Madhavi Bhave
Dear R helpers
 
I have written some function (the actual code I have pasted at the end of mail) 
like say
 
indiv_rate = function(n, rate_name, rate, rate_rf1, rate_rf2, rate_rf3, 
rateprob1, rateprob2, rateprob3)

{
some R commands
 
return(data.frame(rate_name, rates = round(rate_data, digits = 4)))
 
}
 
## INPUT
 
rates = indiv_rate(n = read.csv('number.csv')$n, rate_name = 
read.csv('rate.csv')$rate_name, rate = read.csv('rate.csv')$rate, rate_rf1 = 
read.csv('rate_rf.csv')$rate_rf1, 
 rate_rf2 = read.csv('rate_rf.csv')$rate_rf2, rate_rf3 
= read.csv('rate_rf.csv')$rate_rf3, 
   rateprob1 = read.csv('rate_probability.csv')$probability1, rateprob2 = 
read.csv('rate_probability.csv')$probability2, rateprob3 = 
read.csv('rate_probability.csv')$probability3)
 
 
## OUTPUT
 
write.csv(data.frame(rates), 'indiv rates generated.csv', row.names = FALSE)

##___ end of code 
 
So for given rate (which I am deining as rate = read.csv('rate.csv')$rate), I 
get the desired results.
 
My problem is how do I use this fuction for different 'rates' i.e. for any 
given rate, I run the function and store the result with the respective rate 
name?
 
Regards
 
Madhavi
 
(PS - I am avoiding to paste my actual code consuming 60 lines. Still, if 
somene insists, I can post the same)
 
 
 


  The INTERNET now has a personality. YOURS! See your Yahoo! Homepage. 
[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Total and heading of portfoilo table

2010-02-15 Thread Madhavi Bhave
Hi!
 
I am not expert in R, but perhaps you can try the following -
 
X = as.numeric(read.csv('quantity.csv'))
Y = read.csv('equity_price.csv')
Y = Y[, -1]
 
Z = X*Y
 
port_val = NULL
 
for(i in 1 : nrow(Z))
{
 
port_val[i] = sum(Z[i,])
 
}
 
write.csv(data.frame(Z, port_val = port_val), 'PORTFOLIO.csv', row.names = 
FALSE)


I am sure the experts will have much simpler way to address this problem.
 
Regards
 
Madhavi

--- On Mon, 15/2/10, Sarah Sanchez  wrote:


From: Sarah Sanchez 
Subject: [R] Total and heading of portfoilo table
To: r-help@r-project.org
Date: Monday, 15 February, 2010, 10:08 PM


Dear R helpers,

I have two input files as 'quantity.csv' and 'equity_price.csv' as (for 
example) given below.

'quantity.csv'
GOOG YHOO
1000 100


'equity_price.csv'
sr_no   GOOG_price   YHOO_price
1    15.22 536.40
2    15.07 532.97
3    15.19 534.05  
4    15.16 531.86 
5    15.11 532.11

My problem is to calculate the portfolio value for each of these 5 days 
(actually my portfolio 
consists of 47 comanies and prices taken are for last 1 year).

I had defined 

X = read.csv('quantity.csv')
Y = read.csv('equity_price.csv')

I have tried the loop 

Z = array()

for (i in 1:2)
{
Z[i] = (X[[i]]*Y[i])
}

# When I write this dataframe as

write.csv(data.frame(Z), 'Z.csv', row.names = FALSE)

When I open 'Z.csv' file, I get

c.2500L..3300L..4500L..1000L..4400L.    
c.14000L..45000L..48000L..26000L..15000L.
2500    14000
3300    45000
4500    48000
1000    26000
4400    15000

My requirement is to have the column heads and the portfolio total as
GOOG    YHOO Total
2500       14000 16500
3300       45000 48300
4500       48000 52500
1000       26000 27000
4400       15000 19400


Please guide

Regards

Sarah




      
    [[alternative HTML version deleted]]


-Inline Attachment Follows-


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.




tp://downloads.yahoo.com/in/internetexplorer/
[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] CORRECTION - Storing results in a loop

2010-02-15 Thread Madhavi Bhave
Dear Sir,
 
Thanks a lot for your quick solution. But still I will like someone to guide me 
the solution as per my requirement. The problem is actually I am not looking 
for the square of each term. I have used it to give some example since I didn't 
want to confuse the matter. My process is altogether different and I need to 
use loop. So here is my actual code. I am trying to use Historical simulation 
and for that I need to calculate LN(New rate / old rate for each of the 
instrument separately).
 
Here is my actual code -
 
ONS = read.csv('Instrument.csv')
n = length(ONS)
Y = NULL
B = array()
 
for (i in 1 : n)
 {
 Y[i] = ONS[i]
  j <- 1
 for (j in 1:(length(Y[[i]])-1))
   {
  B[j] <- log((Y[[i]][j+1])/(Y[[i]][j]))
 
 
 j <- j+1
 

 }
 
 }

 


--- On Mon, 15/2/10, Benilton Carvalho  wrote:


From: Benilton Carvalho 
Subject: Re: [R] CORRECTION - Storing results in a loop
To: "Madhavi Bhave" 
Cc: r-help@r-project.org
Date: Monday, 15 February, 2010, 4:29 AM


sorry, meant to type:

B = ONS^2

cheers,
benilton

On Mon, Feb 15, 2010 at 12:28 PM, Benilton Carvalho
 wrote:
> maybe you just want
>
> Y = ONS^2
>
> ?
>
> b
>
> On Mon, Feb 15, 2010 at 12:22 PM, Madhavi Bhave  
> wrote:
>> Dear R Helpers
>>
>> (There is a small correction in my earlier mail. In the 'instrument.csv' 
>> file, I had mentioned only three columns. Actually there are 7 columns. I 
>> regret the error. Rest contents remains the same. Thanks)
>>
>> I have an 'instrument.csv' file with 7 instrument names and 5 rates each 
>> i.e. it has 7 columns and 6 rows (including row names).
>>
>> 'instrument.csv'
>>
>> instrument1  instrument2    instrument7
>> 12 5      14
>> 11 7    7
>> 14   11        3
>>   8   21  10
>> 11 3    5
>>
>>
>> Following is my R code.
>>
>> ONS = read.csv('Instrument.csv')
>> n = length(ONS)
>>
>> Y = NULL
>> B = NULL
>>
>> for (i in 1 : n)
>>
>>  {
>>
>>  Y[i] = ONS[i]
>>
>>   for (j in 1 : length(Y[[i]]))
>>    {
>>    B[j] = (Y[[i]][j])^2
>>    }
>>
>>  }
>>
>> Problem is when I type B, I get the processed result only for the last 
>> column i.e. Y[7]. It doesn't store results for Y[1] to Y[7].
>>
>> I need B[1], B[2]...upto B[7].
>>
>> Please guide me how do I store individual column processed results?
>>
>> Thanking you all in advance
>>
>> Regards
>>
>> Madhavi
>>
>>
>>
>>
>>      The INTERNET now has a personality. YOURS! See your Yahoo! Homepage.
>>        [[alternative HTML version deleted]]
>>
>>
>> __
>> R-help@r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>>
>>
>



  Your Mail works best with the New Yahoo Optimized IE8. Get it NOW! 
http://downloads.yahoo.com/in/internetexplorer/
[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] CORRECTION - Storing results in a loop

2010-02-15 Thread Madhavi Bhave
Dear R Helpers

(There is a small correction in my earlier mail. In the 'instrument.csv' file, 
I had mentioned only three columns. Actually there are 7 columns. I regret the 
error. Rest contents remains the same. Thanks) 

I have an 'instrument.csv' file with 7 instrument names and 5 rates each i.e. 
it has 7 columns and 6 rows (including row names).
 
'instrument.csv'
 
instrument1  instrument2    instrument7
12 5      14
11 7    7
14   11        3
  8   21  10
11 3    5
 
 
Following is my R code.
 
ONS = read.csv('Instrument.csv')
n = length(ONS)
 
Y = NULL
B = NULL
 
for (i in 1 : n)
 
 {
 
 Y[i] = ONS[i]
 
  for (j in 1 : length(Y[[i]])) 
   {
   B[j] = (Y[[i]][j])^2
   } 
 
 }
 
Problem is when I type B, I get the processed result only for the last column 
i.e. Y[7]. It doesn't store results for Y[1] to Y[7].
 
I need B[1], B[2]...upto B[7]. 
 
Please guide me how do I store individual column processed results?
 
Thanking you all in advance
 
Regards
 
Madhavi
 



  The INTERNET now has a personality. YOURS! See your Yahoo! Homepage. 
[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Storing processed results in a loop

2010-02-15 Thread Madhavi Bhave
Dear R Helpers
 
I have an 'instrument.csv' file with 3 instrument names and 5 rates each i.e. 
it has 7 columns and 6 rows (including row names).
 
'instrument.csv'
 
instrument1  instrument2   instrument3
12 5    14
11 7  7
14   11      3
  8   2110
11 3  5
 
 
Following is my R code.
 
ONS = read.csv('Instrument.csv')
n = length(ONS)
 
Y = NULL
B = NULL
 
for (i in 1 : n)
 
 {
 
 Y[i] = ONS[i]
 
  for (j in 1 : length(Y[[i]])) 
   {
   B[j] = (Y[[i]][j])^2
   } 
 
 }
 
Problem is when I type B, I get the processed result only for the last column 
i.e. Y[7]. It doesn't store results for Y[1] to Y[7].
 
I need B[1], B[2]...upto B[7]. 
 
Please guide me how do I store individual column processed results?
 
Thanking you all in advance
 
Regards
 
Madhavi
 
 


  Your Mail works best with the New Yahoo Optimized IE8. Get it NOW! 
http://downloads.yahoo.com/in/internetexplorer/
[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] How to repeat the names?

2010-02-10 Thread Madhavi Bhave

Dear Sirs,
 
Thanks a lot for your guidance. It worked wonderfully.
 
Regards
 
Madhavi

--- On Wed, 10/2/10, Ivan Calandra  wrote:


From: Ivan Calandra 
Subject: Re: [R] How to repeat the names?
To: r-help@r-project.org
Date: Wednesday, 10 February, 2010, 1:54 AM


Hi!
I'm kind of a newboe here, but I think it is because read.csv transforms 
the character variables to factors. Maybe try setting the argument 
"as.is" in read.csv().
Or try:
rep(as.character(c(city1, city2)),5)
That should work.

HTH
Ivan

Le 2/10/2010 10:44, Madhavi Bhave a écrit :
> Dear R helpers
>   
> I have a city.csv file as given below.
>   
> 'city.csv'
> city_name1        city_name2
> New York City    Buffallo      
>   
> So I define
>   
> city_name = read.csv('city.csv')
> city1 = city_name$city_name1
> city2 = city_name$city_name2
>   
> My problem is how do I repeat the names one after other say 10 times i.e. my 
> output should be like
>   
> New York
> City Buffallo
> New York
> City Buffallo
> New York
> City Buffallo
> New York City
> ...
> ...
> ...
> ...
>   
> I have tried the following commands
>   
> rep(c(city1,city2), 5)
>   
> and I got the output something like this
>   
> [1] 1 1 1 1 1 1 ...
>   
> If I try
>   
> rep((city1,city2), 5)
>   
> Error: unexpected ',' in "rep((city1,"
>   
> Please guide
>   
> Regards
>   
> MAdhavi
>
>
>        Your Mail works best with the New Yahoo Optimized IE8. Get it NOW! 
>http://downloads.yahoo.com/in/internetexplorer/
>     [[alternative HTML version deleted]]
>
>    
>
>
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>    

    [[alternative HTML version deleted]]


-Inline Attachment Follows-


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.



[[elided Yahoo spam]]

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] How to repeat the names?

2010-02-10 Thread Madhavi Bhave
Dear R helpers
 
I have a city.csv file as given below.
 
'city.csv'
city_name1    city_name2
New York City    Buffallo   
 
So I define
 
city_name = read.csv('city.csv')
city1 = city_name$city_name1
city2 = city_name$city_name2
 
My problem is how do I repeat the names one after other say 10 times i.e. my 
output should be like
 
New York 
City Buffallo
New York 
City Buffallo 
New York 
City Buffallo 
New York City 
...
...
...
...
 
I have tried the following commands 
 
rep(c(city1,city2), 5)
 
and I got the output something like this 
 
[1] 1 1 1 1 1 1 ...
 
If I try
 
rep((city1,city2), 5)
 
Error: unexpected ',' in "rep((city1,"
 
Please guide
 
Regards
 
MAdhavi


  Your Mail works best with the New Yahoo Optimized IE8. Get it NOW! 
http://downloads.yahoo.com/in/internetexplorer/
[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Yield to Maturity using R

2010-02-02 Thread Madhavi Bhave
Dear Sir,
 
Thank you for valuable guidance. Though I have been using R occassionally, it 
was limited to some basics and that way I am new to R. As suggested by you, I 
have gone through the said chapter of Introduction to R manual, though I have 
some urgent comittments to meet.
 
I have tried writing function as given below.
 
 
f = function(price, tenure, no_comp, coupon_rate, face_value)
 
{
coupon_payment = face_value * coupon_rate / no_comp
cash_flow = c(rep(c(coupon_payment), (no_comp * tenure - 1)), face_value + 
coupon_payment)
 
E = NULL
 for (i in 1 : (tenure * no_comp - 1))
  {
  E[i] = cash_flow[i]/(1+ytm)^i
  }
 
 F = NULL
  {
  F = sum(E) + ((face_value + coupon_payment)/(1+ytm)^(no_comp * tenure)) - 
price
  }
 
return(data.frame(S = uniroot.all(F, interval=c(0,25  
  
}
 
output = f(1010, 3, 1, 0.10, 1000)
 
##  End of code
 
However, when I try to execute the same, I get following error. 
 
Error: object 'ytm' not found

My objective is to find ytm itself and I am not able to figure out where I am 
going wrong and how to overcome the same.
 
Regards
 
Madhavi

--- On Tue, 2/2/10, Dennis Murphy  wrote:


From: Dennis Murphy 
Subject: Re: [R] Yield to Maturity using R
To: "Madhavi Bhave" 
Date: Tuesday, 2 February, 2010, 3:49 AM


Hi:


On Tue, Feb 2, 2010 at 3:01 AM, Madhavi Bhave  wrote:



Dear R helpers,
 
 
Yesterday I had raised following query which was addressed by Mr Ellison. The 
query and the wonderful solution as provided by Mr. Ellison are as given below.
  
## PROBLEM
 
I am calculating the 'Yield to Maturity' for the Bond with following 
characteristics.
  
Its a $1000 face value, 3 year bond with 10% annual coupon and is priced at 
101. The yield to maturity can be calculated after solving the equation - 
  
1010 = [100 / (1+ytm)]  + [100 / (1+ytm)^2] + [ 1100 / (1 + ytm)^3]
  
This can be solved by trial and error method s.t. ytm = 9.601%. I wanted to 
find out how to solve this equation in R.
   
## SOLUTION
 
Mr. Elisson had given me following wonderful solution
 
f.ytm<-function(ytm) 100 / (1+ytm)  +100 / ((1+ytm)^2) + 1100 / ((1 +
ytm)^3) -1010

uniroot(f.ytm, interval=c(0,25)) 

#$root has the answer
 
And I got the answer as 9.601.
 
## _
 
I was just trying to generalize this solution to any equation and accordingly 
written a code as given below.
 
The following input I will be reading using csv file and thus my equation will 
change if tenure or no_comp etc. changes. So taking into account the variable 
nature of the input, I am trying to write a generalized code.
 
## Input
 
price = 101   # Price of bond
tenure = 3 
no_comp = 1  # no of times coupon paid in a year.
coupon_rate = 0.10  # i.e. 10%
face_value  = 100
 
# Computations
 
coupon_payment = face_value * coupon_rate
cash_flow = c(rep(c(coupon_payemnt), (no_comp * tenure - 1)), face_value + 
coupon_payment)
cash_flow
 
## I am trying to customize the code as given by Mr Ellison.
 
f.ytm = function(ytm)
 
{
 
 for (i in 1 : (tenure * no_comp - 1))
 E = NULL
 F = NULL
 {
 E[i] = cash_flow[i]/(1+ytm)^i
 F = (sum(E) + (face_value + coupon_payment)/((1+ytm)^(tenure * no_comp))) - 
price
    }
}
 

For this to work, tenure, no_comp, cash_flow, face_value and coupon_payment 
have to be
visible to the function - i.e., they either have to be in the function's 
calling environment
or in the global environment. These are called 'free variables' under the 
lexical
scoping rules of R. (Welcome to function writing :)  

You might want to look a little more closely at uniroot(), especially the 
conventions
it requires for the *functions* it can evaluate. You want the body of what you 
send
to uniroot() for evaluation to be a function of a single variable. Your 
previously supplied
solution meets that requirement. This one doesn't (yet).

Moreover, it appears that E[i] is never used and nothing is returned from 
f.ytm. I'd
suggest an excursion into R function writing fundamentals. Start with Ch. 10 of
the Introduction to R manual.

HTH,
Dennis


solution = uniroot(f.ytm, interval=c(0,25)) 
 
ytm = solution$root
 
However, when I execute this code I get following error.
 
> solution = uniroot(f.ytm, interval=c(0,25)) 
Error in uniroot(f.ytm, interval = c(0, 25)) : f.lower = f(lower) is NA

Please guide. ytm should be 0.09601 (i.e. 9.601%)
 
 
with regards
 
Madhavi Bhave
 
 


     Your Mail works best with the New Yahoo Optimized IE8. Get it NOW! 
http://downloads.yahoo.com/in/internetexplorer/
       [[alternative HTML version deleted]]


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.





[[elided Yahoo spam]]

[[alternative HTML version delet

[R] Yield to Maturity using R

2010-02-02 Thread Madhavi Bhave


Dear R helpers,
 
 
Yesterday I had raised following query which was addressed by Mr Ellison. The 
query and the wonderful solution as provided by Mr. Ellison are as given below.
  
## PROBLEM
 
I am calculating the 'Yield to Maturity' for the Bond with following 
characteristics.
  
Its a $1000 face value, 3 year bond with 10% annual coupon and is priced at 
101. The yield to maturity can be calculated after solving the equation - 
  
1010 = [100 / (1+ytm)]  + [100 / (1+ytm)^2] + [ 1100 / (1 + ytm)^3]
  
This can be solved by trial and error method s.t. ytm = 9.601%. I wanted to 
find out how to solve this equation in R.
   
## SOLUTION 
 
Mr. Elisson had given me following wonderful solution 
 
f.ytm<-function(ytm) 100 / (1+ytm)  +100 / ((1+ytm)^2) + 1100 / ((1 +
ytm)^3) -1010

uniroot(f.ytm, interval=c(0,25))  

#$root has the answer
 
And I got the answer as 9.601.
 
## _
 
I was just trying to generalize this solution to any equation and accordingly 
written a code as given below. 
 
The following input I will be reading using csv file and thus my equation will 
change if tenure or no_comp etc. changes. So taking into account the variable 
nature of the input, I am trying to write a generalized code.
 
## Input
 
price = 101   # Price of bond
tenure = 3  
no_comp = 1  # no of times coupon paid in a year.
coupon_rate = 0.10  # i.e. 10%
face_value  = 100
 
# Computations
 
coupon_payment = face_value * coupon_rate
cash_flow = c(rep(c(coupon_payemnt), (no_comp * tenure - 1)), face_value + 
coupon_payment)
cash_flow
 
## I am trying to customize the code as given by Mr Ellison.
 
f.ytm = function(ytm)
 
{
 
 for (i in 1 : (tenure * no_comp - 1))
 E = NULL
 F = NULL
 {
 E[i] = cash_flow[i]/(1+ytm)^i
 F = (sum(E) + (face_value + coupon_payment)/((1+ytm)^(tenure * no_comp))) - 
price
    }
}
 
solution = uniroot(f.ytm, interval=c(0,25))  
 
ytm = solution$root
 
However, when I execute this code I get following error.
 
> solution = uniroot(f.ytm, interval=c(0,25))  
Error in uniroot(f.ytm, interval = c(0, 25)) : f.lower = f(lower) is NA

Please guide. ytm should be 0.09601 (i.e. 9.601%)
 
 
with regards
 
Madhavi Bhave
 
 


  Your Mail works best with the New Yahoo Optimized IE8. Get it NOW! 
http://downloads.yahoo.com/in/internetexplorer/
[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] 'R' and 'Yield to Maturity'

2010-02-01 Thread Madhavi Bhave
Dear Sir,
 
That was GREAT!!!. Thanks a lot for the solution. Once again it showed how 
powerful 'R' is otherwise I was breaking my head on Newton-Raphson method.
 
Thanks again Sir. That was really superb.
 
Madhavi

--- On Mon, 1/2/10, S Ellison  wrote:


From: S Ellison 
Subject: Re: [R] 'R' and 'Yield to Maturity'
To: "Craig P. Pyrame" , "Madhavi Bhave" 

Cc: r-help@r-project.org
Date: Monday, 1 February, 2010, 3:41 AM


If you know the likely range, uniroot would do it.

f.ytm<-function(ytm) 100 / (1+ytm)  +100 / ((1+ytm)^2) + 1100 / ((1 +
ytm)^3) -1010

uniroot(f.ytm, interval=c(0,25))  

#$root has the answer


>>> "Craig P. Pyrame"  01/02/2010 10:19 >>>
Madhavi Bhave wrote:
> Dear R helpers
>  
> I am calculating the 'Yield to Maturity' for the Bond with following
characteristics.
>  
> Its a $1000 face value, 3 year bond with 10% annual coupon and is
priced at 101. The yield to maturity can be calculated after solving the
equation - 
>  
> 1010 = [100 / (1+ytm)]  + [100 / (1+ytm)^2] + [ 1100 / (1 + ytm)^3]
>  
> This can be solved by trial and error method s.t. ytm = 9.601%.
>   

Why don't you use sage, for example:

sage: var('ytm');
sage: eqn = 1010 == (100/(1+ytm) + 100/(1+ytm)^2 + 1100/(1+ytm)^3);
sage: [solution.right().n() for solution in solve(eqn, ytm)][2]
0.0960070882662794

The third value is the only real solution.
There may be an R package for doing that, but I don't know one.

Best regards,
Craig


>  
> My query is (1) if there is any R package which will calcualte ytm or
(2) is there any method in 'R' which can solve the above equation.
>  
> Thanking you all in advance
>  
> Regards
>  
> Madhavi
>  
>  
>  
>
>
>       Your Mail works best with the New Yahoo Optimized IE8. Get it
NOW! http://downloads.yahoo.com/in/internetexplorer/ 
>     [[alternative HTML version deleted]]
>
>   
>

>
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help 
> PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html 
> and provide commented, minimal, self-contained, reproducible code.
>

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help 
PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html 
and provide commented, minimal, self-contained, reproducible code.

***
This email and any attachments are confidential. Any use...{{dropped:15}}

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] 'R' and 'Yield to Maturity'

2010-02-01 Thread Madhavi Bhave
Dear R helpers
 
I am calculating the 'Yield to Maturity' for the Bond with following 
characteristics.
 
Its a $1000 face value, 3 year bond with 10% annual coupon and is priced at 
101. The yield to maturity can be calculated after solving the equation - 
 
1010 = [100 / (1+ytm)]  + [100 / (1+ytm)^2] + [ 1100 / (1 + ytm)^3]
 
This can be solved by trial and error method s.t. ytm = 9.601%.
 
My query is (1) if there is any R package which will calcualte ytm or (2) is 
there any method in 'R' which can solve the above equation.
 
Thanking you all in advance
 
Regards
 
Madhavi
 
 
 


  Your Mail works best with the New Yahoo Optimized IE8. Get it NOW! 
http://downloads.yahoo.com/in/internetexplorer/
[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] How to sort data.frame

2010-01-27 Thread Madhavi Bhave
Dear R heleprs
 
Suppose I have following data
 








Scenarios
combination_names
    series1
   series2

Sc1
MAT2 GAU1
7.26554
8.409778

Sc2
MAT2 GAU2
7.438128
8.130275

Sc3
MAT3 GAU1
8.058422
8.06457

Sc4
MAT1 GAU2
8.179855
8.022071

Sc5
MAT3 GAU2
8.184033
8.191831

Sc6
MAT3 GAU2
7.50312
8.232425

Sc7
MAT1 GAU2
7.603291
8.200993

Sc8
MAT1 GAU1
8.221755
8.380097

Sc9
MAT3 GAU2
7.904908
8.088824

Sc10
MAT1 GAU3
7.67034
8.46376
 
 
I wish to sort thise data frame based on combination_names. Actually this is 
just an indicative data. I am deling with the data haveing 5000+ records.
 
I just need to find out how to sort this data s.t, I will get say following 
result
 




Scenarios
combination_names
    series1
   series2




Sc8
   MAT1 GAU1
8.221755
8.380097




Sc4
   MAT1 GAU2
8.179855
8.022071








Sc7
   MAT1 GAU2
7.603291
8.200993
Sc10   MAT1 GAU3 7.67034  8.46376 
    


 
 and so
 
I tried to understand the examples given in 
?base::order but couldn't cracke it.
 
Please guide
 
Madhavi
 


  The INTERNET now has a personality. YOURS! See your Yahoo! Homepage. 
[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Please Please Please Help me!!

2010-01-19 Thread Madhavi Bhave
Dear R helpers
 
(I have already written the required R code which is giving me correct results 
for a given single set of data. I just wish to wish to use it for multiple 
data.)
 
I have defined a function (as given below) which helps me calculate Macaulay 
Duration and Modified Duration and its working fine with given set of data.
 
My Code -
 
## ONS - PPA
 
duration = function(par_value, coupon_rate, frequency_copoun, tenure, ytm)
 
{
macaulay_duration  =   NULL
modified_duration    =   NULL 
freq_coupon_new    =   NULL
 
if(frequency_copoun <= 0)
{
    freq_coupon_new = 365
} 
 
if(frequency_copoun > 0 & frequency_copoun <= 1)
{
    freq_coupon_new = 12
} 
 
if(frequency_copoun > 1 & frequency_copoun <= 2)
{
    freq_coupon_new = 4
} 
 
if(frequency_copoun > 2 & frequency_copoun <= 3)
{
    freq_coupon_new = 2
} 
 
if(frequency_copoun > 3 & frequency_copoun <= 4)
{
    freq_coupon_new = 1
} 
 
## COMPUTATIONS
 
terms_coupon_payment  = (seq(1/freq_coupon_new, tenure, by = 
1/freq_coupon_new))*freq_coupon_new
coupon    = coupon_rate*par_value/100
coupon_amount    = coupon/(freq_coupon_new)
cash_flow1  = rep(c(coupon_amount), (tenure*freq_coupon_new - 
1)) 
cash_flow2  = par_value + coupon_amount
cash_flow   = c(cash_flow1, cash_flow2) 
 
ytm_effective  = ((1+ytm/100)^(1/freq_coupon_new))-1
 
pv = NULL
 
for (i in 1:(tenure*freq_coupon_new))
 {
   pv[i] = cash_flow[i] / ((1+ytm_effective)^terms_coupon_payment[i])
 }
 
macaulay_duration = sum(pv*terms_coupon_payment)/sum(pv)
modified_duration = macaulay_duration / (1+(ytm_effective)/freq_coupon_new)
 
return(data.frame(macaulay_duration, modified_duration))
 
}

### For a given data say 
 
result = duration(par_value = 1000, coupon_rate = 10, frequency_copoun = 0, 
tenure = 5, ytm = 12)

I get the output as 
 
  macaulay_duration modified_duration
1  1423.797  1423.795
 
## __
 
## MY PROBLEM
 
If instead of having only one set of data, suppose I have multiple data (say as 
given below in a csv file), I am not able to get these results.
 
Suppose my 'input.csv' file is as given below.
 
par_value    coupon_rate   frequency_copoun tenure    ytm
 
  1000  10 0    5   
12
    100   
7  1   8    11
 
 
Sir, I am not asking for the modification of existing code as it is running 
fine with a single set of data (and I have checked that the output tallies with 
other methods). I just want to use this code for multiple data so that I get an 
output something like
 
 maculay_duration    modified_duration
 
  1423.797   1423.795
 
 44.339   44.307
 
I request you to please guide me. I undesratnd its not my right to seek help, 
but this is for the third time I am requesting to guide me.
 
Thanking you all
 
Regards
 
Madhavi
 
 
 
 


  The INTERNET now has a personality. YOURS! See your Yahoo! Homepage. 
[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Macualay Duration code in a Functional Form - Please Help

2010-01-19 Thread Madhavi Bhave

# I have written this code in Notepad++ and copied here.

## ONS - PPA    

Duration = function(par_value, coupon_rate, freq_coupon, tenure, ytm)

{
macaulay_duration  =   NULL
modified_duration    =   NULL 
freq_coupon_new    =   NULL

if(freq_coupon <= 0)
{
    freq_coupon_new = 365
} 

if(freq_coupon > 0 & freq_coupon <= 1)
{
    freq_coupon_new = 12
} 

if(freq_coupon > 1 & freq_coupon <= 2)
{
    freq_coupon_new = 4
} 

if(freq_coupon > 2 & freq_coupon <= 3)
{
    freq_coupon_new = 2
} 

if(freq_coupon > 3 & freq_coupon <= 4)
{
    freq_coupon_new = 1
} 

## COMPUTATIONS

terms_coupon_payment  = (seq(1/freq_coupon_new, tenure, by = 
1/freq_coupon_new))*freq_coupon_new
coupon    = coupon_rate*par_value/100
coupon_amount    = coupon/(freq_coupon_new)
cash_flow1  = rep(c(coupon_amount), (tenure*freq_coupon_new - 
1)) 
cash_flow2  = par_value + coupon_amount
cash_flow   = c(cash_flow1, cash_flow2) 

ytm_effective  = ((1+ytm/100)^(1/freq_coupon_new))-1

pv = NULL

for (i in 1:(tenure*freq_coupon_new))
    {
            pv[i] = cash_flow[i] / ((1+ytm_effective)^terms_coupon_payment[i])
    }

macaulay_duration = sum(pv*terms_coupon_payment)/sum(pv)
modified_duration = macaulay_duration / (1+(ytm_effective)/freq_coupon_new)

return(data.frame(macaulay_duration, modified_duration))

}


result = Duration(par_value = 1000, coupon_rate = 10, freq_coupon = 0, tenure = 
5, ytm = 12)

## ___

When I run this function, I get the values of Macaulay Duration and Modified 
Duration

> result
  macaulay_duration modified_duration
1  1423.797  1423.795


### MY PROBLEM

I have arrived at a result using only one set of observations i.e. for the 
following data -

Duration(par_value = 1000, coupon_rate = 10, freq_coupon = 0, tenure = 5, ytm = 
12)

However, if I need to obtain these results for multiple records, how do I 
calculate and obtain the result in a tabular form?

e.g. suppose my input data file is 'instrument details.csv' given as

id par_value coupon_rate  frequency_coupon    tenure    ytm
1   1000    10                 0    
 5  12    
2   100    7 1  
   8  11    

### frequency_coupon is coded s.t. if frequency_coupon = 0, no of compoundings 
in a year = 365 and if it is 1, then no of compoundings = 12


Then how do modify the above code?

I have tried to convert in a matrix form as follows

I have added following code after the function is defined i.e. after

#return(data.frame(macaulay_duration, modified_duration))



#}


# Added code

ONS  = read.csv('instrument details.csv')

n  = length(ONS$par_value)

par_value  =  matrix(data = ONS$par_value, nrow = n, ncol = 1, byrow = TRUE)
coupon_rate    =  matrix(data = ONS$coupon_rate, nrow = n, ncol = 1, byrow = 
TRUE)
freq_coupon  =  matrix(data = ONS$frequency_copoun, nrow = n, ncol = 1, byrow = 
TRUE)
tenure  =  matrix(data = ONS$tenure, nrow = n, ncol = 1, byrow = TRUE)
ytm =   matrix(data = ONS$ytm, nrow = n, ncol = 1, byrow = TRUE)

result  =   matrix(data = NA, nrow = n, ncol = 2, byrow = TRUE)

result = Duration(par_value, coupon_rate, freq_coupon, tenure, ytm)

## 

When I run result, besides getting 50 warnings, I get following 

> result
  macaulay_duration modified_duration
1  826.9026  826.9019
2  826.9026  826.9019

which is I know wrong.

Is there any other way I can use the function defined above to process multiple 
recrds.

Thanking you and sincerely apologize for writing such a long mail as I wanted 
to be clear in my communication.

Regards

Madhavi Bhave



ONS  = read.csv('instrument details.csv')

n  = length(ONS$par_value)

par_value  =  matrix(data = ONS$par_value, nrow = n, ncol = 1, byrow = TRUE)
coupon_rate    =  matrix(data = ONS$coupon_rate, nrow = n, ncol = 1, byrow = 
TRUE)
freq_coupon  =  matrix(data = ONS$frequency_copoun, nrow = n, ncol = 1, byrow = 
TRUE)
tenure  =  matrix(data = ONS$tenure, nrow = n, ncol = 1, byrow = TRUE)
ytm =   matrix(data = ONS$ytm, nrow = n, ncol = 1, byrow = TRUE)

result  =   matrix(data = ONS$par_value, nrow = n, ncol = 2, byrow = 
TRUE)

result = Duration(par_value, coupon_rate, freq_coupon, tenure, ytm)




  The INTERNET now has a personality. YOURS! See your Yahoo! Homepage. 
[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Macaulay Duration for Group

2010-01-19 Thread Madhavi Bhave

Dear R helpers
 
I have following csv file which is an input
 
id   par_value    coupon_rate frequency_coupon   tenure    ytm
 
1    1000 10  1     
5  12
 
# Here frequency_coupon is coded s.t. 0 means Daily compounding, 1 means 
monthly compouding, 2 means Quarterly, 3 means Half yearly and 4 means 
only once. Thus in the case the frequency_coupon = 1 means, total number of 
times compounding is done = 12.   
 
 
My R Code for calcualting Macaulay Duration is as follows -
 
## INPUT
 
ONS  = read.csv('instrument details..csv')
par_value   = ONS$par_value
coupon  = ONS$coupon_rate*par_value/100
freq_coupon   = ONS$frequency_copoun
tenure   = ONS$tenure
ytm  = ONS$ytm
 
# 
_
 
## COMPUTATIONS

macaulay_duration =   NULL
modified_duration   =   NULL 
freq_coupon_new   =   NULL
 
if(freq_coupon <= 0)
{
    freq_coupon_new = 365
} 
 
if(freq_coupon > 0 & freq_coupon <= 1)
{
    freq_coupon_new = 12
} 
 
if(freq_coupon > 1 & freq_coupon <= 2)
{
    freq_coupon_new = 4
} 
 
if(freq_coupon > 2 & freq_coupon <= 3)
{
    freq_coupon_new = 2
} 
 
if(freq_coupon > 3 & freq_coupon <= 4)
{
    freq_coupon_new = 1
} 
 
## COMPUTATIONS
 
terms_coupon_payment  = (seq(1/freq_coupon_new, tenure, by = 
1/freq_coupon_new))*freq_coupon_new
coupon_amount    = coupon/(freq_coupon_new)
cash_flow1  = rep(c(coupon_amount), (tenure*freq_coupon_new - 
1)) 
cash_flow2  = par_value + coupon_amount
cash_flow   = c(cash_flow1, cash_flow2) 
 
ytm_effective  = ((1+ytm/100)^(1/freq_coupon_new))-1
 
pv = NULL
 
for (i in 1:(tenure*freq_coupon_new))
 {
   pv[i] = cash_flow[i] / ((1+ytm_effective)^terms_coupon_payment[i])
 }
 
macaulay_duration = sum(pv*terms_coupon_payment)/sum(pv)
modified_duration = macaulay_duration / (1+(ytm_effective)/freq_coupon_new)

macaulay_duration
modified_duration
 
## _
 
# My PROBLEM
 
Here I am dealing with only one id i.e. only one record. However, if Instead of 
one record,  ahve say 20 records, how do I calculate the Macaulay Duration for 
each of these 20 records. One option is to run this code 20 times *which I 
guess will be foolish thing to do. Other method is to define above code as some 
function and tehn run this function for each of these records, but I don't 
underatnd how to write  a function and thord option is to treat the input of 
these 20 records in a matrix form, which I had tried unsuccessfully.
 
Please guide me as to how do I modify the R-code to calculate Mac duration for 
each of tehse records and store tehm.
 
Regards
 
Madhavi Bhave
 


  The INTERNET now has a personality. YOURS! See your Yahoo! Homepage. 
[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Parameters of Logistic Distribution and (3 Parameter) Log Logistic Distribution

2009-07-31 Thread Madhavi Bhave
Dear R Helpers

Please guide me how one can estimate the parameters of Logistic Distribution 
and 3 Parameter Log-logistic distribution for a given data.

data <- 

c(2987.43,2990.12,3023.52,2964.79,3019.60,3051.07,3080.16,2944.15,3035.19,3023.46,2985.05,2970.95,3192.36,3084.39,2926.23,2952.15,3064.15,3003.20,2980..35,2980.45,3043.12,3115.53,3006.90,2946.03,3039.97,3064.01,3000.56,3049.57,3042.54,3037.63,2982.03,2889.74,3043.83,2930.95,3020.65,3009.21,3084.16,2954.05,2991.04,3083.10,3007.26,2949.58,2995.65,3078.36,3031.64,3001.28,3103.32,3015.04,2994.45,2963.71,2932.90,3021.31,3074.72,2980.15,3002.29,3088.18,2991.39,2942.90,3057.91,3023.25,3192.67,2966.49,3049.31,2915.38,3045.27,2852.72,2999.25,2978.52,3040.07,2945.50,3047.47,2915.95,3012.24,2985.80,2971.04,3035.72,3025.40,3014.76,2979.62,3029.20,2938.38,2966.47,3017.81,3016.43,2989.60,2941.22,3038.30,3033.44,3003.77,2950.02,3053.19,3011.69,2916.34,2918..10,3049.98,3062.46,2948.55,3072.90,3113.52,2987.61)


Thanking in advance.

With regards

Madhavi



  Love Cricket? Check out live scores, photos, video highlights and more. 
Click here http://cricket.yahoo.com
[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Problem with Poisson - Chi Square Goodness of Fit Test - New Mail

2008-08-29 Thread Madhavi Bhave




Dear R-help,

 

 

Chi Square Test for Goodness of Fit

 

I have got a discrete data
as given below (R script)

 

No_of_Frauds<-c(1,1,1,1,1,1,1,1,1,2,1,1,1,1,1,1,2,1,2,2,2,1,1,2,1,1,1,1,4,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,5,1,2,1,1,1,1,1,1,1,3,2,1,1,1,2,1,1,2,1,1,1,1,1,2,1,3,1,2,1,2,14,2,1,1,38,3,3,2,44,1,4,1,4,1,2,2,1,3)

 

I am trying to fit Poisson
distribution to this data using R.

 

My R script is as under :

 



 

# R SCRIPT for Fitting
Poisson Distribution

 

No_of_Frauds<-c(1,1,1,1,1,1,1,1,1,2,1,1,1,1,1,1,2,1,2,2,2,1,1,2,1,1,1,1,4,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,5,1,2,1,1,1,1,1,1,1,3,2,1,1,1,2,1,1,2,1,1,1,1,1,2,1,3,1,2,1,2,14,2,1,1,38,3,3,2,44,1,4,1,4,1,2,2,1,3)

 

N      <- length(No_of_Frauds)

 

Average <- mean(No_of_Frauds)

 

Lambda <- Average

 

i   <- c(0:(N-1))

 

pmf       <- dpois(i, Lambda, log = FALSE)

 

#


 

# Ho: The data follow Poisson
Distribution Vs H1: Not Ho

 

# observed frequencies (Oi)

 

variable.cnts
  <-     table(No_of_Frauds)

variable.cnts.prs
 <- dpois(as.numeric(names(variable.cnts)),
lambda)

variable.cnts
  <- c(variable.cnts, 0)

 

variable.cnts.prs <- c(variable.cnts.prs,
1-sum(variable.cnts.prs))

tst
   <- chisq.test(variable.cnts,
p=variable.cnts.prs)

 

chi_squared
   <- as.numeric(unclass(tst)$statistic)

p_value     <- as.numeric(unclass(tst)$p.value)

df
    <- tst[2]$parameter

 

 

cv1    <- qchisq(p=.01, df=tst[2]$parameter, lower.tail = 
FALSE, log.p =
FALSE)

 

cv2    <- qchisq(p=.05, df=tst[2]$parameter, lower.tail = 
FALSE, log.p =
FALSE)

 

cv3    <- qchisq(p=.1, df=tst[2]$parameter, lower.tail = 
FALSE, log.p =
FALSE)

 

#-

 

# Expected value

 

# variable.cnts.prs *
sum(variable.cnts) 

 

 

#
if tst > cv reject Ho at alpha confidence level

 

#-

 

if(chi_squared > cv1)

 

Conclusion1 <- 'Sample
does not come from the postulated probability distribution at 1% los' else

Conclusion1 <- 'Sample
comes from postulated prob. distribution at 1% los'

 

 

if(chi_squared > cv2)

 

Conclusion2 <- 'Sample
does not come from the postulated probability distribution at 5% los' else

Conclusion2 <- 'Sample
comes from postulated prob. distribution at 1% los'

 

if(chi_squared > cv3)

Conclusion3 <- 'Sample
does not come from the postulated probability distribution at 10% los' else

Conclusion3 <- 'Sample
come from postulated prob distribution at 1% los'

 

#-

 

# Printing RESULTS 

 

print(chi_squared)

 

print(p_value)

 

print(df)

 

print(cv1)

 

print(cv2)

 

print(cv3)

 

print(Conclusion1)

 

print(Conclusion2)

 

print(Conclusion3)

 

 

# End of R Script


 



 

Problem Faced :

 

When I run this script using
R – console,

 

I am getting value of Chi – Square Statistics as
high as “6.95753e+37”

 

When I did the same calculations in Excel, I got
the Chi Square Statistics value = 138.34.


 

Although it is clear that the sample data doesn’t
follow Poisson distribution, and I will have to look for other discrete
distribution, my problem is the HIGH Value of Chi Square test statistics. When
I analyzed further, I understood the problem. 

 

(A) By convention, if your Expected
frequency is less than 5, then by we put together such classes and form a new
class such that Expected frequency is greater than 5 and also accordingly
adjust the observed frequencies.

 





  
  X
  
  
  Oi
  
  
  Ei
  
  
  ((Oi - Ei)^2)/Ei
  


  
  0
  
  
  0
  
  
  10
  
  
  9.96
  


  
  1
  
  
  72
  
  
  23
  
  
  103.79
  


  
  2
  
  
  17
  
  
  27
  
  
  3.54
  


  
  3
  
  
  5
  
  
  21
  
  
  11.85
  


  
  4
  
  
  3
  
  
  12
  
  
  6.71
  


  
  5
  
  
  4
  
  
  9
  
  
  2.51
  


  
  Total
  
  
  101
  
  
  101
  
  
  138.34
  





 

 

When I apply this logic in Excel, I am getting the
reasonable result (i.e. 138.34), however in Excel also, if I don’t apply this
logic, my Chi square test statistic value is as high as 4.70043E+37.

 

My
question is how do I modify my R – script, so that the logic mentioned in (A)
i.e. adjusting the Expected frequencies (and accordingly Observed frequencies) 
is
applied so that the expected frequency becomes greater than 5 for a given
class, thereby resulting in reasonable value of Chi Square test Statistics.

 

I am also attaching the xls file for

[R] Interpreting Logistic Regression

2008-08-20 Thread Madhavi Bhave

Hi !

This is Madhavi from Mumbai, India. Incidently this is my first post.

I am working on Credit Scoring Model and using R, I have run the logistic 
regression. I have received following Output.

I have two questions

(a) What is the significance of "family = binomial(link = logit)". Why do I 
have to mention Binomial? Is it because my dependent variable assumes only two 
values 0 and 1? Can I write name of some other Statistical distribution (say 
Poisson or Negative Binomial) in place of Binomial? How will it affect my 
results?

(b) How do I interpret the "R" result as given below? I know all the variables 
are significant. How do I get Log Likelihood ratio, Odds ratio etc.?

Please can anyone help me out.

With warm regards

Madhavi



R OUTPUT


Call:
glm(formula = Y ~ Age1 + Age2 + Sex + Education + Profession + SavingsAccount + 
    CurrentAccount, family = binomial(link = logit), data = ons)

Deviance Residuals: 
     Min        1Q    Median        3Q       Max  
-3.21142  -0.42556  -0.15911  -0.02954   3.02465  

Coefficients:
                  Estimate Std. Error z value Pr(>|z|)    
(Intercept)       2.627725   0.110752  23.726  < 2e-16 ***
Age1              0.692180   0.070410   9.831  < 2e-16 ***
Age2             -2.817883   0.080801 -34.874  < 2e-16 ***
Sex              -0.486132   0.049766  -9.768  < 2e-16 ***
Education        -0.682142   0..046507 -14.667  < 2e-16 ***
Profession       -0.690937   0.069032 -10.009  < 2e-16 ***
SavingsAccount   -1.891455   0.074906 -25.251  < 2e-16 ***
CurrentAccount   -1.367460   0.079604 -17.178  < 2e-16 ***
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ 
’ 1 

(Dispersion parameter for binomial family taken to be 1)

    Null deviance: 26932  on 24999  degrees of freedom
Residual deviance: 14615  on 24983  degrees of freedom
  (2 observations deleted due to missingness)
AIC: 14649

Number of Fisher Scoring iterations: 6




  Unlimited freedom, unlimited storage. Get it now, on 
http://help.yahoo.com/l/in/yahoo/mail/yahoomail/tools/tools-08.html/
[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.