Re: [RsR] [R-SIG-Finance] Outliers in the market model that's used to estimate `beta' of a stock

markleeds Thu, 18 Sep 2008 11:09:18 -0700

Hi Matias: yes, he wasn't dissing statistics for the most part. He wasdefinitely talking about the miuses also but I think he was claimingthat models , be it statistics, physics and even non quant models infinance are kind of assumed to be right until they don't work.

That's true in all science but it puts finance on quite shaky groundbecause there are people trading serious money based on the idea thatwhat they are doing is valid and working correctly. This is obviouslykind of relevant to what's going on right now. Thanks for yourreferences also.




mark


On Thu, Sep 18, 2008 at  1:29 PM, Matias Salibian-Barrera wrote:

Haven't read "Fooled by randomness", but did start reading Black Swan,and although in general I like provocative books that challenge mypoints of view, I found his main thesis to be too short to warrant somany words... I took it that his main argument was with those whomisinterpret and misuse statistics (particularly when they do it fortheir own benefit), not with statistics itself, which is always basedon assumptions etc.
[snip] In fact, he would say that a model works until it doesn't.
Which is a fair statement, that also applies to science in general,"theories work until they are proved wrong", and the whole"falsifiability" argument (cf. Popper vs. Kuhn vs. Feyerabend vs...).I believe robust statistics can help you determine when your model(theory) has stopped to work.
In any case, with respect to the old "data cleaning versus robustestimators" discussion, I would point the interest reader to the firstchapter of Maronna, Martin and Yohai's book(http://books.google.com/books?id=YD--AAAACAAJ&dq=martin+maronna+yohai),and for some more specific inference implications, to the firstchapter of my PhD dissertation. Essentially, a couple of main issuesare: (a) detecting outliers using non-robust estimators does not workwell in general (but even if / when it does, see my next point); (b)if you remove (or alter) observations, all subsequent probabilisticstatements (p-values, standard errors, etc) are all conditional on thevery non-linear cleaning operation you did, and thus both wrong atface value, and not easy to correct. Robust estimators incorporate thedown-weighting and its effect on the corresponding inference at once,and are thus, IMHO, to be preferred.
Matias

[EMAIL PROTECTED] wrote:
Hi: i don't know if you read "fooled by randomness" by Nassim Taleb (spelling ) but he essentially says using very non statisticalarguments butstrong nevertheless. ( it's not a stat or a quant finance book )that outliers in finance are not modellable and don't claim that youcan modelthem because you'd be lying. In fact, he would say that a model worksuntil it doesn't.
Anyway, it's an interesting book that sort of indirectly talks ( fora little too long actually. you can get what's he saying in the first50 pages andit's about 200 pages ) about your comment below so I figured Iwould just mention it in case you were interested.
On Thu, Sep 18, 2008 at 11:36 AM, Ajay Shah wrote:
In continuation of the discussion on `Winsorisation' that has taken
place on r-sig-finance today, I thought I'd present all of you withan
interesting dataset and a question.
This data is the daily stock returns of the large Indian softwarefirm`Infosys'. (This is the symbol `INFY' on NASDAQ). It is a largenumber
of observations of daily returns (i.e. percentage changes of the
adjusted stock price).

Load the data in --


print(load(url("http://www.mayin.org/ajayshah/tmp/infosys_mm.rda";)))
    str(x)
    summary(x)
    sd(x)

The name `rj' is used for returns on Infosys, and `rM' is used for
returns on the stock market index (Nifty). There are three really
weird observations in this.

    weird.rj <- c(1896,2395)
    weird.rM <- 2672
    x[weird.rj,]
    x[weird.rM,]

As you can see, these observations are quite remarkable given the
small standard deviations that we saw above. There is absolutely no
measurement error here. These things actually happened.

Now consider a typical application: using this to estimate a market
model. The goal here is to estimate the coefficient of a regressionof
rj on rM.

    # A regression with all obs
    summary(lm(rj ~ rM, data=x))

    # Drop the weird rj --
    summary(lm(rj ~ rM, data=x[-weird.rj,]))

    # Drop the weird rM --
    summary(lm(rj ~ rM, data=x[-weird.rM,]))

    # Drop both kinds of weird observations --
    summary(lm(rj ~ rM, data=x[-c(weird.rM,weird.rj),]))

    # Robust regressions
    library(MASS)
    summary(rlm(rj ~ rM, data=x))
    summary(rlm(rj ~ rM, method="MM", data=x))
    library(robust)
    summary(lmRob(rj ~ rM, data=x))
    library(quantreg)
    summary(rq(rj ~ rM, tau=0.5, data=x))

So you see, we have a variety of different estimates for the slope
(which is termed `beta' in finance). What value would you trust the
most?

And, would winsorisation using either my code
(https://stat.ethz.ch/pipermail/r-sig-finance/2008q3/002921.html) or
Patrick Burns' code
(https://stat.ethz.ch/pipermail/r-sig-finance/2008q3/002923.html) bea
good idea here?

I'm instinctively unhappy with any scheme based on discarding
observations that I'm absolutely sure have no measurement error. We
have to model the weirdness of this data generating process, not
ignore it.

--
Ajay Shah http://www.mayin.org/ajayshah [EMAIL PROTECTED]http://ajayshahblog.blogspot.com
<*(:-? - wizard who doesn't know the answer.

_______________________________________________
[EMAIL PROTECTED] mailing list
https://stat.ethz.ch/mailman/listinfo/r-sig-finance
-- Subscriber-posting only.
-- If you want to post, subscribe first.
_______________________________________________
R-SIG-Robust@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-sig-robust
--
_____________________________________________________
Matias Salibian-Barrera - Department of Statistics
The University of British Columbia
Phone: (604) 822-3410 - Fax: (604) 822-6960
"The plural of anecdote is not data" (George Stigler?)


_______________________________________________
R-SIG-Robust@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-sig-robust

Re: [RsR] [R-SIG-Finance] Outliers in the market model that's used to estimate `beta' of a stock

Reply via email to