Hi folks!
I need to build a binary dependent variable model, but my independent
variables have some serious outliers (they are not data errors.) I was
thinking of using the robustbase package because I noticed one of the
functions accepts a binary depvar. Can I evaluate this model like any
ot
On Sun, 2004-07-04 at 19:41, Richard A. O'Keefe wrote:
> Last week there was a thread on outlier detection.
> I came across an article which has a very interesting paragraph.
>
> The article is
> Missing Values, Outliers, Robust Statistics, & Non-parametric Methods
> by Shaun Burke, RHM Te
Last week there was a thread on outlier detection.
I came across an article which has a very interesting paragraph.
The article is
Missing Values, Outliers, Robust Statistics, & Non-parametric Methods
by Shaun Burke, RHM Techology Ltd, High Wycombe, Buckinghamshire, UK.
It was the fourth a
> From: Edgar Acuna [mailto:[EMAIL PROTECTED]
>
> Dear Andy,
> Thanks for your quick answer. I increased the number of trees and the
> outlyingness measure got more stable. But still I do not know if I am
> working with the raw measure or with the normalized measure mentioned
> in the Breiman's W
Dear Andy,
Thanks for your quick answer. I increased the number of trees and the
outlyingness measure got more stable. But still I do not know if I am
working with the raw measure or with the normalized measure mentioned
in the Breiman's Wald lecture. The normalized measure nout is
nout=(nout-med)
The thing to do is probably:
1. Use fairly large number of trees (e.g., 1000).
2. Run a few times and average the results.
The reason for the instability is sort of two fold:
1. The random forest algorithm itself is based on randomization. That's why
it's probably a good idea to have 500-1000 t
Hello,
Does anybody know if the outscale option of randomForest yields the
standarized version of the outlier measure for each case? or the results
are only the raw values. Also I have notice that this measure presents
very high variability. I mean if I repeat the experiment I am getting very
diffe
[EMAIL PROTECTED] wrote:
Dear all
I would like to represent the outliers in the plot. These few outliers
are much larger than the limit of 50 in the ylim-argument.
plot(daten$month~daten$no,ylim=c(0,50))
I know that it is possible to introduce the information about the
presence of outliers wi
Dear all
I would like to represent the outliers in the plot. These few outliers
are much larger than the limit of 50 in the ylim-argument.
plot(daten$month~daten$no,ylim=c(0,50))
I know that it is possible to introduce the information about the
presence of outliers without changing the range o
Hi,
sorry, I was wrong and that's true. The Hampel
suggestion is
outliers <- (xmedx+3.5*madx)
or to use the multiplier 5.2 with
madx <- mad(x, constant=1).
Christian
On Fri, 21 Feb 2003, Jason Turner wrote:
> On Thu, Feb 20, 2003 at 06:54:21PM +0100, Christian Hennig wrote:
> ...
> > However,
On Thu, Feb 20, 2003 at 06:54:21PM +0100, Christian Hennig wrote:
...
> However, a simple straight forward method for outlier identification is
> median +/- 5.2*mad as suggested by Hampel, Technometrics 27 (1985) 95-107.
...
> x <- data vector
> medx <- median(x)
> madx <- mad(x)
> outliers <- (
On Thu, Feb 20, 2003 at 06:37:48PM -0500, Rado Bonk wrote:
> Dear R-users,
>
> I have two outliers related questions.
>
> I.
> I have a vector consisting of 69 values.
>
> mean = 0.00086
> SD = 0.02152
>
> The shape of EDA graphics (boxplots, density plots) is heavily distorted
> due to outlier
Hi,
the boxplot is based on the quartiles which are much less outlier sensitive
than mean and SD and should therefore not be "heavily distorted by
outliers". What you mean is presumably that you see the area of the main
bulk of the data only as a very small box on the screen because of your
outlie
> II.
> How to extract only those values from vector which fulfill the condition
> of interval (higher than A, and lower than B)?
x[x>A & x
> Rado Bonk
>
> __
> [EMAIL PROTECTED] mailing list
> http://www.stat.math.ethz.ch/mailman/listinfo/r-help
>
Dear R-users,
I have two outliers related questions.
I.
I have a vector consisting of 69 values.
mean = 0.00086
SD = 0.02152
The shape of EDA graphics (boxplots, density plots) is heavily distorted
due to outliers. How to define the interval for outliers exception? Is
<2SD - mean + 2SD> interva
15 matches
Mail list logo