Hi,
A 5-second search on Internet brought me here:
http://en.wikibooks.org/wiki/Data_Mining_Algorithms_In_R/Clustering/K-Means
Regards,
Pascal
On 20/06/13 15:57, Safiye Celik wrote:
Hi,
Does anybody know the difference between the Lloyd and Forgy algorithms
specified for R's kmeans
Hi,
I searched for a long time and I read this website before asking the
question. But it does not answer my issue. Thanks anyway.
I digged unto the source code and realized that there is no difference
between the implementations of Lloyd and Forgy. In fact, in kmeans.R, the
method number is set
...@stat.math.ethz.ch
Subject: Re: [R] Difference between R and SAS in Corcordance index in ordinal
logistic regression
Please define 'mean probabilities'.
To compute the C-index or Dxy you need anything that is monotonically
related to the prediction of interest, including the linear combination
For lrm fits, predict(fit, type='mean') predicts the mean Y, not a
probability.
Frank
__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and
lrm does some binning to make the calculations faster. The exact calculation
is obtained by running
f - lrm(...)
rcorr.cens(predict(f), DA), which results in:
C IndexDxy S.D. nmissing
0.96814404 0.93628809 0.0380833632.
Subject: Re: [R] Difference between R and SAS in Corcordance index in ordinal
logistic regression
lrm does some binning to make the calculations faster. The exact calculation
is obtained by running
f - lrm(...)
rcorr.cens(predict(f), DA), which results in:
C IndexDxy
differently ?
Thank for your help and best regards,
OC
Date: Thu, 24 Jan 2013 05:28:13 -0800
From:
f.harrell@
To:
r-help@
Subject: Re: [R] Difference between R and SAS in Corcordance index in
ordinal logistic regression
lrm does some binning to make the calculations faster. The exact
Dear list,
I am calculating the 95th percentile of a set of values with R and with SPSS
In R:
normal200-rnorm(200,0,1)
qnorm(0.95,mean=mean(normal200),sd=sd(normal200),lower.tail =TRUE)
[1] 1.84191
In SPSS, if I use the same 200 values and select Analyze - Descriptive
Statistics -
On Thu, Nov 8, 2012 at 12:17 PM, David A. dasol...@hotmail.com wrote:
In R:
normal200-rnorm(200,0,1)
You forgot set.seed(310366) so we can reproduce your random numbers exactly.
I think the main difference is that SPSS only calculates critical values
within the range of values in the
On 12-11-08 7:17 AM, David A. wrote:
Dear list,
I am calculating the 95th percentile of a set of values with R and with SPSS
In R:
normal200-rnorm(200,0,1)
qnorm(0.95,mean=mean(normal200),sd=sd(normal200),lower.tail =TRUE)
[1] 1.84191
In SPSS, if I use the same 200 values and select
Hi, David,
I think you're confusing the q-th percentile of your data, i. e., the
empirical q-th percentile, which is -- roughly -- the value x_q for which
q * 100 % of the data are less than or equal to x_q, with the q-th
percentile of a distribution (here the normal distribution) that has as
Hi!
as my subject says I am struggling with the different of a two-way ANOVA and
a (two-way) ANCOVA.
I found the following examples from this webpage:
http://www.statmethods.net/stats/anova.html
# One Way Anova (Completely Randomized Design)
fit - aov(y ~ A, data=mydataframe)
# Randomized
On Jul 4, 2012, at 15:20 , syrvn wrote:
Hi!
as my subject says I am struggling with the different of a two-way ANOVA and
a (two-way) ANCOVA.
I found the following examples from this webpage:
http://www.statmethods.net/stats/anova.html
# One Way Anova (Completely Randomized Design)
The usual terminology uses the number of ways to mean the number of
factors (categorical
or classification variables, with more than one degree of freedom per
factor).
The term covariate is used for continuous variables, with exactly one df.
On Wed, Jul 4, 2012 at 9:20 AM, syrvn ment...@gmx.net
On 25/05/2012 12:41 PM, QAMAR MUHAMMAD UZAIR wrote:
dear all,
it will just take you a minute to tell me the difference
between qnorm and qqnorm. are they same or is there any
difference between them??
They are very different, qqnorm draws a plot, qnorm does a calculation
of some of the values
Dear R Users,
I was wondering if some members of the list could shed some light on the
difference in AIC computation existing between R (2.12; gam package)
and Splus (7.0.6). Because I am not a statistician by training, I would
like to apologize in advance if I use wrong terms or dot not
-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On
Behalf
Of brwin338
Sent: Thursday, May 03, 2012 4:33 PM
To: r-help@r-project.org
Subject: [R] Difference between 10 and 10L
Good Evening
We have been searching through the R documentation manuals without success on
this
one.
What
On Thu, May 03, 2012 at 07:32:46PM -0400, brwin338 wrote:
Good Evening
We have been searching through the R documentation manuals without success on
this one.
What is the purpose or result of the L in the following?
n=10
and
n=10L
or
c(5,10)
versus
c(5L,10L)
Hi.
The help page
Good Evening
We have been searching through the R documentation manuals without success on
this one.
What is the purpose or result of the L in the following?
n=10
and
n=10L
or
c(5,10)
versus
c(5L,10L)
Thanks
Joe
Thanks
Joe
[[alternative HTML version deleted]]
tibco.com
-Original Message-
From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On
Behalf
Of brwin338
Sent: Thursday, May 03, 2012 4:33 PM
To: r-help@r-project.org
Subject: [R] Difference between 10 and 10L
Good Evening
We have been searching through the R
On 08.04.2012 20:39, Bazman76 wrote:
Hi there,
Can someone explain what the difference between spec.pgram and spec.ar is?
I understand that they attempt to do the same thing one using an AR
estimation of the underlying series to estimate teh sensity the other using
the FFT. However when
OK so I neeed to understan better what it it they are trying to measure.
I understood (incorrectly it seems) that they were simply different methods
to get the same result?
--
View this message in context:
On 09.04.2012 17:01, Bazman76 wrote:
OK so I neeed to understan better what it it they are trying to measure.
I understood (incorrectly it seems) that they were simply different methods
to get the same result?
Yes. Also note this is a mailing list and you are lucky I was able to
remember
On Apr 9, 2012, at 16:55 , Uwe Ligges wrote:
On 08.04.2012 20:39, Bazman76 wrote:
Hi there,
Can someone explain what the difference between spec.pgram and spec.ar is?
I understand that they attempt to do the same thing one using an AR
estimation of the underlying series to estimate
oops sorry
n 08.04.2012 20:39, Bazman76 wrote:
Hi there,
Can someone explain what the difference between spec.pgram and spec.ar is?
I understand that they attempt to do the same thing one using an AR
estimation of the underlying series to estimate teh sensity the other
using
the
Yes I agree, there may be something pathalogical in the way at least one of
the models handles the data. That's why I was trying to get a better handle
on how the two functions spec.prgm() and spec.ar() work.
The data has been processed by a wavelet analysis, so what you are seeing as
the raw
On Mon, Apr 9, 2012 at 9:27 AM, Bazman76 h_a_patie...@hotmail.com wrote:
Yes I agree, there may be something pathalogical in the way at least one of
the models handles the data. That's why I was trying to get a better handle
on how the two functions spec.prgm() and spec.ar() work.
The data
On 09/04/2012 18:52, Bert Gunter wrote:
On Mon, Apr 9, 2012 at 9:27 AM, Bazman76h_a_patie...@hotmail.com wrote:
Yes I agree, there may be something pathalogical in the way at least one of
the models handles the data. That's why I was trying to get a better handle
on how the two functions
Hi there,
Can someone explain what the difference between spec.pgram and spec.ar is?
I understand that they attempt to do the same thing one using an AR
estimation of the underlying series to estimate teh sensity the other using
the FFT. However when applied to teh same data set they seem to be
= 0.0438 and SD_2 = 0.0285, and
then the 95% CI for the difference?
Thanks
John
From: Thomas Lumley tlum...@uw.edu
To: Jason Connor jcon...@alumni.cmu.edu
Cc: r-help@r-project.org
Sent: Wednesday, March 7, 2012 10:58 AM
Subject: Re: [R] Difference in Kaplan
--begin included message --
I thought this would be trivial, but I can't find a package or function
that does this.
I'm hoping someone can guide me to one.
Imagine a simple case with two survival curves (e.g. treatment
control).
I just want to calculate the difference in KM estimates at a
On Fri, Mar 9, 2012 at 3:08 AM, Terry Therneau thern...@mayo.edu wrote:
A note on standard errors: S(t) +- std is a terrible confidence
interval. You will be much more accurate if you use log scale. (Some
argue for logit or log-log, in truth they work well.) If n is large
enough,
I thought this would be trivial, but I can't find a package or function
that does this.
I'm hoping someone can guide me to one.
Imagine a simple case with two survival curves (e.g. treatment control).
I just want to calculate the difference in KM estimates at a specific time
point (e.g. 1
On Thu, Mar 8, 2012 at 4:50 AM, Jason Connor jcon...@alumni.cmu.edu wrote:
I thought this would be trivial, but I can't find a package or function
that does this.
I'm hoping someone can guide me to one.
Imagine a simple case with two survival curves (e.g. treatment control).
I just want
Did you try the survival package?
On Wed, 07-Mar-2012 at 10:50AM -0500, Jason Connor wrote:
| I thought this would be trivial, but I can't find a package or function
| that does this.
|
| I'm hoping someone can guide me to one.
|
| Imagine a simple case with two survival curves (e.g.
I have a multidimensional array - in this case with 4 dimensions of x,y,z
and time. I'd like to take the time derivative of this matrix, i.e.
perform the diff operator along dimension number 4.
In Matlab, there is a simple option to specify which dimension the
difference is to be taken across.,
On 9 January 2012 08:53, Phil Wiles philip.wi...@gmail.com wrote:
I have a multidimensional array - in this case with 4 dimensions of x,y,z
and time. I'd like to take the time derivative of this matrix, i.e.
perform the diff operator along dimension number 4.
apply can do that, but you may
On Jan 5, 2012, at 02:10 , Yoo Jinho wrote:
Dear all,
I have found some difference of the results between multinom() function in
R and multinomial logistic regression in SPSS software.
The input data, model and parameters are below:
choles - c(94, 158, 133, 164, 162, 182, 140, 157,
Dear all,
I have found some difference of the results between multinom() function in
R and multinomial logistic regression in SPSS software.
The input data, model and parameters are below:
choles - c(94, 158, 133, 164, 162, 182, 140, 157, 146, 182);
sbp - c(105, 121, 128, 149, 132, 103, 97,
On Jan 4, 2012, at 8:10 PM, Yoo Jinho wrote:
Dear all,
I have found some difference of the results between multinom()
function in
R and multinomial logistic regression in SPSS software.
The input data, model and parameters are below:
choles - c(94, 158, 133, 164, 162, 182, 140, 157, 146,
As I said before: please dput() some working data and I'll try to work
something up.
Without it, the only thing I can reasonably suggest is that perhaps
you are looking for the window() function to be applied before
min/max. Something like:
X - ts(1:48, start = 1, frequency = 4)
Y - ts(1:12,
Hello Michael,
Thanks again for your reply. Actually, I am working with wind data.
I have some sample data for actual load.
scan(/home/sam/Desktop/tt.dat) -tt ## This is the input for the actual
output of the generation
t = ts(tt, start=8, end=24, frequency=1,)
I have another random sequence
Hello,
Can you please help me with this? I am also stack in the same problem.
Sam
--
View this message in context:
http://r.789695.n4.nabble.com/Difference-between-two-time-series-tp819843p4073800.html
Sent from the R help mailing list archive at Nabble.com.
It's not clear what it means for the differences to be of increasing
order but if you simply mean the differences are increasing, perhaps
something like this will work:
library(caTools)
X = cumsum( 2*(runif(5e4) 0.5) - 1) # Create a random Walk
Y = runmean(X, 30, endrule = mean, align = right)
Hello Michael,
Thanks for your reply. What I want to do is something like this? For
example, I have a continuous time series y=x(t), and another discrete time
series z=w(t).
Xdiff(i)=Max. difference between x(t) and w(t) in interval i
Ndiff(i)=Min. difference between x(t) and w(t) in interval
Can you post working examples of your data using the dput() function?
There are so many types of time series in R and so many different
things you could mean that it's just easier to work with real data.
Michael
On Tue, Nov 15, 2011 at 4:28 PM, Sarwarul Chy sarwar.sha...@gmail.com wrote:
Hello
Hallo
Can anyone tell me the difference between
foo$a[2] - 1 and foo[2,a] - 1 ?
I thought that both expressions are equivalent, but when I run the following
example, there is obviously a difference.
foo - data.frame(a=NA,b=NA)
foo
a b
1 NA NA
foo$a[1] - 1
foo$b[1] - 2
foo$a[2] - 1
Columns in data frames must all have the same number of elements. Your first
example attempts to violate that, because it works with a single column. The
second example works on the entire data frame, so it is able to lengthen the
other column to match.
Hi,
I draw two Plots, one with xyplot() and one with plot(). Why is the
line with xyplot() not always in the middle of the dots like plot()?
library(lattice)
x-c(1,2,3,4,5,6)
y-x
plot.new()
plot(x ~ y, main = Sequenz 1 und Sequenz 2, xlab=Sequenz 2,
ylab=Sequenz 1, las=1)
abline(a=0,
From: Jörg Reuter
Why
is the line with xyplot() not always in the middle of the
dots like plot()?
Because you used the base graphics command abline() on a lattice plot?
They don't mix. The plot regions are different for lattice and base graphics -
notice that the second abline
Thanks,
i have it now:
library(lattice)
mein.panel - function(x, y){
panel.xyplot(x, y)
panel.abline(a=0, b=1, lwd=2, col=red)}
x-c(1,2,3,4,5,6)
y-x
xyplot(x ~ y, main =
xyplot, xlab=Sequenz 2,
ylab=Sequenz 1, las=1,
panel=mein.panel)
2011/10/28 S Ellison s.elli...@lgcgroup.com:
Hi Max,
Thanks for the note. In your last paragraph, did you mean in
createDataPartition? I'm a little vague about what returnTrain option
does.
Bonnie
Quoting Max Kuhn mxk...@gmail.com:
Basically, createDataPartition is used when you need to make one or
more simple two-way splits of
No, it is an argument to createFolds. Type ?createFolds to see the
appropriate syntax: returnTrain a logical. When true, the values
returned are the sample positions corresponding to the data used
during training. This argument only works in conjunction with list =
TRUE
On Mon, Oct 3,
As I think it is not spam but helpful, let me repeat my stats.stackexchange.com
question here, from
http://stats.stackexchange.com/questions/16346/difference-between-lp-or-simply-in-rs-locfit
I am not sure I see the difference between different examples for local
logistic regression in the
Hello,
I'm trying to separate my dataset into 4 parts with the 4th one as the
test dataset, and the other three to fit a model.
I've been searching for the difference between these 2 functions in
Caret package, but the most I can get is this--
A series of test/training partitions are
Hi,
On Sun, Oct 2, 2011 at 2:47 PM, bby2...@columbia.edu wrote:
Hello,
I'm trying to separate my dataset into 4 parts with the 4th one as the test
dataset, and the other three to fit a model.
I've been searching for the difference between these 2 functions in Caret
package, but the most I
Hi Steve,
Thanks for the note. I did try the example and the result didn't make
sense to me. For splitting a vector, what you describe is a big
difference btw them. For splitting a dataframe, I now wonder if these
2 functions are the wrong choices. They seem to split the columns, at
Hi,
On Sun, Oct 2, 2011 at 3:54 PM, bby2...@columbia.edu wrote:
Hi Steve,
Thanks for the note. I did try the example and the result didn't make sense
to me. For splitting a vector, what you describe is a big difference btw
them. For splitting a dataframe, I now wonder if these 2 functions
Basically, createDataPartition is used when you need to make one or
more simple two-way splits of your data. For example, if you want to
make a training and test set and keep your classes balanced, this is
what you could use. It can also make multiple splits of this kind (or
leave-group-out CV aka
Hello , I have estimated the following model, a sarima:
p=9
d=1
q=2
P=0
D=1
Q=1
S=12
In R 2.12.2
Call:
arima(x = xdata, order = c(p, d, q), seasonal = list(order = c(P, D, Q),
period = S),
optim.control = list(reltol = tol))
Coefficients:
ar1 ar2 ar3 ar4 ar5
Luis Felipe Parra wrote:
and as you can see in the results some coefficients (for example ar2 and
ar8) are different in the different R versions. does anybody know what
might
be going on. Was there any change in the arima function between the two
versions?
You asked the same
I didn't learn about data tables until recently. (They're never covered in
any intro R books).
In any case, I'm not sure what (if any) is the difference between a data
frame and a data table.
Can anyone provide a brief explanation?
Is one preferred over another or is it just dependent on the
Google on R data table please. Read the vignettes therein.
-- Bert
On Mon, Aug 29, 2011 at 1:59 PM, Abraham Mathew abra...@thisorthat.com wrote:
I didn't learn about data tables until recently. (They're never covered in
any intro R books).
In any case, I'm not sure what (if any) is the
Hello,
I am trying to find out if R can do the following:
I have a mixture of normals say f = 0.2*Normal(2, 5) + 0.8*Normal(3,2)
How do I find the difference in the densities at any particular point of f
and at Normal(2,5)?
--
Thanks,
Jim.
[[alternative HTML version deleted]]
Of Jim Silverton
Sent: Monday, April 04, 2011 10:01 AM
To: r-help@r-project.org
Subject: Re: [R] Difference in mixture normals and one density
Hello,
I am trying to find out if R can do the following:
I have a mixture of normals say f = 0.2*Normal(2, 5) + 0.8*Normal(3,2)
How do I find
Hello everyone !
I am currently trying to convert a program from S-plus to R, and I am
having some trouble with the S-plus function called influence(data,
statistic,...).
This function aims to calculate empirical influence values and related
quantities,
and is part of the Resample library that
A detailed description of the Excel problem as seen through the eyes of
MS can be found at
http://support.microsoft.com/kb/214326
On 3/2/2011 8:15 AM, Prof Brian Ripley wrote:
## Excel is said to use 1900-01-01 as day 1 (Windows default) or
## 1904-01-01 as day 0 (Mac default), but
On Wed, 2 Mar 2011, Erich Neuwirth wrote:
A detailed description of the Excel problem as seen through the eyes of
MS can be found at
http://support.microsoft.com/kb/214326
No, that's only half the problem. The description at
http://support.microsoft.com/kb/214330
(as cited in the
Hello. I am using some dates I read in excel in R. I know the excel origin
is supposed to be 1900-1-1. But when I used as.Date with origin=1900-1-1 the
dates that R reported me where two days ahead than the ones I read from
Excel. I noticed that when I did in R the following:
-Original Message-
From: r-help-boun...@r-project.org [mailto:r-help-bounces@r-
project.org] On Behalf Of Luis Felipe Parra
Sent: Tuesday, March 01, 2011 3:07 PM
To: r-help
Subject: [R] Difference in numeric Dates between Excel and R
Hello. I am using some dates I read in excel
On 2/03/2011 12:31 p.m., Nordlund, Dan (DSHS/RDA) wrote:
-Original Message- From: r-help-boun...@r-project.org
[mailto:r-help-bounces@r- project.org] On Behalf Of Luis Felipe
Parra Sent: Tuesday, March 01, 2011 3:07 PM To: r-help Subject: [R]
Difference in numeric Dates between Excel
On Wed, 2 Mar 2011, Luis Felipe Parra wrote:
Hello. I am using some dates I read in excel in R. I know the excel origin
is supposed to be 1900-1-1. But when I used as.Date with origin=1900-1-1 the
dates that R reported me where two days ahead than the ones I read from
Excel. I noticed that when
Well, it should be difference by ID and TIME for q1:
something like:
for ID 1187
in TIME 1 q1=3
and TIME 2 (for same ID) q1=3
so diff would be 3-3=0
TIME ID q1
1 1187 3
1 1187 3
And I don't know how to make R to find pairs and calculate diff?
2011/2/21 Dennis
On 2011-02-22 12:51, Vlatka Matkovic Puljic wrote:
Well, it should be difference by ID and TIME for q1:
something like:
for ID 1187
in TIME 1 q1=3
and TIME 2 (for same ID) q1=3
so diff would be 3-3=0
TIME ID q1
1 1187 3
1 1187 3
And I don't know how to make
Dear all,
I want to perform paired Wilcoxon signed ranks test on my data.
I have pairs defined by ID and TIME variables.
How can I calculate difference in variables q1, q2 in each pair?
TIME ID q1 q2
1 1187 3 2
1 1706 3 3
1 1741 2 4
2 1187 3 2
2 1706 3 3
2 1741 2 4
Please, any clue!
:)
--
Hi:
Assuming dd is the name of your data frame,
dd$diff - with(dd, q2 - q1)
dd
TIME ID q1 q2 diff
11 1187 3 2 -1
21 1706 3 30
31 1741 2 42
42 1187 3 2 -1
52 1706 3 30
62 1741 2 42
is one way to do it.
HTH,
Dennis
On Mon, Feb 21,
Dear R-users,
I'm studing a DB, structured like this (just a little part of my dataset):
_
Site
Latitude
Longitude
Year
Tot-Prod
Total_Density
dmp
Dendoudi-1
Francesco,
My guess would be collinearity of the predictors. The linear model
gives you the best fit to all of the predictors at once; unless the
predictors are orthogonal (which in a case like this is certainly not
the case), there is no guarantee that the parameter estimates which
give the best
Santosh Srinivas wrote:
A fundamental question ...I'm trying to understand the differences between
loop and vectorization ... I understand that it should be a natural choice
to use apply / adply when it is needed to perform the
same function across all rows of a data frame. Any pointers on
Hello R-helpers,
A fundamental question ...I'm trying to understand the differences
between loop and vectorization ... I understand that it should be a
natural choice to use apply / adply when it is needed to perform the
same function across all rows of a data frame.
Any pointers on why this is
Hi all,
I have a couple of questions that are general statistics questions rather
than being R-specific.
I'm interested in figuring out how to compute something like standard error
for difference scores (in particular, differences scores of reaction times).
Does anyone know if there is a
I just wrote up some code for differencing two .RData files or
environments (or one of each). Available from source here:
http://www.maths.lancs.ac.uk/~rowlings/R/Ediff/
In its handiest form, running:
ediff()
will tell you the difference between your working environment and the
.RData file
Hello all,
I would like to know what the difference is between chisq.test and
fisher.test when using the Monte Carlo method with simulate.p.value=TRUE?
Thank you
--
View this message in context:
This is my first post to the mailing list and I guess it's a pretty stupid
question but I can't figure it out. I hope this is the right forum for these
kind of questions.
Before I started using R I was using STATA to run a Wilcoxon signed-rank
test on two variables. See data below:
Hi,
Look at the output of the test made in R and you can see it is a
Wilcoxon rank sum test and not a Wilcoxon signed rank test.
If there are ties, I know I prefer wilcox.exact from the exactRankTests.
Alain
On 09-Aug-10 12:43, Capasia wrote:
This is my first post to the mailing list and
On Aug 9, 2010, at 3:03 PM, Alain Guillet wrote:
Hi,
Look at the output of the test made in R and you can see it is a Wilcoxon
rank sum test and not a Wilcoxon signed rank test.
It might be helpful to add that paired=TRUE is needed in the call to get the
signed-rank test.
If there are
On Aug 9, 2010, at 9:52 AM, peter dalgaard wrote:
On Aug 9, 2010, at 3:03 PM, Alain Guillet wrote:
Hi,
Look at the output of the test made in R and you can see it is a
Wilcoxon rank sum test and not a Wilcoxon signed rank test.
It might be helpful to add that paired=TRUE is needed in
Dear R People:
I have a data frame with the two following date columns:
a.df[1:10,c(1,6)]
DATE DEATH
1207 2009-04-16 2009-05-06
1514 2009-04-16 2009-05-06
2548 2009-04-16 2009-05-08
3430 2009-04-16 2009-05-09
3851 2009-04-16 2009-05-09
3945 2009-04-16 2009-05-09
7274 2009-04-16
454.6529 days
generates differences between the 2 columns without NAs.
What's the output you get when you call str on your data frame?
Christos
Date: Sun, 20 Jun 2010 19:13:24 -0500
From: erinm.hodg...@gmail.com
To: r-h...@stat.math.ethz.ch
Subject: [R] difference in dates
Dear R
the output you get when you call str on your data frame?
Christos
Date: Sun, 20 Jun 2010 19:13:24 -0500
From: erinm.hodg...@gmail.com
To: r-h...@stat.math.ethz.ch
Subject: [R] difference in dates
Dear R People:
I have a data frame with the two following date columns:
a.df[1:10,c(1,6)]
DATE
Are you sure your data was the Date class?
x - read.table('clipboard', header=TRUE)
x
DATE DEATH
1207 2009-04-16 2009-05-06
1514 2009-04-16 2009-05-06
2548 2009-04-16 2009-05-08
3430 2009-04-16 2009-05-09
3851 2009-04-16 2009-05-09
3945 2009-04-16 2009-05-09
7274 2009-04-16
On Fri, 28 May 2010 01:17:49 -0700 (PDT)
carslaw david.cars...@kcl.ac.uk wrote:
[4] HGV-D-Euro-III HGV-D-Euro-IV EGR HGV-D-Euro-IV SCR
[4] HGV-D-Euro-III HGV-D-Euro-IV EGR HGV-D-Euro-IV SCR
[7] HGV-D-Euro-IV SCRb HGV-D-Euro-V EGR HGV-D-Euro-VI
[7] HGV-D-Euro-IV SCRb
Dear R users,
I'm a bit perplexed with the effect sort has here, as it is different on
Windows vs. linux.
It makes my factor levels and subsequent plots different on the two systems.
Given:
types - c(PC-D-Euro-0, PC-D-Euro-1, PC-D-Euro-2, PC-D-Euro-3,
PC-D-Euro-4, PC-D-Euro-5, PC-D-Euro-6,
On 28-May-10 08:17:49, carslaw wrote:
Dear R users,
I'm a bit perplexed with the effect sort has here, as it is different
on Windows vs. linux.
It makes my factor levels and subsequent plots different on the two
systems.
Given:
types - c(PC-D-Euro-0, PC-D-Euro-1, PC-D-Euro-2,
In my response cited below:
On 28-May-10 09:55:36, Ted Harding wrote:
I suspect the result (in Linux, I can't test this on Windows)
may be related to the following phenomenon:
sort(c(AB CD,ABCD))
# [1] ABCD AB CD
sort(c(AB CD,ABCD ))
# [1] AB CD ABCD
I.e. ABCD precedes AB CD
Thanks Ted,
Indeed, there is a difference between the systems on your much-simplified
example (thanks).
So, linux:
sort(c(AB CD,ABCD))
[1] ABCD AB CD
Windows:
sort(c(AB CD,ABCD))
[1] AB CD ABCD
Regards,
David
--
View this message in context:
carslaw wrote:
Dear R users,
I'm a bit perplexed with the effect sort has here, as it is different on
Windows vs. linux.
It makes my factor levels and subsequent plots different on the two systems.
You are using different collation orders. On Linux, your sessionInfo shows
en_GB.utf8
Pretty obvious: You use different locales (collate). What happens if you use
the same on both machines?
Cheers
Joris
On Fri, May 28, 2010 at 10:17 AM, carslaw david.cars...@kcl.ac.uk wrote:
Dear R users,
I'm a bit perplexed with the effect sort has here, as it is different on
...
the
It would seem that there is indeed a locale effect. Revisiting the
examples I used on Linux in a previous post, at which time I was
using the default LC_COLLATE=en_GB.UTF-8, I changed this to C.
Both the C and the en_GB.UTF-8 are indicated (the latter copied
from my previous post):
An experiment:
sort(c(AACD,A CD))
# [1] AACD A CD
sort(c(ABCD,A CD))
# [1] ABCD A CD
sort(c(ACCD,A CD))
# [1] ACCD A CD
sort(c(ADCD,A CD))
# [1] A CD ADCD
sort(c(AECD,A CD))
# [1] A CD AECD
## (with results for AFCD, ... AZCD similar to the last two).
101 - 200 of 293 matches
Mail list logo