on 08/27/2008 09:26 AM David Cobey said the following:
I'm interested in modifying a regression tree algorithm to use the
difference between a control and a dosed sample as its dependent variable.
I was wondering if anyone knew where I could find code to implement a basic
chaid algorithm. Is
If I understand correctly the result you are trying to achieve, I think
what you may be looking for is this:
model1.coeff <- lm(dv ~ iv1 + iv2 + iv3, data =
merged.dataset[merged.dataset$model1 == 1, ])
This will give you a regression using only the data for which model1 == 1.
on 08/22/2008
on 08/22/2008 10:19 AM Martin Ballaschk said the following:
how do I read files that have two header fields less than they have
columns? The easiest solution would be to insert one or two additional
header fields, but I have a lot of files and that would be quite a lot
of awful work.
Any idea
Hi,
I am having a problem with a fixed-effects regression using the "plm"
package. A regression of this form runs just fine:
plm(y ~ x*z, data=datasubset, model="within")
but when i try to add a lag in there, and run it on the same dataset,
like this:
plm(y ~ x*z + lag(x,-1)*z, data=datasu
I second that - quantile regression seems to be what you want.
on 07/23/2008 10:10 AM Ben Bolker said the following:
Firas Swidan gmail.com> writes:
Hi,
I am having difficulties in finding ways to analyse scatter plots and
quantitatively differentiate between them.
Try quantile regress
togram but since the audience is not
very statistically experienced I would prefer to do it this way.
Anyone have an idea?
Thanks again for your help.
Thomas Fröjd
On Wed, Jun 25, 2008 at 6:16 PM, Daniel Folkinshteyn <[EMAIL PROTECTED]> wrote:
I don't understand this. Why not
if there's nothing specific for it, you could probably do it with merge?
on 06/27/2008 02:41 PM Agustin Lobo said the following:
Hi!
Given a vector (or a factor within a df),i.e. v1 <- c(1,1,1,2,3,4,1,10,3)
and a dictionary
cbind(c(1,2,3),c(1001,1002,1003))
is there a function (on the same lin
oh, unlist - very nice function, thanks :)
on 06/27/2008 11:23 AM Jorge Ivan Velez said the following:
Hi Ramya,
Try something like this:
as.character(unlist(lapply(geneset,function(x) x[1])))
HTH,
Jorge
On Fri, Jun 27, 2008 at 10:33 AM, Rajasekaramya <[EMAIL PROTECTED]>
wrote:
Hi,
I h
try this:
firstgenes = lapply(geneset, function(x){return(x[1,1])})
firstgenes = do.call(rbind(firstgenes))
on 06/27/2008 10:33 AM Rajasekaramya said the following:
Hi,
I have a problem in assessing the list element.
i have list called geneset it contains the following elements
this should do what you want:
> myexstrings = c("*AAA.AA","BBB BB","*.CCC.","**dd- d")
> a = gsub("^\\W*","", myexstrings,perl=T)
> b = gsub("\\W.*", "", a, perl=T)
> b
[1] "AAA" "BBB" "CCC" "dd"
first one, removes any non-word characters from the beginning (as you
already figured out)
second o
not sure why it doesn't work, but try the following:
first, plot to a regular window, then run:
> dev.copy(device=png, file="yourfilename.png")
> dev.off()
see if that produces a file you want.
another note: what do you mean you can't just "copy and paste the graph"
in ubuntu? doesn't pressing
this is probably a cludge, and there may be a "neater" way to do this,
but... here's one:
> a = 0:1
> for (i in 1:9){ a= merge(unname(a), 0:1) }
> a = t(a)
after the for loop, 'a' will contain a 1024 row by 10 col dataframe.
putting it through a transpose, gives you the 10 rows by 1024 cols ma
no need for a for loop - we can vectorize this:
> dt <- data.frame(a = c(1, 2, 3), b = c(3, 2, 2), c = c(1, 3, 5))
> dt
a b c
1 1 3 1
2 2 2 3
3 3 2 5
> dt[,paste("test", 1:2, sep="")] = rep(1:2, each=3)
> dt
a b c test1 test2
1 1 3 1 1 2
2 2 2 3 1 2
3 3 2 5 1 2
on 06
If I analyze a client's data using an R script I created then I can
charge the client a $20,000 consulting fee, but, if I let the client
push the button to execute the R script and charge him 10 cents for the
privilege then I can be sued for violating the GPL? Or are my
I think you cannot be su
I don't understand this. Why not just get hist() to plot on the
density scale,
thereby making its output commensurate with the output of density()?
The hist() function will plot on the density scale if you ask it
to. Set freq=FALSE
(or prob=TRUE) in the call to hist.
ehrm
just cbind the cols in the appropriate order:
m.2 = cbind( m.1[,1:5], yourthreecolumns, m.1[,6:ncol(m.1)] )
on 06/24/2008 07:02 AM Daren Tan said the following:
Instead of prepend or append new columns to a matrix, how to insert them to a matrix ? For example, I would like to insert 3 new column
on 06/23/2008 03:40 PM Thomas Frööjd said the following:
1. Shift the mean and std on the reference dataset to the mean
and std of my clinic birth weight data.
to shift the mean by any distance, just add or subtract that distance
from each observation (e.g., to move mean from m1 to m2, t
source('yourscript.R')
on 06/19/2008 03:11 PM [EMAIL PROTECTED] said the following:
Dear R-Users,
I've written a number of functions in a .R/script file. I would like to
call those functions from another script file. How can I execute all the
code in a script file so that the functions are av
install.packages("profr")
library(profr)
p <- profr(fcn_create_nonissuing_match_by_quarterssinceissue(...))
plot(p)
That should at least help you see where the slow bits are.
Hadley
so profiling reveals that '[.data.frame' and '[[.data.frame' and '[' are
the biggest timesuckers...
i suppose
on 06/06/2008 06:55 PM hadley wickham said the following:
Why not try profiling? The profr package provides an alternative
display that I find more helpful than the default tools:
install.packages("profr")
library(profr)
p <- profr(fcn_create_nonissuing_match_by_quarterssinceissue(...))
plot(p)
t those columns, convert them to a matrix, do all the matching,
and then based on some sort of row index retrieve all of the associated
columns.
-Don
At 2:09 PM -0400 6/5/08, Daniel Folkinshteyn wrote:
Hi everyone!
I have a question about data processing efficiency.
My data are as follows: I
e proper index).
Patrick Burns
[EMAIL PROTECTED]
+44 (0)20 8525 0696
http://www.burns-stat.com
(home of S Poetry and "A Guide for the Unwilling S User")
Daniel Folkinshteyn wrote:
Anybody have any thoughts on this? Please? :)
on 06/05/2008 02:09 PM Daniel Folkinshteyn said the follow
another vote for ubuntu here - works for me, and quite trouble-free. add
the r-project repositories, and you're sure to always have the latest,
too. (if you don't care for the latest R, you can of course also just
get R from the distro's repos as well)
on 06/06/2008 05:22 PM Abhijit Dasgupta s
works for me:
> sub('1.00', '1', '1.00E-20')
[1] "1E-20"
remember, according to ?sub, it's sub(pattern, repl, string)
try doing it step by step. first, see what yr1bp$TreeTag[1501] is.
then, if it's the right data item, see what the output of sub("1.00",
"1", yr1bp$TreeTag[1501]) is.
that'll l
d not far from optimal.
If you pick the possibly too small route, then increasing
the size in largish junks is much better than adding
a row at a time.
Pat
Daniel Folkinshteyn wrote:
thanks for the tip! i'll try that and see how big of a difference that
makes... if i am not sure what exactl
== end function code===
on 06/06/2008 01:35 PM Gabor Grothendieck said the following:
I think the posting guide may not be clear enough and have suggested that
it be clarified. Hopefully this better communicates what is required and why
in a shorter amount of space:
https://stat.ethz.c
just in case, uploaded it to the server, you can get the zip file i
mentioned here:
http://astro.temple.edu/~dfolkins/helplistfiles.zip
on 06/06/2008 01:25 PM Daniel Folkinshteyn said the following:
i thought since the function code (which i provided in full) was pretty
short, it would be
ci = rainbow(7)[c(4:7, 1:3)]
on 06/06/2008 01:02 PM avilella said the following:
Hi,
I want to reorder the colors given by rainbow(7) so that the last half
move to the first 4.
For example:
ci=rainbow(7)
ci
[1] "#FFFF" "#FFDB00FF" "#49FF00FF" "#00FF92FF" "#0092"
"#4900"
[7] "#FF
[EMAIL PROTECTED]
+44 (0)20 8525 0696
http://www.burns-stat.com
(home of S Poetry and "A Guide for the Unwilling S User")
Daniel Folkinshteyn wrote:
Anybody have any thoughts on this? Please? :)
on 06/05/2008 02:09 PM Daniel Folkinshteyn said the following:
Hi everyone!
I have a que
008 at 12:03 PM, Daniel Folkinshteyn <[EMAIL PROTECTED]> wrote:
i did! what did i miss?
on 06/06/2008 11:45 AM Gabor Grothendieck said the following:
Try reading the posting guide before posting.
On Fri, Jun 6, 2008 at 11:12 AM, Daniel Folkinshteyn <[EMAIL PROTECTED]>
wrote:
Anybod
well, where are you getting the filename in the first place? are you
looping over a list of filenames that comes from somewhere?
generally, for concatenating strings, look at function 'paste':
write.table(myoutput, paste(myfilename,"_out.txt", sep=''),sep="\t")
on 06/06/2008 11:51 AM DAVID ARTE
i did! what did i miss?
on 06/06/2008 11:45 AM Gabor Grothendieck said the following:
Try reading the posting guide before posting.
On Fri, Jun 6, 2008 at 11:12 AM, Daniel Folkinshteyn <[EMAIL PROTECTED]> wrote:
Anybody have any thoughts on this? Please? :)
on 06/05/2008 02:09 PM
Anybody have any thoughts on this? Please? :)
on 06/05/2008 02:09 PM Daniel Folkinshteyn said the following:
Hi everyone!
I have a question about data processing efficiency.
My data are as follows: I have a data set on quarterly institutional
ownership of equities; some of them have had
than by.y
and by.x?
I think when i was playing around i tried the all. command in that setup
as well
Mike
On Fri, Jun 6, 2008 at 2:07 PM, Daniel Folkinshteyn <[EMAIL PROTECTED]
<mailto:[EMAIL PROTECTED]>> wrote:
try this:
FullData <- merge(ETC, SURVEY, by.x = &
names(f)[which.max(f)]
on 06/06/2008 09:14 AM Muhammad Azam said the following:
Dear R users
I have a very basic question. I tried but could not find the required result.
using
dat <- pima
f <- table(dat[,9])
f
0 1
500 268
i want to find that class say "0" having maximum frequency i.e
try this:
FullData <- merge(ETC, SURVEY, by.x = "ord", by.y = "uid", all.x = T,
all.y = F)
on 06/06/2008 07:30 AM Michael Pearmain said the following:
Hi All,
Newbie question for you all but i have been looking at the archieves and the
help dtuff to get a rough idea of what i want to do
I wo
should work - don't even have to put them in quotes, if your field
separator is not space. why don't you just try it and see what comes out? :)
on 06/06/2008 08:43 AM stephen sefick said the following:
if I wanted to use a name for a column with two words say Dick Cheney and
George Bush
can I p
according to the helpfile, comment only takes one character, so you'll
have to do some 'magic' :)
i'd suggest to first run mydata through sed, and replace one of the
comment chars with another, then run read.table with the one comment
char that remains.
sed -e 's/^\^/!/' mydata.txt > mydata2
the '00' entries may be in a numeric column, so it gets typecast to a
number, and of course 00 == 0, numerically speaking, so they get
'condensed'.
to be sure you read everything "as is", specify "colClasses='character'. :
data<-read.table("data.txt",sep='\t', header=T, colClasses='character')
looks like you don't have permission to write a file to C:\
try writing to some other directory where you have write access
(e.g., your user's home dir, or your "my documents", or something like
that).
on 06/05/2008 11:57 PM Megh Dal said the following:
Hi,
I got following error in write.tab
i know this is an R mailing list :) but... i'll recommend you try python
with the beautifulsoup module - makes html processing a cinch.
another thing to note is that wunderground provides very handy RSS feeds
for every location, so rather than parsing the html page (with it's
associated bundle
oosen said the following:
Maybe you should provide a minimal, working code with data, so that we all
can give it a try.
In the mean time: take a look at the Rprof function to see where your code
can be improved.
Good luck
Bart
Daniel Folkinshteyn-2 wrote:
Hi everyone!
I have a question
would a density plot do? try
plot(density(x))
if you are specifically after the histogram tops rather than a density
estimate, then get the hist object with plot=F, then look at the counts
attribute:
histobj = hist(x, freq="TRUE", breaks=1000, plot=F)
plot(histobj$counts)
hope this helps.
o
Hi everyone!
I have a question about data processing efficiency.
My data are as follows: I have a data set on quarterly institutional
ownership of equities; some of them have had recent IPOs, some have not
(I have a binary flag set). The total dataset size is 700k+ rows.
My goal is this: For
44 matches
Mail list logo