[R] Profiler for R ?

2010-07-05 Thread Ralf B
Hi,

is there such a thing as a profiler for R that informs about a) how
much processing time is used by particular functions and commands and
b) how much memory is used for creating how many objects (or types of
data structures)? In a way I am looking for something similar to the
java profiler (which is started by command line and provides profiling
information collected from the run of a particular program). Is there
such a tool through the R command line or RGUI ? Are there profilers
available for the Eclipse StatET or though another package or
extension?

Thanks,
Ralf

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Fast String operations in R ? Cost of String operations

2010-07-05 Thread Ralf B
Hi experts,

currently developing some code that checks a large amount of Strings
for the existence of sub-strings and pattern (detecting sub-strings
within URLs). I wonder if there is information about how well
particular String operations work in R together with comparisons. Are
there  recommendations (based on such information) regarding what
operations should be used and what should be avoided? Are there
libraries and functions that provide optimized String operations for
such needs or is R simply not the right choice for that?

Best,
Ralf

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] UK map in R

2010-07-05 Thread Barry Rowlingson
On Sun, Jul 4, 2010 at 9:10 PM, happy naren narender.ku...@gmail.com wrote:
 Hi,
 i am currently working on a problem where i need to plot latitude and
 longitude data on a respective county of UK. After this i want to plot
 altitude data to make a 3d surface on which then i have to plot my
 corresponding data.

 Have you read the Spatial Task View on CRAN?

http://ftp.heanet.ie/mirrors/cran.r-project.org/web/views/Spatial.html

 Do you want to plot point data on a map of the whole of the UK with
county boundaries? These boundaries do change so you'll need to
specify a year if you need that precision. There's a set here:

http://gadm.org/

 but they might not be counties, they may be EU NUTS areas which are
close to counties at one level. Obviously Scotland (still part of the
UK) doesn't have counties at all. See:

http://en.wikipedia.org/wiki/Counties_of_the_United_Kingdom

 for all the details on UK administrative regions, historic counties etc etc.

 For your 3d plot, do you need an accurate elevation model of the UK?
At what precision? The best free elevation model I know is the SRTM
data, and you'd have to download the relevant section. To do 3d plots,
use the rgl package - do: library(rgl) ; example(terrain3d) for the
kind of thing.

 Any other questions are probably best sent to the R-spatial mailing
list, but you should probably do a bit more research first.

Barry

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] LatticeExtra Parallel

2010-07-05 Thread Tal Galili
Hi Ben,

You can also experiment with

matlines


Tal


Contact
Details:---
Contact me: tal.gal...@gmail.com |  972-52-7275845
Read me: www.talgalili.com (Hebrew) | www.biostatistics.co.il (Hebrew) |
www.r-statistics.com (English)
--




On Mon, Jul 5, 2010 at 5:49 AM, Deepayan Sarkar
deepayan.sar...@gmail.comwrote:

 On Sun, Jul 4, 2010 at 12:59 PM, Ben Wilkinson bjlwilkin...@gmail.com
 wrote:
  I have put together a chart of 1,000 monthly data series using parallel
 and
  I really like the way it displays the data. Is there a way to achieve
  something similar in terms of display using the actual scale (
 consistently
  across all the data) as opposed to min/max ?

 You mean like

 parallel(iris, common.scale = TRUE)

 ?

 -Deepayan

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Profiler for R ?

2010-07-05 Thread Joshua Wiley
Perhaps ?Rprof

HTH,

Josh


On Sun, Jul 4, 2010 at 11:26 PM, Ralf B ralf.bie...@gmail.com wrote:
 Hi,

 is there such a thing as a profiler for R that informs about a) how
 much processing time is used by particular functions and commands and
 b) how much memory is used for creating how many objects (or types of
 data structures)? In a way I am looking for something similar to the
 java profiler (which is started by command line and provides profiling
 information collected from the run of a particular program). Is there
 such a tool through the R command line or RGUI ? Are there profilers
 available for the Eclipse StatET or though another package or
 extension?

 Thanks,
 Ralf

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.




-- 
Joshua Wiley
Ph.D. Student, Health Psychology
University of California, Los Angeles
http://www.joshuawiley.com/

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] timeseries

2010-07-05 Thread nuncio m
Dear useRs,
I am trying to construct a time series using as.ts function, surprisingly
when I plot
the data the x axis do not show the time in years, however if I use
ts(data), time in years are shown in the
x axis.  Why such difference in the results of both the commands
Thanks
nuncio


-- 
Nuncio.M
Research Scientist
National Center for Antarctic and Ocean research
Head land Sada
Vasco da Gamma
Goa-403804

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] repeated measures with missing data

2010-07-05 Thread Rafael Diaz
Dear R help group,  I am teaching myself linear mixed models with missing data 
since I would like to analyze a stats design with these kind of models. The 
textbook example is for the procedure proc MIXED in SAS, but I would like to 
know if there is an equivalent in R.  This example only includes two 
time-measurements across subjects (a t-test with missing values), but I will 
need to to this with three time-measurements (repeated measures ANOVA with 
missing values):

Patient Treatment
 A  B


1   20 12
2   26 24
3   16 17
4   29 21
5   22 N/A
6   N/A  12

I have tried this analysis using using the instructions below with the help of 
Mixed-Effects Models in S and S-Plus, but have not been able to go around the 
missing data issue as follows:

tmtA - c(20,26, 16,29,22,NA)
tmtB - c(12,24,17,21,NA,17)
require(lme4)
dv - c(20,12,26,24,16,17,29,21,22,17)
subject - rep(c(s1,s2,s3,s4,s5,s6),each=2)
subject - subject[-c(10,11)]
myfactor - rep(c(f1,f2), 6)
myfactor - myfactor[-c(10,11)]
mydata - data.frame(dv, subject, myfactor)
am2 - lmer(dv ~ myfactor + (1|subject)), data = mydata)
summary(am2)
anova(am2)
subject - subject[-c(10,11)]


Any help would be greatly appreciated.  Thank you,

Rafael Diaz
Assistant Professor
Math and Stats Dept
California State University Sacramento



  
[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Data Labels in a barchart (Lattice or otherwise)

2010-07-05 Thread RaoulD

Hi,

Can anyone please help me with how I could add labels with the value for
each bar in a barchart? (similar to how data labels can be added in Excel) I
have done a lot of searching but havent been lucky.

Thanks,
Raoul
-- 
View this message in context: 
http://r.789695.n4.nabble.com/Data-Labels-in-a-barchart-Lattice-or-otherwise-tp2278027p2278027.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] repeated measures with missing data

2010-07-05 Thread ONKELINX, Thierry
Dear Rafael,

The line below had one closing bracket to much. The line below should
work.

am2 - lmer(dv ~ myfactor + (1|subject), data = mydata)

Furthermore I would advise to change myfactor for a character variable
to a factor.

HTH,

Thierry




ir. Thierry Onkelinx
Instituut voor natuur- en bosonderzoek
team Biometrie  Kwaliteitszorg
Gaverstraat 4
9500 Geraardsbergen
Belgium

Research Institute for Nature and Forest
team Biometrics  Quality Assurance
Gaverstraat 4
9500 Geraardsbergen
Belgium

tel. + 32 54/436 185
thierry.onkel...@inbo.be
www.inbo.be

To call in the statistician after the experiment is done may be no more
than asking him to perform a post-mortem examination: he may be able to
say what the experiment died of.
~ Sir Ronald Aylmer Fisher

The plural of anecdote is not data.
~ Roger Brinner

The combination of some data and an aching desire for an answer does not
ensure that a reasonable answer can be extracted from a given body of
data.
~ John Tukey
  

 -Oorspronkelijk bericht-
 Van: r-help-boun...@r-project.org 
 [mailto:r-help-boun...@r-project.org] Namens Rafael Diaz
 Verzonden: maandag 5 juli 2010 3:37
 Aan: r-help@r-project.org
 Onderwerp: [R] repeated measures with missing data
 
 Dear R help group,  I am teaching myself linear mixed models 
 with missing data since I would like to analyze a stats 
 design with these kind of models. The textbook example is for 
 the procedure proc MIXED in SAS, but I would like to know 
 if there is an equivalent in R.  This example only includes 
 two time-measurements across subjects (a t-test with missing 
 values), but I will need to to this with three 
 time-measurements (repeated measures ANOVA with missing values):
 
 Patient Treatment
  A  B
 
 
 1   20 12
 2   26 24
 3   16 17
 4   29 21
 5   22 N/A
 6   N/A  12
 
 I have tried this analysis using using the instructions below 
 with the help of Mixed-Effects Models in S and S-Plus, but 
 have not been able to go around the missing data issue as follows:
 
 tmtA - c(20,26, 16,29,22,NA)
 tmtB - c(12,24,17,21,NA,17)
 require(lme4)
 dv - c(20,12,26,24,16,17,29,21,22,17)
 subject - rep(c(s1,s2,s3,s4,s5,s6),each=2)
 subject - subject[-c(10,11)]
 myfactor - rep(c(f1,f2), 6)
 myfactor - myfactor[-c(10,11)]
 mydata - data.frame(dv, subject, myfactor)
 am2 - lmer(dv ~ myfactor + (1|subject)), data = mydata)
 summary(am2)
 anova(am2)
 subject - subject[-c(10,11)]
 
 
 Any help would be greatly appreciated.  Thank you,
 
 Rafael Diaz
 Assistant Professor
 Math and Stats Dept
 California State University Sacramento
 
 
 
   
   [[alternative HTML version deleted]]
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide 
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.
 

Druk dit bericht a.u.b. niet onnodig af.
Please do not print this message unnecessarily.

Dit bericht en eventuele bijlagen geven enkel de visie van de schrijver weer 
en binden het INBO onder geen enkel beding, zolang dit bericht niet bevestigd is
door een geldig ondertekend document. The views expressed in  this message 
and any annex are purely those of the writer and may not be regarded as stating 
an official position of INBO, as long as the message is not confirmed by a duly 
signed document.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] question concerning VGAM

2010-07-05 Thread Martin Maechler
  == Martin Spindler martin.spind...@gmx.de
 on Mon, 5 Jul 2010 07:48:42 +0200 writes:

 Hello everyone,
 using the VGAM package and the following code

 

 library(VGAM)

 bp1 - vglm(cbind(daten$anzahl_b, daten$deckung_b) ~ ., binom2.rho,
 data=daten1)

 summary(bp1)

 coef(bp1, matrix=TRUE)

 

 produced this error message:

 
 error in object$coefficients : $ operator not defined for this S4 class

 

 I am bit confused because some day ago this error message did not show up
 and everything was fine.

 Thank you very much in advance for your help.
 
 Best,

 

 Martin


 [[alternative HTML version deleted]]

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
--- PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
--- and provide commented, minimal, self-contained, reproducible code.

Hmm, and which part of the two lines above did you not
understand?

example(vglm)

already contains uses of coef() which do work fine;
so it must be you, or your setup which breaks things.

Martin Maechler, ETH Zurich

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Command run

2010-07-05 Thread Mahdieh TAHER (NUHS)
Hi Sir/Madam,

I am calling R from an ASP.NET project. The outputs are a chart and cor(P1,P2). 
The exe file is in c:\program files\R\R-2.11.1\bin\rscript.exe, when I the 
asp.net project opens cmd.exe, and run r - -vanilla r_sample.r , it says r is 
not recognized as defined command. (r_sample.r location is in bin of R). I also 
register Rscript.exe as exe file in windows, but it still shows an error. 
Really appreciate your hint.

Regards
Mahdieh


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] lc2 Model

2010-07-05 Thread Kathrin Schreglmann

Dear developping team,

I am a graduate student trying to fit a dose response curve for my 
thesis. I found one publication talking about the lc2-Modell in the drc 
Function (drm package), but I didn't find any related info how to create 
my data.

fct = lc.2()
was not found by my R. How do I get any info on that please?
First, I tried the LL.2 model, but it doesnot really fit as my data are 
not logistically, but on a linear scale (values 4, 8, 12, 16, 20, 24, 28).


Trying to contact the author of the publication failed so I couldn't 
think of any other way, I hope you can help me.

Thanks a lot,
Sincerely, 
Kathrin Schreglmann


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] R squared from cv.lm

2010-07-05 Thread Andruit

   Hello,
   I used the cv.lm function to validate a linear regression model
   fit-lm(y ~ x1+x2+x3+x4+0, data=mydata) without intercept
   I tried to validate the model by performing a leave one out cross validation
   procedure usinfg the cvlm function:
   CVlm(df=mydata, fit, m=196)
   But how can I get the adjusted R² from the output of this function.
   Or is there any other function to perform a leave one out cross validation
   procedure.
   Thank you very much in advance.
   A


   GRATIS für alle WEB.DE Nutzer: Die maxdome Movie-FLAT!
   Jetzt freischalten unter http://movieflat.web.de
__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Stoch Prog in R

2010-07-05 Thread Sudhakar Achath
Can you please let know if there are any packages for
stochastic linear programming (SLP) in R?

Thanks in advance

Sudhakar Achath

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] r code exchange site?

2010-07-05 Thread pdb

Does there exist a site where snippets of r code examples can be deposited,
such as the one that exists for matlab?

http://www.mathworks.com/matlabcentral/fileexchange/

ps
I also noted from the main r site

http://www.r-project.org/

when you click on the nabble link under the search link, I end up here

http://e-nvf.vvvay.net/-td13672.html#a13819

which I don't think is anything to do with R as far as I can tell (but my
Russian is not that hot)

Yours Hopefully,

pb

-- 
View this message in context: 
http://r.789695.n4.nabble.com/r-code-exchange-site-tp2278205p2278205.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] r code exchange site?

2010-07-05 Thread Liviu Andronic
Hello
There is http://www.r-cookbook.com/, but I'm not sure that it is what
you're looking for.
Liviu

On Mon, Jul 5, 2010 at 10:54 AM, pdb ph...@philbrierley.com wrote:

 Does there exist a site where snippets of r code examples can be deposited,
 such as the one that exists for matlab?

 http://www.mathworks.com/matlabcentral/fileexchange/

 ps
 I also noted from the main r site

 http://www.r-project.org/

 when you click on the nabble link under the search link, I end up here

 http://e-nvf.vvvay.net/-td13672.html#a13819

 which I don't think is anything to do with R as far as I can tell (but my
 Russian is not that hot)

 Yours Hopefully,

 pb

 --
 View this message in context: 
 http://r.789695.n4.nabble.com/r-code-exchange-site-tp2278205p2278205.html
 Sent from the R help mailing list archive at Nabble.com.

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.




-- 
Do you know how to read?
http://www.alienetworks.com/srtest.cfm
http://goodies.xfce.org/projects/applications/xfce4-dict#speed-reader
Do you know how to write?
http://garbl.home.comcast.net/~garbl/stylemanual/e.htm#e-mail

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] repeated measures with missing data

2010-07-05 Thread Gabor Grothendieck
On Mon, Jul 5, 2010 at 4:00 AM, ONKELINX, Thierry
thierry.onkel...@inbo.be wrote:
 Dear Rafael,

 The line below had one closing bracket to much. The line below should
 work.

 am2 - lmer(dv ~ myfactor + (1|subject), data = mydata)

 Furthermore I would advise to change myfactor for a character variable
 to a factor.


In addition, you can simplify the data manipulation:

wide - matrix(c(20, 26, 16, 29, 22, NA, 12, 24, 17, 21, NA, 17), nrow = 6,
  dimnames = list(subject = paste(s, 1:6, sep = ), myfactor =
c(f1, f2)))

long - as.data.frame.table(wide, responseName = dv)

am2 - lmer(dv ~ myfactor + (1|subject), data = long)

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] lattice xyplot with bty=l

2010-07-05 Thread László Sándor
Hi all,

Back in 2007, Deepayan and Patrick had an exchange about how to modify
axes for lattice plots (pasted below). I need something similar, but I
also need to produce ticks on the axes. Deepayan quickly coded up
substitute gridlines because they needed to make the default box
transparent. The code works, but it lacks the ticks, and I could not
google up how to add them.

(I would be taken aback if you could even help me but this option into
a theme for the latticeExtra package -- the grid options there do not
seem to allow left and bottom axes at the same time. Not even
asTheEconomist command, though I thought I have seen plot with
bty=l  in The Economist...)

I understand that panels in lattice were not intended for such use
originally, but for many advanced features of the lattice package, my
team wants to use this. The specific goal would be to produce graphs
in R like those here:
http://obs.rc.fas.harvard.edu/chetty/denmark_adjcost_slides.pdf  .

Any help would be greatly appreciated.

Thank you very much,

Laszlo

László Sándor
graduate student
Department of Economics
Harvard University

Deepayan Sarkar [EMAIL PROTECTED] writes:

 On 9/4/07, Patrick Drechsler [EMAIL PROTECTED] wrote:

 what is the correct way of removing the top and right axes
 completely from a lattice xyplot? I would like to have a plot similar
 to using the bty=l option for traditional plots.

 There is no direct analog (and I think it would be weird in a
 multipanel plot).

I agree that this is not very useful for multipanel plots.

 Combining a few different features, you can do:

 library(grid)

 xyplot(1:10 ~ 1:10, scales = list(col = black, tck = c(1, 0)),
par.settings = list(axis.line = list(col = transparent)),
axis = function(side, ...) {
if (side == left)
 grid.lines(x = c(0, 0), y = c(0, 1),
default.units = npc)
else if (side == bottom)
 grid.lines(x = c(0, 1), y = c(0, 0),
default.units = npc)
axis.default(side = side, ...)
})

 -Deepayan

Thank you very much Deepayan, this is exactly what I was looking for!

Cheers,

Patrick

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Execute commands in 'R' within PERL Program

2010-07-05 Thread chakri_amateur

Hi,

I wrote a program in PERL which creates a file with .net extension (..
xyz.net). I want to call R from within my PERL program, execute 3-line
command in 'R', store the output and get back to PERL program.

RSPerl is of omegahat is a good software which creates interface between R
and PERL, but unfortunately it doesn't work on my Windows XP.

I used the following command in PERL to call R

@args = ('C:\Program Files\R\R-2.9.0\bin\Rgui.exe');
system(@args) == 0 or die system @args failed: $!;

which opens R console. 

Now, I want to run the following command in R, close R and get back to
original PERL program

g - read.graph(f://xyz.net, pajek)
d-degree(g,mode=in)
power.law.fit(d+1)

I am struck at this point, could any one help me out ?

In the past, there was a discussion in this forum titled Problem calling R
from within perl script on Windows
.., from there I picked up the system command to call R from within
PERL. Thanks barrry ! 

I am hoping that somebody from that thread could help me.. ! 

Thanks
Chakri
-- 
View this message in context: 
http://r.789695.n4.nabble.com/Execute-commands-in-R-within-PERL-Program-tp2278255p2278255.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] adding a row of names to data.frame

2010-07-05 Thread Maas James Dr (MED)
Relative noob here, I have a data.frame and simply want to add an explicit 
column of names in column 1 of the form
trial_number01 for row 1, trial_number02 for row 2  etc.  It is simply 
for visual purposes and to explain data to others.  I've tried
Using row.names and other but still no luck, am sure it has been covered but I 
can't find it, can you please point me in the right direction?

Thanks

Jim


===
Dr. Jim Maas
University of East Anglia
Norwich, UK


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Creating DataFrame of Vectors Data Structure for Classification

2010-07-05 Thread Gundala Viswanath
Dear Experts,

I have a input file that looks like this

-0.438185,svm,1
-0.766791,svm,1
0.695282,svm,-1
0.759100,svm,-1
0.034400,svm,1
0.524807,svm,1
-0.27647800,nn,1
-0.16120810,nn,-1
0.63911350,nn,1
0.400554110,nn,1
0.429192240,nn,-1
0.454239140,nn,1

How can I create a data structure in R so that
it gives this:

 print(a_data_structure)
$hiv.svm
$hiv.svm$predictions
$hiv.svm$predictions[[1]]
  [1] -0.438185 -0.766791  0.695282
$hiv.svm$predictions[[2]]
  [1]  0.759100  0.034400  0.524807
$hiv.svm$labels
$hiv.svm$labels[[1]]
  [1]  1  1  -1
$hiv.svm$labels[[2]]
  [1]  -1  1  1
$hiv.nn
$hiv.nn$predictions
$hiv.nn$predictions[[1]]
  [1] -0.27647800 -0.16120810  0.63911350
$hiv.nn$predictions[[2]]
  [1]  0.400554110  0.429192240  0.454239140
$hiv.nn$labels
$hiv.nn$labels[[1]]
  [1]  1  -1  1
$hiv.nn$labels[[2]]
  [1]  1  -1  1

I'm new in R. Truly need help

Regards,
G. Viswanath

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Assigning entries to categories

2010-07-05 Thread LogLord

OK, thanks for the help!

Here a more complex example:

a=c(x,y,z)
b=c(8,14,19)
c=c(200010,535388,19929)
data=data.frame(a,b,c)

d=c(cat1,cat2,cat3,cat4,cat5,cat6)
b1=c(14,5,8,20,19,1)
c_start=c(50,50,20,20,18000,60)
c_stop=c(55,55,201000,201000,2,70)
category=data.frame(d,b1,c_start,c_stop) 


Again I want to create a new variable, which automatically assigns the
category to the data based on matching b = b1 and c  = c_start and
=c_stop.

I hope this explains my problem more explicit.

Thanks!
-- 
View this message in context: 
http://r.789695.n4.nabble.com/Assigning-entries-to-categories-tp2272697p2278334.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] adding a row of names to data.frame

2010-07-05 Thread Pete B

Jim

Is this what you need?

#create data
Lines - Drug1 Drug2 Drug3 Drug4
  153  133  145  111
  189  177  200  170
  221  241  187  243
  215  228  201  178
  302  283  292  248
  223  255  220  202
  201  238  233  163
  173  164  172  139
  121  128  119  120
  100  200  300  400

# read in data
d - read.table(textConnection(Lines), header = TRUE)

#add row.names
row.names(d)=paste(trial_number,sprintf(%02d,as.numeric(row.names(d))),sep=)

# view output
print(d)

HTH

Pete

-- 
View this message in context: 
http://r.789695.n4.nabble.com/adding-a-row-of-names-to-data-frame-tp2278278p2278345.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Linux-Windows problem

2010-07-05 Thread Ildiko Varga
Dear All,

I faced the following problem. With the same data.frame the results are 
different under Linux and Windows.
Could you help on this topic?
Thanks in advance,
Ildiko

Linux:
  d = read.csv(CRP.csv)
  d$drugCode = as.numeric(d$drug)
  cor(d, use=pairwise.complete.obs)
   PATIENT BL.CRP   X24HR.CRP   X48HR.CRP drug   drugCode
PATIENTNA NA  NA  NA   NA NA
BL.CRP NA  1.000  0.84324880 -0.05699590   NA -0.3367147
X24HR.CRP  NA  0.8432488  1. -0.06162383   NA -0.3557316
X48HR.CRP  NA -0.0569959 -0.06162383  1.   NA  0.1553356
drug   NA NA  NA  NA   NA NA
drugCode   NA -0.3367147 -0.35573159  0.15533562   NA  1.000
Warning message:
In cor(d, use = pairwise.complete.obs) : NAs introduced by coercion
  str(d)
'data.frame':   41 obs. of  6 variables:
  $ PATIENT  : Factor w/ 41 levels RV13,RV14,..: 2 3 4 6 7 12 13 14 
15 17 ...
  $ BL.CRP   : num  7.3 31.2 4.2 6.7 1.6 7.7 5.3 38.9 1 7.3 ...
  $ X24HR.CRP: num  6.1 24.9 11.1 4.9 1 5 3.7 18 1 7.3 ...
  $ X48HR.CRP: num  121.5 40 28.4 34.5 33.3 ...
  $ drug : Factor w/ 2 levels active,placebo: 1 1 1 1 1 1 1 1 1 
1 ...
  $ drugCode : num  1 1 1 1 1 1 1 1 1 1 ...

 Windows:
  d = read.csv(CRP.csv)
  d$drugCode = as.numeric(d$drug)
  cor(d, use=pairwise.complete.obs)
Error in cor(d, use = pairwise.complete.obs) : 'x' must be numeric
  str(d)
'data.frame':   41 obs. of  6 variables:
  $ PATIENT  : Factor w/ 41 levels RV13,RV14,..: 2 3 4 6 7 12 13 14 
15 17 ...
  $ BL.CRP   : num  7.3 31.2 4.2 6.7 1.6 7.7 5.3 38.9 1 7.3 ...
  $ X24HR.CRP: num  6.1 24.9 11.1 4.9 1 5 3.7 18 1 7.3 ...
  $ X48HR.CRP: num  121.5 40 28.4 34.5 33.3 ...
  $ drug : Factor w/ 2 levels active,placebo: 1 1 1 1 1 1 1 1 1 
1 ...
  $ drugCode : num  1 1 1 1 1 1 1 1 1 1 ...

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Linux-Windows problem

2010-07-05 Thread Uwe Ligges



On 05.07.2010 14:31, Ildiko Varga wrote:

Dear All,

I faced the following problem. With the same data.frame the results are
different under Linux and Windows.
Could you help on this topic?


I guess you read in the data differently since you have different 
default encodings on both platforms (e.g. latin1 vs. UTF-8) and you data 
is probably not plain ASCII.


Best,
Uwe Ligges


Thanks in advance,
Ildiko

Linux:
d = read.csv(CRP.csv)
d$drugCode = as.numeric(d$drug)
cor(d, use=pairwise.complete.obs)
PATIENT BL.CRP   X24HR.CRP   X48HR.CRP drug   drugCode
PATIENTNA NA  NA  NA   NA NA
BL.CRP NA  1.000  0.84324880 -0.05699590   NA -0.3367147
X24HR.CRP  NA  0.8432488  1. -0.06162383   NA -0.3557316
X48HR.CRP  NA -0.0569959 -0.06162383  1.   NA  0.1553356
drug   NA NA  NA  NA   NA NA
drugCode   NA -0.3367147 -0.35573159  0.15533562   NA  1.000
Warning message:
In cor(d, use = pairwise.complete.obs) : NAs introduced by coercion
str(d)
'data.frame':   41 obs. of  6 variables:
   $ PATIENT  : Factor w/ 41 levels RV13,RV14,..: 2 3 4 6 7 12 13 14
15 17 ...
   $ BL.CRP   : num  7.3 31.2 4.2 6.7 1.6 7.7 5.3 38.9 1 7.3 ...
   $ X24HR.CRP: num  6.1 24.9 11.1 4.9 1 5 3.7 18 1 7.3 ...
   $ X48HR.CRP: num  121.5 40 28.4 34.5 33.3 ...
   $ drug : Factor w/ 2 levels active,placebo: 1 1 1 1 1 1 1 1 1
1 ...
   $ drugCode : num  1 1 1 1 1 1 1 1 1 1 ...

  Windows:
d = read.csv(CRP.csv)
d$drugCode = as.numeric(d$drug)
cor(d, use=pairwise.complete.obs)
Error in cor(d, use = pairwise.complete.obs) : 'x' must be numeric
str(d)
'data.frame':   41 obs. of  6 variables:
   $ PATIENT  : Factor w/ 41 levels RV13,RV14,..: 2 3 4 6 7 12 13 14
15 17 ...
   $ BL.CRP   : num  7.3 31.2 4.2 6.7 1.6 7.7 5.3 38.9 1 7.3 ...
   $ X24HR.CRP: num  6.1 24.9 11.1 4.9 1 5 3.7 18 1 7.3 ...
   $ X48HR.CRP: num  121.5 40 28.4 34.5 33.3 ...
   $ drug : Factor w/ 2 levels active,placebo: 1 1 1 1 1 1 1 1 1
1 ...
   $ drugCode : num  1 1 1 1 1 1 1 1 1 1 ...

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Counting defined character within String

2010-07-05 Thread Kunzler, Andreas
Dear list,

I'm looking for a way to count the number of | within an object.
The character | is used to separated ids.

Assume a data (d) structure like

Var
NA
NA
NA
NA
NA
1
1|2
1|22|45
3
4b|24789

I need to know the maximum number of ids within one object. In this case 3 
(1|22|45)


Does anybody know a better way?

Thanks

Mit freundlichen Grüßen

Andreas Kunzler

Bundeszahnärztekammer (BZÄK)
Chausseestraße 13
10115 Berlin

Tel.: 030 40005-113
Fax:  030 40005-119

E-Mail: a.kunz...@bzaek.de 

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Counting defined character within String

2010-07-05 Thread Henrique Dallazuanna
Try this:

sapply(strsplit(as.character(Var$Var), \\|), length)

On Mon, Jul 5, 2010 at 11:04 AM, Kunzler, Andreas a.kunz...@bzaek.dewrote:

 Dear list,

 I'm looking for a way to count the number of | within an object.
 The character | is used to separated ids.

 Assume a data (d) structure like

 Var
 NA
 NA
 NA
 NA
 NA
 1
 1|2
 1|22|45
 3
 4b|24789

 I need to know the maximum number of ids within one object. In this case 3
 (1|22|45)


 Does anybody know a better way?

 Thanks

 Mit freundlichen Grüßen

 Andreas Kunzler
 
 Bundeszahnärztekammer (BZÄK)
 Chausseestraße 13
 10115 Berlin

 Tel.: 030 40005-113
 Fax:  030 40005-119

 E-Mail: a.kunz...@bzaek.de

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.




-- 
Henrique Dallazuanna
Curitiba-Paraná-Brasil
25° 25' 40 S 49° 16' 22 O

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] XTFEVD implementation in R

2010-07-05 Thread Suresh Singh
Is there a package in R for XTFEVD procedure (Plümper  Troeger)?

Also, are there any examples of Hausman-Taylor implementation in R?
I understand that it can be done using plm package but could not find
examples with actual data

Thank you,
Suresh Singh
Fisher College of Business
The Ohio State University

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Counting defined character within String

2010-07-05 Thread Marc Schwartz
On Jul 5, 2010, at 9:04 AM, Kunzler, Andreas wrote:

 Dear list,
 
 I'm looking for a way to count the number of | within an object.
 The character | is used to separated ids.
 
 Assume a data (d) structure like
 
 Var
 NA
 NA
 NA
 NA
 NA
 1
 1|2
 1|22|45
 3
 4b|24789
 
 I need to know the maximum number of ids within one object. In this case 3 
 (1|22|45)
 
 
 Does anybody know a better way?
 
 Thanks


Presuming that your column is in a data frame called 'DF', where the 'Var' 
column is likely imported as a factor:

 DF
Var
1  NA
2  NA
3  NA
4  NA
5  NA
6 1
7   1|2
8   1|22|45
9 3
10 4b|24789



 max(sapply(strsplit(as.character(DF$Var), split = \\|), length))
[1] 3


The above uses strsplit() to split each line using the | as the split 
character. Since | has a special meaning for regular expressions, it needs to 
be escaped using the double backslash:

 strsplit(as.character(DF$Var), split = \\|)
[[1]]
[1] NA

[[2]]
[1] NA

[[3]]
[1] NA

[[4]]
[1] NA

[[5]]
[1] NA

[[6]]
[1] 1

[[7]]
[1] 1 2

[[8]]
[1] 1  22 45

[[9]]
[1] 3

[[10]]
[1] 4b24789


Then you just loop through each line getting the length:

 sapply(strsplit(as.character(DF$Var), split = \\|), length)
 [1] 1 1 1 1 1 1 2 3 1 2


and of course get the max value.

HTH,

Marc Schwartz

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Data Labels in a barchart (Lattice or otherwise)

2010-07-05 Thread David Winsemius


On Jul 4, 2010, at 11:43 PM, RaoulD wrote:



Hi,

Can anyone please help me with how I could add labels with the value  
for
each bar in a barchart? (similar to how data labels can be added in  
Excel) I

have done a lot of searching but havent been lucky.


This is generally pretty easy with text() at least if you are using  
base graphics. If it is not clear after reading the help page then  
post an examply with whatever barchart function you have chosen to  
use. If it's the lattice barchart there is an ltext example  
immediately before the barchart example that quickly can be grafted  
into the barchart code.


David Winsemius, MD
West Hartford, CT

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] r code exchange site?

2010-07-05 Thread David Winsemius


On Jul 5, 2010, at 5:54 AM, pdb wrote:



Does there exist a site where snippets of r code examples can be  
deposited,

such as the one that exists for matlab?


From the main R-project page wht Wiki link takes you here:

http://rwiki.sciviews.org/doku.php

There is also an R section on stack overflow.



--

David Winsemius, MD
West Hartford, CT

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] r code exchange site?

2010-07-05 Thread Albert-Jan Roskam
I've used www.pastebin.com  before, with C as the code.

Cheers!!
Albert-Jan

~~
All right, but apart from the sanitation, the medicine, education, wine, public 
order, irrigation, roads, a fresh water system, and public health, what have 
the Romans ever done for us?
~~

--- On Mon, 7/5/10, David Winsemius dwinsem...@comcast.net wrote:


From: David Winsemius dwinsem...@comcast.net
Subject: Re: [R] r code exchange site?
To: pdb ph...@philbrierley.com
Cc: r-help@r-project.org
Date: Monday, July 5, 2010, 4:25 PM



On Jul 5, 2010, at 5:54 AM, pdb wrote:

 
 Does there exist a site where snippets of r code examples can be deposited,
 such as the one that exists for matlab?

From the main R-project page wht Wiki link takes you here:

http://rwiki.sciviews.org/doku.php

There is also an R section on stack overflow.
 
--
David Winsemius, MD
West Hartford, CT

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.



  
[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Assigning entries to categories

2010-07-05 Thread David Winsemius


On Jul 5, 2010, at 8:54 AM, LogLord wrote:



OK, thanks for the help!

Here a more complex example:

a=c(x,y,z)
b=c(8,14,19)
c=c(200010,535388,19929)
data=data.frame(a,b,c)

d=c(cat1,cat2,cat3,cat4,cat5,cat6)
b1=c(14,5,8,20,19,1)
c_start=c(50,50,20,20,18000,60)
c_stop=c(55,55,201000,201000,2,70)
category=data.frame(d,b1,c_start,c_stop)


Again I want to create a new variable, which automatically assigns the
category to the data based on matching b = b1 and c  = c_start and
=c_stop.




Probably not the most elegant solution. For each data row, see which  
one or more rows of category satisfies. Not tested for possibility of  
non-hit:


 for (i in 1:nrow(data)) print( category[
  which(apply(category[, -1], 1,
   function(x) {data$b[i]==x[1]  data 
$c[i]  x[2]  x[3]  data$c[i]})),

1] )
[1] cat3
Levels: cat1 cat2 cat3 cat4 cat5 cat6
[1] cat1
Levels: cat1 cat2 cat3 cat4 cat5 cat6
[1] cat5
Levels: cat1 cat2 cat3 cat4 cat5 cat6

A couple of points. Bad practice to name variables or objects with the  
name c. Also bad practice to name objects data. Both at common R  
function names.



I hope this explains my problem more explicit.

Thanks!
--
View this message in context: 
http://r.789695.n4.nabble.com/Assigning-entries-to-categories-tp2272697p2278334.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


David Winsemius, MD
West Hartford, CT

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Assigning entries to categories

2010-07-05 Thread Gabor Grothendieck
On Mon, Jul 5, 2010 at 8:54 AM, LogLord nils.sch...@web.de wrote:

 OK, thanks for the help!

 Here a more complex example:

 a=c(x,y,z)
 b=c(8,14,19)
 c=c(200010,535388,19929)
 data=data.frame(a,b,c)

 d=c(cat1,cat2,cat3,cat4,cat5,cat6)
 b1=c(14,5,8,20,19,1)
 c_start=c(50,50,20,20,18000,60)
 c_stop=c(55,55,201000,201000,2,70)
 category=data.frame(d,b1,c_start,c_stop)


 Again I want to create a new variable, which automatically assigns the
 category to the data based on matching b = b1 and c  = c_start and
 =c_stop.


Try this:

 library(sqldf)

 sqldf(select data.*, d from data, category where data.b = category.b1 and c 
 = c_start and c = c_stop)
  a  b  cd
1 x  8 200010 cat3
2 y 14 535388 cat1
3 z 19  19929 cat5

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Assigning entries to categories

2010-07-05 Thread LogLord


Gabor Grothendieck wrote:
 
 On Mon, Jul 5, 2010 at 8:54 AM, LogLord nils.sch...@web.de wrote:

 OK, thanks for the help!

 Here a more complex example:

 a=c(x,y,z)
 b=c(8,14,19)
 c=c(200010,535388,19929)
 data=data.frame(a,b,c)

 d=c(cat1,cat2,cat3,cat4,cat5,cat6)
 b1=c(14,5,8,20,19,1)
 c_start=c(50,50,20,20,18000,60)
 c_stop=c(55,55,201000,201000,2,70)
 category=data.frame(d,b1,c_start,c_stop)


 Again I want to create a new variable, which automatically assigns the
 category to the data based on matching b = b1 and c  = c_start and
 =c_stop.

 
 Try this:
 
 library(sqldf)

 sqldf(select data.*, d from data, category where data.b = category.b1
 and c = c_start and c = c_stop)
   a  b  cd
 1 x  8 200010 cat3
 2 y 14 535388 cat1
 3 z 19  19929 cat5
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.
 
 


Great! That's what I need! Seems like I need to extend my sql knowledge
urgently...

Thanks a lot!

-- 
View this message in context: 
http://r.789695.n4.nabble.com/Assigning-entries-to-categories-tp2272697p2278524.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Plot with whispers

2010-07-05 Thread Ian Bentley
Hello!

I need to make a plot with whispers that does the following.

Reads in 50 files, each file containing 200 data points.  A file looks like
this:
base100.log
Send Receive
10.5   100.3
15.0   102.4
...

There are 100 lines, each with two data points.  I need to read in the 50
files, and plot three lines

The first line is the mean of the send column with whiskers indicating
standard deviation  (Each file represents one data point)

The second line is the mean of the receive column, as above.

the final plot is the mean of the two summed, with whiskers as above.

There will be 50 data points on the final graph, one for each file.

I've done this sort of a thing before, but I really can't figure out how to
handle the different Columns.

If I use read.table:

x1 - read.table(updateToSink1010.log)

then x1 becomes a matrix, with two columns and 101 rows.  -- including Send,
Receive.

Anyways, I'd appreciate a push in some direction - hopefully the right one
:).

-- 
Ian Bentley
M.Sc. Candidate
Queen's University
Kingston, Ontario

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Plot with whispers

2010-07-05 Thread Matt Shotwell
It looks like read.table is reading the first line as a data value,
which is the default for read.table. Try using read.table with the
argument header=TRUE. Also, consider using a box and whiskers plot for
these data (?boxplot, ?lattice::bwplot).

-Matt

On Mon, 2010-07-05 at 12:08 -0400, Ian Bentley wrote:
 Hello!
 
 I need to make a plot with whispers that does the following.
 
 Reads in 50 files, each file containing 200 data points.  A file looks like
 this:
 base100.log
 Send Receive
 10.5   100.3
 15.0   102.4
 ...
 
 There are 100 lines, each with two data points.  I need to read in the 50
 files, and plot three lines
 
 The first line is the mean of the send column with whiskers indicating
 standard deviation  (Each file represents one data point)
 
 The second line is the mean of the receive column, as above.
 
 the final plot is the mean of the two summed, with whiskers as above.
 
 There will be 50 data points on the final graph, one for each file.
 
 I've done this sort of a thing before, but I really can't figure out how to
 handle the different Columns.
 
 If I use read.table:
 
 x1 - read.table(updateToSink1010.log)
 
 then x1 becomes a matrix, with two columns and 101 rows.  -- including Send,
 Receive.
 
 Anyways, I'd appreciate a push in some direction - hopefully the right one
 :).
 
-- 
Matthew S. Shotwell
Graduate Student
Division of Biostatistics and Epidemiology
Medical University of South Carolina
http://biostatmatt.com

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Can anybody help me understand AIC and BIC and devise a new metric?

2010-07-05 Thread LosemindL

Hi all,

Could anybody please help me understand AIC and BIC and especially why do
they make sense?

Furthermore, I am trying to devise a new metric related to the model
selection in the financial asset management industry.

As you know the industry uses Sharpe Ratio as the main performance
benchmark, which is the annualized mean of returns divided by the annualized
standard deviation of returns. 

In model selection, we would like to choose a model that yields the highest
Sharpe Ratio.

However, the more parameters you use, the higher Sharpe Ratio you might
potentially get, and the higher risk that your model is overfitted. 

I am trying to think of a AIC or BIC version of the Sharpe Ratio that
facilitates the model selection...

Anybody could you please give me some pointers? 

Thanks a lot! 
-- 
View this message in context: 
http://r.789695.n4.nabble.com/Can-anybody-help-me-understand-AIC-and-BIC-and-devise-a-new-metric-tp2278448p2278448.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] unable to get bigglm working, ATTN: Thomas Lumley

2010-07-05 Thread stephenb

the model fails to converge after more than 3 hours ( I went home so don't
know how long it took)

 bigglm (formula = resp ~ relage+I(relage^2)+termfac+ri , 
+   data = a, family = binomial(link='logit')); 
Large data regression model: bigglm(formula = resp ~ relage + I(relage^2) +
termfac + ri, 
data = a, family = binomial(link = logit))
Sample size =  12758187 
failed to converge after 8  iterations
Warning message:
In bigglm.function(formula = resp ~ relage + I(relage^2) + termfac +  :
  ran out of iterations and failed to converge

SAS converges 
NOTE: PROC LOGISTIC is modeling the probability that resp='1'.
NOTE: Convergence criterion (GCONV=1E-8) satisfied.
NOTE: There were 12758187 observations read from the data set
SRVRUSER.COMMIT.
NOTE: The data set WORK.OUT3 has 11 observations and 15 variables.
NOTE: PROCEDURE LOGISTIC used (Total process time):
  real time   2:25.42
  cpu time1:16.79

I did not see a trace argument in bigglm. is there another way to see what
is happening?
Thank you.
-- 
View this message in context: 
http://r.789695.n4.nabble.com/unable-to-get-bigglm-working-ATTN-Thomas-Lumley-tp2276524p2278381.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] plotting data with ellipse confidence intervals

2010-07-05 Thread web reg
Hi,
I would like to plot a set of paired means (as X Y data) with unique
confidence intervals for each (creating a set of ellipses, each with it's
own centre point and shape).
Would appreciate any advice out there!
Cheers,
Ged

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Help reg Genome view

2010-07-05 Thread mahalakshmi sivamani
Hi,

I have a set of genes and its chromosomal physical position in a text file.
I want to view those genes in the chromosome using R package GenePlotter.
Could any one please tell how to view this.

Thanks in advance.


Yours sincerely,

S.Mahalakshmi

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Issue with write.table and read.table : I'm not getting out what I put in

2010-07-05 Thread Irina
Hello,

I am trying to save a large matrix of values in a file. My problem is that I am
writing 
write.table(allpos,'control_chr1.txt', dec=.)
and then I want to check it with
test2=read.table('control_chr1.txt')
sum(test2[,2]==allpos[,2])

This last number is lower than the length of the test2[,2] vector. This is
really annoying me because I can't figure out why I don't get out the same thing
that I put in.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Passing the parameter (file name) to png()

2010-07-05 Thread Maulik Shah
Thanks a lot!

Regards,
Maulik

On Sat, Jun 26, 2010 at 4:43 AM, jim holtman jholt...@gmail.com wrote:

 b - paste(C:\\rphp\\,arg, sep='')

 On Sat, Jun 26, 2010 at 12:55 AM, Maulik Shah maulik.shah2...@gmail.com
 wrote:
  I am fitting 3 parameter model to my response matrix and want to generate
  item characterstic curve.
  I want to specify file name to save item characterstic curve by passing
 it
  as external parameter to the R batch script. The following is the code I
  have written for this.
 
  *R Script:*
 
  library(ltm)
  cmd_args = commandArgs();
  for (arg in cmd_args) cat(  , arg, \n, sep=)
  respmat - read.table(C:\\rphp\\responsedata.txt)
  fit3pl - tpm(respmat)
  cat(  , arg, \n, sep=)
  b - c(C:\\rphp\\,arg)
  png(file=b, bg=transparent)
  plot(fit3pl,items=c,lwd=3)
  dev.off()
  rm(respmat,fit3pl,b)
  q()
 
  Could you please help me in doing so? I get an error message when R
 executes
  png().
 
  Thanks and Regards,
  Maulik
 
 [[alternative HTML version deleted]]
 
  __
  R-help@r-project.org mailing list
  https://stat.ethz.ch/mailman/listinfo/r-help
  PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.htmlhttp://www.r-project.org/posting-guide.html
  and provide commented, minimal, self-contained, reproducible code.
 



 --
 Jim Holtman
 Cincinnati, OH
 +1 513 646 9390

 What is the problem that you are trying to solve?


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Patch for legend.position={left,top,bottom} in ggplot2

2010-07-05 Thread Sebastian Wurster

Thank you for this nice patch!
To incorporate it you have to open the ggplot2 file in path to your R 
packages\ggplot2\R, search for the first line of code and replace it 
with the patch. Don't forget to delete the lines with - and the + in 
front of the new code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] To detect the location of duplicate values

2010-07-05 Thread Moohwan Kim
Dear R family,

I have a question about how to detect some duplicate numeric observations.
Suppose that I have two variables dataset.

order value
1  0.52
2  0.23
3  0.43
4  0.21
5  0.32
6  0.32
7  0.32
8  0.32
9  0.32
10 0.12
11 0.46
12 0.09
13 0.32
14 0.25
;
Could you help me indicate where the duplicate observations in a row
(e.g., 0.32) are?

best,
moohwan

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Issue with write.table and read.table : I'm not getting out what I put in

2010-07-05 Thread jim holtman
Why not use 'save'  'load'?

On Mon, Jul 5, 2010 at 11:49 AM, Irina irina.kr...@epfl.ch wrote:
 Hello,

 I am trying to save a large matrix of values in a file. My problem is that I 
 am
 writing
 write.table(allpos,'control_chr1.txt', dec=.)
 and then I want to check it with
 test2=read.table('control_chr1.txt')
 sum(test2[,2]==allpos[,2])

 This last number is lower than the length of the test2[,2] vector. This is
 really annoying me because I can't figure out why I don't get out the same 
 thing
 that I put in.

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.




-- 
Jim Holtman
Cincinnati, OH
+1 513 646 9390

What is the problem that you are trying to solve?

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] To detect the location of duplicate values

2010-07-05 Thread Henrique Dallazuanna
Try this:

DF[duplicated(DF$value),]

On Mon, Jul 5, 2010 at 1:31 PM, Moohwan Kim kmhl...@gmail.com wrote:

 Dear R family,

 I have a question about how to detect some duplicate numeric observations.
 Suppose that I have two variables dataset.

 order value
 1  0.52
 2  0.23
 3  0.43
 4  0.21
 5  0.32
 6  0.32
 7  0.32
 8  0.32
 9  0.32
 10 0.12
 11 0.46
 12 0.09
 13 0.32
 14 0.25
 ;
 Could you help me indicate where the duplicate observations in a row
 (e.g., 0.32) are?

 best,
 moohwan

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.




-- 
Henrique Dallazuanna
Curitiba-Paraná-Brasil
25° 25' 40 S 49° 16' 22 O

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] To detect the location of duplicate values

2010-07-05 Thread jim holtman
try this

 x
   order value
1  1  0.52
2  2  0.23
3  3  0.43
4  4  0.21
5  5  0.32
6  6  0.32
7  7  0.32
8  8  0.32
9  9  0.32
1010  0.12
1111  0.46
1212  0.09
1313  0.32
1414  0.25
 # go both ways to capture all duplicates
 which(duplicated(x$value) | duplicated(x$value, fromLast=TRUE))
[1]  5  6  7  8  9 13



On Mon, Jul 5, 2010 at 12:31 PM, Moohwan Kim kmhl...@gmail.com wrote:
 Dear R family,

 I have a question about how to detect some duplicate numeric observations.
 Suppose that I have two variables dataset.

 order value
 1  0.52
 2  0.23
 3  0.43
 4  0.21
 5  0.32
 6  0.32
 7  0.32
 8  0.32
 9  0.32
 10 0.12
 11 0.46
 12 0.09
 13 0.32
 14 0.25
 ;
 Could you help me indicate where the duplicate observations in a row
 (e.g., 0.32) are?

 best,
 moohwan

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.




-- 
Jim Holtman
Cincinnati, OH
+1 513 646 9390

What is the problem that you are trying to solve?

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] To detect the location of duplicate values

2010-07-05 Thread Joshua Wiley
Hello Moohwan,

Look at ?duplicated  for example:

 x
[1] 1 1 2 2 3 3
 duplicated(x)
[1] FALSE  TRUE FALSE  TRUE FALSE  TRUE

If your end goal is to get rid of the duplicates, take a look at ?unique

 unique(x)
[1] 1 2 3

Best Regards,

Josh

On Mon, Jul 5, 2010 at 9:31 AM, Moohwan Kim kmhl...@gmail.com wrote:
 Dear R family,

 I have a question about how to detect some duplicate numeric observations.
 Suppose that I have two variables dataset.

 order value
 1  0.52
 2  0.23
 3  0.43
 4  0.21
 5  0.32
 6  0.32
 7  0.32
 8  0.32
 9  0.32
 10 0.12
 11 0.46
 12 0.09
 13 0.32
 14 0.25
 ;
 Could you help me indicate where the duplicate observations in a row
 (e.g., 0.32) are?

 best,
 moohwan

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.




-- 
Joshua Wiley
Ph.D. Student, Health Psychology
University of California, Los Angeles
http://www.joshuawiley.com/

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Help reg Genome view

2010-07-05 Thread Martin Morgan
On 07/05/2010 08:51 AM, mahalakshmi sivamani wrote:
 Hi,
 
 I have a set of genes and its chromosomal physical position in a text file.
 I want to view those genes in the chromosome using R package GenePlotter.
 Could any one please tell how to view this.

Hi S. Mahalakshmi,

The package is geneplotter.

a) read the vignette

  browseVignettes('GenePlotter')

b) ask on the Bioconductor mailing list

  http://bioconductor.org/docs/mailList.html

c) see additional packages, esp. GenomeGraphs at

  http://bioconductor.org/packages/release/Software.html

Martin

 
 Thanks in advance.
 
 
 Yours sincerely,
 
 S.Mahalakshmi
 
   [[alternative HTML version deleted]]
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


-- 
Martin Morgan
Computational Biology / Fred Hutchinson Cancer Research Center
1100 Fairview Ave. N.
PO Box 19024 Seattle, WA 98109

Location: Arnold Building M1 B861
Phone: (206) 667-2793

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] plotting data with ellipse confidence intervals

2010-07-05 Thread Peter Ehlers

On 2010-07-05 7:48, web reg wrote:

Hi,
I would like to plot a set of paired means (as X Y data) with unique
confidence intervals for each (creating a set of ellipses, each with it's
own centre point and shape).
Would appreciate any advice out there!
Cheers,
Ged



If you have only the means, then you can't
plot CIs and ellipses.

If you have the original paired data, then
have a look at the car::ellipse function and
friends.

  -Peter Ehlers

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Can anybody help me understand AIC and BIC and devise a new metric?

2010-07-05 Thread David Winsemius


On Jul 5, 2010, at 10:35 AM, LosemindL wrote:



Hi all,

Could anybody please help me understand AIC and BIC and especially  
why do

they make sense?

Furthermore, I am trying to devise a new metric related to the model
selection in the financial asset management industry.

As you know the industry uses Sharpe Ratio as the main performance
benchmark, which is the annualized mean of returns divided by the  
annualized

standard deviation of returns.

In model selection, we would like to choose a model that yields the  
highest

Sharpe Ratio.

However, the more parameters you use, the higher Sharpe Ratio you  
might

potentially get, and the higher risk that your model is overfitted.

I am trying to think of a AIC or BIC version of the Sharpe Ratio that
facilitates the model selection...

Anybody could you please give me some pointers?


From: http://www.R-project.org/posting-guide.html

Basic statistics and classroom homework: R-help is not intended for  
these.


Perhaps following some link on Wikipedia, instead?

--
David Winsemius, MD
West Hartford, CT

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Issue with write.table and read.table : I'm not getting out what I put in

2010-07-05 Thread David Winsemius


On Jul 5, 2010, at 11:49 AM, Irina wrote:


Hello,

I am trying to save a large matrix of values in a file. My problem  
is that I am

writing
write.table(allpos,'control_chr1.txt', dec=.)
and then I want to check it with
test2=read.table('control_chr1.txt')
sum(test2[,2]==allpos[,2])

This last number is lower than the length of the test2[,2] vector.  
This is
really annoying me because I can't figure out why I don't get out  
the same thing

that I put in.

Many potential problems could underly getting FALSE for an == test.  
One might be FAQ 7.31. Another might be encoding or locale issues  
related to the decimal separator. Why not post a reproducible example?


--

David Winsemius, MD
West Hartford, CT

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Patch for legend.position={left,top,bottom} in ggplot2

2010-07-05 Thread Hadley Wickham
Or wait a couple of days for the next release of ggplot2...

Hadley

On Mon, Jul 5, 2010 at 11:28 AM, Sebastian Wurster
sebastian.wurs...@gmx.de wrote:
 Thank you for this nice patch!
 To incorporate it you have to open the ggplot2 file in path to your R
 packages\ggplot2\R, search for the first line of code and replace it with
 the patch. Don't forget to delete the lines with - and the + in front of
 the new code.

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.




-- 
Assistant Professor / Dobelman Family Junior Chair
Department of Statistics / Rice University
http://had.co.nz/

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Issue with write.table and read.table : I'm not getting out what I put in

2010-07-05 Thread Peter Ehlers

On 2010-07-05 11:30, David Winsemius wrote:


On Jul 5, 2010, at 11:49 AM, Irina wrote:


Hello,

I am trying to save a large matrix of values in a file. My problem is
that I am
writing
write.table(allpos,'control_chr1.txt', dec=.)
and then I want to check it with
test2=read.table('control_chr1.txt')
sum(test2[,2]==allpos[,2])

This last number is lower than the length of the test2[,2] vector.
This is
really annoying me because I can't figure out why I don't get out the
same thing
that I put in.


Many potential problems could underly getting FALSE for an == test.
One might be FAQ 7.31. Another might be encoding or locale issues
related to the decimal separator. Why not post a reproducible example?



David's advice is spot-on. (As is Jim's: using save/load is better.)

I have no problem replicating your 'problem' with
random data, showing once again the futility of
using == in situations such as this. Try instead:

 all.equal(test2, allpos, check.attributes = FALSE)

(why the check.attributes argument may be needed
is left as an exercise)

  -Peter Ehlers

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Counting defined character within String

2010-07-05 Thread Charles C. Berry

On Mon, 5 Jul 2010, Kunzler, Andreas wrote:


Dear list,

I'm looking for a way to count the number of | within an object.
The character | is used to separated ids.

Assume a data (d) structure like

Var
NA
NA
NA
NA
NA
1
1|2
1|22|45
3
4b|24789

I need to know the maximum number of ids within one object. In this case 3 
(1|22|45)


Does anybody know a better way?


See

?max
?count.fields

and, if you are noit using this on a text file,

?textConnection


count.fields(textConnection(

+ Var
+ NA
+ NA
+ NA
+ NA
+ NA
+ 1
+ 1|2
+ 1|22|45
+ 3
+ 4b|24789
+ ),sep=|)
 [1] 1 1 1 1 1 1 1 2 3 1 2

HTH,

Chuck



Thanks

Mit freundlichen Grüßen

Andreas Kunzler

Bundeszahnärztekammer (BZÄK)
Chausseestraße 13
10115 Berlin

Tel.: 030 40005-113
Fax:  030 40005-119

E-Mail: a.kunz...@bzaek.de

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.



Charles C. Berry(858) 534-2098
Dept of Family/Preventive Medicine
E mailto:cbe...@tajo.ucsd.edu   UC San Diego
http://famprevmed.ucsd.edu/faculty/cberry/  La Jolla, San Diego 92093-0901

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] To detect the location of duplicate values

2010-07-05 Thread Charles C. Berry

On Mon, 5 Jul 2010, Moohwan Kim wrote:


Dear R family,

I have a question about how to detect some duplicate numeric observations.
Suppose that I have two variables dataset.

order value
1  0.52
2  0.23
3  0.43
4  0.21
5  0.32
6  0.32
7  0.32
8  0.32
9  0.32
10 0.12
11 0.46
12 0.09
13 0.32
14 0.25
;
Could you help me indicate where the duplicate observations in a row
(e.g., 0.32) are?


I see you already have replies about duplicate() and unique(), which are 
very handy for the 'detect' part of your query.



But to list the locations of the duplciated elements, you might also 
benefit from using split() and Filter() like this:



Filter( function(x) length(x)1, split(order, value) )

$`0.32`
[1]  5  6  7  8  9 13


HTH,

Chuck





best,
moohwan

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.



Charles C. Berry(858) 534-2098
Dept of Family/Preventive Medicine
E mailto:cbe...@tajo.ucsd.edu   UC San Diego
http://famprevmed.ucsd.edu/faculty/cberry/  La Jolla, San Diego 92093-0901

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] export VTK from R : impossible to write data as float

2010-07-05 Thread alexandre pryet
Hello,

I've written a short code (below) to write 3D unstructured grid to binary
VTK files from R. Problem : I can only write integers with the command :

data-as.numeric(c(3.3))
storage.mode(data)-'integer'
writeBin(data,bfile_celldata,endian=swap)

the function storage.mode(data)-'long' looks fine, but the VTK file is not
readable.

thanks for any help or comments,

#- R SCRIPT TO WRITE A CUBE IN VTK FORMAT

cat('# vtk DataFile Version 3.0\n',file=vtk_header)
cat('R Binary Export v3.0 of inversion model\nBINARY\n',file=vtk_header,
append=TRUE)
cat('DATASET UNSTRUCTURED_GRID\n',file=vtk_header, append=TRUE)
#placed here instead of top of vtk_points, since npoints is not known before
cat('POINTS', 8,'int\n',file=vtk_header,append=TRUE)

#write points
writeBin(as.integer(c(0, 0, 0, 1, 0, 0, 0, 1, 0, 1, 1, 0, 0, 0, 1, 1, 0, 1,
0, 1, 1, 1, 1, 1)),bfile_points,endian=swap)

#cells header
cat('\nCELLS', 1,9,'\n',file=cells_header)
#write cells
writeBin(as.integer(c(8, 0, 1, 3, 2, 4, 5, 7, 6)),bfile_cells,endian=swap)

#cell types header
cat('\nCELL_TYPES', 1,'\n',file=celltypes_header)
#write cell types
writeBin(as.integer(c(12)),bfile_celltypes,endian=swap)

#cell data header
cat('\nCELL_DATA',1,'\n',file=celldata_header)
cat('SCALARS R float 1','\n',file=celldata_header, append=TRUE)
cat('LOOKUP_TABLE default\n',file=celldata_header, append=TRUE)
#write cell data
data-as.numeric(c(3.3))
storage.mode(data)-'integer'
writeBin(data,bfile_celldata,endian=swap)

#close binary connections
close(bfile_points)
close(bfile_cells)
close(bfile_celltypes)
close(bfile_celldata)

#concatenate files to produce VTK
system(paste('cat',vtk_header,bfile_points,cells_header,bfile_cells,celltypes_header,bfile_celltypes,celldata_header,bfile_celldata,,testb_unstructured.vtk,sep=
))

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Plot with whispers

2010-07-05 Thread Ian Bentley
Thanks Matt,
I've been trying to get the data into a format that boxplot will accept, but
I'm having trouble.
If I read in my file directly

data - read.table(base100.log)
plot(data)

It plots the Send data against the Receive data, using one as x, and one as
y.

That's not too surprising, so I tried to just plot the Send data

plot(data[1])

It plots all the points horizontally along the x axis, with no associated y
data.  This is very similar to what I would like to do - associate all 100
points with a y co-ordinate.

I couldn't get this step to work though, I tried a number of things

ydata - seq(length=100, from=1, by=0)
p1 - c(data[1], ydata)

seems to be the close to what I want, but it isn't quite right.  Can anyone
give me an idea how to associate the 100 data points with a y-coord, so that
I can then use them in a boxplot/whiskerplot?

Thanks
ian

On 5 July 2010 12:31, Matt Shotwell shotw...@musc.edu wrote:

 It looks like read.table is reading the first line as a data value,
 which is the default for read.table. Try using read.table with the
 argument header=TRUE. Also, consider using a box and whiskers plot for
 these data (?boxplot, ?lattice::bwplot).

 -Matt

 On Mon, 2010-07-05 at 12:08 -0400, Ian Bentley wrote:
  Hello!
 
  I need to make a plot with whispers that does the following.
 
  Reads in 50 files, each file containing 200 data points.  A file looks
 like
  this:
  base100.log
  Send Receive
  10.5   100.3
  15.0   102.4
  ...
 
  There are 100 lines, each with two data points.  I need to read in the 50
  files, and plot three lines
 
  The first line is the mean of the send column with whiskers indicating
  standard deviation  (Each file represents one data point)
 
  The second line is the mean of the receive column, as above.
 
  the final plot is the mean of the two summed, with whiskers as above.
 
  There will be 50 data points on the final graph, one for each file.
 
  I've done this sort of a thing before, but I really can't figure out how
 to
  handle the different Columns.
 
  If I use read.table:
 
  x1 - read.table(updateToSink1010.log)
 
  then x1 becomes a matrix, with two columns and 101 rows.  -- including
 Send,
  Receive.
 
  Anyways, I'd appreciate a push in some direction - hopefully the right
 one
  :).
 
 --
 Matthew S. Shotwell
 Graduate Student
 Division of Biostatistics and Epidemiology
 Medical University of South Carolina
 http://biostatmatt.com




-- 
Ian Bentley
M.Sc. Candidate
Queen's University
Kingston, Ontario

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Plot with whispers

2010-07-05 Thread Dennis Murphy
Hi:

This sounds like your standard error bar plot. Here's one way to get it,
using lists, melt() from the reshape package and ggplot2.

# Generate 50 fake data sets with 200 rows and variables send, receive:
for(i in seq_len(50)) assign(paste('df', i, sep = ''),
   data.frame(send = rnorm(200, 10, 5), receive = rnorm(200, 105, 10)))

# Generate a vector of data frame names
dnames - paste('df', 1:50, sep = '')

# Create the function for processing one data frame. In this case, we
# want to melt the data first so that the variable names become factor
# levels and the data values are correspondingly stacked. We then use
# ddply() from package plyr to produce the mean and standard deviation
# from each variable.
f - function(df) {
   u - melt(df)
   ddply(u, .(variable), summarise, m = mean(value), s = sd(value))
 }

# Slurp the data frames into a list and then add send and receive together
# This can probably be done without a loop using sapply, but the loop
# should be about as fast.
l - vector('list', length(dnames))
for(i in seq_along(dnames)) {l[[i]] - get(dnames[[i]])
 l[[i]]$both = with(l[[i]], send + receive)}

# Now, apply the function f to each data frame in the list, and rbind the
# results together. Afterward, create dsn to distinguish the different
# data frames (I chose to use the numbers only as they can be used
# as the x-axis in the plots below.)

out - do.call(rbind, lapply(l, f))
out$dsn - rep(1:length(dnames), each = 3)

# Create the error bar plots for each of the 50 data frames by each
# variable, where the dot represents the mean and the ends of the
# segments represent a 1 SD distance from the mean.

p - ggplot(out, aes(x = dsn, y = m, ymin = m - s, ymax = m + s))
p + geom_point(size = 2) + geom_errorbar(width = 0) +
facet_grid(variable ~ ., scales = 'free_y') +
xlab('Data set number')

Substitute your actual data frames for the fake ones (in particular,
redefine
dnames) and you should be good to go if you like the plot.

HTH,
Dennis

On Mon, Jul 5, 2010 at 9:08 AM, Ian Bentley ian.bent...@gmail.com wrote:

 Hello!

 I need to make a plot with whispers that does the following.

 Reads in 50 files, each file containing 200 data points.  A file looks like
 this:
 base100.log
 Send Receive
 10.5   100.3
 15.0   102.4
 ...

 There are 100 lines, each with two data points.  I need to read in the 50
 files, and plot three lines

 The first line is the mean of the send column with whiskers indicating
 standard deviation  (Each file represents one data point)

 The second line is the mean of the receive column, as above.

 the final plot is the mean of the two summed, with whiskers as above.

 There will be 50 data points on the final graph, one for each file.

 I've done this sort of a thing before, but I really can't figure out how to
 handle the different Columns.

 If I use read.table:

 x1 - read.table(updateToSink1010.log)

 then x1 becomes a matrix, with two columns and 101 rows.  -- including
 Send,
 Receive.

 Anyways, I'd appreciate a push in some direction - hopefully the right one
 :).

 --
 Ian Bentley
 M.Sc. Candidate
 Queen's University
 Kingston, Ontario

[[alternative HTML version deleted]]

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Some questions about R's modelling algebra

2010-07-05 Thread Kingsford Jones
On Fri, Jul 2, 2010 at 8:16 AM, Hadley Wickham had...@rice.edu wrote:
 ?formula in R 2.9.2 says in para 2:
 The %in% operator indicates that the terms on its left are nested
 within those on the right. For example a + b %in% a expands to the
 formula a + a:b. 

 Ooops, missed that.  So b %in% a = a:b, and that's what's meant by
 different coding.

Or would this be true only if b %in% a was preceded by a?

attr(terms(y ~ B %in% A), 'term.labels')
#[1] B:A
attr(terms(y ~ B + B %in% A), 'term.labels')
#[1] B   B:A
attr(terms(y ~ A + B %in% A), 'term.labels')
#[1] A   A:B

suggesting a documentation buglet in Sec 11.1 of An Introduction to R,
where it states:

\begin{quote}
y ~ A*B
y ~ A + B + A:B
y ~ B %in% A
y ~ A/B
Two factor non-additive model of y on A and B. The first two
specify the same crossed classification and the second two specify the
same nested classification. In abstract terms all four specify the
same model subspace.
\end{quote}

I think y ~ B %in% A  should be changed to y ~ A + B %in% A since

attr(terms(y ~ A/B), 'term.labels')
#[1] A   A:B

Or am I missing something?

Kingsford



 Hadley

 --
 Assistant Professor / Dobelman Family Junior Chair
 Department of Statistics / Rice University
 http://had.co.nz/

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] selection of optim parameters

2010-07-05 Thread Fabian Gehring

Hi all,

I am trying to rebuild the results of a study using a different data 
set. I'm using about 450 observations. The code I've written seems to 
work well, but I have some troubles minimizing the negative of the 
LogLikelyhood function using 5 free parameters.


As starting values I am using the result of the paper I am rebuiling. 
The system.time of the calculation of the function is about 0.65 sec. 
Since the free parameters should be within some boundaries I am using 
the following command:


optim(fn=calculateLogLikelyhood, c(0.4, 2, 0.4, 8, 0.8), 
lower=c(0,0,0,0,0), upper=c(1, 5, Inf, Inf, 1), control=list(trace=1, 
maxit=1000))


Unfortunately the result doesn't seem to be reasonable. 3 of the 
optimized parameters are on the boundaries.


Unfortunately I don't have much experience using optimizatzion methods. 
That's why I am asking you.
Do you have any hints for me what should be taken into account when 
doing such an optimization.


Is there a good way to implement the boundaries into the code (instead 
of doing it while optimizing)? I've read about parscale in the 
help-section. Unfortunately I don't really know how to use it. And 
anyways, could this help? What other points/controls should be taken 
into account?


I know that this might be a bit little information about my current 
code. But I don't know what you need to give me some advise. Just let me 
know what you need to know.


Thankds

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] data.frame: adding a column that is based on ranges of values in another column

2010-07-05 Thread Abdi, Abdulhakim
Dear List,

I've been looking tirelessly for a solution to this dilemma but without 
success. Perhaps someone has an idea that will guide me in the right direction.

Suppose I have the following data.frame:

DF = data.frame(X = c(114.5508, 114.6468, 114.6596, 114.6957, 114.6828, 
114.8903, 114.9519, 114.8842,
114.8579, 114.8489), Y = c(47.14094, 46.98874, 46.91235, 46.88265, 46.80584, 
46.67022, 46.53264, 46.47727,
46.46457, 46.47032), Date = as.Date(c('2009-01-01', '2009-01-03', '2009-01-05', 
'2009-01-10', '2009-01-14',
'2009-01-15', '2009-01-16', '2009-01-17', '2009-01-22', '2009-01-29')))

DF
  XY   Date
1  114.5508 47.14094 2009-01-01
2  114.6468 46.98874 2009-01-03
3  114.6596 46.91235 2009-01-05
4  114.6957 46.88265 2009-01-10
5  114.6828 46.80584 2009-01-14
6  114.8903 46.67022 2009-01-15
7  114.9519 46.53264 2009-01-16
8  114.8842 46.47727 2009-01-17
9  114.8579 46.46457 2009-01-22
10 114.8489 46.47032 2009-01-29

I also have two objects that contain the dates of the first and last fortnight 
of the month of January 2009.

s.d1 = '2009-01-01'
e.d1 = '2009-01-14'
f.n1 = seq(from = as.Date(s.d1)  , to =  as.Date(e.d1), by = 1)

f.n1
[1] 2009-01-01 2009-01-02 2009-01-03 2009-01-04 2009-01-05 
2009-01-06 2009-01-07 2009-01-08 2009-01-09 2009-01-10 2009-01-11 
2009-01-12 2009-01-13 2009-01-14

s.d2 = '2009-01-15'
e.d2 = '2009-01-31'
f.n2 = seq(from = as.Date(s.d2)  , to =  as.Date(e.d2), by = 1)

f.n2
[1] 2009-01-15 2009-01-16 2009-01-17 2009-01-18 2009-01-19 
2009-01-20 2009-01-21 2009-01-22 2009-01-23 2009-01-24 2009-01-25 
2009-01-26 2009-01-27 2009-01-28 2009-01-29 2009-01-30 2009-01-31


I'm trying to add a column called Fortnight to the existing data.frame. The 
components of the new Fortnight column are based on the existing Date 
column so that if the value in Date falls within the first fortnight (f.n1) 
then the value of the new Fortnight column would be FN1, and if the value 
of the Date column falls within the second fortnight (f.n2), then the value 
of the Fortnight column would be FN2, and so on.

The end result should look like:

  XY   Date Fortnight
1  114.5508 47.14094 2009-01-01   FN1
2  114.6468 46.98874 2009-01-03   FN1
3  114.6596 46.91235 2009-01-05   FN1
4  114.6957 46.88265 2009-01-10   FN1
5  114.6828 46.80584 2009-01-14   FN1
6  114.8903 46.67022 2009-01-15   FN2
7  114.9519 46.53264 2009-01-16   FN2
8  114.8842 46.47727 2009-01-17   FN2
9  114.8579 46.46457 2009-01-22   FN2
10 114.8489 46.47032 2009-01-29   FN2

I manually entered the above values for the Fortnight column to illustrate my 
point, however, that would be quite tiresome for 500+ rows of data ;-)

The only other similar issue I found on the list was 
https://stat.ethz.ch/pipermail/r-help/2008-February/153995.html but that 
particular problem is slightly different than what I'm trying to accomplish 
here.

I appreciate your time and assistance.

Thanks in advance.

Regards,


Hakim Abdi



_
Abdulhakim Abdi, M.Sc.
Research Intern

Conservation GIS/Remote Sensing Lab
Smithsonian Conservation Biology Institute
1500 Remount Road
Front Royal, VA 22630
phone: +1 540 635 6578
mobile: +1 747 224 7006
fax: +1 540 635 6506 (Attn:GIS Lab)
email: ab...@si.edu
http://nationalzoo.si.edu/SCBI/ConservationGIS/






[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Can anybody help me understand AIC and BIC and devise a new metric?

2010-07-05 Thread Dennis Murphy
Hi:

On Mon, Jul 5, 2010 at 7:35 AM, LosemindL comtech@gmail.com wrote:


 Hi all,

 Could anybody please help me understand AIC and BIC and especially why do
 they make sense?


Any good text that discusses model selection in detail will have some
discussion of
AIC and BIC. Frank Harrell's book 'Regression Modeling Strategies' comes
immediately
to mind, along with Hastie, Tibshirani and Friedman (Elements of Statistical
Learning)
and Burnham and Anderson's book (Model Selection and Multi-Model Inference),
but
there are many other worthy texts that cover the topic. The gist is that AIC
and BIC
penalize the log likelihood of a model by subtracting different functions of
its number
of parameters. David's suggestion of Wikipedia is also on target.


 Furthermore, I am trying to devise a new metric related to the model
 selection in the financial asset management industry.

 As you know the industry uses Sharpe Ratio as the main performance
 benchmark, which is the annualized mean of returns divided by the
 annualized
 standard deviation of returns.


I didn't know, but thank you for the information. Isn't this simply a
signal-to-noise
ratio quantified on an annual basis?


 In model selection, we would like to choose a model that yields the highest
 Sharpe Ratio.

 However, the more parameters you use, the higher Sharpe Ratio you might
 potentially get, and the higher risk that your model is overfitted.

 I am trying to think of a AIC or BIC version of the Sharpe Ratio that
 facilitates the model selection...


You might be able to make some progress if you can express the (penalized)
log likelihood as a function of the Sharpe ratio. But if you have several
years of
data in your model and the ratio is computed annually, then isn't it a
random
variable rather than a parameter? If so, it changes the nature of the
problem, no?
(Being unfamiliar with the Sharpe ratio, I fully recognize that I may be
completely
off-base in this suggestion, but I'll put it out there anyway :)

BTW, you might find the R-sig-finance list to be a more productive resource
in
this problem than R-help due to the specialized nature of the question.

HTH,
Dennis


 Anybody could you please give me some pointers?

 Thanks a lot!
 --
 View this message in context:
 http://r.789695.n4.nabble.com/Can-anybody-help-me-understand-AIC-and-BIC-and-devise-a-new-metric-tp2278448p2278448.html
 Sent from the R help mailing list archive at Nabble.com.

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Data Labels in a barchart (Lattice or otherwise)

2010-07-05 Thread RaoulD

Thank You David. Yes, I am using the lattice barchart and have managed to add
data labels, however, they tend to be on the tip of each bar and are
difficult to read as they are partially on the bar. Any help would be
greatly appreciated.

This is the code I am using:
 levels(PR_SUMMARY$Bucket)=c(0-3 months,3-9 months,9-15 months,15-18
months)
 barchart(PrimaryReason ~ cInteractions| Bucket + Type, data = PR_SUMMARY,
layout = c(4, 2),col=lightgreen,main=COMPARISON - PRIMARY REASON,
   sub=L  R,xlab=Number of Customers,ylab=Primary Reasons,
   auto.key = list(title = COMPARISON - PRIMARY
REASON,columns=2,points = FALSE, rectangles =  TRUE,space= right
),scales = list(x = list(abbreviate=TRUE,minlength=5,rot=45)),
   panel = function(x,y,subscripts,groups,...){
panel.barchart(x,y,...)
ltext(x,y,label=round(PR_SUMMARY$cInteractions,1),
cex=.99,rot=45)
border=transparent}) 

I dont really understand the ltext part and found it with some other code,
but it works.

Thanks again,
Raoul
-- 
View this message in context: 
http://r.789695.n4.nabble.com/Data-Labels-in-a-barchart-Lattice-or-otherwise-tp2278027p2278646.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] How to determine if R is 64 bit compiled under Unix-alike?

2010-07-05 Thread Przemek Grabowicz
Under MacOS I had R64 executive and it was clear. Under Ubuntu, which I 
do not have administrative rights to, there is only R executive. It 
seems that I can allocate more than 3GB of memory, however not 
everything seems to work the same/right as with R64 under MacOS.


Pms.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] unable to get bigglm working, ATTN: Thomas Lumley

2010-07-05 Thread stephenb

I decided to give it 1 more variable, which is strongly significant to help
the optimization and it throws:

 bigglm (formula = resp ~ relage+relage2+termfac+ri+sn , 
+   data = a, family = binomial(link='logit')); 
Error in bigglm.function(formula = resp ~ relage + relage2 + termfac +  : 
  model matrices incompatible
-- 
View this message in context: 
http://r.789695.n4.nabble.com/unable-to-get-bigglm-working-ATTN-Thomas-Lumley-tp2276524p2278734.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] selection of optim parameters

2010-07-05 Thread Charles C. Berry

On Mon, 5 Jul 2010, Fabian Gehring wrote:


Hi all,

I am trying to rebuild the results of a study using a different data set. I'm 
using about 450 observations. The code I've written seems to work well, but I 
have some troubles minimizing the negative of the LogLikelyhood function 
using 5 free parameters.


As starting values I am using the result of the paper I am rebuiling. The 
system.time of the calculation of the function is about 0.65 sec. Since the 
free parameters should be within some boundaries I am using the following 
command:


optim(fn=calculateLogLikelyhood, c(0.4, 2, 0.4, 8, 0.8), 
lower=c(0,0,0,0,0), upper=c(1, 5, Inf, Inf, 1), control=list(trace=1, 
maxit=1000))


Unfortunately the result doesn't seem to be reasonable. 3 of the optimized 
parameters are on the boundaries.


You haven't said why this is unreasonable.

A suggestion: Profile the loglikelihood around the starting value and 
around the putative maximum. (This might help with the parscale issue.)


Also, you might try something like

apply(rbind(0,eps*as.matrix(expand.grid(rep(list(c(-1,0,1)),5,1,
function(x) calculateLogLikelyhood( x + y ) )

where y is the starting value (or the value achieved by optim()) and 'eps' 
is small enough to make small changes in the function value might help you 
see what gives. (It might be necessary to scale each column in 
as.matrix(...) separately, 'though.)


In addition to inspecting the results by eyeball, you can fit the results 
to a quadratic form in rbind(...) using lm() and then figure out roughly 
where to go to find the maximum of your function.


If this isn't enough to get you started, at least it might enable you to 
say more clearly what is not reasonable about your results.


HTH,

Chuck



Unfortunately I don't have much experience using optimizatzion methods. 
That's why I am asking you. Do you have any hints for me what should be 
taken into account when doing such an optimization.


Is there a good way to implement the boundaries into the code (instead of 
doing it while optimizing)? I've read about parscale in the help-section. 
Unfortunately I don't really know how to use it. And anyways, could this 
help? What other points/controls should be taken into account?


I know that this might be a bit little information about my current code. But 
I don't know what you need to give me some advise. Just let me know what you 
need to know.


Thankds

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.



Charles C. Berry(858) 534-2098
Dept of Family/Preventive Medicine
E mailto:cbe...@tajo.ucsd.edu   UC San Diego
http://famprevmed.ucsd.edu/faculty/cberry/  La Jolla, San Diego 92093-0901

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Can anybody help me understand AIC and BIC and devise a new metric?

2010-07-05 Thread Kjetil Halvorsen
You should have a look at:

Model Selection and
Model Averaging
Gerda Claeskens
K.U. Leuven
Nils Lid Hjort
University of Oslo

Among other this will explain that AIC and BIC really aims at different goals.

On Mon, Jul 5, 2010 at 4:20 PM, Dennis Murphy djmu...@gmail.com wrote:
 Hi:

 On Mon, Jul 5, 2010 at 7:35 AM, LosemindL comtech@gmail.com wrote:


 Hi all,

 Could anybody please help me understand AIC and BIC and especially why do
 they make sense?


 Any good text that discusses model selection in detail will have some
 discussion of
 AIC and BIC. Frank Harrell's book 'Regression Modeling Strategies' comes
 immediately
 to mind, along with Hastie, Tibshirani and Friedman (Elements of Statistical
 Learning)
 and Burnham and Anderson's book (Model Selection and Multi-Model Inference),
 but
 there are many other worthy texts that cover the topic. The gist is that AIC
 and BIC
 penalize the log likelihood of a model by subtracting different functions of
 its number
 of parameters. David's suggestion of Wikipedia is also on target.


 Furthermore, I am trying to devise a new metric related to the model
 selection in the financial asset management industry.

 As you know the industry uses Sharpe Ratio as the main performance
 benchmark, which is the annualized mean of returns divided by the
 annualized
 standard deviation of returns.


 I didn't know, but thank you for the information. Isn't this simply a
 signal-to-noise
 ratio quantified on an annual basis?


 In model selection, we would like to choose a model that yields the highest
 Sharpe Ratio.

 However, the more parameters you use, the higher Sharpe Ratio you might
 potentially get, and the higher risk that your model is overfitted.

 I am trying to think of a AIC or BIC version of the Sharpe Ratio that
 facilitates the model selection...


 You might be able to make some progress if you can express the (penalized)
 log likelihood as a function of the Sharpe ratio. But if you have several
 years of
 data in your model and the ratio is computed annually, then isn't it a
 random
 variable rather than a parameter? If so, it changes the nature of the
 problem, no?
 (Being unfamiliar with the Sharpe ratio, I fully recognize that I may be
 completely
 off-base in this suggestion, but I'll put it out there anyway :)

 BTW, you might find the R-sig-finance list to be a more productive resource
 in
 this problem than R-help due to the specialized nature of the question.

 HTH,
 Dennis


 Anybody could you please give me some pointers?

 Thanks a lot!
 --
 View this message in context:
 http://r.789695.n4.nabble.com/Can-anybody-help-me-understand-AIC-and-BIC-and-devise-a-new-metric-tp2278448p2278448.html
 Sent from the R help mailing list archive at Nabble.com.

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


        [[alternative HTML version deleted]]

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] data.frame: adding a column that is based on ranges of values in another column

2010-07-05 Thread jim holtman
use 'merge':

 DF = data.frame(X = c(114.5508, 114.6468, 114.6596, 114.6957, 114.6828, 
 114.8903, 114.9519, 114.8842,
+ 114.8579, 114.8489), Y = c(47.14094, 46.98874, 46.91235, 46.88265,
46.80584, 46.67022, 46.53264, 46.47727,
+ 46.46457, 46.47032), Date = as.Date(c('2009-01-01', '2009-01-03',
'2009-01-05', '2009-01-10', '2009-01-14',
+ '2009-01-15', '2009-01-16', '2009-01-17', '2009-01-22', '2009-01-29')))

 s.d1 = '2009-01-01'
 e.d1 = '2009-01-14'
 f.n1 = seq(from = as.Date(s.d1)  , to =  as.Date(e.d1), by = 1)

 s.d2 = '2009-01-15'
 e.d2 = '2009-01-31'
 f.n2 = seq(from = as.Date(s.d2)  , to =  as.Date(e.d2), by = 1)
 x.new - data.frame(Date=c(f.n1, f.n2),
+ Fortnight=c(rep(FN1, length(f.n1)), rep(FN2, length(f.n2

 merge(DF, x.new, all.x=TRUE)
 DateXY Fortnight
1  2009-01-01 114.5508 47.14094   FN1
2  2009-01-03 114.6468 46.98874   FN1
3  2009-01-05 114.6596 46.91235   FN1
4  2009-01-10 114.6957 46.88265   FN1
5  2009-01-14 114.6828 46.80584   FN1
6  2009-01-15 114.8903 46.67022   FN2
7  2009-01-16 114.9519 46.53264   FN2
8  2009-01-17 114.8842 46.47727   FN2
9  2009-01-22 114.8579 46.46457   FN2
10 2009-01-29 114.8489 46.47032   FN2


On Mon, Jul 5, 2010 at 4:01 PM, Abdi, Abdulhakim ab...@si.edu wrote:
 Dear List,

 I've been looking tirelessly for a solution to this dilemma but without 
 success. Perhaps someone has an idea that will guide me in the right 
 direction.

 Suppose I have the following data.frame:

 DF = data.frame(X = c(114.5508, 114.6468, 114.6596, 114.6957, 114.6828, 
 114.8903, 114.9519, 114.8842,
 114.8579, 114.8489), Y = c(47.14094, 46.98874, 46.91235, 46.88265, 46.80584, 
 46.67022, 46.53264, 46.47727,
 46.46457, 46.47032), Date = as.Date(c('2009-01-01', '2009-01-03', 
 '2009-01-05', '2009-01-10', '2009-01-14',
 '2009-01-15', '2009-01-16', '2009-01-17', '2009-01-22', '2009-01-29')))

 DF
          X        Y       Date
 1  114.5508 47.14094 2009-01-01
 2  114.6468 46.98874 2009-01-03
 3  114.6596 46.91235 2009-01-05
 4  114.6957 46.88265 2009-01-10
 5  114.6828 46.80584 2009-01-14
 6  114.8903 46.67022 2009-01-15
 7  114.9519 46.53264 2009-01-16
 8  114.8842 46.47727 2009-01-17
 9  114.8579 46.46457 2009-01-22
 10 114.8489 46.47032 2009-01-29

 I also have two objects that contain the dates of the first and last 
 fortnight of the month of January 2009.

 s.d1 = '2009-01-01'
 e.d1 = '2009-01-14'
 f.n1 = seq(from = as.Date(s.d1)  , to =  as.Date(e.d1), by = 1)

 f.n1
 [1] 2009-01-01 2009-01-02 2009-01-03 2009-01-04 2009-01-05 
 2009-01-06 2009-01-07 2009-01-08 2009-01-09 2009-01-10 2009-01-11 
 2009-01-12 2009-01-13 2009-01-14

 s.d2 = '2009-01-15'
 e.d2 = '2009-01-31'
 f.n2 = seq(from = as.Date(s.d2)  , to =  as.Date(e.d2), by = 1)

 f.n2
 [1] 2009-01-15 2009-01-16 2009-01-17 2009-01-18 2009-01-19 
 2009-01-20 2009-01-21 2009-01-22 2009-01-23 2009-01-24 2009-01-25 
 2009-01-26 2009-01-27 2009-01-28 2009-01-29 2009-01-30 2009-01-31


 I'm trying to add a column called Fortnight to the existing data.frame. The 
 components of the new Fortnight column are based on the existing Date 
 column so that if the value in Date falls within the first fortnight (f.n1) 
 then the value of the new Fortnight column would be FN1, and if the value 
 of the Date column falls within the second fortnight (f.n2), then the value 
 of the Fortnight column would be FN2, and so on.

 The end result should look like:

          X        Y       Date Fortnight
 1  114.5508 47.14094 2009-01-01       FN1
 2  114.6468 46.98874 2009-01-03       FN1
 3  114.6596 46.91235 2009-01-05       FN1
 4  114.6957 46.88265 2009-01-10       FN1
 5  114.6828 46.80584 2009-01-14       FN1
 6  114.8903 46.67022 2009-01-15       FN2
 7  114.9519 46.53264 2009-01-16       FN2
 8  114.8842 46.47727 2009-01-17       FN2
 9  114.8579 46.46457 2009-01-22       FN2
 10 114.8489 46.47032 2009-01-29       FN2

 I manually entered the above values for the Fortnight column to illustrate 
 my point, however, that would be quite tiresome for 500+ rows of data ;-)

 The only other similar issue I found on the list was 
 https://stat.ethz.ch/pipermail/r-help/2008-February/153995.html but that 
 particular problem is slightly different than what I'm trying to accomplish 
 here.

 I appreciate your time and assistance.

 Thanks in advance.

 Regards,


 Hakim Abdi



 _
 Abdulhakim Abdi, M.Sc.
 Research Intern

 Conservation GIS/Remote Sensing Lab
 Smithsonian Conservation Biology Institute
 1500 Remount Road
 Front Royal, VA 22630
 phone: +1 540 635 6578
 mobile: +1 747 224 7006
 fax: +1 540 635 6506 (Attn:GIS Lab)
 email: ab...@si.edu
 http://nationalzoo.si.edu/SCBI/ConservationGIS/






        [[alternative HTML version deleted]]

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 

Re: [R] data.frame: adding a column that is based on ranges of values in another column

2010-07-05 Thread Dennis Murphy
Hi:

Since you've been looking tirelessly :)

For your stated problem, the following will work:

DF$Fortnight - with(DF, ifelse(Date %in% f.n1, 'FN1',
  ifelse(Date %in% f.n2, 'FN2', 'FN3')))

However, if you have a number of fortnights (perhaps stretching over several
years), you need a different approach. The idea is to use the last day of
2008 as the origin in your example and compute the number of days past it
for each observation. We then integer divide the number of days past the the
origin by 14 and add one to get the fortnight number.

origin - as.Date('2008-12-31') # set origin
DF$doy - DF$Date - origin  # days past origin
DF$FN - 1 + as.numeric(DF$doy)%/% 14   # fortnight number past origin

 DF
  XY   Date Fortnight doy FN
1  114.5508 47.14094 2009-01-01   FN1  1 days  1
2  114.6468 46.98874 2009-01-03   FN1  3 days  1
3  114.6596 46.91235 2009-01-05   FN1  5 days  1
4  114.6957 46.88265 2009-01-10   FN1 10 days  1
5  114.6828 46.80584 2009-01-14   FN1 14 days  2
6  114.8903 46.67022 2009-01-15   FN2 15 days  2
7  114.9519 46.53264 2009-01-16   FN2 16 days  2
8  114.8842 46.47727 2009-01-17   FN2 17 days  2
9  114.8579 46.46457 2009-01-22   FN2 22 days  2
10 114.8489 46.47032 2009-01-29   FN3 29 days  3

This should be less tedious than the ifelse() approach.

HTH,
Dennis


On Mon, Jul 5, 2010 at 1:01 PM, Abdi, Abdulhakim ab...@si.edu wrote:

 Dear List,

 I've been looking tirelessly for a solution to this dilemma but without
 success. Perhaps someone has an idea that will guide me in the right
 direction.

 Suppose I have the following data.frame:

 DF = data.frame(X = c(114.5508, 114.6468, 114.6596, 114.6957, 114.6828,
 114.8903, 114.9519, 114.8842,
 114.8579, 114.8489), Y = c(47.14094, 46.98874, 46.91235, 46.88265,
 46.80584, 46.67022, 46.53264, 46.47727,
 46.46457, 46.47032), Date = as.Date(c('2009-01-01', '2009-01-03',
 '2009-01-05', '2009-01-10', '2009-01-14',
 '2009-01-15', '2009-01-16', '2009-01-17', '2009-01-22', '2009-01-29')))

 DF
  XY   Date
 1  114.5508 47.14094 2009-01-01
 2  114.6468 46.98874 2009-01-03
 3  114.6596 46.91235 2009-01-05
 4  114.6957 46.88265 2009-01-10
 5  114.6828 46.80584 2009-01-14
 6  114.8903 46.67022 2009-01-15
 7  114.9519 46.53264 2009-01-16
 8  114.8842 46.47727 2009-01-17
 9  114.8579 46.46457 2009-01-22
 10 114.8489 46.47032 2009-01-29

 I also have two objects that contain the dates of the first and last
 fortnight of the month of January 2009.

 s.d1 = '2009-01-01'
 e.d1 = '2009-01-14'
 f.n1 = seq(from = as.Date(s.d1)  , to =  as.Date(e.d1), by = 1)

 f.n1
 [1] 2009-01-01 2009-01-02 2009-01-03 2009-01-04 2009-01-05
 2009-01-06 2009-01-07 2009-01-08 2009-01-09 2009-01-10
 2009-01-11 2009-01-12 2009-01-13 2009-01-14

 s.d2 = '2009-01-15'
 e.d2 = '2009-01-31'
 f.n2 = seq(from = as.Date(s.d2)  , to =  as.Date(e.d2), by = 1)

 f.n2
 [1] 2009-01-15 2009-01-16 2009-01-17 2009-01-18 2009-01-19
 2009-01-20 2009-01-21 2009-01-22 2009-01-23 2009-01-24
 2009-01-25 2009-01-26 2009-01-27 2009-01-28 2009-01-29
 2009-01-30 2009-01-31


 I'm trying to add a column called Fortnight to the existing data.frame.
 The components of the new Fortnight column are based on the existing
 Date column so that if the value in Date falls within the first
 fortnight (f.n1) then the value of the new Fortnight column would be
 FN1, and if the value of the Date column falls within the second
 fortnight (f.n2), then the value of the Fortnight column would be FN2,
 and so on.

 The end result should look like:

  XY   Date Fortnight
 1  114.5508 47.14094 2009-01-01   FN1
 2  114.6468 46.98874 2009-01-03   FN1
 3  114.6596 46.91235 2009-01-05   FN1
 4  114.6957 46.88265 2009-01-10   FN1
 5  114.6828 46.80584 2009-01-14   FN1
 6  114.8903 46.67022 2009-01-15   FN2
 7  114.9519 46.53264 2009-01-16   FN2
 8  114.8842 46.47727 2009-01-17   FN2
 9  114.8579 46.46457 2009-01-22   FN2
 10 114.8489 46.47032 2009-01-29   FN2

 I manually entered the above values for the Fortnight column to
 illustrate my point, however, that would be quite tiresome for 500+ rows of
 data ;-)

 The only other similar issue I found on the list was
 https://stat.ethz.ch/pipermail/r-help/2008-February/153995.html but that
 particular problem is slightly different than what I'm trying to accomplish
 here.

 I appreciate your time and assistance.

 Thanks in advance.

 Regards,


 Hakim Abdi



 _
 Abdulhakim Abdi, M.Sc.
 Research Intern

 Conservation GIS/Remote Sensing Lab
 Smithsonian Conservation Biology Institute
 1500 Remount Road
 Front Royal, VA 22630
 phone: +1 540 635 6578
 mobile: +1 747 224 7006
 fax: +1 540 635 6506 (Attn:GIS Lab)
 email: ab...@si.edu
 http://nationalzoo.si.edu/SCBI/ConservationGIS/






[[alternative HTML version deleted]]

 __

[R] nested for loops

2010-07-05 Thread Senay ASMA
Dear Admin,
I will appreciate if you advise me an effective way to write the following R
code including nested for loops. I cannot do it by using expand.grid
function because it results with memory allocation problems.
Thanks for your time and consideration.

for(d1 in 0:n){
for(d2 in 0:n){
for(d3 in 0:n){
for(d4 in 0:n){
for(d5 in 0:n){
for(d6 in 0:n){
for(d7 in 0:n){
for(d8 in 0:n){
for(d9 in 0:n){
for(d10 in 0:n){
for(d11 in 0:n){
for(d12 in 0:n){
for(d13 in 0:n){
for(d14 in 0:n){
for(d15 in 0:n){
for(d16 in 0:n){
for(d17 in 0:n){
for(d18 in 0:n){
for(d19 in 0:n){
for(d20 in 0:n){

list=c(d1,d2,d3,d4,d5,d6,d7,d8,d9,d10,d11,d12,d13,d14,d15,d16,d17,d18,d19,d20)


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] How to determine if R is 64 bit compiled under Unix-alike?

2010-07-05 Thread Bernardo Rangel Tura
On Mon, 2010-07-05 at 19:25 +0200, Przemek Grabowicz wrote:
 Under MacOS I had R64 executive and it was clear. Under Ubuntu, which I 
 do not have administrative rights to, there is only R executive. It 
 seems that I can allocate more than 3GB of memory, however not 
 everything seems to work the same/right as with R64 under MacOS.
 
 Pms.
 

Type .Machine$sizeof.pointer
If respond is 8 your R is 64 bits


-- 
Bernardo Rangel Tura, M.D,MPH,Ph.D
National Institute of Cardiology
Brazil

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] To detect the location of duplicate values

2010-07-05 Thread Charles Berry
Charles C. Berry cberry at tajo.ucsd.edu writes:

 
 On Mon, 5 Jul 2010, Moohwan Kim wrote:
 
  Dear R family,
 
  I have a question about how to detect some duplicate numeric observations.
  Suppose that I have two variables dataset.
 
  order value
  1  0.52
  2  0.23
  3  0.43
  4  0.21
  5  0.32
  6  0.32
  7  0.32
  8  0.32
  9  0.32
  10 0.12
  11 0.46
  12 0.09
  13 0.32
  14 0.25
  ;
  Could you help me indicate where the duplicate observations in a row
  (e.g., 0.32) are?
 
 I see you already have replies about duplicate() and unique(), which are 
 very handy for the 'detect' part of your query.
 
 But to list the locations of the duplciated elements, you might also 
 benefit from using split() and Filter() like this:
 
  Filter( function(x) length(x)1, split(order, value) )
 $`0.32`
 [1]  5  6  7  8  9 13
 

Mark Leeds kindly pointed out (in private correspondence) that this needs a bit
more explanation. If the above 'dataset' is in fact a data.frame called 'dat'

then either

attach(dat) 
Filter( function(x) length(x)1, split(order, value) )

or

Filter( function(x) length(x)1, split(dat$order, dat$value) )

or 

with( dat, Filter( function(x) length(x)1, split(order, value) ) )

should do it.

Thanks Mark!



[snip]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Function to compute the multinomial beta function?

2010-07-05 Thread Gregory Gentlemen
Dear R-users,

Is there an R function to compute the multinomial beta function? That is, the 
normalizing constant that arises in a Dirichlet distribution. For example, with 
three parameters the beta function is Beta(n1,n2,n2) = 
Gamma(n1)*Gamma(n2)*Gamma(n3)/Gamma(n1+n2+n3)

Thanks in advance for any assisstance.

Regards,
Greg



[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] nested for loops

2010-07-05 Thread Romain Francois


Le 05/07/10 23:06, Senay ASMA a écrit :


Dear Admin,
I will appreciate if you advise me an effective way to write the following R
code including nested for loops. I cannot do it by using expand.grid
function because it results with memory allocation problems.
Thanks for your time and consideration.

for(d1 in 0:n){
for(d2 in 0:n){
for(d3 in 0:n){
for(d4 in 0:n){
for(d5 in 0:n){
for(d6 in 0:n){
for(d7 in 0:n){
for(d8 in 0:n){
for(d9 in 0:n){
for(d10 in 0:n){
for(d11 in 0:n){
for(d12 in 0:n){
for(d13 in 0:n){
for(d14 in 0:n){
for(d15 in 0:n){
for(d16 in 0:n){
for(d17 in 0:n){
for(d18 in 0:n){
for(d19 in 0:n){
for(d20 in 0:n){

list=c(d1,d2,d3,d4,d5,d6,d7,d8,d9,d10,d11,d12,d13,d14,d15,d16,d17,d18,d19,d20)



Probably not what you want, but this should replicate the same effect as 
the code you posted:


list - rep( n, 20 )

Romain

--
Romain Francois
Professional R Enthusiast
+33(0) 6 28 91 30 30
http://romainfrancois.blog.free.fr
|- http://bit.ly/98Uf7u : Rcpp 0.8.1
|- http://bit.ly/c6YnCi : graph gallery collage
`- http://bit.ly/bZ7ltC : inline 0.3.5

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] How to determine if R is 64 bit compiled under Unix-alike?

2010-07-05 Thread Stuart Luppescu
On 月, 2010-07-05 at 19:25 +0200, Przemek Grabowicz wrote:
 Under MacOS I had R64 executive and it was clear. Under Ubuntu, which I 
 do not have administrative rights to, there is only R executive. It 
 seems that I can allocate more than 3GB of memory, however not 
 everything seems to work the same/right as with R64 under MacOS.

If you can locate the R executable then the command file will tell you
right away. On my system (Gentoo), the R command is actually a shell
script that sets a number of environment variables, etc. and then calls
the actual R executable, which is /usr/lib64/R/bin/exec/R (don't know if
this is the same in Ubuntu). File then gives:
file /usr/lib64/R/bin/exec/R
/usr/lib64/R/bin/exec/R: ELF 64-bit LSB executable, x86-64, version 1
(SYSV), dynamically linked (uses shared libs), for GNU/Linux 2.6.9,
stripped

-- 
Stuart Luppescu -*-*- slu at ccsr dot uchicago dot edu
CCSR in UEI at U of C
__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] to remove duplicate values

2010-07-05 Thread Moohwan Kim
Dear R family,

Suppose I have two series.

order value
1  0.52
2  0.23
3  0.43
4  0.21
5  0.32
6  0.32
7  0.32
8  0.32
9  0.32
10 0.12
11 0.46
12 0.09
13 0.32
14 0.25

For these two series, I figured out the way to detect the locations of
duplicate values.
The next thing to do is remove the repeated values except for a value
that would not be next to each other.
In other words, while keeping the 13th value, I want to remove
observations from 6th to 9th.
That is my end goal.

Could you help me reach the goal?

best
moohwan

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] How to determine if R is 64 bit compiled under Unix-alike?

2010-07-05 Thread Przemek Grabowicz

On 07/05/2010 10:52 PM, Marcin Jaworski wrote:

Try:

.Machine$sizeof.pointer

If you get 8, you are riding 64 bit R. If you get 4, your R is 32-bit one.
   


I got 8, so should be 64 bits. But I have problems with some package, 
could it be that it is 32-bit? It was installed using:


R CMD INSTALL foobar.tar.gz

On MacOS using

R64 CMD INSTALL foobar.tar.gz

gave proper effect. But here on Ubuntu it seems that objects from that package 
are not able to load much data.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Memory problem in multinomial logistic regression

2010-07-05 Thread Daniel Wiesmann
Dear All

I am trying to fit a multinomial logistic regression to a data set with a size 
of 94279 by 14 entries. The data frame has one sample column which is the 
categorical variable, and the number of different categories is 9. The size of 
the data set (as a csv file) is less than 10 MB.

I tried to fit a multinomial logistic regression, either using vglm() from the 
VGAM package or mlogit() from the mlogit package.

In both cases the estimation crashes because I do not have enough memory, 
although the free memory before starting the regression is more than 2GB. The 
regression functions eat up all of my memory.

Does anyone know why this relatively small data set leads to memory problems, 
and how I could work around my problem?

thank you for your help,

Daniel

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Function to compute the multinomial beta function?

2010-07-05 Thread Matt Shotwell
How about this?

mbeta - function(...) { 
exp(sum(lgamma(c(...)))-lgamma(sum(c(...
}
 
 gamma(5)*gamma(6)*gamma(7)/gamma(18)
[1] 5.829838e-09
 mbeta(5,6,7)
[1] 5.829838e-09



On Mon, 2010-07-05 at 17:10 -0400, Gregory Gentlemen wrote:
 Dear R-users,
 
 Is there an R function to compute the multinomial beta function? That is, the 
 normalizing constant that arises in a Dirichlet distribution. For example, 
 with three parameters the beta function is Beta(n1,n2,n2) = 
 Gamma(n1)*Gamma(n2)*Gamma(n3)/Gamma(n1+n2+n3)
 
 Thanks in advance for any assisstance.
 
 Regards,
 Greg
 
 
 
   [[alternative HTML version deleted]]
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.
-- 
Matthew S. Shotwell
Graduate Student
Division of Biostatistics and Epidemiology
Medical University of South Carolina
http://biostatmatt.com

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] How to determine if R is 64 bit compiled under Unix-alike?

2010-07-05 Thread Stuart Luppescu
On 月, 2010-07-05 at 23:05 +0200, Przemek Grabowicz wrote:
 On 07/05/2010 10:52 PM, Marcin Jaworski wrote:
  Try:
 
  .Machine$sizeof.pointer
 
  If you get 8, you are riding 64 bit R. If you get 4, your R is 32-bit one.
 
 
 I got 8, so should be 64 bits. But I have problems with some package, 
 could it be that it is 32-bit? It was installed using:
 
 R CMD INSTALL foobar.tar.gz
 
 On MacOS using
 
 R64 CMD INSTALL foobar.tar.gz
 
 gave proper effect. But here on Ubuntu it seems that objects from that 
 package are not able to load much data.

I think when you install a package, the source files are compiled using
the development tools on your system. (I don't know -- are there any
binary packages for Linux?) Do you have the necessary C and Fortran
compilers in your system? If you can find the object files, you can test
them with file as before, for example:

file /usr/lib64/R/library/MASS/libs/MASS.so 
/usr/lib64/R/library/MASS/libs/MASS.so: ELF 64-bit LSB shared object,
x86-64, version 1 (SYSV), dynamically linked, stripped

-- 
Stuart Luppescu -=- slu .at. ccsr.uchicago.edu
University of Chicago -=- CCSR 
才文と智奈美の父 -=-Kernel 2.6.31-gentoo-r6
Andrew Thomas: ...and if something goes wrong here 
 it is probably not WinBUGS since that has been
 running for more than 10 years... Peter Green
 (from the back): ... and it still hasn't
 converged!-- Andrew Thomas and Peter Green
 (during the talk about 'BRugs')   gR 2003,
 Aalborg (September 2003)

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] nested for loops

2010-07-05 Thread jim holtman
What do you want to do with the data being genereated?  In the loop
you have, it will just return the last value generated.  Let me ask my
favorite question: What is the problem you are trying to solve.  If
you get a memory problem with expand.grid, then if you are trying to
store the values in the 'for' loop, you will have the same problem.
How big is 'n'?  If it is 3, you will have this many values:

 3^20
[1] 3486784401

What are you going to do with them?

On Mon, Jul 5, 2010 at 5:06 PM, Senay ASMA senaya...@gmail.com wrote:
 Dear Admin,
 I will appreciate if you advise me an effective way to write the following R
 code including nested for loops. I cannot do it by using expand.grid
 function because it results with memory allocation problems.
 Thanks for your time and consideration.

 for(d1 in 0:n){
 for(d2 in 0:n){
 for(d3 in 0:n){
 for(d4 in 0:n){
 for(d5 in 0:n){
 for(d6 in 0:n){
 for(d7 in 0:n){
 for(d8 in 0:n){
 for(d9 in 0:n){
 for(d10 in 0:n){
 for(d11 in 0:n){
 for(d12 in 0:n){
 for(d13 in 0:n){
 for(d14 in 0:n){
 for(d15 in 0:n){
 for(d16 in 0:n){
 for(d17 in 0:n){
 for(d18 in 0:n){
 for(d19 in 0:n){
 for(d20 in 0:n){

 list=c(d1,d2,d3,d4,d5,d6,d7,d8,d9,d10,d11,d12,d13,d14,d15,d16,d17,d18,d19,d20)
 

        [[alternative HTML version deleted]]

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.




-- 
Jim Holtman
Cincinnati, OH
+1 513 646 9390

What is the problem that you are trying to solve?

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] to remove duplicate values

2010-07-05 Thread Jannis
Some further tricks will (probably) lead you to your goal. I suppose you 
use duplicated() or something similar to get an array of locations of 
the duplicated values:


pos.dup - whcih(duplicated(value))

then do
diff.pos.dup - diff(pos.dup)

and you get the indices to delete:

pos.delete   - order[diff.pos.dup[which(diff.pos.dup==1)]]


I leave some tweaking to you as you perhaps have to adjust some indices 
slightly by adding or substracting 1 (I am never exactly sure how this 
diff() function turns out).



HTH
Jannis
Moohwan Kim schrieb:

Dear R family,

Suppose I have two series.

order value
1  0.52
2  0.23
3  0.43
4  0.21
5  0.32
6  0.32
7  0.32
8  0.32
9  0.32
10 0.12
11 0.46
12 0.09
13 0.32
14 0.25

For these two series, I figured out the way to detect the locations of
duplicate values.
The next thing to do is remove the repeated values except for a value
that would not be next to each other.
In other words, while keeping the 13th value, I want to remove
observations from 6th to 9th.
That is my end goal.

Could you help me reach the goal?

best
moohwan

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.




__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Data Labels in a barchart (Lattice or otherwise)

2010-07-05 Thread David Winsemius


On Jul 5, 2010, at 1:14 PM, RaoulD wrote:



Thank You David. Yes, I am using the lattice barchart and have  
managed to add

data labels, however, they tend to be on the tip of each bar and are
difficult to read as they are partially on the bar. Any help would be
greatly appreciated.

This is the code I am using:
levels(PR_SUMMARY$Bucket)=c(0-3 months,3-9 months,9-15  
months,15-18

months)
barchart(PrimaryReason ~ cInteractions| Bucket + Type, data =  
PR_SUMMARY,

layout = c(4, 2),col=lightgreen,main=COMPARISON - PRIMARY REASON,
  sub=L  R,xlab=Number of Customers,ylab=Primary  
Reasons,

  auto.key = list(title = COMPARISON - PRIMARY
REASON,columns=2,points = FALSE, rectangles =  TRUE,space=  
right

),scales = list(x = list(abbreviate=TRUE,minlength=5,rot=45)),
  panel = function(x,y,subscripts,groups,...){
   panel.barchart(x,y,...)
   ltext(x,y,label=round(PR_SUMMARY$cInteractions,1),
cex=.99,rot=45)


# if you add or subtract a small amount from y in the prior line it  
will move the labels up or down.



   border=transparent})

I dont really understand the ltext part and found it with some  
other code,

but it works.

Thanks again,
Raoul
--



David Winsemius, MD
West Hartford, CT

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] calculation on series with different time-steps

2010-07-05 Thread Jannis
Your question seem to me to be not precise enough for us to provide 
help. Do you need help with the if() syntax? If yes I would advice you 
to read some introductory R tutorial (like introduction to R (pdf, 
freely avalable on the net)) or some descent textbook.


On quick hint for the correct syntax:
| denotes OR
 denotes AND
you need to write == if you want to use the logical equals

If you do not know how to write the results to some dataframe you should 
really invest the time to get to know to R by some basic tutorial to 
understand the basics.


By the way, if your timeseries is ordered in the way you describe, a 
more elegant way would be to create a series consisting of the pressure 
value belonging to each entry in the stream stage vector (by repeating 
each single value of pressure 12 times (see ?rep ), and then just 
substract the two.



HTH
Jannis

Jeana Lee schrieb:

Hello,

I have two series, one with stream stage measurements every 5 minutes, and
the other with barometric pressure measurements every hour.  I want to
subtract each barometric pressure measurement from the 12 stage measurements
closest in time to it (6 stage measurements on either side of the hour).

I want to do something like the following, but I don't know the syntax.

If the Julian day of the stage measurement is equal to the Julian day of
the pressure measurement, AND the absolute value of the difference between
the time of the stage measurement and the hour of the pressure measurement
is less than or equal to 30 minutes, then subtract the pressure measurement
from the stage measurement (and put it in a new column in the stage data
frame).

 if ( stage$julian_day = baro$julian_day  |stage$time -
baro$hour| = 30 )
 then (stage$stage.cm - baro$pressure)

Can you help me?

Thanks,
JL

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.




__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] timeseries

2010-07-05 Thread Jannis
Please have a look at the posting guide of the list. How shall we help 
you withou an idea of what you have done? Please include reproducible 
code and sample data!


nuncio m schrieb:

Dear useRs,
I am trying to construct a time series using as.ts function, surprisingly
when I plot
the data the x axis do not show the time in years, however if I use
ts(data), time in years are shown in the
x axis.  Why such difference in the results of both the commands
Thanks
nuncio





__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Profiler for R ? (HFWUtils package)

2010-07-05 Thread Jim Callahan
 Message: 21
 Date: Mon, 5 Jul 2010 02:26:29 -0400
 From: Ralf B ralf.bie...@gmail.com
 To: r-help@r-project.org r-help@r-project.org
 Subject: [R] Profiler for R ?

 Hi,

 is there such a thing as a profiler for R that informs about a) how
 much processing time is used by particular functions and commands and
 b) how much memory is used for creating how many objects (or types of
 data structures)?

Haven't tried it; but stumbled across Profiling() function in the
HFWUtils package.
Starting at bottom of page 29-30 of HFWUtils package user manual:

profiling
plots tree of execution times

Description
determines how much time a function its and sub-functions (and
sub-functions thereof etc) take to run (‘profiling’). Also draws
picture of this using the interrelations of functions.


HTH,
Jim Callahan
Orlando, FL


In a way I am looking for something similar to the
 java profiler (which is started by command line and provides profiling
 information collected from the run of a particular program). Is there
 such a tool through the R command line or RGUI ? Are there profilers
 available for the Eclipse StatET or though another package or
 extension?

 Thanks,
 Ralf

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] data.frame: adding a column that is based on ranges of values in another column

2010-07-05 Thread Bill.Venables
Here is one way

 checkList - data.frame(Day = c(f.n1, f.n2),
+ FN = rep(c(FN1,FN2),
+ c(length(f.n1), length(f.n2
 m - match(DF$Date, checkList$Day)
 DF - cbind(DF, Fortnight = checkList$FN[m])
 DF
  XY   Date Fortnight
1  114.5508 47.14094 2009-01-01   FN1
2  114.6468 46.98874 2009-01-03   FN1
3  114.6596 46.91235 2009-01-05   FN1
4  114.6957 46.88265 2009-01-10   FN1
5  114.6828 46.80584 2009-01-14   FN1
6  114.8903 46.67022 2009-01-15   FN2
7  114.9519 46.53264 2009-01-16   FN2
8  114.8842 46.47727 2009-01-17   FN2
9  114.8579 46.46457 2009-01-22   FN2
10 114.8489 46.47032 2009-01-29   FN2
 

-Original Message-
From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On 
Behalf Of Abdi, Abdulhakim
Sent: Tuesday, 6 July 2010 6:01 AM
To: r-help@r-project.org
Subject: [R] data.frame: adding a column that is based on ranges of values in 
another column

Dear List,

I've been looking tirelessly for a solution to this dilemma but without 
success. Perhaps someone has an idea that will guide me in the right direction.

Suppose I have the following data.frame:

DF = data.frame(X = c(114.5508, 114.6468, 114.6596, 114.6957, 114.6828, 
114.8903, 114.9519, 114.8842,
114.8579, 114.8489), Y = c(47.14094, 46.98874, 46.91235, 46.88265, 46.80584, 
46.67022, 46.53264, 46.47727,
46.46457, 46.47032), Date = as.Date(c('2009-01-01', '2009-01-03', '2009-01-05', 
'2009-01-10', '2009-01-14',
'2009-01-15', '2009-01-16', '2009-01-17', '2009-01-22', '2009-01-29')))

DF
  XY   Date
1  114.5508 47.14094 2009-01-01
2  114.6468 46.98874 2009-01-03
3  114.6596 46.91235 2009-01-05
4  114.6957 46.88265 2009-01-10
5  114.6828 46.80584 2009-01-14
6  114.8903 46.67022 2009-01-15
7  114.9519 46.53264 2009-01-16
8  114.8842 46.47727 2009-01-17
9  114.8579 46.46457 2009-01-22
10 114.8489 46.47032 2009-01-29

I also have two objects that contain the dates of the first and last fortnight 
of the month of January 2009.

s.d1 = '2009-01-01'
e.d1 = '2009-01-14'
f.n1 = seq(from = as.Date(s.d1)  , to =  as.Date(e.d1), by = 1)

f.n1
[1] 2009-01-01 2009-01-02 2009-01-03 2009-01-04 2009-01-05 
2009-01-06 2009-01-07 2009-01-08 2009-01-09 2009-01-10 2009-01-11 
2009-01-12 2009-01-13 2009-01-14

s.d2 = '2009-01-15'
e.d2 = '2009-01-31'
f.n2 = seq(from = as.Date(s.d2)  , to =  as.Date(e.d2), by = 1)

f.n2
[1] 2009-01-15 2009-01-16 2009-01-17 2009-01-18 2009-01-19 
2009-01-20 2009-01-21 2009-01-22 2009-01-23 2009-01-24 2009-01-25 
2009-01-26 2009-01-27 2009-01-28 2009-01-29 2009-01-30 2009-01-31


I'm trying to add a column called Fortnight to the existing data.frame. The 
components of the new Fortnight column are based on the existing Date 
column so that if the value in Date falls within the first fortnight (f.n1) 
then the value of the new Fortnight column would be FN1, and if the value 
of the Date column falls within the second fortnight (f.n2), then the value 
of the Fortnight column would be FN2, and so on.

The end result should look like:

  XY   Date Fortnight
1  114.5508 47.14094 2009-01-01   FN1
2  114.6468 46.98874 2009-01-03   FN1
3  114.6596 46.91235 2009-01-05   FN1
4  114.6957 46.88265 2009-01-10   FN1
5  114.6828 46.80584 2009-01-14   FN1
6  114.8903 46.67022 2009-01-15   FN2
7  114.9519 46.53264 2009-01-16   FN2
8  114.8842 46.47727 2009-01-17   FN2
9  114.8579 46.46457 2009-01-22   FN2
10 114.8489 46.47032 2009-01-29   FN2

I manually entered the above values for the Fortnight column to illustrate my 
point, however, that would be quite tiresome for 500+ rows of data ;-)

The only other similar issue I found on the list was 
https://stat.ethz.ch/pipermail/r-help/2008-February/153995.html but that 
particular problem is slightly different than what I'm trying to accomplish 
here.

I appreciate your time and assistance.

Thanks in advance.

Regards,


Hakim Abdi



_
Abdulhakim Abdi, M.Sc.
Research Intern

Conservation GIS/Remote Sensing Lab
Smithsonian Conservation Biology Institute
1500 Remount Road
Front Royal, VA 22630
phone: +1 540 635 6578
mobile: +1 747 224 7006
fax: +1 540 635 6506 (Attn:GIS Lab)
email: ab...@si.edu
http://nationalzoo.si.edu/SCBI/ConservationGIS/






[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Help in the legend()

2010-07-05 Thread Shant Ch
Hi R-users,

I was plotting the differences of the variances of the three estimators- T^(1), 
T^(2), T^(3), ofcourse taking two at a time. I was using the expression() in 
the legend function in order to show which line correspond to which of the 
difference, but the following that I had used didn't gave desired result. I 
would be grateful, if you help me out.

plot(n, pg, type=l,xlab=n,ylab=Differences of the 
variances,ylim=c(-0.0012,0.0023), xlim=c(0,60));
lines(gs,lty = 2)
lines(ps,lty=5)

legend(30, 0.0021, expression( c ( var(t^(3))-var(t^(2)), 
var(t^(2))-var(t^(1))), var(t^(3))-var(t^(1)) ) ), lty=c(1,2,5))

Thanks.
Shant


  
[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Function to compute the multinomial beta function?

2010-07-05 Thread Robert A LaBudde

At 05:10 PM 7/5/2010, Gregory Gentlemen wrote:

Dear R-users,

Is there an R function to compute the multinomial beta function? 
That is, the normalizing constant that arises in a Dirichlet 
distribution. For example, with three parameters the beta function 
is Beta(n1,n2,n2) = Gamma(n1)*Gamma(n2)*Gamma(n3)/Gamma(n1+n2+n3)


 beta3- function (n1, n2, n3) 
exp(lgamma(n1)+lgamma(n2)+lgamma(n3)-lgamma(n1+n2+n3))

 beta3(5,3,8)
[1] 1.850002e-07


Robert A. LaBudde, PhD, PAS, Dpl. ACAFS  e-mail: r...@lcfltd.com
Least Cost Formulations, Ltd.URL: http://lcfltd.com/
824 Timberlake Drive Tel: 757-467-0954
Virginia Beach, VA 23464-3239Fax: 757-467-2947

Vere scire est per causas scire

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Help in the legend()

2010-07-05 Thread David Winsemius


On Jul 5, 2010, at 8:06 PM, Shant Ch wrote:


Hi R-users,

I was plotting the differences of the variances of the three  
estimators- T^(1), T^(2), T^(3), ofcourse taking two at a time. I  
was using the expression() in the legend function in order to show  
which line correspond to which of the difference, but the following  
that I had used didn't gave desired result. I would be grateful, if  
you help me out.


plot(n, pg, type=l,xlab=n,ylab=Differences of the  
variances,ylim=c(-0.0012,0.0023), xlim=c(0,60));

lines(gs,lty = 2)
lines(ps,lty=5)

legend(30, 0.0021, expression( c ( var(t^(3))-var(t^(2)), var(t^(2))- 
var(t^(1))), var(t^(3))-var(t^(1)) ) ), lty=c(1,2,5))




Have you consider offering a toy set of objects which defines t,  
n, and pg.



Thanks.
Shant



[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


David Winsemius, MD
West Hartford, CT

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] to remove duplicate values

2010-07-05 Thread Charles C. Berry

On Mon, 5 Jul 2010, Moohwan Kim wrote:


Dear R family,

Suppose I have two series.

order value
1  0.52
2  0.23
3  0.43
4  0.21
5  0.32
6  0.32
7  0.32
8  0.32
9  0.32
10 0.12
11 0.46
12 0.09
13 0.32
14 0.25

For these two series, I figured out the way to detect the locations of
duplicate values.


You _asked how_ to do it on R-help and got several answers showing how to 
do it.


That doesn't count as 'figured out how to do it'. You should give credit 
where it is warranted.




The next thing to do is remove the repeated values except for a value
that would not be next to each other.


Well, that is what you should have asked in the first place.

The answer is actually simpler and need not involve duplicated().

Use one each of these operations

head
tail
!=
c
[

in that order and you have a neat one-liner that returns the original 
data.frame without the adjacent duplicates.


And since I did not say exactly how to do it, you will be able to claim 
that you figured out the way albeit with assistance. ;-)




In other words, while keeping the 13th value, I want to remove
observations from 6th to 9th.
That is my end goal.

Could you help me reach the goal?

best
moohwan

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.



Charles C. Berry(858) 534-2098
Dept of Family/Preventive Medicine
E mailto:cbe...@tajo.ucsd.edu   UC San Diego
http://famprevmed.ucsd.edu/faculty/cberry/  La Jolla, San Diego 92093-0901

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Memory problem in multinomial logistic regression

2010-07-05 Thread Charles C. Berry

On Mon, 5 Jul 2010, Daniel Wiesmann wrote:


Dear All

I am trying to fit a multinomial logistic regression to a data set with 
a size of 94279 by 14 entries. The data frame has one sample column 
which is the categorical variable, and the number of different 
categories is 9. The size of the data set (as a csv file) is less than 
10 MB.



First, do

str( your.data.frame )

so we can be sure that you do not have a factor lurking among your 
regressors.


Then report the calls you used for vglm() and mlogit().

It might not hurt to construct the model.matrix() first and check on it 
with object.size()


Also try

for (i in levels(your.data.frame$sample)){
print(
glm(I(sample==i) ~. , your.data.,frame, family=binomial)
)}

just to check on your data. If that loop fails all bets are off.


HTH,

Chuck



I tried to fit a multinomial logistic regression, either using vglm() 
from the VGAM package or mlogit() from the mlogit package.


In both cases the estimation crashes because I do not have enough 
memory, although the free memory before starting the regression is more 
than 2GB. The regression functions eat up all of my memory.


Does anyone know why this relatively small data set leads to memory 
problems, and how I could work around my problem?


thank you for your help,

Daniel

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.



Charles C. Berry(858) 534-2098
Dept of Family/Preventive Medicine
E mailto:cbe...@tajo.ucsd.edu   UC San Diego
http://famprevmed.ucsd.edu/faculty/cberry/  La Jolla, San Diego 92093-0901

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Help in the legend()

2010-07-05 Thread Shant Ch
Thanks Dr. Winsemius. Here's the toy data set. 

Basically pg = var(t^(3))-var(t^(2), gs = var(t^(2))-var(t^(1))and   
ps=var(t^(3))-var(t^(1)). The revised code and the data set is as follows:  

n-seq(4:13)
pg-c(-1.241394e-03, -9.738079e-04, -7.158755e-04, -5.343962e-04, 
-4.088778e-04, -3.202068e-04, -2.558709e-04, -2.079914e-04, -1.715435e-04,
 -1.432430e-04)
gs-c(0.0022520038, 0.0020060234, 0.0017601434, 0.0015519810, 
0.0013810851,0.0012407732, 0.0011245410, 0.0010271681, 0.0009446642, 
0.0008740083)
ps-c( 0.0010106098, 0.0010322155, 0.0010442678, 0.0010175848, 
0.0009722074,0.0009205665, 0.0008686700, 0.0008191768, 0.0007731207, 
0.0007307653)

plot(n, pg, type=l,xlab=n,ylab=Differences of the 
variances,ylim=c(-0.0012,0.0023) );
lines(gs,lty = 2)
lines(ps,lty=5)
 
legend(30, 0.0021, expression( c ( var(t^(3))-var(t^(2)), 
var(t^(2))-var(t^(1))), var(t^(3))-var(t^(1)) ) ), lty=c(1,2,5)).




From: David Winsemius dwinsem...@comcast.net

Cc: r-help@r-project.org
Sent: Mon, July 5, 2010 9:43:19 PM
Subject: Re: [R] Help in the legend()


On Jul 5, 2010, at 8:06 PM, Shant Ch wrote:

 Hi R-users,
 
 I was plotting the differences of the variances of the three estimators- 
 T^(1), T^(2), T^(3), ofcourse taking two at a time. I was using the 
 expression() in the legend function in order to show which line correspond to 
 which of the difference, but the following that I had used didn't gave 
 desired result. I would be grateful, if you help me out.
 
 plot(n, pg, type=l,xlab=n,ylab=Differences of the 
 variances,ylim=c(-0.0012,0.0023), xlim=c(0,60));
 lines(gs,lty = 2)
 lines(ps,lty=5)
 
 legend(30, 0.0021, expression( c ( var(t^(3))-var(t^(2)), 
 var(t^(2))-var(t^(1))), var(t^(3))-var(t^(1)) ) ), lty=c(1,2,5))
 

Have you consider offering a toy set of objects which defines t, n, and 
pg.

 Thanks.
 Shant
 
 
 
 [[alternative HTML version deleted]]
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

David Winsemius, MD
West Hartford, CT


  
[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] nls + quasi-poisson distribution

2010-07-05 Thread Suresh Krishna


Hello R-helpers,

I would like to fit a non-linear function to data (Discrete X axis,  
over-dispersed Poisson values on the Y axis).


I found the functions gnlr in the gnlm package from Jim Lindsey: this can  
handle nonlinear regression equations for the parameters of Poisson and  
negative binomial distributions, among others. I also found the function  
nls2 in the software package accompanying the book Statistical tools for  
nonlinear regression by Huet et al: this can handle nonlinear regression  
with Poisson distributed Y-axis values.


I was wondering if there was any other option: specifically, any option  
that handled nonlinear fitting with quasi-Poisson distributions (to handle  
the overdispersion).


This is a very new area for me, and I am still trying to figure out the  
best way to do this, so I would appreciate any and all pointers.


Thanks much, Suresh

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] How to Plot With Different Marker ( ‘x ’ and ‘o’) Based on Condition in R

2010-07-05 Thread Gundala Viswanath
Dear Expert,

I have a data that looks like this:

for_y_axis -c(0.49534,0.80796,0.93970,0.8)
for_x_axis -c(1,2,3,4)
count  -c(0,33,0,4)

What I want to do is to plot the graph using for_x_axis and
for_y_axis but will mark
each point with o if the value is equal to 0(zero) and with x if
the count value is greater than zero.

Is there a simple way to achieve that in R?

Regards,

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.