date:20081209

Stavros Macrakis wrote:
 I've read in many places that R semantics are based on Scheme semantics.  As
 a long-time Lisp user and implementor, I've tried to make this more precise,
 and this is what I've found so far.  I've excluded trivial things that
 aren't basic semantic issues: support for arbitrary-precision integers;
 subscripting; general style; etc. I would appreciate corrections or
 additions from more experienced users of R -- I'm sure that some of the
 points below simply reflect my ignorance.

 ==Similarities to Scheme==

 R has first-class function closures. (i.e. correctly supports upward and
 downward funarg).

 R has a single namespace for functions and variables (Lisp-1).

 ==Important dissimilarities to Scheme (as opposed to other Lisps)==

 R is not properly tail-recursive.

 R does not have continuations or call-with-current-continuation or other
 mechanisms for implementing coroutines, general iterators, and the like.
   

there is callCC, for example, which however seems kind of obsolete.

 R supports keyword arguments.

 ==Similarities to Lisp and other dynamic languages, including Scheme==

 R is runtime-typed and garbage-collected.

 R supports nested read-eval-print loops for debugging etc.

 R expressions are represented as user-manipulable data structures.

 ==Dissimilarities to all (modern) Lisps, including Scheme==

 R has call-by-need, not call-by-object-value.

 R does not have macros.

 R objects are values, not pointers, so a-1:10; b-a; b[1]-999; a[1] =
 999.  Similarly, functions cannot modify the contents of their arguments.
   

have you actually tried this code?  even if the objects are values not
pointers, assignment causes, in cases such as the above, copying the
value with modifications applied as needed.  thus, a[1] - 1, not 999,
even though after b-a b and a are the same value object.

try the following:

system.time(x-1:(10^8))
system.time(y-x)
system.time(y[1]-0)
system.time(y[2]-0)
head(x)
head(y)


with some trickery, functions can modify the contents of their
arguments, using deparse/substitute and assign:

a - 1
f - function(x) assign(deparse(substitute(x)), 0, parent.frame())
f(a)
a


the 'cannot modify the contents' does not apply to arguments that are
environments:

e - new.env(parent=emptyenv())
l - list()
f - function(e) e$a = 0
f(e)
e$a
f(l)
l$a



 There is no equivalent to set-car!/rplaca (not even pairlists and
 expressions).  For example, r-pairlist(1,2); r[[1]]-r does not create a
 circular list. And in general there doesn't seem to be substructure sharing
 at the semantic level (though there may be in the implementation).
   

computations on environment objects seem not to be subject to the
copy-value-on-assignment semantics:

e - new.env(parent=emptyenv())
ee - e
e$a - 0
ee$a


 R does not have multiple value return in the Lisp sense.

 R assignment creates a new local variable on first assignment, dynamically.
 So static analysis is not enough to determine variable reference (R is not
 referentially transparent). Example: ff - function(a){if (a) x-1; x} ;
 x-99; ff(T) - 1; ff(F) - 99.

 In R, most data types (including numeric vectors) do not have a standard
 external representation which can be read back in without evaluation.

 R coerces logicals to numbers and numbers to strings. Lisps are stricter
 about automatic type conversion -- except that false a.k.a. NIL == () in
 Lisps other than Scheme.
   

types are not treated coherently.  in some situations, r coerces doubles
to complex (according to the hierarchy of types specified here and there
in the man pages), in others it won't:

x - as.double(-1)
y - as.complex(-1)
x == y

sqrt(x)
sqrt(y)

in certain cases, r will also do implicit inverse (downward) coercion:

is(y:y)



vQ

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] R and Scheme

Stavros Macrakis wrote:
 There is no equivalent to set-car!/rplaca (not even pairlists and
 expressions).  For example, r-pairlist(1,2); r[[1]]-r does not create a
 circular list. And in general there doesn't seem to be substructure sharing
 at the semantic level (though there may be in the implementation).

   

again, you can achieve the effect with environments:

e = new.env(parent=emptyenv())
e$e = e
e
e$e$e$e


vQ

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] for loop query

2008-12-09 Thread Claudia Beleites

Hi,

 Why isn't my loop incrementing i - the outer loop to 2 and then resetting
 j=3?
It is. It runs out of bounds with j  26

 Am I missing something obvious?
   for (i in 1:25)
  + {
  +   for (j in i+1:26)
You miss parentheses.

i + 1 : 26  is i + (1 : 26) as the vector 1 :26 is calculated first

what happens is that for i = 1 j goes over 2 : 27, with i = 2 over 3 : 28, ...

what you want is (i + 1) : 26:

for (i in 1 : 25)
   for (j in (i + 1) : 26)
  cat (i, j, \n)

HTH Claudia

-- 
Claudia Beleites
Dipartimento dei Materiali e delle Risorse Naturali
Università degli Studi di Trieste
Via Alfonso Valerio 6/a
I-34127 Trieste

phone: +39 (0 40) 5 58-34 47
email: [EMAIL PROTECTED]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Pre-model Variable Reduction

2008-12-09 Thread Mark Difford


Hi Harsh,

 I looked for other R packages that allow me to do variable reduction 
 without considering a dependent variable.

Have look at package subselect. This has an implementation of the genetic
algorithm, along with some other methods. It should do what you want.

Regards, Mark.


Harsh-7 wrote:
 
 Hello All,
 I am trying to carry out variable reduction. I do not have information
 about the dependent variable, and have only the X variables as it
 were.
 In selecting variables I wish to keep, I have considered the following
 criteria.
 1) Percentage of missing value in each column/variable
 2) Variance of each variable, with a cut-off value.
 
 I recently came across Weka and found that there is an RWeka package
 which would allow me to make use of Weka through R.
 Weka provides a Genetic search variable reduction method, but I
 could not find its R code implementation in the RWeka Pdf file on
 CRAN.
 
 I looked for other R packages that allow me to do variable reduction
 without considering a dependent variable. I came across 'dprep'
 package but it does not have a Windows implementation.
 
 Moreover, I have a dataset that contains continuous and categorical
 variables, some categorical variables having 3 levels, 10 levels and
 so on, till a max 50 levels (E.g. States in the USA).
 
 Any suggestions in this regard will be much appreciated.
 
 Thank you
 
 Harsh Singhal
 Decision Systems,
 Mu Sigma, Inc.
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.
 
 

-- 
View this message in context: 
http://www.nabble.com/Pre-model-Variable-Reduction-tp20912229p20914146.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] display matrix on graphics device

2008-12-09 Thread Jim Lemon


[EMAIL PROTECTED] wrote:

Folks,

Is there a way to print a matrix in tabular form onto the graphics device? 
I want to create a display consisting of graphs and tables,

so that I can do something like:

windows()
opar = par(mfrow = c(3, 2))

library(plotrix)
library(PerformanceAnalytics)

radial.plot(...)
radial.plot(...)
chart.BarVaR(...)
chart.RollingCorrelation(...)

print.matrix.onto.graphics.device(X)

par(opar)

where X is a matrix with named rows and columns.

If this is not easily done, is there any way I can embed graphics and 
tabular data into a PDF?


  

Hi Murali,
As you are using the plotrix package, will addtable2plot do what you want?

Jim

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Sorry, I have attached the data - Here is the code that causes wavCWTPeaks error

2008-12-09 Thread mauede


-Messaggio originale-
Da: [EMAIL PROTECTED]
Inviato: mar 09/12/2008 13.52
A: stephen sefick; Francesco Masulli; Stefano Rovetta
Cc: [EMAIL PROTECTED]; [EMAIL PROTECTED]; [EMAIL PROTECTED]
Oggetto: Here is the code that causes wavCWTPeaks error
 
aats - create.signalSeries(aa, pos=list(from=0.0, by=0.033))
aa.cwt -  wavCWT(aats)   
x11 (width=10,height=12)
plot (aats,main=paste(insig, Cycle: ,j,sep=))   
aa.maxtree - wavCWTTree (aa.cwt, type=maxima) 
aa.mintree - wavCWTTree (aa.cwt, type=minima)
aa.maxpeak - wavCWTPeaks (aa.maxtree) 
aa.minpeak - wavCWTPeaks (aa.mintree)
  Error in `row.names-.data.frame`(`*tmp*`, value = c(1, 0)) : 
  invalid 'row.names' length
bb - -aa   #EXCHANGE MAXIMA WITH MINIMA
bbts - create.signalSeries(bb, pos=list(from=0.0, by=0.033))
plot (bbts,main=paste(insig, Cycle: ,j,sep=)) 
plot (bbts,main=paste(insig, Cycle: ,j,sep=))
bb.cwt -  wavCWT(bbts)
bb.maxtree - wavCWTTree (bb.cwt, type=maxima) 
bb.mintree - wavCWTTree (bb.cwt, type=minima)
bb.maxpeak - wavCWTPeaks (bb.maxtree) 
  Error in `row.names-.data.frame`(`*tmp*`, value = c(1, 0)) : 
  invalid 'row.names' length
bb.minpeak - wavCWTPeaks (bb.mintree)
  Error in `row.names-.data.frame`(`*tmp*`, value = c(1, 0)) : 
  invalid 'row.names' length

Attached are aa breathing cycle amplitude values (zip compressed)
I'd like to figure out what is wrong with my data. 
wmTSA documentation does not mention any time series constraint.

Thank you in advance.
Kind regards,
Maura 

-Messaggio originale-
Da: stephen sefick [mailto:[EMAIL PROTECTED]
Inviato: mar 09/12/2008 8.11
A: [EMAIL PROTECTED]
Cc: [EMAIL PROTECTED]; [EMAIL PROTECTED]; [EMAIL PROTECTED]
Oggetto: Re: [R] package wmtsa: wavCWTPeaks error
 
Are the names of the rows the same as the time series that you are
using?  I know that I am not being that helpful, but this seems like a
mismatch in the time series object.  look at
length(rowname(your.data))
length(your.data[,1])

again it is always helpful to have reproducible code.

On Tue, Dec 9, 2008 at 1:39 AM,  [EMAIL PROTECTED] wrote:
 I keep getting the following error when I look for minima in the series:

 aa.peak - wavCWTPeaks (aa.tree)
 Error in `row.names-.data.frame`(`*tmp*`, value = c(1, 0)) :
  invalid 'row.names' length

 How can I work it around ?

 Thank you.

 Regards,
 Maura


 Alice Messenger ;-) chatti anche con gli amici di Windows Live Messenger e 
 tutti i telefonini TIM!
 Vai su http://maileservizi.alice.it/alice_messenger/index.html?pmk=footer

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.





-- 
Stephen Sefick

Let's not spend our time and resources thinking about things that are
so little or so large that all they really do for us is puff us up and
make us feel like gods.  We are mammals, and have not exhausted the
annoying little problems of being mammals.

-K. Mullis




Alice Messenger ;-) chatti anche con gli amici di Windows Live Messenger e 
tutti i telefonini TIM!
Vai su http://maileservizi.alice.it/alice_messenger/index.html?pmk=footer



Alice Messenger ;-) chatti anche con gli amici di Windows Live Messenger e 
tutti i telefonini TIM!
Vai su http://maileservizi.alice.it/alice_messenger/index.html?pmk=footer
__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Pre-model Variable Reduction

2008-12-09 Thread Frank E Harrell Jr


Harsh wrote:

Hello All,
I am trying to carry out variable reduction. I do not have information
about the dependent variable, and have only the X variables as it
were.
In selecting variables I wish to keep, I have considered the following criteria.
1) Percentage of missing value in each column/variable
2) Variance of each variable, with a cut-off value.

I recently came across Weka and found that there is an RWeka package
which would allow me to make use of Weka through R.
Weka provides a Genetic search variable reduction method, but I
could not find its R code implementation in the RWeka Pdf file on
CRAN.

I looked for other R packages that allow me to do variable reduction
without considering a dependent variable. I came across 'dprep'
package but it does not have a Windows implementation.

Moreover, I have a dataset that contains continuous and categorical
variables, some categorical variables having 3 levels, 10 levels and
so on, till a max 50 levels (E.g. States in the USA).

Any suggestions in this regard will be much appreciated.

Thank you

Harsh Singhal
Decision Systems,
Mu Sigma, Inc.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.



Take a look at the the redun function in the Hmisc package, which does 
redundancy analysis.


Frank

--
Frank E Harrell Jr   Professor and Chair   School of Medicine
 Department of Biostatistics   Vanderbilt University

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] for loop query

2008-12-09 Thread Gerard M. Keogh



Hi all,

apologies if this is obvious - but I can't see it and would appreciate some
quick help!

the matrix mhouse is 26x3 and I'm computing odds ratios. The simple code
below should compute the odds vector for every pair (325) i.e. 26C2 in
cols 1 and 2.
On the first i=1 outer loop the inner j loop runs from 2 to 26 ok and then
I get the error (Error: subscript out of bounds)

Why isn't my loop incrementing i - the outer loop to 2 and then resetting
j=3?
Am I missing something obvious?

thanks

Gerard


  mhouse
   [,1] [,2]  [,3]
  [1,]  275  81949
  [2,]  593 1323   192
  [3,]  813 1181   292
  [4,] 2177 5189  1320
  [5,] 1651 2243   270
  [6,] 1061 5629 11035
  [7,] 1690 2302   589
  [8,] 1130 1203   345
  [9,]  565 1898   655
 [10,]  580  730   234
 [11,]  343 176173
 [12,]  372  53667
 [13,]  666 1713   397
 [14,]  382  918   279
 [15,]  486  921   247
 [16,] 1141  988   313
 [17,]  626 1135   666
 [18,]  438  436   168
 [19,]  425  691   101
 [20,]  609  71699
 [21,]  467  661   141
 [22,]  879 137379
 [23,]  444 1101   130
 [24,]  459  898   351
 [25,]  995 1801   398
 [26,]  396 1107   201
  # set up the odds vector by declaring it to be null
  odds=NULL
 
  # compute the odds ratios for Individual House vs Scheme House
  for (i in 1:25)
 + {
 +   for (j in i+1:26)
 +   {
 + todds = (mhouse[i,1]*mhouse[j,2])/(mhouse[j,1]*mhouse[i,2])
 + # compute the todds for row i with row j:
 ji
 + odds = c(odds,todds)
 + # append todds to the odds vector
 +   }
 + }
 Error: subscript out of bounds
  odds
  [1] 0.7491244 0.4877622 0.8003391 0.4561745 1.7814132 0.4573697
 0.3574670 1.1279674 0.4226138 1.7239078 0.4838053
 [12] 0.8636384 0.8069156 0.6363150 0.2907502 0.6087939 0.3342421
 0.5459312 0.3947703 0.4752623 0.5244818 0.8326321
 [23] 0.6569199 0.6077702 0.9386447
 


**
The information transmitted is intended only for the person or entity to which 
it is addressed and may contain confidential and/or privileged material. Any 
review, retransmission, dissemination or other use of, or taking of any action 
in reliance upon, this information by persons or entities other than the 
intended recipient is prohibited. If you received this in error, please contact 
the sender and delete the material from any computer.  It is the policy of the 
Department of Justice, Equality and Law Reform and the Agencies and Offices 
using its IT services to disallow the sending of offensive material.
Should you consider that the material contained in this message is offensive 
you should contact the sender immediately and also mailminder[at]justice.ie.

Is le haghaidh an duine nó an eintitis ar a bhfuil sí dírithe, agus le haghaidh 
an duine nó an eintitis sin amháin, a bheartaítear an fhaisnéis a tarchuireadh 
agus féadfaidh sé go bhfuil ábhar faoi rún agus/nó faoi phribhléid inti. 
Toirmisctear aon athbhreithniú, atarchur nó leathadh a dhéanamh ar an 
bhfaisnéis seo, aon úsáid eile a bhaint aisti nó aon ghníomh a dhéanamh ar a 
hiontaoibh, ag daoine nó ag eintitis seachas an faighteoir beartaithe. Má fuair 
tú é seo trí dhearmad, téigh i dteagmháil leis an seoltóir, le do thoil, agus 
scrios an t-ábhar as aon ríomhaire. Is é beartas na Roinne Dlí agus Cirt, 
Comhionannais agus Athchóirithe Dlí, agus na nOifígí agus na nGníomhaireachtaí 
a úsáideann seirbhísí TF na Roinne, seoladh ábhair cholúil a dhícheadú.
Más rud é go measann tú gur ábhar colúil atá san ábhar atá sa teachtaireacht 
seo is ceart duit dul i dteagmháil leis an seoltóir láithreach agus le 
mailminder[ag]justice.ie chomh maith. 
***



__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Pre-model Variable Reduction

2008-12-09 Thread Harsh

Hello All,
I am trying to carry out variable reduction. I do not have information
about the dependent variable, and have only the X variables as it
were.
In selecting variables I wish to keep, I have considered the following criteria.
1) Percentage of missing value in each column/variable
2) Variance of each variable, with a cut-off value.

I recently came across Weka and found that there is an RWeka package
which would allow me to make use of Weka through R.
Weka provides a Genetic search variable reduction method, but I
could not find its R code implementation in the RWeka Pdf file on
CRAN.

I looked for other R packages that allow me to do variable reduction
without considering a dependent variable. I came across 'dprep'
package but it does not have a Windows implementation.

Moreover, I have a dataset that contains continuous and categorical
variables, some categorical variables having 3 levels, 10 levels and
so on, till a max 50 levels (E.g. States in the USA).

Any suggestions in this regard will be much appreciated.

Thank you

Harsh Singhal
Decision Systems,
Mu Sigma, Inc.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Pre-model Variable Reduction

See:

?prcomp
?princomp

On Tue, Dec 9, 2008 at 5:34 AM, Harsh [EMAIL PROTECTED] wrote:
 Hello All,
 I am trying to carry out variable reduction. I do not have information
 about the dependent variable, and have only the X variables as it
 were.
 In selecting variables I wish to keep, I have considered the following 
 criteria.
 1) Percentage of missing value in each column/variable
 2) Variance of each variable, with a cut-off value.

 I recently came across Weka and found that there is an RWeka package
 which would allow me to make use of Weka through R.
 Weka provides a Genetic search variable reduction method, but I
 could not find its R code implementation in the RWeka Pdf file on
 CRAN.

 I looked for other R packages that allow me to do variable reduction
 without considering a dependent variable. I came across 'dprep'
 package but it does not have a Windows implementation.

 Moreover, I have a dataset that contains continuous and categorical
 variables, some categorical variables having 3 levels, 10 levels and
 so on, till a max 50 levels (E.g. States in the USA).

 Any suggestions in this regard will be much appreciated.

 Thank you

 Harsh Singhal
 Decision Systems,
 Mu Sigma, Inc.

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] display matrix on graphics device

2008-12-09 Thread murali . menon

Folks,

Is there a way to print a matrix in tabular form onto the graphics device? 
I want to create a display consisting of graphs and tables,
so that I can do something like:

windows()
opar = par(mfrow = c(3, 2))

library(plotrix)
library(PerformanceAnalytics)

radial.plot(...)
radial.plot(...)
chart.BarVaR(...)
chart.RollingCorrelation(...)

print.matrix.onto.graphics.device(X)

par(opar)

where X is a matrix with named rows and columns.

If this is not easily done, is there any way I can embed graphics and 
tabular data into a PDF?

Thanks,

Murali


---
This e-mail is confidential and may be privileged. If you have received it by 
mistake, please notify the sender by return e-mail and delete it from your 
system. You should not disclose, copy or use it for any purpose. The 
information in this e-mail is not contractual. Fortis Investments provides no 
guarantee as to the correctness of this information and accepts no 
responsibility for any action taken on the basis of it. Fortis Investments is 
the trade name for all entities within the Fortis Investment Management group.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] for loop query

2008-12-09 Thread Robin Hankin



Hi

start simple!

Work out *each* row combined with *each* row,
to give (in your case) a 26-by-26 matrix.


Only after you have got this working, start thinking about
making it run faster [eg by
only evaluating the upper triangular entries]

To do a nested loop, do


M - matrix(0,n,n)

for(i in seq_len(n)){
 for(j in seq_len(n)){
   M[i,j] -  f(i,j)
   }
}

which will fill in matrix M for you.

HTH

rksh






Gerard M. Keogh wrote:

Hi all,

apologies if this is obvious - but I can't see it and would appreciate some
quick help!

the matrix mhouse is 26x3 and I'm computing odds ratios. The simple code
below should compute the odds vector for every pair (325) i.e. 26C2 in
cols 1 and 2.
On the first i=1 outer loop the inner j loop runs from 2 to 26 ok and then
I get the error (Error: subscript out of bounds)

Why isn't my loop incrementing i - the outer loop to 2 and then resetting
j=3?
Am I missing something obvious?

thanks

Gerard


  mhouse
   [,1] [,2]  [,3]
  [1,]  275  81949
  [2,]  593 1323   192
  [3,]  813 1181   292
  [4,] 2177 5189  1320
  [5,] 1651 2243   270
  [6,] 1061 5629 11035
  [7,] 1690 2302   589
  [8,] 1130 1203   345
  [9,]  565 1898   655
 [10,]  580  730   234
 [11,]  343 176173
 [12,]  372  53667
 [13,]  666 1713   397
 [14,]  382  918   279
 [15,]  486  921   247
 [16,] 1141  988   313
 [17,]  626 1135   666
 [18,]  438  436   168
 [19,]  425  691   101
 [20,]  609  71699
 [21,]  467  661   141
 [22,]  879 137379
 [23,]  444 1101   130
 [24,]  459  898   351
 [25,]  995 1801   398
 [26,]  396 1107   201
  # set up the odds vector by declaring it to be null
  odds=NULL
 
  # compute the odds ratios for Individual House vs Scheme House
  for (i in 1:25)
 + {
 +   for (j in i+1:26)
 +   {
 + todds = (mhouse[i,1]*mhouse[j,2])/(mhouse[j,1]*mhouse[i,2])
 + # compute the todds for row i with row j:
 ji
 + odds = c(odds,todds)
 + # append todds to the odds vector
 +   }
 + }
 Error: subscript out of bounds
  odds
  [1] 0.7491244 0.4877622 0.8003391 0.4561745 1.7814132 0.4573697
 0.3574670 1.1279674 0.4226138 1.7239078 0.4838053
 [12] 0.8636384 0.8069156 0.6363150 0.2907502 0.6087939 0.3342421
 0.5459312 0.3947703 0.4752623 0.5244818 0.8326321
 [23] 0.6569199 0.6077702 0.9386447
 


**
The information transmitted is intended only for the person or entity to which 
it is addressed and may contain confidential and/or privileged material. Any 
review, retransmission, dissemination or other use of, or taking of any action 
in reliance upon, this information by persons or entities other than the 
intended recipient is prohibited. If you received this in error, please contact 
the sender and delete the material from any computer.  It is the policy of the 
Department of Justice, Equality and Law Reform and the Agencies and Offices 
using its IT services to disallow the sending of offensive material.
Should you consider that the material contained in this message is offensive 
you should contact the sender immediately and also mailminder[at]justice.ie.

Is le haghaidh an duine nó an eintitis ar a bhfuil sí dírithe, agus le haghaidh 
an duine nó an eintitis sin amháin, a bheartaítear an fhaisnéis a tarchuireadh 
agus féadfaidh sé go bhfuil ábhar faoi rún agus/nó faoi phribhléid inti. 
Toirmisctear aon athbhreithniú, atarchur nó leathadh a dhéanamh ar an 
bhfaisnéis seo, aon úsáid eile a bhaint aisti nó aon ghníomh a dhéanamh ar a 
hiontaoibh, ag daoine nó ag eintitis seachas an faighteoir beartaithe. Má fuair 
tú é seo trí dhearmad, téigh i dteagmháil leis an seoltóir, le do thoil, agus 
scrios an t-ábhar as aon ríomhaire. Is é beartas na Roinne Dlí agus Cirt, 
Comhionannais agus Athchóirithe Dlí, agus na nOifígí agus na nGníomhaireachtaí 
a úsáideann seirbhísí TF na Roinne, seoladh ábhair cholúil a dhícheadú.
Más rud é go measann tú gur ábhar colúil atá san ábhar atá sa teachtaireacht seo is ceart duit dul i dteagmháil leis an seoltóir láithreach agus le mailminder[ag]justice.ie chomh maith. 
***




__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
  



--
Robin K. S. Hankin
Uncertainty Analyst
University of Cambridge
19 Silver Street
Cambridge CB3

[R] (senza oggetto)

2008-12-09 Thread Andrea Ferroni


   Dear R help,
  I use the package plm e the function plm() to analyse a panel data and
   estimate a dynamic model.
   Can I estimate a model without intercept?

   Thanks,
   Andrea Ferroni
__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Pre-model Variable Reduction

2008-12-09 Thread Hans W. Borchers

Harsh singhalblr at gmail.com writes:

 
 Hello All,
 I am trying to carry out variable reduction. I do not have information
 about the dependent variable, and have only the X variables as it
 were.
 ...
 I looked for other R packages that allow me to do variable reduction
 without considering a dependent variable. I came across 'dprep'
 package but it does not have a Windows implementation.

I doubt that you will find what you are longing for, but: There is a Windows
version available at the Homepage of the drep package at
http://math.uprm.edu/~edgar/dprep.html. 
This version 2.0 can be loaded without errors into R 2.8.0 though it appears 
not to be fully compliant with the tests on CRAN.

 Moreover, I have a dataset that contains continuous and categorical
 variables, some categorical variables having 3 levels, 10 levels and
 so on, till a max 50 levels (E.g. States in the USA).
 
 Any suggestions in this regard will be much appreciated.
 
 Thank you
 
 Harsh Singhal
 Decision Systems,
 Mu Sigma, Inc.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Pre-model Variable Reduction

2008-12-09 Thread Ravi Varadhan

Principal components analysis does dimensionality reduction but NOT
variable reduction.  However, Jolliffe's 2004 book on PCA does discuss the
problem of selecting a subset of variables, with the goal of representing
the internal variation of original multivariate vector as well as possible
(see Section 6.3 of that book).  I do not think that these methods can
handle missing data.  The most important issue is to think about the goal of
variable reduction and then choose an appropriate optimality criterion for
achieving that goal.  In most instances of variable selection, the criterion
that is optimized is never explicitly considered.

Ravi.


---

Ravi Varadhan, Ph.D.

Assistant Professor, The Center on Aging and Health

Division of Geriatric Medicine and Gerontology 

Johns Hopkins University

Ph: (410) 502-2619

Fax: (410) 614-9625

Email: [EMAIL PROTECTED]

Webpage:  http://www.jhsph.edu/agingandhealth/People/Faculty/Varadhan.html

 





-Original Message-
From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On
Behalf Of Gabor Grothendieck
Sent: Tuesday, December 09, 2008 8:00 AM
To: Harsh
Cc: r-help@r-project.org
Subject: Re: [R] Pre-model Variable Reduction

See:

?prcomp
?princomp

On Tue, Dec 9, 2008 at 5:34 AM, Harsh [EMAIL PROTECTED] wrote:
 Hello All,
 I am trying to carry out variable reduction. I do not have information 
 about the dependent variable, and have only the X variables as it 
 were.
 In selecting variables I wish to keep, I have considered the following
criteria.
 1) Percentage of missing value in each column/variable
 2) Variance of each variable, with a cut-off value.

 I recently came across Weka and found that there is an RWeka package 
 which would allow me to make use of Weka through R.
 Weka provides a Genetic search variable reduction method, but I 
 could not find its R code implementation in the RWeka Pdf file on 
 CRAN.

 I looked for other R packages that allow me to do variable reduction 
 without considering a dependent variable. I came across 'dprep'
 package but it does not have a Windows implementation.

 Moreover, I have a dataset that contains continuous and categorical 
 variables, some categorical variables having 3 levels, 10 levels and 
 so on, till a max 50 levels (E.g. States in the USA).

 Any suggestions in this regard will be much appreciated.

 Thank you

 Harsh Singhal
 Decision Systems,
 Mu Sigma, Inc.

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide 
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Selecting rows that are the same in separate data frames

2008-12-09 Thread ppaarrkk


I want to compare two matrices or data frames and select or get an index for
those rows which are the same in both. I have tried the following :






a = matrix ( 1:10, ncol = 2 )
a

b = matrix ( c ( 2,3,4,7,8,9 ), ncol = 2 )
b

a[a==b]






a = as.data.frame ( matrix ( 1:10, ncol = 2 ) )
a

b = as.data.frame ( matrix ( c ( 2,3,4,7,8,9 ), ncol = 2 ) )
b

a[a==b]








Any ideas please.


Thanks.


Simon Parker
Imperial College

-- 
View this message in context: 
http://www.nabble.com/Selecting-rows-that-are-the-same-in-separate-data-frames-tp20916243p20916243.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Need help optimizing/vectorizing nested loops

2008-12-09 Thread tyler

Hi,

I'm analyzing a large number of large simulation datasets, and I've
isolated one of the bottlenecks. Any help in speeding it up would be
appreciated.

`dat` is a dataframe of samples from a regular grid. The first two
columns are the spatial coordinates of the samples, the remaining 20
columns are the abundances of species in each cell. I need to calculate
the species richness in adjacent cells for each cell in the sample. 
For example, if I have nine cells in my dataframe (X = 1:3, Y = 1:3):

  a b c
  d e f
  g h i

I need to calculate the neighbour-richness for each cell; for a, this is
the richness of cells b, d and e combined. The neighbour richness of
cell e would be the combined richness of all the other eight cells.

The following code does what I what, but it's slow. The sample dataset
'dat', below, represents a 5x5 grid, 25 samples. It takes about 1.5
seconds on my computer. The largest samples I am working with have a 51
x 51 grid (2601 cells) and take 4.5 minutes. This is manageable, but
since I have potentially hundreds of these analyses to run, trimming
that down would be very helpful.

After loading the function and the data, the call

  system.time(tmp - time.test(dat))

Will run the code. Note that I've excised this from a larger, more
general function, after determining that for large datasets this section
is responsible for a slowdown from 10-12 seconds to ca. 250 seconds.

Thanks for your patience,

Tyler


time.test - function(dat) {

  cen - dat
  grps - 5
  n.rich - numeric(grps^2)
  n.ind - 1
  
  for(i in 1:grps)
for (j in 1:grps) {
  n.cen - numeric(ncol(cen) - 2)
  neighbours - expand.grid((j-1):(j+1), (i-1):(i+1)) 
  neighbours - neighbours[-5,] 
  neighbours - neighbours[which(neighbours[,1] %in% 1:grps 
 neighbours[,2] %in% 1:grps),]
  
  for (k in 1:nrow(neighbours))
n.cen - n.cen + cen[cen$X == neighbours[k,1] 
 cen$Y == neighbours[k,2], -c(1:2)]
  
  n.rich[n.ind] - sum(as.logical(n.cen))
  n.ind - n.ind + 1
}

return(n.rich)
}

`dat` - structure(list(
  X = c(1, 2, 3, 4, 5, 1, 2, 3, 4, 5, 1, 2, 3, 4, 5, 1, 2, 3, 4, 5, 1,
  2, 3, 4, 5), Y = c(1, 1, 1, 1, 1, 2, 2, 2, 2, 2, 3, 3, 3, 3, 3, 4,
  4, 4, 4, 4, 5, 5, 5, 5, 5), V1 = c(0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L,
  0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 45L, 131L, 0L, 0L, 34L,
  481L, 1744L), V2 = c(0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L,
  0L, 1L, 88L, 0L, 70L, 101L, 13L, 634L, 0L, 0L, 71L, 640L, 1636L), V3
  = c(0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 49L, 3L, 113L,
  1L, 44L, 167L, 336L, 933L, 0L, 14L, 388L, 1180L, 1709L), V4 = c(0L,
  0L, 0L, 0L, 0L, 0L, 3L, 12L, 0L, 0L, 2L, 1L, 36L, 45L, 208L, 7L,
  221L, 213L, 371L, 1440L, 26L, 211L, 389L, 1382L, 1614L), V5 = c(96L,
  7L, 0L, 0L, 0L, 10L, 17L, 0L, 5L, 0L, 0L, 11L, 151L, 127L, 160L,
  27L, 388L, 439L, 1117L, 1571L, 81L, 598L, 1107L, 1402L, 891L), V6 =
  c(16L, 30L, 13L, 0L, 0L, 10L, 195L, 60L, 29L, 29L, 1L, 107L, 698L,
  596L, 655L, 227L, 287L, 677L, 1477L, 1336L, 425L, 873L, 961L, 1360L,
  1175L), V7 = c(249L, 101L, 69L, 0L, 18L, 186L, 331L, 291L, 259L,
  248L, 336L, 404L, 642L, 632L, 775L, 455L, 801L, 697L, 1063L, 978L,
  626L, 686L, 1204L, 1138L, 627L), V8 = c(300L, 163L, 65L, 145L, 377L,
  257L, 690L, 655L, 420L, 288L, 346L, 461L, 1276L, 897L, 633L, 812L,
  1018L, 1337L, 1295L, 1163L, 550L, 1104L, 768L, 933L, 433L), V9 =
  c(555L, 478L, 374L, 349L, 357L, 360L, 905L, 954L, 552L, 438L, 703L,
  984L, 1616L, 1732L, 1234L, 1213L, 1518L, 1746L, 1191L, 967L, 1394L,
  1722L, 1706L, 610L, 169L), V10 = c(1527L, 1019L, 926L, 401L, 830L,
  833L, 931L, 816L, 1126L, 1232L, 1067L, 1169L, 1270L, 1277L, 1145L,
  1159L, 1072L, 1534L, 997L, 391L, 1328L, 1414L, 1037L, 444L, 1L), V11
  = c(1468L, 1329L, 1013L, 603L, 1096L, 1237L, 1488L, 1189L, 1064L,
  1303L, 1258L, 1479L, 1421L, 1365L, 1101L, 1415L, 1145L, 1329L,
  1325L, 236L, 1379L, 1199L, 729L, 328L, 0L), V12 = c(983L, 1459L,
  791L, 898L, 911L, 1215L, 1528L, 960L, 1172L, 1286L, 1358L, 722L,
  857L, 1478L, 1452L, 1502L, 1013L, 745L, 455L, 149L, 1686L, 917L,
  1013L, 84L, 0L), V13 = c(1326L, 1336L, 1110L, 1737L, 1062L, 1578L,
  1382L, 1537L, 1366L, 1308L, 1301L, 1357L, 746L, 622L, 934L, 1132L,
  954L, 460L, 270L, 65L, 957L, 699L, 521L, 18L, 1L), V14 = c(1047L,
  1315L, 1506L, 1562L, 1254L, 1336L, 1106L, 1213L, 1220L, 1457L, 858L,
  1606L, 590L, 726L, 598L, 945L, 732L, 258L, 45L, 6L, 937L, 436L, 43L,
  0L, 0L), V15 = c(845L, 935L, 1295L, 1077L, 1400L, 1049L, 802L,
  1247L, 1449L, 1046L, 1134L, 877L, 327L, 352L, 470L, 564L, 461L,
  166L, 0L, 0L, 230L, 110L, 29L, 0L, 0L), V16 = c(784L, 675L, 1157L,
  1488L, 1511L, 1004L, 420L, 523L, 733L, 724L, 833L, 542L, 171L, 116L,
  384L, 357L, 197L, 0L, 0L, 0L, 246L, 0L, 0L, 0L, 0L), V17 = c(444L,
  873L, 530L, 596L, 448L, 431L, 109L, 446L, 378L, 243L, 284L, 148L,
  69L, 30L, 6L, 71L, 32L, 131L, 0L, 0L, 120L, 0L, 0L, 0L, 0L), V18 =
  c(307L, 128L, 823L, 566L,

Re: [R] Pre-model Variable Reduction

2008-12-09 Thread Harsh

Thank you everyone.
The idea really is for me to get the variables themselves from a
super-set of all variables.
x1 -numeric continuous
x2 -numeric continuous
x3 - numeric Factor with 2 levels
x4 -Character Factor with 10 levels
x5 - numeric continuous
x6 - numeric integer

Variable Reduction method then, must ideally give me

keep : x1, x3 and x6
drop : x2, x4 and x5

The 'redun' function from Hmisc package seems promising since it
considers categorical variables as well. Variable to be dropped is the
variable which can be predicted by other variables. I guess its to
check for multi-colinearity.

The RWeka package, as I mentioned earlier, allows one to use Weka's
variable reduction/selection techniques  in R. I did come across an
implementation of the Genetic Search' method, but have not been able
to find relevant documentation for the same to tweak to suit my needs.

Thank you all for your time.

Harsh Singhal
Decision Systems,
Mu Sigma Inc.




On Tue, Dec 9, 2008 at 8:05 PM, Ravi Varadhan [EMAIL PROTECTED] wrote:
 Principal components analysis does dimensionality reduction but NOT
 variable reduction.  However, Jolliffe's 2004 book on PCA does discuss the
 problem of selecting a subset of variables, with the goal of representing
 the internal variation of original multivariate vector as well as possible
 (see Section 6.3 of that book).  I do not think that these methods can
 handle missing data.  The most important issue is to think about the goal of
 variable reduction and then choose an appropriate optimality criterion for
 achieving that goal.  In most instances of variable selection, the criterion
 that is optimized is never explicitly considered.

 Ravi.

 
 ---

 Ravi Varadhan, Ph.D.

 Assistant Professor, The Center on Aging and Health

 Division of Geriatric Medicine and Gerontology

 Johns Hopkins University

 Ph: (410) 502-2619

 Fax: (410) 614-9625

 Email: [EMAIL PROTECTED]

 Webpage:  http://www.jhsph.edu/agingandhealth/People/Faculty/Varadhan.html



 
 


 -Original Message-
 From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On
 Behalf Of Gabor Grothendieck
 Sent: Tuesday, December 09, 2008 8:00 AM
 To: Harsh
 Cc: r-help@r-project.org
 Subject: Re: [R] Pre-model Variable Reduction

 See:

 ?prcomp
 ?princomp

 On Tue, Dec 9, 2008 at 5:34 AM, Harsh [EMAIL PROTECTED] wrote:
 Hello All,
 I am trying to carry out variable reduction. I do not have information
 about the dependent variable, and have only the X variables as it
 were.
 In selecting variables I wish to keep, I have considered the following
 criteria.
 1) Percentage of missing value in each column/variable
 2) Variance of each variable, with a cut-off value.

 I recently came across Weka and found that there is an RWeka package
 which would allow me to make use of Weka through R.
 Weka provides a Genetic search variable reduction method, but I
 could not find its R code implementation in the RWeka Pdf file on
 CRAN.

 I looked for other R packages that allow me to do variable reduction
 without considering a dependent variable. I came across 'dprep'
 package but it does not have a Windows implementation.

 Moreover, I have a dataset that contains continuous and categorical
 variables, some categorical variables having 3 levels, 10 levels and
 so on, till a max 50 levels (E.g. States in the USA).

 Any suggestions in this regard will be much appreciated.

 Thank you

 Harsh Singhal
 Decision Systems,
 Mu Sigma, Inc.

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] R and Scheme

2008-12-09 Thread Luke Tierney


On Tue, 9 Dec 2008, Wacek Kusnierczyk wrote:


Stavros Macrakis wrote:

I've read in many places that R semantics are based on Scheme semantics.  As
a long-time Lisp user and implementor, I've tried to make this more precise,
and this is what I've found so far.  I've excluded trivial things that
aren't basic semantic issues: support for arbitrary-precision integers;
subscripting; general style; etc. I would appreciate corrections or
additions from more experienced users of R -- I'm sure that some of the
points below simply reflect my ignorance.

==Similarities to Scheme==

R has first-class function closures. (i.e. correctly supports upward and
downward funarg).

R has a single namespace for functions and variables (Lisp-1).

==Important dissimilarities to Scheme (as opposed to other Lisps)==

R is not properly tail-recursive.

R does not have continuations or call-with-current-continuation or other
mechanisms for implementing coroutines, general iterators, and the like.



there is callCC, for example, which however seems kind of obsolete.


There is nothing obsolete about it.  It supports only downward or
dynamic extent continuations and so is not useful (nor intended) for
the things Stavros mentions.  It is useful for escaping from deeply
nested function calls, for example recursive examination of tree
structures -- that is why it exists.  At some point upward (at least
one-shot) contitnuations may be added as well, but probably not soon.

luke




R supports keyword arguments.

==Similarities to Lisp and other dynamic languages, including Scheme==

R is runtime-typed and garbage-collected.

R supports nested read-eval-print loops for debugging etc.

R expressions are represented as user-manipulable data structures.

==Dissimilarities to all (modern) Lisps, including Scheme==

R has call-by-need, not call-by-object-value.

R does not have macros.

R objects are values, not pointers, so a-1:10; b-a; b[1]-999; a[1] =
999.  Similarly, functions cannot modify the contents of their arguments.



have you actually tried this code?  even if the objects are values not
pointers, assignment causes, in cases such as the above, copying the
value with modifications applied as needed.  thus, a[1] - 1, not 999,
even though after b-a b and a are the same value object.

try the following:

system.time(x-1:(10^8))
system.time(y-x)
system.time(y[1]-0)
system.time(y[2]-0)
head(x)
head(y)


with some trickery, functions can modify the contents of their
arguments, using deparse/substitute and assign:

a - 1
f - function(x) assign(deparse(substitute(x)), 0, parent.frame())
f(a)
a


the 'cannot modify the contents' does not apply to arguments that are
environments:

e - new.env(parent=emptyenv())
l - list()
f - function(e) e$a = 0
f(e)
e$a
f(l)
l$a




There is no equivalent to set-car!/rplaca (not even pairlists and
expressions).  For example, r-pairlist(1,2); r[[1]]-r does not create a
circular list. And in general there doesn't seem to be substructure sharing
at the semantic level (though there may be in the implementation).



computations on environment objects seem not to be subject to the
copy-value-on-assignment semantics:

e - new.env(parent=emptyenv())
ee - e
e$a - 0
ee$a



R does not have multiple value return in the Lisp sense.

R assignment creates a new local variable on first assignment, dynamically.
So static analysis is not enough to determine variable reference (R is not
referentially transparent). Example: ff - function(a){if (a) x-1; x} ;
x-99; ff(T) - 1; ff(F) - 99.

In R, most data types (including numeric vectors) do not have a standard
external representation which can be read back in without evaluation.

R coerces logicals to numbers and numbers to strings. Lisps are stricter
about automatic type conversion -- except that false a.k.a. NIL == () in
Lisps other than Scheme.



types are not treated coherently.  in some situations, r coerces doubles
to complex (according to the hierarchy of types specified here and there
in the man pages), in others it won't:

x - as.double(-1)
y - as.complex(-1)
x == y

sqrt(x)
sqrt(y)

in certain cases, r will also do implicit inverse (downward) coercion:

is(y:y)



vQ

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.



--
Luke Tierney
Chair, Statistics and Actuarial Science
Ralph E. Wareham Professor of Mathematical Sciences
University of Iowa  Phone: 319-335-3386
Department of Statistics andFax:   319-335-3017
   Actuarial Science
241 Schaeffer Hall  email:  [EMAIL PROTECTED]
Iowa City, IA 52242 WWW:  http://www.stat.uiowa.edu

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do

[R] motif search

2008-12-09 Thread Alessia Deglincerti


Hi,

I am very new to R and wanted to know if there is a package that, given 
very long nucleotide sequences, searches and identifies short (7-10nt) 
motifs..  I would like to look for enrichment of certain motifs in 
genomic sequences.


I tried using MEME (not an R package, I know), but the online version 
only allows sequences up to MAX 6 nucleotides, and that's too short 
for my needs..


Thanks

A

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Data Analysis Functions in R

2008-12-09 Thread Dirk Eddelbuettel

On Mon, Dec 08, 2008 at 09:34:35PM -0800, Feanor22 wrote:
 
 Hi experts of R,
 
 Are there any functions in R to test a univariate series for long memory
 effects, structural breaks and time reversability?
 I've found for ARCH effects(ArchTest), for normal (Shapiro.test,
 KS.test(comparing with randn) and lillie.test) but not for the above
 mentioned.
 Where can I find a comprehensive list of functions available by type?

Please try the CRAN Task views for EmpiricalFinance, Econometrics and 
TimeSeries.

Dirk

-- 
Three out of two people have difficulties with fractions.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Pre-model Variable Reduction

2008-12-09 Thread Mark Difford


Hi All,

I beg to differ with Ravi Varadhan's perspective. While it is true that
principal component analysis does not itself do variable selection, it is an
important method for pointing the way to what to select. This is what the
methods in the subselect package rely on. (One of its authors was I believe
a student of Jolliffe's). For a modern perspective on this, see the
following paper:

Debashis Paul, Eric Bair, Trevor Hastie and Robert Tibshirani:
Preconditioning for feature selection and regression in high-dimensional
problems We show that supervised principal components followed by a variable
selection procedure is an effective approach for variable selection in very
high dimension. Annals of Statistics 36(4), 2008, 1595-1618.

http://www-stat.stanford.edu/~hastie/Papers/Preconditioning_Annals.pdf

Regards, Mark.


Ravi Varadhan wrote:
 
 Principal components analysis does dimensionality reduction but NOT
 variable reduction.  However, Jolliffe's 2004 book on PCA does discuss
 the
 problem of selecting a subset of variables, with the goal of representing
 the internal variation of original multivariate vector as well as possible
 (see Section 6.3 of that book).  I do not think that these methods can
 handle missing data.  The most important issue is to think about the goal
 of
 variable reduction and then choose an appropriate optimality criterion for
 achieving that goal.  In most instances of variable selection, the
 criterion
 that is optimized is never explicitly considered.
 
 Ravi.
 
 
 ---
 
 Ravi Varadhan, Ph.D.
 
 Assistant Professor, The Center on Aging and Health
 
 Division of Geriatric Medicine and Gerontology 
 
 Johns Hopkins University
 
 Ph: (410) 502-2619
 
 Fax: (410) 614-9625
 
 Email: [EMAIL PROTECTED]
 
 Webpage:  http://www.jhsph.edu/agingandhealth/People/Faculty/Varadhan.html
 
  
 
 
 
 
 
 -Original Message-
 From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED]
 On
 Behalf Of Gabor Grothendieck
 Sent: Tuesday, December 09, 2008 8:00 AM
 To: Harsh
 Cc: r-help@r-project.org
 Subject: Re: [R] Pre-model Variable Reduction
 
 See:
 
 ?prcomp
 ?princomp
 
 On Tue, Dec 9, 2008 at 5:34 AM, Harsh [EMAIL PROTECTED] wrote:
 Hello All,
 I am trying to carry out variable reduction. I do not have information 
 about the dependent variable, and have only the X variables as it 
 were.
 In selecting variables I wish to keep, I have considered the following
 criteria.
 1) Percentage of missing value in each column/variable
 2) Variance of each variable, with a cut-off value.

 I recently came across Weka and found that there is an RWeka package 
 which would allow me to make use of Weka through R.
 Weka provides a Genetic search variable reduction method, but I 
 could not find its R code implementation in the RWeka Pdf file on 
 CRAN.

 I looked for other R packages that allow me to do variable reduction 
 without considering a dependent variable. I came across 'dprep'
 package but it does not have a Windows implementation.

 Moreover, I have a dataset that contains continuous and categorical 
 variables, some categorical variables having 3 levels, 10 levels and 
 so on, till a max 50 levels (E.g. States in the USA).

 Any suggestions in this regard will be much appreciated.

 Thank you

 Harsh Singhal
 Decision Systems,
 Mu Sigma, Inc.

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide 
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.
 
 

-- 
View this message in context: 
http://www.nabble.com/Pre-model-Variable-Reduction-tp20912229p20916445.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Sorry, I have attached the data - Here is the code that causes wavCWTPeaks error

2008-12-09 Thread stephen sefick

I'm stumped.

On Tue, Dec 9, 2008 at 7:53 AM,  [EMAIL PROTECTED] wrote:

 -Messaggio originale-
 Da: [EMAIL PROTECTED]
 Inviato: mar 09/12/2008 13.52
 A: stephen sefick; Francesco Masulli; Stefano Rovetta
 Cc: [EMAIL PROTECTED]; [EMAIL PROTECTED];
 [EMAIL PROTECTED]
 Oggetto: Here is the code that causes wavCWTPeaks error

 aats - create.signalSeries(aa, pos=list(from=0.0, by=0.033))
 aa.cwt -  wavCWT(aats)
 x11 (width=10,height=12)
 plot (aats,main=paste(insig, Cycle: ,j,sep=))
 aa.maxtree - wavCWTTree (aa.cwt, type=maxima)
 aa.mintree - wavCWTTree (aa.cwt, type=minima)
 aa.maxpeak - wavCWTPeaks (aa.maxtree)
 aa.minpeak - wavCWTPeaks (aa.mintree)
   Error in `row.names-.data.frame`(`*tmp*`, value = c(1, 0)) :
   invalid 'row.names' length
 bb - -aa   #EXCHANGE MAXIMA WITH MINIMA
 bbts - create.signalSeries(bb, pos=list(from=0.0, by=0.033))
 plot (bbts,main=paste(insig, Cycle: ,j,sep=))
 plot (bbts,main=paste(insig, Cycle: ,j,sep=))
 bb.cwt -  wavCWT(bbts)
 bb.maxtree - wavCWTTree (bb.cwt, type=maxima)
 bb.mintree - wavCWTTree (bb.cwt, type=minima)
 bb.maxpeak - wavCWTPeaks (bb.maxtree)
   Error in `row.names-.data.frame`(`*tmp*`, value = c(1, 0)) :
   invalid 'row.names' length
 bb.minpeak - wavCWTPeaks (bb.mintree)
   Error in `row.names-.data.frame`(`*tmp*`, value = c(1, 0)) :
   invalid 'row.names' length

 Attached are aa breathing cycle amplitude values (zip compressed)
 I'd like to figure out what is wrong with my data.
 wmTSA documentation does not mention any time series constraint.

 Thank you in advance.
 Kind regards,
 Maura

 -Messaggio originale-
 Da: stephen sefick [mailto:[EMAIL PROTECTED]
 Inviato: mar 09/12/2008 8.11
 A: [EMAIL PROTECTED]
 Cc: [EMAIL PROTECTED]; [EMAIL PROTECTED];
 [EMAIL PROTECTED]
 Oggetto: Re: [R] package wmtsa: wavCWTPeaks error

 Are the names of the rows the same as the time series that you are
 using?  I know that I am not being that helpful, but this seems like a
 mismatch in the time series object.  look at
 length(rowname(your.data))
 length(your.data[,1])

 again it is always helpful to have reproducible code.

 On Tue, Dec 9, 2008 at 1:39 AM,  [EMAIL PROTECTED] wrote:
 I keep getting the following error when I look for minima in the series:

 aa.peak - wavCWTPeaks (aa.tree)
 Error in `row.names-.data.frame`(`*tmp*`, value = c(1, 0)) :
  invalid 'row.names' length

 How can I work it around ?

 Thank you.

 Regards,
 Maura


 Alice Messenger ;-) chatti anche con gli amici di Windows Live Messenger e
 tutti i telefonini TIM!
 Vai su http://maileservizi.alice.it/alice_messenger/index.html?pmk=footer

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.





 --
 Stephen Sefick

 Let's not spend our time and resources thinking about things that are
 so little or so large that all they really do for us is puff us up and
 make us feel like gods.  We are mammals, and have not exhausted the
 annoying little problems of being mammals.

 -K. Mullis




 Alice Messenger ;-) chatti anche con gli amici di Windows Live Messenger e
 tutti i telefonini TIM!
 Vai su http://maileservizi.alice.it/alice_messenger/index.html?pmk=footer



 Alice Messenger ;-) chatti anche con gli amici di Windows Live Messenger e
 tutti i telefonini TIM!
 Vai su http://maileservizi.alice.it/alice_messenger/index.html?pmk=footer



-- 
Stephen Sefick

Let's not spend our time and resources thinking about things that are
so little or so large that all they really do for us is puff us up and
make us feel like gods.  We are mammals, and have not exhausted the
annoying little problems of being mammals.

-K. Mullis

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Selecting rows that are the same in separate data frames

2008-12-09 Thread bartjoosen


I'm not sure what you want, but take a look at ?merge and %in%


ppaarrkk wrote:
 
 I want to compare two matrices or data frames and select or get an index
 for those rows which are the same in both. I have tried the following :
 
 
 
 
 
 
 a = matrix ( 1:10, ncol = 2 )
 a
 
 b = matrix ( c ( 2,3,4,7,8,9 ), ncol = 2 )
 b
 
 a[a==b]
 
 
 
 
 
 
 a = as.data.frame ( matrix ( 1:10, ncol = 2 ) )
 a
 
 b = as.data.frame ( matrix ( c ( 2,3,4,7,8,9 ), ncol = 2 ) )
 b
 
 a[a==b]
 
 
 
 
 
 
 
 
 Any ideas please.
 
 
 Thanks.
 
 
 Simon Parker
 Imperial College
 
 

-- 
View this message in context: 
http://www.nabble.com/Selecting-rows-that-are-the-same-in-separate-data-frames-tp20916243p20916727.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] R: Sorry, I have attached the data - Here is the code that causes wavCWTPeaks error

2008-12-09 Thread mauede

Maxima are found if I prolong the 1-cycle time series on both sides, appaently 
no matter which values I use.
The following sequence works fine:

 aal  - rep (aa[1], length(aa))
 aar  - rep (aa[length(aa)], length(aa))  
 aa3 - c(aal,aa,aar)   
#APPEND CYCLE TO ITSELF TWICE 
 aa3ts - create.signalSeries(aa3, pos=list(from=0.0, by=0.033))  #CONVERT 
AMPLITUDE INTO TIME-SERIES 
 plot (aa3ts, main=paste(insig, Cycle: ,j,sep=))
 aa3.cwt -  wavCWT(aa3ts) 
#CWT
 aa3.maxtree - wavCWTTree (aa3.cwt, type=maxima)# GENERATE CWT 
MAXIMA TREE
 aa3.maxpeak - wavCWTPeaks (aa3.maxtree)#GET MAXIMUM 
PEAK

Since the minima of  aa are the maxima of -aa I reverse the sign and then 
repeat the above procedure.
The following sequence works fine:

## EXCHANGE MAXIMA WITH MINIMA
  bb  -  -aa   
  #REVERSE SIGN
  bbl  - rep (bb[1], length(bb))\
  bbr  - rep (bb[length(bb)], length(bb))  
  bb3 - c(bbl,bb,bbr)  
#APPEND CYCLE TO ITSELF TWICE 
  bb3ts - create.signalSeries(bb3, pos=list(from=0.0, by=0.033))  #CONVERT 
AMPLITUDE INTO TIME-SERIES 
  plot (bb3ts, main=paste(insig, Cycle: ,j,sep=))
  bb3.cwt -  wavCWT(bb3ts)
#CWT
  bb3.maxtree - wavCWTTree (bb3.cwt, type=maxima)   #GENERATE CWT MAXIMA 
TREE
  bb3.maxpeak - wavCWTPeaks (bb3.maxtree)#GET MAXIMUM 
PEAKS 

It looks like wmTSA functions work like the Moving Average algorithm  which 
uses a portion of the time series 
for kind of self-training 

Best regards,
Maura 

-Messaggio originale-
Da: stephen sefick [mailto:[EMAIL PROTECTED]
Inviato: mar 09/12/2008 16.15
A: [EMAIL PROTECTED]
Cc: Francesco Masulli; Stefano Rovetta; [EMAIL PROTECTED]; [EMAIL PROTECTED]; 
[EMAIL PROTECTED]
Oggetto: Re: Sorry, I have attached the data - Here is the code that causes 
wavCWTPeaks error
 
I'm stumped.

On Tue, Dec 9, 2008 at 7:53 AM,  [EMAIL PROTECTED] wrote:

 -Messaggio originale-
 Da: [EMAIL PROTECTED]
 Inviato: mar 09/12/2008 13.52
 A: stephen sefick; Francesco Masulli; Stefano Rovetta
 Cc: [EMAIL PROTECTED]; [EMAIL PROTECTED];
 [EMAIL PROTECTED]
 Oggetto: Here is the code that causes wavCWTPeaks error

 aats - create.signalSeries(aa, pos=list(from=0.0, by=0.033))
 aa.cwt -  wavCWT(aats)
 x11 (width=10,height=12)
 plot (aats,main=paste(insig, Cycle: ,j,sep=))
 aa.maxtree - wavCWTTree (aa.cwt, type=maxima)
 aa.mintree - wavCWTTree (aa.cwt, type=minima)
 aa.maxpeak - wavCWTPeaks (aa.maxtree)
 aa.minpeak - wavCWTPeaks (aa.mintree)
   Error in `row.names-.data.frame`(`*tmp*`, value = c(1, 0)) :
   invalid 'row.names' length
 bb - -aa   #EXCHANGE MAXIMA WITH MINIMA
 bbts - create.signalSeries(bb, pos=list(from=0.0, by=0.033))
 plot (bbts,main=paste(insig, Cycle: ,j,sep=))
 plot (bbts,main=paste(insig, Cycle: ,j,sep=))
 bb.cwt -  wavCWT(bbts)
 bb.maxtree - wavCWTTree (bb.cwt, type=maxima)
 bb.mintree - wavCWTTree (bb.cwt, type=minima)
 bb.maxpeak - wavCWTPeaks (bb.maxtree)
   Error in `row.names-.data.frame`(`*tmp*`, value = c(1, 0)) :
   invalid 'row.names' length
 bb.minpeak - wavCWTPeaks (bb.mintree)
   Error in `row.names-.data.frame`(`*tmp*`, value = c(1, 0)) :
   invalid 'row.names' length

 Attached are aa breathing cycle amplitude values (zip compressed)
 I'd like to figure out what is wrong with my data.
 wmTSA documentation does not mention any time series constraint.

 Thank you in advance.
 Kind regards,
 Maura

 -Messaggio originale-
 Da: stephen sefick [mailto:[EMAIL PROTECTED]
 Inviato: mar 09/12/2008 8.11
 A: [EMAIL PROTECTED]
 Cc: [EMAIL PROTECTED]; [EMAIL PROTECTED];
 [EMAIL PROTECTED]
 Oggetto: Re: [R] package wmtsa: wavCWTPeaks error

 Are the names of the rows the same as the time series that you are
 using?  I know that I am not being that helpful, but this seems like a
 mismatch in the time series object.  look at
 length(rowname(your.data))
 length(your.data[,1])

 again it is always helpful to have reproducible code.

 On Tue, Dec 9, 2008 at 1:39 AM,  [EMAIL PROTECTED] wrote:
 I keep getting the following error when I look for minima in the series:

 aa.peak - wavCWTPeaks (aa.tree)
 Error in `row.names-.data.frame`(`*tmp*`, value = c(1, 0)) :
  invalid 'row.names' length

 How can I work it around ?

 Thank you.

 Regards,
 Maura


 Alice Messenger ;-) chatti anche con gli amici di Windows Live Messenger e
 tutti i telefonini TIM!
 Vai su http://maileservizi.alice.it/alice_messenger/index.html?pmk=footer

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide

[R] problem with Vista

2008-12-09 Thread vincenzo landi

Hello,
I want to import a txt table into R but the software give me this message. I 
have windows vista.


Errore in file(file, r) : cannot open this connection
Besides: Warning message:
In file(file, r) :
Â cannot open file 'C:/Users/Vincenzo/Desktop/prova/prova.txt': No such file or 
directory
thank you


** 
Vincenzo Landi 
Post Doctorate student
Animal genomic and breeding
 cell:0039/339538871@ï¿½ï¿½ï¿½ï¿½ï¿½ï¿½ï¿½Xï¿½ï¿½ï¿½ï¿½@
ï¿½ï¿½ï¿½ï¿½le=font-family: comic sans ms;Fax. 075-5857122 



  
[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Logical inconsistency

2008-12-09 Thread emma jane

Many thanks for your help, perhaps I should have set my query in context  !

I'm simply calculating an indicator variable [0,1] based on the whether the 
difference between two measured variables is  1 or =1.

I understand the FAQ about floating point arithmetic, but am still puzzled that 
it only apparently applies to certain elements, as follows:

8.8 - 7.8  1
 TRUE

8.3 - 7.3  1
 TRUE
Â 
However,
Â 
10.2 - 9.2  1
FALSE
Â 
11.3 - 10.31
Â FALSE

Emma Jane





From: Bernardo Rangel Tura [EMAIL PROTECTED]
To: Wacek Kusnierczyk [EMAIL PROTECTED]
Cc: R help [EMAIL PROTECTED]
Sent: Saturday, 6 December, 2008 10:00:48
Subject: Re: [R] Logical inconsistency

On Fri, 2008-12-05 at 14:18 +0100, Wacek Kusnierczyk wrote:
 Berwin A Turlach wrote:
  Dear Emma,
 
  On Fri, 5 Dec 2008 04:23:53 -0800 (PST)

 Â  
 Â  
  Please could someone kindly explain the following inconsistencies
  I've discovered__when performing logical calculations in R:
 
  8.8 - 7.8  1
 Â  Â  
  TRUE
 Â  Â  Â  
  8.3 - 7.3  1
 Â  Â  
  TRUE
 Â  Â  Â  
 
  Gladly:Â  FAQ 7.31
  http://cran.at.r-project.org/doc/FAQ/R-FAQ.html#Why-doesn_0027t-R-think-these-numbers-are-equal_003f
 
 Â  
 
 well, this answer the question only partially.Â  this explains why a
 system with finite precision arithmetic, such as r, will fail to be
 logically correct in certain cases.Â  it does not explain why r, a
 language said to isolate a user from the underlying implementational
 choices, would have to fail this way. 
 
 there is, in principle, no problem in having a high-level language
 perform the computation in a logically consistent way.Â  for example, bc
 is an arbitrary precision calculator language, and has no problem with
 examples as the above:
 
 bc  8.8 - 7.8  1
 # 0, meaning 'no'
 
 bc  8.3 - 7.3  1
 # 0, meaning 'no'
 
 bc  8.8 - 7.8 == 1
 # 1, meaning 'yes'
 
 
 the fact that r (and many others, including matlab and sage, perhaps not
 mathematica) does not perform logically here is a consequence of its
 implementation of floating point arithmetic. 
 
 the faq you were pointed to, and its referring to the goldberg's
 article, show that r does not successfully isolate a user from details
 of the lower-level implementation.
 
 vQ

Well, first of all for 8.-7.3 is not equal to 1 [for computers]

 8.3-7.3-1
[1] 8.881784e-16

But if you use only one digit precision 

 round(8.3-7.3,1)-1
[1] 0
 round(8.3-7.3,1)-10
[1] FALSE
 round(8.3-7.3,1)==1
[1] TRUE


So the problem is the code write and no the software

-- 
Bernardo Rangel Tura, M.D,MPH,Ph.D
National Institute of Cardiology
Brazil


  
[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] ANCOVA

ANCOVA usually is used when you have a numerical covariate that you want to 
adjust for, your description does not include any, so a regular anova (aov 
function) can be used.  If you do have 1 or more covariates to adjust for, then 
it can be thought of as a regression problem (ANCOVA is a special case of 
linear models/regression) and you can fit it using either the aov or lm 
function.

--
Gregory (Greg) L. Snow Ph.D.
Statistical Data Center
Intermountain Healthcare
[EMAIL PROTECTED]
801.408.8111


 -Original Message-
 From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED]
 project.org] On Behalf Of Samuel Okoye
 Sent: Tuesday, December 09, 2008 2:03 AM
 To: [EMAIL PROTECTED]
 Subject: [R] ANCOVA


 Hello,

 Could you please help me in the following question:
 I have 16 persons 6 take 0.5 mg, 6 take 0.75 mg and 4 take placebo! Can
 I use the ANCOVA and t-test in this case? Is it possible in R?

 Thank you in advance,
 Samuel



 [[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] How to add accuracy, sensitivity, specificity to logistic regression output?

2008-12-09 Thread Jorge Ivan Velez

Dear Pufft,
You might interested in the lroc function in the epicalc [1] package and the
ROCR [2] package
itself.

[1] http://cran.r-project.org/web/packages/epicalc/index.html
[2] http://cran.r-project.org/web/packages/ROCR/index.html


HTH,

Jorge



On Mon, Dec 8, 2008 at 11:39 PM, pufftissue pufftissue [EMAIL PROTECTED]
 wrote:

 Hi,

 Is there a way when doing logistic regression for the output to spit out
 accuracy, sensitivity, and specificity?

 I would like to know these basic measures for my model.

 Thanks!

[[alternative HTML version deleted]]

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Selecting rows that are the same in separate data frames

2008-12-09 Thread Jorge Ivan Velez

Dear Simon,

Try this:

# Index -- FALSE, TRUE
sapply(1:nrow(a),function(x) all(a[x,]%in%b))

#  Rows of a that are in b
which(sapply(1:nrow(a),function(x) all(a[x,]%in%b)))

# Reporting
a[sapply(1:nrow(a),function(x) all(a[x,]%in%b)),]


HTH,

Jorge


On Tue, Dec 9, 2008 at 9:57 AM, ppaarrkk [EMAIL PROTECTED] wrote:


 I want to compare two matrices or data frames and select or get an index
 for
 those rows which are the same in both. I have tried the following :






 a = matrix ( 1:10, ncol = 2 )
 a

 b = matrix ( c ( 2,3,4,7,8,9 ), ncol = 2 )
 b

 a[a==b]






 a = as.data.frame ( matrix ( 1:10, ncol = 2 ) )
 a

 b = as.data.frame ( matrix ( c ( 2,3,4,7,8,9 ), ncol = 2 ) )
 b

 a[a==b]








 Any ideas please.


 Thanks.


 Simon Parker
 Imperial College

 --
 View this message in context:
 http://www.nabble.com/Selecting-rows-that-are-the-same-in-separate-data-frames-tp20916243p20916243.html
 Sent from the R help mailing list archive at Nabble.com.

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Bootstrap a GLM model with Poisson family

2008-12-09 Thread Joana Vicente


Hello!
Anyone can help me to know how to do a bootstrap evaluation analysis  
to a GLM with family=Poisson?

I have some R codes but they were coded only for family=binomial...

thanks a lot
Joana Vicente .·.

CIBIO - Centro de Investigação em Biodiversidade e Recursos Genéticos.
Faculdade de Ciências do Porto
Departamento de Botânica -Edifício FC4
Rua Do Campo Alegre, S/N
4169-007 Porto - Portugal

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Selecting rows that are the same in separate data frames

2008-12-09 Thread ppaarrkk


Thanks for reply.

What I want is the equivalent of this :

xxx = 1:10
which(xxx %in% c(2,5))


...but where there is more than one criterion for matching.


which (b %in% a) in the code I included does nothing (not surprisingly).

I'm not sure that I can use merge, because I want the whole of a, but to
mark those rows which are also in b. If I do merge ( a,b ),  I just get b.
If I do merge ( a,b, all.x =TRUE), I get a.




bartjoosen wrote:
 
 I'm not sure what you want, but take a look at ?merge and %in%
 
 
 ppaarrkk wrote:
 
 I want to compare two matrices or data frames and select or get an index
 for those rows which are the same in both. I have tried the following :
 
 
 
 
 
 
 a = matrix ( 1:10, ncol = 2 )
 a
 
 b = matrix ( c ( 2,3,4,7,8,9 ), ncol = 2 )
 b
 
 a[a==b]
 
 
 
 
 
 
 a = as.data.frame ( matrix ( 1:10, ncol = 2 ) )
 a
 
 b = as.data.frame ( matrix ( c ( 2,3,4,7,8,9 ), ncol = 2 ) )
 b
 
 a[a==b]
 
 
 
 
 
 
 
 
 Any ideas please.
 
 
 Thanks.
 
 
 Simon Parker
 Imperial College
 
 
 
 

-- 
View this message in context: 
http://www.nabble.com/Selecting-rows-that-are-the-same-in-separate-data-frames-tp20916243p20917838.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] creating standard curves for ELISA analysis

2008-12-09 Thread 1Rnwb


Hello R guru's

I am a newbie to R, In my research work I usually generate a lot of ELISA
data in form of absorbance values. I ususally use Excel to calculate the
concentrations of unknown, but it is too tedious and manual especially when
I have 100's of files to process. I would appreciate some help  in creating
a R script to do this with minimal manual input. s A1-G1 and A2-G2 are
standards serially diluted H1 and H2 are Blanks. A3 to H12 are serum
samples. I am pasting the structure of my data below:



A1  14821
B1  11577
C1  5781
D1  2580
E1  902
F1  264
G1  98
H1  4
A2  14569.5
B2  11060
C2  5612
D2  2535
E2  872
F2  285
G2  85
H2  3
A3  1016
B3  2951.5
C3  547
D3  1145
E3  4393
F3  4694
G3  1126
H3  1278
A4  974.5
B4  3112.5
C4  696.5
D4  2664.5
E4  184.5
F4  1908
G4  108.5
H4  1511
A5  463.5
B5  1365
C5  816
D5  806
E5  1341
F5  1157
G5  542.5
H5  749

-- 
View this message in context: 
http://www.nabble.com/creating-standard-curves-for-ELISA-analysis-tp20917182p20917182.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Logical inconsistency

Some (possibly all) of those numbers cannot be represented exactly, so there is 
a chance of round off error whenever you do some arithmetic, sometimes the 
errors cancel out, sometimes they don't.  Consider:

 print(8.3-7.3, digits=20)
[1] 1.001
 print(11.3-10.3, digits=20)
[1] 1

So in the first case the rounding error gives a value that is slightly greater 
than 1, so the greater than test returns true (if you round the result before 
comparing to 1, then it will return false).  In the second case the 
uncertainties cancelled out so that you get exactly 1 which is not greater than 
1 an so the comparison returns false.

Hope this helps,

--
Gregory (Greg) L. Snow Ph.D.
Statistical Data Center
Intermountain Healthcare
[EMAIL PROTECTED]
801.408.8111


 -Original Message-
 From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED]
 project.org] On Behalf Of emma jane
 Sent: Tuesday, December 09, 2008 7:02 AM
 To: Bernardo Rangel Tura; Wacek Kusnierczyk; Chuck Cleland
 Cc: R help
 Subject: Re: [R] Logical inconsistency

 Many thanks for your help, perhaps I should have set my query in
 context  !

 I'm simply calculating an indicator variable [0,1] based on the whether
 the difference between two measured variables is  1 or =1.

 I understand the FAQ about floating point arithmetic, but am still
 puzzled that it only apparently applies to certain elements, as
 follows:

 8.8 - 7.8  1
  TRUE

 8.3 - 7.3  1
  TRUE

 However,

 10.2 - 9.2  1
 FALSE

 11.3 - 10.31
  FALSE

 Emma Jane




 
 From: Bernardo Rangel Tura [EMAIL PROTECTED]
 To: Wacek Kusnierczyk [EMAIL PROTECTED]
 Cc: R help [EMAIL PROTECTED]
 Sent: Saturday, 6 December, 2008 10:00:48
 Subject: Re: [R] Logical inconsistency

 On Fri, 2008-12-05 at 14:18 +0100, Wacek Kusnierczyk wrote:
  Berwin A Turlach wrote:
   Dear Emma,
  
   On Fri, 5 Dec 2008 04:23:53 -0800 (PST)

  
  
   Please could someone kindly explain the following inconsistencies
   I've discovered__when performing logical calculations in R:
  
   8.8 - 7.8  1
  
   TRUE
  
   8.3 - 7.3  1
  
   TRUE
  
  
   Gladly:  FAQ 7.31
   http://cran.at.r-project.org/doc/FAQ/R-FAQ.html#Why-doesn_0027t-R-
 th
   ink-these-numbers-are-equal_003f
  
  
 
  well, this answer the question only partially.  this explains why a
  system with finite precision arithmetic, such as r, will fail to be
  logically correct in certain cases.  it does not explain why r, a
  language said to isolate a user from the underlying implementational
  choices, would have to fail this way.
 
  there is, in principle, no problem in having a high-level language
  perform the computation in a logically consistent way.  for example,
  bc is an arbitrary precision calculator language, and has no
 problem
  with examples as the above:
 
  bc  8.8 - 7.8  1
  # 0, meaning 'no'
 
  bc  8.3 - 7.3  1
  # 0, meaning 'no'
 
  bc  8.8 - 7.8 == 1
  # 1, meaning 'yes'
 
 
  the fact that r (and many others, including matlab and sage, perhaps
  not
  mathematica) does not perform logically here is a consequence of its
  implementation of floating point arithmetic.
 
  the faq you were pointed to, and its referring to the goldberg's
  article, show that r does not successfully isolate a user from
 details
  of the lower-level implementation.
 
  vQ

 Well, first of all for 8.-7.3 is not equal to 1 [for computers]

  8.3-7.3-1
 [1] 8.881784e-16

 But if you use only one digit precision

  round(8.3-7.3,1)-1
 [1] 0
  round(8.3-7.3,1)-10
 [1] FALSE
  round(8.3-7.3,1)==1
 [1] TRUE


 So the problem is the code write and no the software

 --
 Bernardo Rangel Tura, M.D,MPH,Ph.D
 National Institute of Cardiology
 Brazil



 [[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] motif search

2008-12-09 Thread anna freni sterrantino

Hi Alessia,
you may want to post this kind of question on Bioc mailing list, 
is more appropriate.


 http://www.bioconductor.org/docs/mailList.html
About your question ,
I'm not 100% sure, but check  if Biostrings  pkg
can do what you need to do.

http://www.bioconductor.org/workshops/2008/SeattleNov08/MatchAlign/

Best Regards

Anna


Anna Freni Sterrantino
Ph.D Student 
Department of Statistics
University of Bologna, Italy
via Belle Arti 41, 40124 BO.





Da: Alessia Deglincerti [EMAIL PROTECTED]
A: r-help@r-project.org
Inviato: MartedÃ¬ 9 dicembre 2008, 16:03:55
Oggetto: [R] motif search

Hi,

I am very new to R and wanted to know if there is a package that, given very 
long nucleotide sequences, searches and identifies short (7-10nt) motifs..  I 
would like to look for enrichment of certain motifs in genomic sequences.

I tried using MEME (not an R package, I know), but the online version only 
allows sequences up to MAX 6 nucleotides, and that's too short for my 
needs..

Thanks

A

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.



  
[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] [R-pkgs] SFA tools moved from micEcon to frontier

2008-12-09 Thread Arne Henningsen

Dear R users,

I would like to inform you that everything of the micEcon package that is 
related to Stochastic Frontier Analysis (SFA) has been moved to the frontier 
package, because this is a more appropriate place for the functions 
front41WriteInput, front41ReadOutput, and front41Est, and the 
corresponding (S3) methods. The data sets riceProdPhil and Coelli have 
been removed from the micEcon package, because they were already included in 
the frontier package (the latter is named front41Data in the frontier 
package). The new frontier package (with the front41... functions) is 
available on CRAN now and the new micEcon package (without the front41... 
functions) will be available on CRAN in some time. Please use the meantime to 
update your code. 

Please don't hesitate to contact me if you have any questions 
or comments,
Arne

-- 
Arne Henningsen
http://www.arne-henningsen.name/

___
R-packages mailing list
[EMAIL PROTECTED]
https://stat.ethz.ch/mailman/listinfo/r-packages

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Replacing tabs with appropriate number of spaces

2008-12-09 Thread Dennis Fisher


Colleagues,

Platform: OS X (but issue applies to all platforms)
Version: 2.8.0

I have a mixture of text and data that I am outputting via R to a pdf  
document (using a fixed-width font).  The text contains tabs that  
align columns properly with a fixed-width font in a terminal window.   
However, when the PDF document is created, the concept of a tab is not  
invoked properly and columns do not align.


I could use brute force as follows:
1.  identify lines of text containing tabs
2.  strsplit on tabs
3.  count characters preceding the tab, then replace the tab with the  
appropriate number of spaces (e.g., if the string preceding the tab  
has 29 characters, add 3 spaces), then paste(..., sep=)


However, I am sure a more elegant approach exists.  Can anyone offer  
one?


Dennis


Dennis Fisher MD
P  (The P Less Than Company)
Phone: 1-866-PLessThan (1-866-753-7784)
Fax: 1-415-564-2220
www.PLessThan.com

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Applying min to numeric vectors

2008-12-09 Thread Brigid Mooney

I was surprised this morning, that it seems as though the min() function
does not work as *I* anticipated when given vector arguments.

For example:
a - 1:10
b - c(rep(1, times=5), rep(10, times=5))

Result:
 min(a,b)
 1

What I actually wanted was a term by term minimum, i.e.:
ifelse(a=b, a, b)
 1  1  1  1  1  6  7  8  9 10
Am I losing much in terms of computation power if I use the ifelse?

I'm a little worried, because in implementation my vectors are quite long,
and I will be computing the min of many of them, min(a,b,c,d,e,f)  where a
through f are all vectors of the same length.

Any insight that can be provided is much appreciated.

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Applying min to numeric vectors

Try this:

pmin(a, b)

On Tue, Dec 9, 2008 at 2:58 PM, Brigid Mooney [EMAIL PROTECTED] wrote:

 I was surprised this morning, that it seems as though the min() function
 does not work as *I* anticipated when given vector arguments.

 For example:
 a - 1:10
 b - c(rep(1, times=5), rep(10, times=5))

 Result:
  min(a,b)
  1

 What I actually wanted was a term by term minimum, i.e.:
 ifelse(a=b, a, b)
  1  1  1  1  1  6  7  8  9 10
 Am I losing much in terms of computation power if I use the ifelse?

 I'm a little worried, because in implementation my vectors are quite long,
 and I will be computing the min of many of them, min(a,b,c,d,e,f)  where a
 through f are all vectors of the same length.

 Any insight that can be provided is much appreciated.

[[alternative HTML version deleted]]

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.




-- 
Henrique Dallazuanna
Curitiba-Paraná-Brasil
25° 25' 40 S 49° 16' 22 O

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Bootstrap a GLM model with Poisson family

2008-12-09 Thread Daniel Malter

Hi, if you already have code for the binomial, can you not just replace
family=binomial by family=Poisson in the code? 

Cheers,
Daniel


-
cuncta stricte discussurus
-

-Ursprüngliche Nachricht-
Von: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] Im
Auftrag von Joana Vicente
Gesendet: Tuesday, December 09, 2008 11:03 AM
An: r-help@r-project.org
Betreff: [R] Bootstrap a GLM model with Poisson family

Hello!
Anyone can help me to know how to do a bootstrap evaluation analysis to a
GLM with family=Poisson?
I have some R codes but they were coded only for family=binomial...

thanks a lot
Joana Vicente .·.

CIBIO - Centro de Investigação em Biodiversidade e Recursos Genéticos.
Faculdade de Ciências do Porto
Departamento de Botânica -Edifício FC4
Rua Do Campo Alegre, S/N
4169-007 Porto - Portugal

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] ARMA

2008-12-09 Thread Raphael Saldanha

Hi!

Is there any package or function on R to ARMA models (Box  Jenkins, without
sazonality and trend) with resources to automatic identification for p and q
?

Regards,

Raphael Saldanha
Brazil

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Replacing tabs with appropriate number of spaces

2008-12-09 Thread hadley wickham

On Tue, Dec 9, 2008 at 10:39 AM, Dennis Fisher [EMAIL PROTECTED] wrote:
 Colleagues,

 Platform: OS X (but issue applies to all platforms)
 Version: 2.8.0

 I have a mixture of text and data that I am outputting via R to a pdf
 document (using a fixed-width font).  The text contains tabs that align
 columns properly with a fixed-width font in a terminal window.  However,
 when the PDF document is created, the concept of a tab is not invoked
 properly and columns do not align.

 I could use brute force as follows:
 1.  identify lines of text containing tabs
 2.  strsplit on tabs
 3.  count characters preceding the tab, then replace the tab with the
 appropriate number of spaces (e.g., if the string preceding the tab has 29
 characters, add 3 spaces), then paste(..., sep=)

Tabs usually just expand to a fixed number of spaces - so gsub(\t,
  , text) would do the job.

Hadley

-- 
http://had.co.nz/

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] How to get Greenhouse-Geisser epsilons from anova?

2008-12-09 Thread Skotara


Dear John and Peter,

thank you both very much for your help!
Everything works fine now!

John, Anova also works very fine. Thank you very much!
However, if I had more than 2 levels for the between factor the same 
thing as mentioned occured.
The degrees of freedom showed that Anova calculated it as if all 
subjects came from the same group, for example for main effect A the dfs 
are 1 and 35.

Since I can get those values using anova that causes no problem.

I saw that the x$G to get the greenhouse-geisser epsilon do work for:
x- anova(mlmfitD, X=~C+B, M=~A+C+B, test = Spherical)
but does not work for y$G:
y - anova(mlmfit, mlmfit0, X= ~C+B, M = ~A+C+B, idata = 
dd,test=Spherical)


Finally, the Greenhouse-Geisser epsilons are identical using both 
methods and to the SPSS output.

The Huynh-Feldt are not the same as them of SPSS. I will use GG instead.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Pre-model Variable Reduction

2008-12-09 Thread Frank E Harrell Jr


Mark Difford wrote:

Hi All,

I beg to differ with Ravi Varadhan's perspective. While it is true that
principal component analysis does not itself do variable selection, it is an
important method for pointing the way to what to select. This is what the
methods in the subselect package rely on. (One of its authors was I believe
a student of Jolliffe's). For a modern perspective on this, see the
following paper:

Debashis Paul, Eric Bair, Trevor Hastie and Robert Tibshirani:
Preconditioning for feature selection and regression in high-dimensional
problems We show that supervised principal components followed by a variable
selection procedure is an effective approach for variable selection in very
high dimension. Annals of Statistics 36(4), 2008, 1595-1618.

http://www-stat.stanford.edu/~hastie/Papers/Preconditioning_Annals.pdf

Regards, Mark.


Mark,

Slightly more relevant is the unsupervised sparse principal component 
methods described in the following references.  If anyone knows of 
better references for this please let me know.  -Frank



@Article{zou06spa,
  author =   {Zhou, Hui and Hastie, Trevor and Tibshirani, Robert},
  title ={Sparse principal component analysis},
  journal =  J Comp Graph Stat,
  year = 2006,
  volume =   15,
  pages ={265-286},
  annote =   {gene microarray;lasso/elastic net;multivariate
analysis;data reduction;singular value
decomposition;thresholding;principal components analysis that shrinks
some loadings to zero}
}
@Article{wit08tes,
  author =   {Witten, Daniela M. and Tibshirani, Robert},
  title = 		 {Testing significance of features by lassoed principal 
components},

  journal =  Annals Appl Stat,
  year = 2008,
  volume =   2,
  number =   3,
  pages ={986-1012},
  annote = 	 {reduction in false discovery rates over using a vector of 
t-statistics;borrowing strength across genes;``one would not expect a 
single gene to be associated with the outcome, since, in practice, many 
genes work together to effect a particular phenotype.  LPC effectively 
down-weights individual genes that are associated with the outcome but 
that do not share an expression pattern with a larger group of genes, 
and instead favors large groups of genes that appear to be 
differentially-expressed.'';regress principal components on outcome}

}




Ravi Varadhan wrote:

Principal components analysis does dimensionality reduction but NOT
variable reduction.  However, Jolliffe's 2004 book on PCA does discuss
the
problem of selecting a subset of variables, with the goal of representing
the internal variation of original multivariate vector as well as possible
(see Section 6.3 of that book).  I do not think that these methods can
handle missing data.  The most important issue is to think about the goal
of
variable reduction and then choose an appropriate optimality criterion for
achieving that goal.  In most instances of variable selection, the
criterion
that is optimized is never explicitly considered.

Ravi.


---

Ravi Varadhan, Ph.D.

Assistant Professor, The Center on Aging and Health

Division of Geriatric Medicine and Gerontology 


Johns Hopkins University

Ph: (410) 502-2619

Fax: (410) 614-9625

Email: [EMAIL PROTECTED]

Webpage:  http://www.jhsph.edu/agingandhealth/People/Faculty/Varadhan.html

 






-Original Message-
From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED]
On
Behalf Of Gabor Grothendieck
Sent: Tuesday, December 09, 2008 8:00 AM
To: Harsh
Cc: r-help@r-project.org
Subject: Re: [R] Pre-model Variable Reduction

See:

?prcomp
?princomp

On Tue, Dec 9, 2008 at 5:34 AM, Harsh [EMAIL PROTECTED] wrote:

Hello All,
I am trying to carry out variable reduction. I do not have information 
about the dependent variable, and have only the X variables as it 
were.

In selecting variables I wish to keep, I have considered the following

criteria.

1) Percentage of missing value in each column/variable
2) Variance of each variable, with a cut-off value.

I recently came across Weka and found that there is an RWeka package 
which would allow me to make use of Weka through R.
Weka provides a Genetic search variable reduction method, but I 
could not find its R code implementation in the RWeka Pdf file on 
CRAN.


I looked for other R packages that allow me to do variable reduction 
without considering a dependent variable. I came across 'dprep'

package but it does not have a Windows implementation.

Moreover, I have a dataset that contains continuous and categorical 
variables, some categorical variables having 3 levels, 10 levels and 
so on, till a max 50 levels (E.g. States in the USA).


Any suggestions in this regard will be much appreciated.

Thank you

Harsh Singhal
Decision Systems,

Re: [R] Applying min to numeric vectors

2008-12-09 Thread Jorge Ivan Velez

Dear Brigid,
Try:

pmin(a,b)


HTH,

Jorge



On Tue, Dec 9, 2008 at 11:58 AM, Brigid Mooney [EMAIL PROTECTED] wrote:

 I was surprised this morning, that it seems as though the min() function
 does not work as *I* anticipated when given vector arguments.

 For example:
 a - 1:10
 b - c(rep(1, times=5), rep(10, times=5))

 Result:
  min(a,b)
  1

 What I actually wanted was a term by term minimum, i.e.:
 ifelse(a=b, a, b)
  1  1  1  1  1  6  7  8  9 10
 Am I losing much in terms of computation power if I use the ifelse?

 I'm a little worried, because in implementation my vectors are quite long,
 and I will be computing the min of many of them, min(a,b,c,d,e,f)  where a
 through f are all vectors of the same length.

 Any insight that can be provided is much appreciated.

[[alternative HTML version deleted]]

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Pre-model Variable Reduction

2008-12-09 Thread Mark Difford


Hi Frank,

 If anyone knows of better references for this please let me know.

Many thanks: I was not aware of the Witten paper. If I turn up anything else
I will be sure to let you know.

Best Regards, Mark.



Frank E Harrell Jr wrote:
 
 Mark Difford wrote:
 Hi All,
 
 I beg to differ with Ravi Varadhan's perspective. While it is true that
 principal component analysis does not itself do variable selection, it is
 an
 important method for pointing the way to what to select. This is what the
 methods in the subselect package rely on. (One of its authors was I
 believe
 a student of Jolliffe's). For a modern perspective on this, see the
 following paper:
 
 Debashis Paul, Eric Bair, Trevor Hastie and Robert Tibshirani:
 Preconditioning for feature selection and regression in
 high-dimensional
 problems We show that supervised principal components followed by a
 variable
 selection procedure is an effective approach for variable selection in
 very
 high dimension. Annals of Statistics 36(4), 2008, 1595-1618.
 
 http://www-stat.stanford.edu/~hastie/Papers/Preconditioning_Annals.pdf
 
 Regards, Mark.
 
 Mark,
 
 Slightly more relevant is the unsupervised sparse principal component 
 methods described in the following references.  If anyone knows of 
 better references for this please let me know.  -Frank
 
 
 @Article{zou06spa,
author ={Zhou, Hui and Hastie, Trevor and Tibshirani, Robert},
title = {Sparse principal component analysis},
journal =   J Comp Graph Stat,
year =  2006,
volume =15,
pages = {265-286},
annote ={gene microarray;lasso/elastic net;multivariate
 analysis;data reduction;singular value
 decomposition;thresholding;principal components analysis that shrinks
 some loadings to zero}
 }
 @Article{wit08tes,
author ={Witten, Daniela M. and Tibshirani, Robert},
title = {Testing significance of features by lassoed principal 
 components},
journal =   Annals Appl Stat,
year =  2008,
volume =2,
number =3,
pages = {986-1012},
annote ={reduction in false discovery rates over using a vector of 
 t-statistics;borrowing strength across genes;``one would not expect a 
 single gene to be associated with the outcome, since, in practice, many 
 genes work together to effect a particular phenotype.  LPC effectively 
 down-weights individual genes that are associated with the outcome but 
 that do not share an expression pattern with a larger group of genes, 
 and instead favors large groups of genes that appear to be 
 differentially-expressed.'';regress principal components on outcome}
 }
 
 
 
 Ravi Varadhan wrote:
 Principal components analysis does dimensionality reduction but NOT
 variable reduction.  However, Jolliffe's 2004 book on PCA does discuss
 the
 problem of selecting a subset of variables, with the goal of
 representing
 the internal variation of original multivariate vector as well as
 possible
 (see Section 6.3 of that book).  I do not think that these methods can
 handle missing data.  The most important issue is to think about the
 goal
 of
 variable reduction and then choose an appropriate optimality criterion
 for
 achieving that goal.  In most instances of variable selection, the
 criterion
 that is optimized is never explicitly considered.

 Ravi.

 
 ---

 Ravi Varadhan, Ph.D.

 Assistant Professor, The Center on Aging and Health

 Division of Geriatric Medicine and Gerontology 

 Johns Hopkins University

 Ph: (410) 502-2619

 Fax: (410) 614-9625

 Email: [EMAIL PROTECTED]

 Webpage: 
 http://www.jhsph.edu/agingandhealth/People/Faculty/Varadhan.html

  

 
 


 -Original Message-
 From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED]
 On
 Behalf Of Gabor Grothendieck
 Sent: Tuesday, December 09, 2008 8:00 AM
 To: Harsh
 Cc: r-help@r-project.org
 Subject: Re: [R] Pre-model Variable Reduction

 See:

 ?prcomp
 ?princomp

 On Tue, Dec 9, 2008 at 5:34 AM, Harsh [EMAIL PROTECTED] wrote:
 Hello All,
 I am trying to carry out variable reduction. I do not have information 
 about the dependent variable, and have only the X variables as it 
 were.
 In selecting variables I wish to keep, I have considered the following
 criteria.
 1) Percentage of missing value in each column/variable
 2) Variance of each variable, with a cut-off value.

 I recently came across Weka and found that there is an RWeka package 
 which would allow me to make use of Weka through R.
 Weka provides a Genetic search variable reduction method, but I 
 could not find its R code implementation in the RWeka Pdf file on 
 CRAN.

 I looked for other R packages that allow me to do variable reduction 
 without considering a dependent variable. I came across 'dprep'
 package but it

[R] Was Logical inconsistency - algorithm portability

2008-12-09 Thread John C Nash

The logical inconsistency thread has wandered a bit and discussion now veers towards portability of algorithms. On that subject, I have off-list been sharing ideas about the existence of functions to ensure optimizing compilers do not corrupt logical tests in numerical methods. Apparently the volatile adjective may have a role here (I am generally a visiting programmer in C, doing just enough to get things to work, and cribbing other folks' code). 

One need is for a function that ensures the stored representation of numbers is returned, call it STORED, so that one can test if STORED(x) == STORED(y). Various hacks can do this. One time we used an array of length 2. But hacks are generally ugly and vulnerable to updates in compilers, so a properly written and documented function would be helpful. And, of course, may already exist. 


As this is not a short-term help request but a long term need, I suggest 
contacting me off list about this subject.

Cheers,JN

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Need help optimizing/vectorizing nested loops

2008-12-09 Thread Charles C. Berry


On Tue, 9 Dec 2008, tyler wrote:


Hi,

I'm analyzing a large number of large simulation datasets, and I've
isolated one of the bottlenecks. Any help in speeding it up would be
appreciated.



Cast the neighborhoods as an indicator matrix, then use matrix 
multiplications:



 system.time(tmp - time.test(dat))

   user  system elapsed
   1.180.001.20

system.time({

+ mn - with(dat,outer(1:25,1:25, function(i,j) abs(X[i]-X[j])2  abs(Y[i]-Y[j])2 
 i!=j ))
+ print(all.equal(rowSums(mn%*%as.matrix(dat[,-(1:2)])0),tmp))})
[1] TRUE
   user  system elapsed
  0   0   0


HTH,

Chuck





`dat` is a dataframe of samples from a regular grid. The first two
columns are the spatial coordinates of the samples, the remaining 20
columns are the abundances of species in each cell. I need to calculate
the species richness in adjacent cells for each cell in the sample.
For example, if I have nine cells in my dataframe (X = 1:3, Y = 1:3):

 a b c
 d e f
 g h i

I need to calculate the neighbour-richness for each cell; for a, this is
the richness of cells b, d and e combined. The neighbour richness of
cell e would be the combined richness of all the other eight cells.

The following code does what I what, but it's slow. The sample dataset
'dat', below, represents a 5x5 grid, 25 samples. It takes about 1.5
seconds on my computer. The largest samples I am working with have a 51
x 51 grid (2601 cells) and take 4.5 minutes. This is manageable, but
since I have potentially hundreds of these analyses to run, trimming
that down would be very helpful.

After loading the function and the data, the call

 system.time(tmp - time.test(dat))

Will run the code. Note that I've excised this from a larger, more
general function, after determining that for large datasets this section
is responsible for a slowdown from 10-12 seconds to ca. 250 seconds.

Thanks for your patience,

Tyler


time.test - function(dat) {

 cen - dat
 grps - 5
 n.rich - numeric(grps^2)
 n.ind - 1

 for(i in 1:grps)
   for (j in 1:grps) {
 n.cen - numeric(ncol(cen) - 2)
 neighbours - expand.grid((j-1):(j+1), (i-1):(i+1))
 neighbours - neighbours[-5,]
 neighbours - neighbours[which(neighbours[,1] %in% 1:grps 
neighbours[,2] %in% 1:grps),]

 for (k in 1:nrow(neighbours))
   n.cen - n.cen + cen[cen$X == neighbours[k,1] 
cen$Y == neighbours[k,2], -c(1:2)]

 n.rich[n.ind] - sum(as.logical(n.cen))
 n.ind - n.ind + 1
   }

   return(n.rich)
}

`dat` - structure(list(
 X = c(1, 2, 3, 4, 5, 1, 2, 3, 4, 5, 1, 2, 3, 4, 5, 1, 2, 3, 4, 5, 1,
 2, 3, 4, 5), Y = c(1, 1, 1, 1, 1, 2, 2, 2, 2, 2, 3, 3, 3, 3, 3, 4,
 4, 4, 4, 4, 5, 5, 5, 5, 5), V1 = c(0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L,
 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 45L, 131L, 0L, 0L, 34L,
 481L, 1744L), V2 = c(0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L,
 0L, 1L, 88L, 0L, 70L, 101L, 13L, 634L, 0L, 0L, 71L, 640L, 1636L), V3
 = c(0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 49L, 3L, 113L,
 1L, 44L, 167L, 336L, 933L, 0L, 14L, 388L, 1180L, 1709L), V4 = c(0L,
 0L, 0L, 0L, 0L, 0L, 3L, 12L, 0L, 0L, 2L, 1L, 36L, 45L, 208L, 7L,
 221L, 213L, 371L, 1440L, 26L, 211L, 389L, 1382L, 1614L), V5 = c(96L,
 7L, 0L, 0L, 0L, 10L, 17L, 0L, 5L, 0L, 0L, 11L, 151L, 127L, 160L,
 27L, 388L, 439L, 1117L, 1571L, 81L, 598L, 1107L, 1402L, 891L), V6 =
 c(16L, 30L, 13L, 0L, 0L, 10L, 195L, 60L, 29L, 29L, 1L, 107L, 698L,
 596L, 655L, 227L, 287L, 677L, 1477L, 1336L, 425L, 873L, 961L, 1360L,
 1175L), V7 = c(249L, 101L, 69L, 0L, 18L, 186L, 331L, 291L, 259L,
 248L, 336L, 404L, 642L, 632L, 775L, 455L, 801L, 697L, 1063L, 978L,
 626L, 686L, 1204L, 1138L, 627L), V8 = c(300L, 163L, 65L, 145L, 377L,
 257L, 690L, 655L, 420L, 288L, 346L, 461L, 1276L, 897L, 633L, 812L,
 1018L, 1337L, 1295L, 1163L, 550L, 1104L, 768L, 933L, 433L), V9 =
 c(555L, 478L, 374L, 349L, 357L, 360L, 905L, 954L, 552L, 438L, 703L,
 984L, 1616L, 1732L, 1234L, 1213L, 1518L, 1746L, 1191L, 967L, 1394L,
 1722L, 1706L, 610L, 169L), V10 = c(1527L, 1019L, 926L, 401L, 830L,
 833L, 931L, 816L, 1126L, 1232L, 1067L, 1169L, 1270L, 1277L, 1145L,
 1159L, 1072L, 1534L, 997L, 391L, 1328L, 1414L, 1037L, 444L, 1L), V11
 = c(1468L, 1329L, 1013L, 603L, 1096L, 1237L, 1488L, 1189L, 1064L,
 1303L, 1258L, 1479L, 1421L, 1365L, 1101L, 1415L, 1145L, 1329L,
 1325L, 236L, 1379L, 1199L, 729L, 328L, 0L), V12 = c(983L, 1459L,
 791L, 898L, 911L, 1215L, 1528L, 960L, 1172L, 1286L, 1358L, 722L,
 857L, 1478L, 1452L, 1502L, 1013L, 745L, 455L, 149L, 1686L, 917L,
 1013L, 84L, 0L), V13 = c(1326L, 1336L, 1110L, 1737L, 1062L, 1578L,
 1382L, 1537L, 1366L, 1308L, 1301L, 1357L, 746L, 622L, 934L, 1132L,
 954L, 460L, 270L, 65L, 957L, 699L, 521L, 18L, 1L), V14 = c(1047L,
 1315L, 1506L, 1562L, 1254L, 1336L, 1106L, 1213L, 1220L, 1457L, 858L,
 1606L, 590L, 726L, 598L, 945L, 732L, 258L, 45L, 6L, 937L, 436L, 43L,
 0L, 0L), V15 = c(845L, 935L, 1295L, 1077L, 1400L, 1049L, 802L,
 1247L, 1449L, 1046L, 1134L, 877L, 327L, 352L, 470L, 564L, 461L,

[R] difftime

2008-12-09 Thread eric lee

Hi.  I'm trying to take the difference in days between two times.  Can
you point out what's wrong, or suggest a different function?  When I
try the following code,  The following code works fine:

a - strptime(1911100807,format=%Y%m%d%H,tz=GMT)
b - strptime(1911102718,format=%Y%m%d%H,tz=GMT)
x - difftime(b, a, units=days)
x


But when I change the year, the following code returns 'NA' for the
time between a and b.  Thanks.

a - strptime(1999100807,format=%Y%m%d%H,tz=GMT)
b - strptime(1999102718,format=%Y%m%d%H,tz=GMT)
x - difftime(b, a, units=days)
x

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Need help optimizing/vectorizing nested loops

2008-12-09 Thread bartjoosen


Hello,

how about changing the last loop to an apply?

time.test2 - function(dat) { 
  cen - dat 
  grps - 5 
  n.rich - numeric(grps^2) 
  n.ind - 1 

  for(i in 1:grps) 
for (j in 1:grps) { 
  n.cen - numeric(ncol(cen) - 2) 
  neighbours - expand.grid(X=(j-1):(j+1), Y=(i-1):(i+1)) 
  neighbours - neighbours[-5,] 
  neighbours - neighbours[which(neighbours[,1] %in% 1:grps  
 neighbours[,2] %in% 1:grps),] 
n.rich[n.ind] -
sum(as.logical(apply(merge(neighbours,cen)[,-c(1,2)],2,sum)))
  n.ind - n.ind + 1 


} 

return(n.rich) 
} 

The timings on my system:
 system.time(time.test(dat))
   user  system elapsed 
   1.650.031.71 
 system.time(time.test2(dat))
   user  system elapsed 
   0.270.000.27 


I'm still thinking about the optimisation for the selection of the cells,
but for the moment I have no clue about any optimisation for this problem.


Kind Regards

Bart


Tyler Smith wrote:
 
 Hi,
 
 I'm analyzing a large number of large simulation datasets, and I've
 isolated one of the bottlenecks. Any help in speeding it up would be
 appreciated.
 
 `dat` is a dataframe of samples from a regular grid. The first two
 columns are the spatial coordinates of the samples, the remaining 20
 columns are the abundances of species in each cell. I need to calculate
 the species richness in adjacent cells for each cell in the sample. 
 For example, if I have nine cells in my dataframe (X = 1:3, Y = 1:3):
 
   a b c
   d e f
   g h i
 
 I need to calculate the neighbour-richness for each cell; for a, this is
 the richness of cells b, d and e combined. The neighbour richness of
 cell e would be the combined richness of all the other eight cells.
 
 The following code does what I what, but it's slow. The sample dataset
 'dat', below, represents a 5x5 grid, 25 samples. It takes about 1.5
 seconds on my computer. The largest samples I am working with have a 51
 x 51 grid (2601 cells) and take 4.5 minutes. This is manageable, but
 since I have potentially hundreds of these analyses to run, trimming
 that down would be very helpful.
 
 After loading the function and the data, the call
 
   system.time(tmp - time.test(dat))
 
 Will run the code. Note that I've excised this from a larger, more
 general function, after determining that for large datasets this section
 is responsible for a slowdown from 10-12 seconds to ca. 250 seconds.
 
 Thanks for your patience,
 
 Tyler
 
 
 time.test - function(dat) {
 
   cen - dat
   grps - 5
   n.rich - numeric(grps^2)
   n.ind - 1
   
   for(i in 1:grps)
 for (j in 1:grps) {
   n.cen - numeric(ncol(cen) - 2)
   neighbours - expand.grid((j-1):(j+1), (i-1):(i+1)) 
   neighbours - neighbours[-5,] 
   neighbours - neighbours[which(neighbours[,1] %in% 1:grps 
  neighbours[,2] %in% 1:grps),]
   
   for (k in 1:nrow(neighbours))
 n.cen - n.cen + cen[cen$X == neighbours[k,1] 
  cen$Y == neighbours[k,2], -c(1:2)]
   
   n.rich[n.ind] - sum(as.logical(n.cen))
   n.ind - n.ind + 1
 }
 
 return(n.rich)
 }
 
 `dat` - structure(list(
   X = c(1, 2, 3, 4, 5, 1, 2, 3, 4, 5, 1, 2, 3, 4, 5, 1, 2, 3, 4, 5, 1,
   2, 3, 4, 5), Y = c(1, 1, 1, 1, 1, 2, 2, 2, 2, 2, 3, 3, 3, 3, 3, 4,
   4, 4, 4, 4, 5, 5, 5, 5, 5), V1 = c(0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L,
   0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 45L, 131L, 0L, 0L, 34L,
   481L, 1744L), V2 = c(0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L,
   0L, 1L, 88L, 0L, 70L, 101L, 13L, 634L, 0L, 0L, 71L, 640L, 1636L), V3
   = c(0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 49L, 3L, 113L,
   1L, 44L, 167L, 336L, 933L, 0L, 14L, 388L, 1180L, 1709L), V4 = c(0L,
   0L, 0L, 0L, 0L, 0L, 3L, 12L, 0L, 0L, 2L, 1L, 36L, 45L, 208L, 7L,
   221L, 213L, 371L, 1440L, 26L, 211L, 389L, 1382L, 1614L), V5 = c(96L,
   7L, 0L, 0L, 0L, 10L, 17L, 0L, 5L, 0L, 0L, 11L, 151L, 127L, 160L,
   27L, 388L, 439L, 1117L, 1571L, 81L, 598L, 1107L, 1402L, 891L), V6 =
   c(16L, 30L, 13L, 0L, 0L, 10L, 195L, 60L, 29L, 29L, 1L, 107L, 698L,
   596L, 655L, 227L, 287L, 677L, 1477L, 1336L, 425L, 873L, 961L, 1360L,
   1175L), V7 = c(249L, 101L, 69L, 0L, 18L, 186L, 331L, 291L, 259L,
   248L, 336L, 404L, 642L, 632L, 775L, 455L, 801L, 697L, 1063L, 978L,
   626L, 686L, 1204L, 1138L, 627L), V8 = c(300L, 163L, 65L, 145L, 377L,
   257L, 690L, 655L, 420L, 288L, 346L, 461L, 1276L, 897L, 633L, 812L,
   1018L, 1337L, 1295L, 1163L, 550L, 1104L, 768L, 933L, 433L), V9 =
   c(555L, 478L, 374L, 349L, 357L, 360L, 905L, 954L, 552L, 438L, 703L,
   984L, 1616L, 1732L, 1234L, 1213L, 1518L, 1746L, 1191L, 967L, 1394L,
   1722L, 1706L, 610L, 169L), V10 = c(1527L, 1019L, 926L, 401L, 830L,
   833L, 931L, 816L, 1126L, 1232L, 1067L, 1169L, 1270L, 1277L, 1145L,
   1159L, 1072L, 1534L, 997L, 391L, 1328L, 1414L, 1037L, 444L, 1L), V11
   = c(1468L, 1329L, 1013L, 603L, 1096L, 1237L, 1488L, 1189L, 1064L,
   1303L, 1258L, 1479L, 1421L, 1365L,

[R] extract the digits of a number

2008-12-09 Thread Gustavo Carvalho

Hello,

Anyone knows how can I do this in a cleaner way?

mynumber = 1001
as.numeric(unlist(strsplit(as.character(mynumber),)))
[1] 1 0 0 1

Thanks in advance,

Gustavo

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] difftime

Perhaps you are getting warning messages from difftime like this:

Warning messages:
1: In structure(.Internal(as.POSIXct(x, tz)), class = c(POSIXt,
POSIXct),  :
  unable to identify current timezone 'V':
please set environment variable 'TZ'
2: In structure(.Internal(as.POSIXct(x, tz)), class = c(POSIXt,
POSIXct),  :
  unknwon timezone 'localtime'

Try execute the code two times.

On Tue, Dec 9, 2008 at 3:33 PM, eric lee [EMAIL PROTECTED] wrote:

 Hi.  I'm trying to take the difference in days between two times.  Can
 you point out what's wrong, or suggest a different function?  When I
 try the following code,  The following code works fine:

 a - strptime(1911100807,format=%Y%m%d%H,tz=GMT)
 b - strptime(1911102718,format=%Y%m%d%H,tz=GMT)
 x - difftime(b, a, units=days)
 x


 But when I change the year, the following code returns 'NA' for the
 time between a and b.  Thanks.

 a - strptime(1999100807,format=%Y%m%d%H,tz=GMT)
 b - strptime(1999102718,format=%Y%m%d%H,tz=GMT)
 x - difftime(b, a, units=days)
 x

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.




-- 
Henrique Dallazuanna
Curitiba-Paraná-Brasil
25° 25' 40 S 49° 16' 22 O

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] How to display y-axis labels in Multcomp plot

2008-12-09 Thread Metconnection


Thanks Kingsford... that does the trick perfectly!

Simon
-- 
View this message in context: 
http://www.nabble.com/How-to-display-y-axis-labels-in-Multcomp-plot-tp20904977p20919920.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Replacing tabs with appropriate number of spaces

This is basically your approach, but automated a bit more than you describe:


library(gsubfn)

tmp - 
strsplit('one\ttwo\nthree\tfour\n12345678\t910\na\tbc\tdef\tghi\n','\n')[[1]]

tmp2 - gsubfn('([^\t]+)\t', function(x) {
  ln - nchar(x)
  nsp - 8-(ln %% 8)
  sp - paste( rep(' ', nsp), collapse='' )
  paste(x,sp, sep='')
}, tmp )

tmp2
cat(tmp2, sep='\n')

This is based on the assumption of tab stops every 8 columns, change the 2 8's 
above if you want something different.

Hope this helps,


--
Gregory (Greg) L. Snow Ph.D.
Statistical Data Center
Intermountain Healthcare
[EMAIL PROTECTED]
801.408.8111


 -Original Message-
 From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED]
 project.org] On Behalf Of Dennis Fisher
 Sent: Tuesday, December 09, 2008 9:40 AM
 To: [EMAIL PROTECTED]
 Subject: [R] Replacing tabs with appropriate number of spaces

 Colleagues,

 Platform: OS X (but issue applies to all platforms)
 Version: 2.8.0

 I have a mixture of text and data that I am outputting via R to a pdf
 document (using a fixed-width font).  The text contains tabs that
 align columns properly with a fixed-width font in a terminal window.
 However, when the PDF document is created, the concept of a tab is not
 invoked properly and columns do not align.

 I could use brute force as follows:
 1.  identify lines of text containing tabs
 2.  strsplit on tabs
 3.  count characters preceding the tab, then replace the tab with the
 appropriate number of spaces (e.g., if the string preceding the tab
 has 29 characters, add 3 spaces), then paste(..., sep=)

 However, I am sure a more elegant approach exists.  Can anyone offer
 one?

 Dennis


 Dennis Fisher MD
 P  (The P Less Than Company)
 Phone: 1-866-PLessThan (1-866-753-7784)
 Fax: 1-415-564-2220
 www.PLessThan.com

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-
 guide.html
 and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] no implicit call of the print function within loops?

2008-12-09 Thread Mark Heckmann

Dear R-users,

I wonder why some functions produce output when they are called (I suppose
due to an implicit call of the print function) but within a loop they do
not:

attach(anscomce)
exp - parse(text= lm(x1 ~ y1))
eval(exp)

Here the print() function seems to be called implicitly.
If I do the same within a for-loop, it is not.

for (i in c(1)){
   eval(exp)
}

I know that I have to wrap it into a print function so it would work. 
But why is that so? In the eval() help I don't find any clues. 
As this happens with other functions as well, I would like to understand the
causes and thus
avoid some future mistakes.

TIA,
Mark

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] keep function in stepAIC

2008-12-09 Thread Xin Shi

Dear:

 

Does anyone use keep function in stepAIC command? If so, could you give an 
example? I try to use this function to choose some variables in all of possible 
models.

 

Many Thanks!

 

Xin


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Re sampling with index

2008-12-09 Thread legendy


Hallo, All,

I have a question needs your help.
I just want get a sample with the original index. For example, I have a
dataset,

ind  x
139
224
315
475
561

After resample, I want to get a new dataset like this (with the original
index)

ind  x
315
561
139

Thank you in advance.

Legendy





-- 
View this message in context: 
http://www.nabble.com/Resampling-with-index-tp20919265p20919265.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] XYPLOT Help

2008-12-09 Thread Zhou, James

This may be an easy question for most of you, but any prompt
clues/hints/examples would be really appreciated.

 

My dataset includes a various number of records (by time points) for a
variable by subject.  I use xyplot to plot line/symbol graphs of the
variable at each available time point by each subject on the same plot
(see attached as an example).  How to suppress those symbols with no
available time points (e.g. symbols for subject 1001 study days 5, 6,
7)?  Below is a quick example of data:

 

Subject Study Day% CD8 response

1001 1  0.23

1001 2  0.25

1001 3  0.27

1001 4  0.29

1001 8  0.29

1002 1  0.23

1002 2  0.25

1002 3  0.27

1002 4  0.29

1002 5  0.23

1002 6  0.25

1002 7  0.27

1002 8  0.29

1003 1  0.23

1003 2  0.25

1003 3  0.27

1003 4  0.29

1003 5  0.23

1003 6  0.25

1003 7  0.27

1003 8  0.29

1004 1  0.23

1004 2  0.25

1004 3  0.27

1004 4  0.29

1004 8  0.29

 

Thanks a lot in advance!

 

James

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] R Graphics Device margins

2008-12-09 Thread Metcalfe, John

Hello,
I am relatively new to R and am using it to run Classification and Regression 
Tree analysis. My only issue at this point is that numbers are always cut off 
on the lower nodes. I've tried changing the margins with 

mai=c(0, 0.5, 0.5, 0) but this has not so far worked.

Any suggestions would be appreciated.

Best,

 

John Z. Metcalfe, M.D., M.P.H. 
Division of Pulmonary and Critical Care Medicine 
University of California, San Francisco
San Francisco General Hospital

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] extract the digits of a number

2008-12-09 Thread Christos Hatzis

An alternative that works for any base (other than 10) is the following:

 library(sfsmisc)
 digitsBase(1001, 10)
Class 'basedInt'(base = 10) [1:1]
 [,1]
[1,]1
[2,]0
[3,]0
[4,]1

-Christos

 -Original Message-
 From: [EMAIL PROTECTED] 
 [mailto:[EMAIL PROTECTED] On Behalf Of Gustavo Carvalho
 Sent: Tuesday, December 09, 2008 12:49 PM
 To: r-help@r-project.org
 Subject: [R] extract the digits of a number
 
 Hello,
 
 Anyone knows how can I do this in a cleaner way?
 
 mynumber = 1001
 as.numeric(unlist(strsplit(as.character(mynumber),)))
 [1] 1 0 0 1
 
 Thanks in advance,
 
 Gustavo
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide 
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.
 


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] extract the digits of a number

Try this also:

library(gsubfn)
strapply(as.character(mynumber), [0-9], simplify = as.numeric)

On Tue, Dec 9, 2008 at 3:48 PM, Gustavo Carvalho
[EMAIL PROTECTED][EMAIL PROTECTED]
 wrote:

 Hello,

 Anyone knows how can I do this in a cleaner way?

 mynumber = 1001
 as.numeric(unlist(strsplit(as.character(mynumber),)))
 [1] 1 0 0 1

 Thanks in advance,

 Gustavo

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.




-- 
Henrique Dallazuanna
Curitiba-Paraná-Brasil
25° 25' 40 S 49° 16' 22 O

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] extract the digits of a number

I don't know if this is cleaner or not, but here is another way:

 mynumber - 1001
 floor( mynumber/(10^(nchar(mynumber):1 -1))) %% 10
[1] 1 0 0 1
 mynumber - 12345678
 floor( mynumber/(10^(nchar(mynumber):1 -1))) %% 10
[1] 1 2 3 4 5 6 7 8
 mynumber - 9753086421
 floor( mynumber/(10^(nchar(mynumber):1 -1))) %% 10
 [1] 9 7 5 3 0 8 6 4 2 1


--
Gregory (Greg) L. Snow Ph.D.
Statistical Data Center
Intermountain Healthcare
[EMAIL PROTECTED]
801.408.8111


 -Original Message-
 From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED]
 project.org] On Behalf Of Gustavo Carvalho
 Sent: Tuesday, December 09, 2008 10:49 AM
 To: r-help@r-project.org
 Subject: [R] extract the digits of a number

 Hello,

 Anyone knows how can I do this in a cleaner way?

 mynumber = 1001
 as.numeric(unlist(strsplit(as.character(mynumber),)))
 [1] 1 0 0 1

 Thanks in advance,

 Gustavo

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-
 guide.html
 and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Re sampling with index

Try this:

x[x$ind %in% sample(x$ind, 3),]

On Tue, Dec 9, 2008 at 3:19 PM, legendy [EMAIL PROTECTED] wrote:


 Hallo, All,

 I have a question needs your help.
 I just want get a sample with the original index. For example, I have a
 dataset,

 ind  x
 139
 224
 315
 475
 561

 After resample, I want to get a new dataset like this (with the original
 index)

 ind  x
 315
 561
 139

 Thank you in advance.

 Legendy





 --
 View this message in context:
 http://www.nabble.com/Resampling-with-index-tp20919265p20919265.html
 Sent from the R help mailing list archive at Nabble.com.

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.




-- 
Henrique Dallazuanna
Curitiba-Paraná-Brasil
25° 25' 40 S 49° 16' 22 O

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Can elastic net do binary classification?

2008-12-09 Thread Jack Luo

Hi, List

The elastic net package (by Hastie and Zou at Stanford) is used to do
regularization and variable selection, it can also do regression. I am
wondering if it can perform binary classification (discrete outcome).
Anybody having similar experience?

Many thanks,

-Jack

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] package plm

2008-12-09 Thread Andrea Ferroni


   Dear R help,
  I use the package plm e the function plm() to analyse a panel data and
   estimate a dynamic model.
   Can I estimate a model without intercept?

   Thanks,
   Andrea Ferroni
__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] no implicit call of the print function within loops?

The general rule is that when you type something at the command line and that 
something returns a value, if you do not tell it what to do with the value, and 
the value has not been made invisible, then the value is printed.

If you do:

 tmp - eval(exp)

You will not see the output printed since you tell it what to do with the 
return value from eval, but without the assignment, it is not told what to do 
and so does the print.  This is not unique to the eval function, but true for 
anything that returns a visible value, compare:

 3 + 4
 x - 3 + 4
 (x - 3 + 4)

The implicit print does not happen inside of loops, batch processing, inside of 
functions, and probably a few other places.

Hope this helps,

--
Gregory (Greg) L. Snow Ph.D.
Statistical Data Center
Intermountain Healthcare
[EMAIL PROTECTED]
801.408.8111


 -Original Message-
 From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED]
 project.org] On Behalf Of Mark Heckmann
 Sent: Tuesday, December 09, 2008 10:44 AM
 To: r-help@r-project.org
 Subject: [R] no implicit call of the print function within loops?

 Dear R-users,

 I wonder why some functions produce output when they are called (I
 suppose
 due to an implicit call of the print function) but within a loop they
 do
 not:

 attach(anscomce)
 exp - parse(text= lm(x1 ~ y1))
 eval(exp)

 Here the print() function seems to be called implicitly.
 If I do the same within a for-loop, it is not.

 for (i in c(1)){
eval(exp)
 }

 I know that I have to wrap it into a print function so it would work.
 But why is that so? In the eval() help I don't find any clues.
 As this happens with other functions as well, I would like to
 understand the
 causes and thus
 avoid some future mistakes.

 TIA,
 Mark

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-
 guide.html
 and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] package plm

2008-12-09 Thread Giovanni Petris


If nobody answered to your question before, you probably have to
formulate it differently, trying to give some context and possibly
providing a small example, as requested by the posting
guides. Resending your question unchanged multiple times is not going
to help.

Questions about contributed packages should be directed to the package
maintainer. 

Best,
Giovanni

 Date: Tue, 09 Dec 2008 19:04:36 +0100
 From: Andrea Ferroni [EMAIL PROTECTED]
 Sender: [EMAIL PROTECTED]
 Precedence: list
 
 
Dear R help,
   I use the package plm e the function plm() to analyse a panel data and
estimate a dynamic model.
Can I estimate a model without intercept?
 
Thanks,
Andrea Ferroni
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.
 
 

-- 

Giovanni Petris  [EMAIL PROTECTED]
Associate Professor
Department of Mathematical Sciences
University of Arkansas - Fayetteville, AR 72701
Ph: (479) 575-6324, 575-8630 (fax)
http://definetti.uark.edu/~gpetris/

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] R Graphics Device margins

2008-12-09 Thread Prof Brian Ripley


Without the reproducible example asked for it is hard to tell, but
?text.rpart did say

 ...: Graphical parameters may also be supplied as arguments to
  this function (see 'par').  As labels often extend outside
  the plot region it can be helpful to specify 'xpd = TRUE'.

Might that be the issue (rather than your subject line)?

On Tue, 9 Dec 2008, Metcalfe, John wrote:


Hello,


I am relatively new to R and am using it to run Classification and 
Regression Tree analysis. My only issue at this point is that numbers 
are always cut off on the lower nodes. I've tried changing the margins 
with


mai=c(0, 0.5, 0.5, 0) but this has not so far worked.

Any suggestions would be appreciated.

Best,

John Z. Metcalfe, M.D., M.P.H.
Division of Pulmonary and Critical Care Medicine
University of California, San Francisco
San Francisco General Hospital

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.



--
Brian D. Ripley,  [EMAIL PROTECTED]
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford, Tel:  +44 1865 272861 (self)
1 South Parks Road, +44 1865 272866 (PA)
Oxford OX1 3TG, UKFax:  +44 1865 272595

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] extract the digits of a number

Actually what you have is not so bad.  Here is a variation
which has one fewer nested (...)

 as.numeric(strsplit(as.character(mynumber),)[[1]])
[1] 1 0 0 1

 # or
 # Try strapply in gsubfn:

 library(gsubfn)
 mynmber - 1001
 strapply(as.character(mynumber), ., as.numeric)[[1]]
[1] 1 0 0 1

On Tue, Dec 9, 2008 at 12:48 PM, Gustavo Carvalho
[EMAIL PROTECTED] wrote:
 Hello,

 Anyone knows how can I do this in a cleaner way?

 mynumber = 1001
 as.numeric(unlist(strsplit(as.character(mynumber),)))
 [1] 1 0 0 1

 Thanks in advance,

 Gustavo

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Need help optimizing/vectorizing nested loops

2008-12-09 Thread tyler

Charles C. Berry writes:
  On Tue, 9 Dec 2008, tyler wrote:
  
   I'm analyzing a large number of large simulation datasets, and I've
   isolated one of the bottlenecks. Any help in speeding it up would be
   appreciated.
  
  Cast the neighborhoods as an indicator matrix, then use matrix 
  multiplications:
  
   system.time({ mn - with(dat,outer(1:25,1:25, function(i,j)
  abs(X[i]-X[j])2  abs(Y[i]-Y[j])2  i!=j ))


Wow, that's fantastic! On my biggest matrix, your code cost me 3.854
seconds, compared to 207.769 seconds with my original function. I
might be able to use `outer' to solve some other slowdowns. 

Thanks,

Tyler

  HTH,
  
  Chuck
  
  
  
  
   `dat` is a dataframe of samples from a regular grid. The first two
   columns are the spatial coordinates of the samples, the remaining 20
   columns are the abundances of species in each cell. I need to calculate
   the species richness in adjacent cells for each cell in the sample.
   For example, if I have nine cells in my dataframe (X = 1:3, Y = 1:3):
  
a b c
d e f
g h i
  
   I need to calculate the neighbour-richness for each cell; for a, this is
   the richness of cells b, d and e combined. The neighbour richness of
   cell e would be the combined richness of all the other eight cells.
  
   The following code does what I what, but it's slow. The sample dataset
   'dat', below, represents a 5x5 grid, 25 samples. It takes about 1.5
   seconds on my computer. The largest samples I am working with have a 51
   x 51 grid (2601 cells) and take 4.5 minutes. This is manageable, but
   since I have potentially hundreds of these analyses to run, trimming
   that down would be very helpful.
  
   After loading the function and the data, the call
  
system.time(tmp - time.test(dat))
  
   Will run the code. Note that I've excised this from a larger, more
   general function, after determining that for large datasets this section
   is responsible for a slowdown from 10-12 seconds to ca. 250 seconds.
  
   Thanks for your patience,
  
   Tyler
  
  
   time.test - function(dat) {
  
cen - dat
grps - 5
n.rich - numeric(grps^2)
n.ind - 1
  
for(i in 1:grps)
  for (j in 1:grps) {
n.cen - numeric(ncol(cen) - 2)
neighbours - expand.grid((j-1):(j+1), (i-1):(i+1))
neighbours - neighbours[-5,]
neighbours - neighbours[which(neighbours[,1] %in% 1:grps 
   neighbours[,2] %in% 1:grps),]
  
for (k in 1:nrow(neighbours))
  n.cen - n.cen + cen[cen$X == neighbours[k,1] 
   cen$Y == neighbours[k,2], -c(1:2)]
  
n.rich[n.ind] - sum(as.logical(n.cen))
n.ind - n.ind + 1
  }
  
  return(n.rich)
   }
  
   `dat` - structure(list(
X = c(1, 2, 3, 4, 5, 1, 2, 3, 4, 5, 1, 2, 3, 4, 5, 1, 2, 3, 4, 5, 1,
2, 3, 4, 5), Y = c(1, 1, 1, 1, 1, 2, 2, 2, 2, 2, 3, 3, 3, 3, 3, 4,
4, 4, 4, 4, 5, 5, 5, 5, 5), V1 = c(0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L,
0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 45L, 131L, 0L, 0L, 34L,
481L, 1744L), V2 = c(0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L,
0L, 1L, 88L, 0L, 70L, 101L, 13L, 634L, 0L, 0L, 71L, 640L, 1636L), V3
= c(0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 49L, 3L, 113L,
1L, 44L, 167L, 336L, 933L, 0L, 14L, 388L, 1180L, 1709L), V4 = c(0L,
0L, 0L, 0L, 0L, 0L, 3L, 12L, 0L, 0L, 2L, 1L, 36L, 45L, 208L, 7L,
221L, 213L, 371L, 1440L, 26L, 211L, 389L, 1382L, 1614L), V5 = c(96L,
7L, 0L, 0L, 0L, 10L, 17L, 0L, 5L, 0L, 0L, 11L, 151L, 127L, 160L,
27L, 388L, 439L, 1117L, 1571L, 81L, 598L, 1107L, 1402L, 891L), V6 =
c(16L, 30L, 13L, 0L, 0L, 10L, 195L, 60L, 29L, 29L, 1L, 107L, 698L,
596L, 655L, 227L, 287L, 677L, 1477L, 1336L, 425L, 873L, 961L, 1360L,
1175L), V7 = c(249L, 101L, 69L, 0L, 18L, 186L, 331L, 291L, 259L,
248L, 336L, 404L, 642L, 632L, 775L, 455L, 801L, 697L, 1063L, 978L,
626L, 686L, 1204L, 1138L, 627L), V8 = c(300L, 163L, 65L, 145L, 377L,
257L, 690L, 655L, 420L, 288L, 346L, 461L, 1276L, 897L, 633L, 812L,
1018L, 1337L, 1295L, 1163L, 550L, 1104L, 768L, 933L, 433L), V9 =
c(555L, 478L, 374L, 349L, 357L, 360L, 905L, 954L, 552L, 438L, 703L,
984L, 1616L, 1732L, 1234L, 1213L, 1518L, 1746L, 1191L, 967L, 1394L,
1722L, 1706L, 610L, 169L), V10 = c(1527L, 1019L, 926L, 401L, 830L,
833L, 931L, 816L, 1126L, 1232L, 1067L, 1169L, 1270L, 1277L, 1145L,
1159L, 1072L, 1534L, 997L, 391L, 1328L, 1414L, 1037L, 444L, 1L), V11
= c(1468L, 1329L, 1013L, 603L, 1096L, 1237L, 1488L, 1189L, 1064L,
1303L, 1258L, 1479L, 1421L, 1365L, 1101L, 1415L, 1145L, 1329L,
1325L, 236L, 1379L, 1199L, 729L, 328L, 0L), V12 = c(983L, 1459L,
791L, 898L, 911L, 1215L, 1528L, 960L, 1172L, 1286L, 1358L, 722L,
857L, 1478L, 1452L, 1502L, 1013L, 745L, 455L, 149L, 1686L, 917L,
1013L, 84L, 0L), V13 = c(1326L, 1336L, 1110L, 1737L, 1062L, 1578L,
1382L, 1537L, 1366L, 1308L, 1301L, 1357L, 746L, 622L, 934L, 1132L,
954L, 460L,

[R] glm error message when using family Gamma(link=inverse)

2008-12-09 Thread John Sorkin

R 2.5
windows XP

I am getting an error from glm() that I don't understand. Any help or 
suggestions would be appreciated. N.B. 1=AAMTCAREJ=327900

 summary(data$AAMTCAREJ)
Min.  1st Qu.   Median Mean  3rd Qu. Max. 
 1.0404.3   1430.0   6567.0   5457.0 327900.0 


 fitglm-glm(AAMTCAREJ~sexcat+H_AGE+SmokeCat+InsuranceCat+MedicadeCat+
+ incomegrp+racecat+MARSTATJS+EdCat+bmiNewjohn,data=data,family=Gamma(link = 
inverse))
Error: no valid set of coefficients has been found: please supply starting 
values
In addition: Warning message:
NaNs produced in: log(x) 

Thanks
John

John David Sorkin M.D., Ph.D.
Chief, Biostatistics and Informatics
University of Maryland School of Medicine Division of Gerontology
Baltimore VA Medical Center
10 North Greene Street
GRECC (BT/18/GR)
Baltimore, MD 21201-1524
(Phone) 410-605-7119
(Fax) 410-605-7913 (Please call phone number above prior to faxing)

Confidentiality Statement:
This email message, including any attachments, is for th...{{dropped:6}}

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Replacing tabs with appropriate number of spaces

That's a clever use of gsubfn.  Here is a very minor simplification using
the same code but representing it in formula notation and sprintf:

gsubfn('([^\t]+)\t', ~ sprintf(%s%*s, x, 8-nchar(x)%%8,  ), tmp)


On Tue, Dec 9, 2008 at 12:51 PM, Greg Snow [EMAIL PROTECTED] wrote:
 This is basically your approach, but automated a bit more than you describe:


 library(gsubfn)

 tmp - 
 strsplit('one\ttwo\nthree\tfour\n12345678\t910\na\tbc\tdef\tghi\n','\n')[[1]]

 tmp2 - gsubfn('([^\t]+)\t', function(x) {
  ln - nchar(x)
  nsp - 8-(ln %% 8)
  sp - paste( rep(' ', nsp), collapse='' )
  paste(x,sp, sep='')
 }, tmp )

 tmp2
 cat(tmp2, sep='\n')

 This is based on the assumption of tab stops every 8 columns, change the 2 
 8's above if you want something different.

 Hope this helps,


 --
 Gregory (Greg) L. Snow Ph.D.
 Statistical Data Center
 Intermountain Healthcare
 [EMAIL PROTECTED]
 801.408.8111


 -Original Message-
 From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED]
 project.org] On Behalf Of Dennis Fisher
 Sent: Tuesday, December 09, 2008 9:40 AM
 To: [EMAIL PROTECTED]
 Subject: [R] Replacing tabs with appropriate number of spaces

 Colleagues,

 Platform: OS X (but issue applies to all platforms)
 Version: 2.8.0

 I have a mixture of text and data that I am outputting via R to a pdf
 document (using a fixed-width font).  The text contains tabs that
 align columns properly with a fixed-width font in a terminal window.
 However, when the PDF document is created, the concept of a tab is not
 invoked properly and columns do not align.

 I could use brute force as follows:
 1.  identify lines of text containing tabs
 2.  strsplit on tabs
 3.  count characters preceding the tab, then replace the tab with the
 appropriate number of spaces (e.g., if the string preceding the tab
 has 29 characters, add 3 spaces), then paste(..., sep=)

 However, I am sure a more elegant approach exists.  Can anyone offer
 one?

 Dennis


 Dennis Fisher MD
 P  (The P Less Than Company)
 Phone: 1-866-PLessThan (1-866-753-7784)
 Fax: 1-415-564-2220
 www.PLessThan.com

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-
 guide.html
 and provide commented, minimal, self-contained, reproducible code.

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] French IRC channel and mailing list ?

2008-12-09 Thread baptiste auguie

A few personal thoughts on this:

I recently joined a newly created R user group on google http://groups.google.co.uk/group/gur-ugr
that started with a similar impulse.

In my personal opinion, I see little overall benefit from such an
approach. For one thing, a major strength of the R mailing list is the
large number of very knowledgeable persons. A mailing list with only a
few 10s of users will never provide as good a support as you can find
in the main list. The advice you get could very easily be biased or
even plain wrong without much of a peer-review, so to say. Another
thing to consider is whether people who can help and understand french
actually want to answer a question in french, thereby limiting their
advice to a much narrower audience (people facing a similar problem
subsequently may be unable to get help from an answer in this
language). Perhaps even more likely is the opposite situation where
the question has been solved many times in the main mailing list: it
can be quite tempting to just send the link and say, well, here is
the solution, let me know what you don't understand rather than doing
a translator's job. Solving an R problem and translating somebody's
text have very unequal appeal.

I don't know what the exact policy is for this mailing list (a search
for english in the posting guide didn't return anything). Perhaps it
is OK to send the occasional question in french, or franglais. I
know I don't mind seeing a few of these and answering them if I can,
while I would not join a new list for the reasons stated above. This
would have the advantage of keeping the knowledge together. Maybe a
special tag could be used so that people not interested can filter out
all questions posted in non-english languages.

Best wishes,

Baptiste

On 8 Dec 2008, at 21:10, Julien Barnier wrote:

Dear all,

For some time now, R is becomming more and more popular in more and
more countries. France is for sure one of them, but french people
being french one of the obstacle they might tackle is the lack of
documentation and support in their native language.

To offer this support in french an IRC channel (#Rfr on
irc.freenode.net) was created some months ago, beside the official
english channel #R. We (me (~juba) and Pierre-Yves Chibon (~pingou))
have to recognize that it has a very low activity right now but it's
related to our lack of promotion about it.

Another tool that could be useful to bring support in french (and
other languages) would be dedicated mailing-lists. I've searched the
archives to see if similar requests have already been made but
couldn't manage to find one. So I would like to ask here the question,
has there been any thought on the creation of dedicated R-help
mailing-lists for the major languages such as Spanish, French, Chinese
and others ?

We had some thought about it and we actually think that it would be
something useful for users to be able to receive some help (especially
when their english is not really fluent).

Thanks in advance for your answers,

Sincerely,

Pierre-Yves Chibon
Julien Barnier

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Baptiste Auguié

School of Physics
University of Exeter
Stocker Road,
Exeter, Devon,
EX4 4QL, UK

Phone: +44 1392 264187

http://newton.ex.ac.uk/research/emag

Re: [R] glm error message when using family Gamma(link=inverse)

2008-12-09 Thread Prof Brian Ripley


On Tue, 9 Dec 2008, John Sorkin wrote:


R 2.5


Please

1) do as the posting guide asks, and quote version numbers accurately.
2) do as the posting guide asks, and update *before* posting.

That's too old a version to support here.


windows XP

I am getting an error from glm() that I don't understand. Any help or suggestions 
would be appreciated. N.B. 1=AAMTCAREJ=327900


summary(data$AAMTCAREJ)

   Min.  1st Qu.   Median Mean  3rd Qu. Max.
1.0404.3   1430.0   6567.0   5457.0 327900.0



fitglm-glm(AAMTCAREJ~sexcat+H_AGE+SmokeCat+InsuranceCat+MedicadeCat+

+ incomegrp+racecat+MARSTATJS+EdCat+bmiNewjohn,data=data,family=Gamma(link = 
inverse))
Error: no valid set of coefficients has been found: please supply starting 
values
In addition: Warning message:
NaNs produced in: log(x)


That model is not necessarily valid: the linear predictor has to be 
strictly positive.  If you really know why it is applicable you will be 
able to give starting values (e.g. maybe all the columns of the design 
matrix are positive, in which case you will be able to find suitable 
positive initial coefficients).



Thanks
John

John David Sorkin M.D., Ph.D.
Chief, Biostatistics and Informatics
University of Maryland School of Medicine Division of Gerontology
Baltimore VA Medical Center
10 North Greene Street
GRECC (BT/18/GR)
Baltimore, MD 21201-1524
(Phone) 410-605-7119
(Fax) 410-605-7913 (Please call phone number above prior to faxing)

Confidentiality Statement:
This email message, including any attachments, is for ...{{dropped:19}}


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] difftime

I guess its the same problem as this (run after your
code):

 as.POSIXct(a, tz = )
[1] NA
 as.POSIXct(b, tz = )
[1] NA
 difftime(b, a, units=days)
Time difference of NA days

If you explicitly specify the tz as GMT then it works as expected:

 as.POSIXct(a, tz = GMT)
[1] 1999-10-08 07:00:00 GMT
 as.POSIXct(b, tz = GMT)
[1] 1999-10-27 18:00:00 GMT
 difftime(b, a, tz = GMT, units=days)
Time difference of 19.45833 days

Since the problem is confusion between the  and GMT
time zones the easiest thing to do is to simply set the
default time zone for the remainder of the session to GMT

Sys.setenv(TZ = GMT)

in which case tz =  and tz = GMT are the same and
you should get the expected answer either way.  Alternately
use chron which does not have time zones in the first place
so problem cannot arise.

See R News 4/1.

The above was done under:
 R.version.string # Vista
[1] R version 2.8.0 Patched (2008-11-10 r46884)


On Tue, Dec 9, 2008 at 12:33 PM, eric lee [EMAIL PROTECTED] wrote:
 Hi.  I'm trying to take the difference in days between two times.  Can
 you point out what's wrong, or suggest a different function?  When I
 try the following code,  The following code works fine:

 a - strptime(1911100807,format=%Y%m%d%H,tz=GMT)
 b - strptime(1911102718,format=%Y%m%d%H,tz=GMT)
 x - difftime(b, a, units=days)
 x


 But when I change the year, the following code returns 'NA' for the
 time between a and b.  Thanks.

 a - strptime(1999100807,format=%Y%m%d%H,tz=GMT)
 b - strptime(1999102718,format=%Y%m%d%H,tz=GMT)
 x - difftime(b, a, units=days)
 x

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] How to get Greenhouse-Geisser epsilons from anova?

2008-12-09 Thread John Fox

Dear Nils,

 -Original Message-
 From: Skotara [mailto:[EMAIL PROTECTED]
 Sent: December-09-08 12:21 PM
 To: John Fox
 Cc: 'Peter Dalgaard'; r-help@r-project.org
 Subject: Re: [R] How to get Greenhouse-Geisser epsilons from anova?
 
 Dear John and Peter,
 
 thank you both very much for your help!
 Everything works fine now!
 
 John, Anova also works very fine. Thank you very much!
 However, if I had more than 2 levels for the between factor the same
 thing as mentioned occured.
 The degrees of freedom showed that Anova calculated it as if all
 subjects came from the same group, for example for main effect A the dfs
 are 1 and 35.

That is odd -- the example that I sent had a factor with 3 levels, producing
2 df; here's a simplified version with s single between-subject factor
(treatment):

 OBrienKaiser$treatment
 [1] control control control control control A   A   A   A  
[10] B   B   B   B   B   B   B  
attr(,contrasts)
[,1] [,2]
control   -20
A  1   -1
B  11
Levels: control A B

 mod.ok - lm(cbind(pre.1, pre.2, pre.3, pre.4, pre.5, 
+   post.1, post.2, post.3, post.4, post.5, 
+   fup.1, fup.2, fup.3, fup.4, fup.5) ~  treatment, 
+data=OBrienKaiser)

 av.ok - Anova(mod.ok, idata=idata, idesign=~phase*hour)

 summary(av.ok, multivariate=FALSE)

Univariate Type II Repeated-Measures ANOVA Assuming Sphericity

 SS num Df Error SS den Df   FPr(F)
treatment186.75  2   416.58 13  2.9139  0.090041 .  
phase167.50  292.17 26 23.6257 1.419e-06 ***
treatment:phase   77.00  492.17 26  5.4304  0.002578 ** 
hour 106.29  472.81 52 18.9769 1.128e-09 ***
treatment:hour 0.89  872.81 52  0.0798  0.999597
phase:hour11.08  8   116.96104  1.2319  0.287947
treatment:phase:hour   5.96 16   116.96104  0.3312  0.992574
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 


Mauchly Tests for Sphericity

 Test statistic p-value
phase   0.85151 0.38119
treatment:phase 0.85151 0.38119
hour0.09859 0.00194
treatment:hour  0.09859 0.00194
phase:hour  0.00837 0.09038
treatment:phase:hour0.00837 0.09038


Greenhouse-Geisser and Huynh-Feldt Corrections
 for Departure from Sphericity

  GG eps Pr(F[GG])
phase0.87071  5.665e-06 ***
treatment:phase  0.87071   0.004301 ** 
hour 0.48867  1.016e-05 ***
treatment:hour   0.48867   0.986835
phase:hour   0.50283   0.308683
treatment:phase:hour 0.50283   0.950677
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 

  HF eps Pr(F[HF])
phase0.99390  1.515e-06 ***
treatment:phase  0.99390   0.002641 ** 
hour 0.57413  2.190e-06 ***
treatment:hour   0.57413   0.992771
phase:hour   0.75630   0.298926
treatment:phase:hour 0.75630   0.981607
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

I didn't save your earlier postings, so can't check your model and data. If
you like, please feel free to send them to me privately.

Regards,
 John

 Since I can get those values using anova that causes no problem.
 
 I saw that the x$G to get the greenhouse-geisser epsilon do work for:
 x- anova(mlmfitD, X=~C+B, M=~A+C+B, test = Spherical)
 but does not work for y$G:
 y - anova(mlmfit, mlmfit0, X= ~C+B, M = ~A+C+B, idata =
 dd,test=Spherical)
 
 Finally, the Greenhouse-Geisser epsilons are identical using both
 methods and to the SPSS output.
 The Huynh-Feldt are not the same as them of SPSS. I will use GG instead.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] opening multiple connections at once

2008-12-09 Thread Rout, Subrat



Best Regards,
Subrat Rout
CNSI, Gaithers Dr.
(240) 399 2472
[EMAIL PROTECTED]mailto:[EMAIL PROTECTED]


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] RCurl::postForm() -- how does one determine what the names are of each form element in an online html form?

2008-12-09 Thread Tony Breyal

Dear R-Help,

I am looking into using the Open Calais web service (http://
sws.clearforest.com/calaisViewer/) for text mining purposes. I would
like to use R to post text into one of the forms on their website.

In package RCurl, there is a function called postForm(). This sounds
like it would do the job. Unfortunately the URL used in the example is
no longer valid (i have emailed the maintainer about this).

Question: How does one determine the name of the form elements to use?
is there an R function which will print out the names of these
elements perhaps?

[i am still learning, so please forgive me if i used the wrong
terminology.]

### Example from ?postForm ###
library(RCurl)
# Now looking at POST method for forms.
postForm(http://www.speakeasy.org/~cgires/perl_form.cgi;,
some_text = Duncan,
choice = Ho,
radbut = eep,
box = box1, box2
  )
### Example ends ###

So in the above code, i believe the form elements are: some_text,
choice, redbut and box. But how does one find out the names of
these form elements if one is not given them previously?

I hope that the above made sense, and thank you kindly in advance for
any help.
Tony Breyal.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] TIFF capability in R 2.8.0

2008-12-09 Thread Michael Rhead Enion


Hi,

I installed R 2.8.0 (binary - dmg) for Mac 10.5.5 (Macbook Pro).  It  
seems to have broken tiff output.  From the R console, capabilities()  
reports TIFF is not installed.  And when I run CairoTIFF(), it throws  
an error -- TIFF not compiled.  CarioTIFF() worked in R 2.7.2.


Is this a bug or am I doing something wrong?

Thanks,

Rhead

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] chron - when seconds data not included

2008-12-09 Thread Tubin


Works like a charm!  Thanks.


Gabor Grothendieck wrote:
 
 On Mon, Dec 8, 2008 at 11:52 PM, Tubin [EMAIL PROTECTED] wrote:

 I have date and time data which looks like this:

  [,1] [,2]
  [1,] 7/1/08 9:19
  [2,] 7/1/08 9:58
  [3,] 7/7/08 15:47
  [4,] 7/8/08 10:03
  [5,] 7/8/08 10:32
  [6,] 7/8/08 15:22
  [7,] 7/8/08 15:27
  [8,] 7/8/08 15:40
  [9,] 7/9/08 10:25
 [10,] 7/9/08 10:27

 I would like to use chron on it, so that I can calculate intervals in
 time.

 I can't seem to get chron to accept the time format that doesn't include
 seconds.  Do I have to go through and append :00 on every line in order
 to
 use chron?
 
 
 
 That's one way:
 
 m - matrix( c(7/1/08,9:19,
   7/1/08,9:58,
   7/7/08,15:47,
   7/8/08,10:03,
   7/8/08,10:32,
   7/8/08,15:22,
   7/8/08,15:27,
   7/8/08,15:40,
   7/9/08,10:25,
   7/9/08,10:27), nc = 2, byrow = TRUE)
 
 chron(m[,1], paste(m[,2], 0, sep = :))
 
 # another is to use as.chron
 
 as.chron(apply(m, 1, paste, collapse =  ), %m/%d/%y %H:%M)
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.
 
 

-- 
View this message in context: 
http://www.nabble.com/chron---when-seconds-data-not-included-tp20908803p20922004.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] R and Scheme

Luke Tierney wrote:
 On Tue, 9 Dec 2008, Wacek Kusnierczyk wrote:

 Stavros Macrakis wrote:

 R does not have continuations or call-with-current-continuation or
 other
 mechanisms for implementing coroutines, general iterators, and the
 like.


 there is callCC, for example, which however seems kind of obsolete.

 There is nothing obsolete about it.  It supports only downward or
 dynamic extent continuations and so is not useful (nor intended) for
 the things Stavros mentions.  It is useful for escaping from deeply
 nested function calls, for example recursive examination of tree
 structures -- that is why it exists.  At some point upward (at least
 one-shot) contitnuations may be added as well, but probably not soon.

 luke

ok, thanks for the explanation.

vQ

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] How can I draw bars

2008-12-09 Thread Raymond Balise

I need to make a graphic to show problems on different parts of
chromosomes (think of a graphic showing the number of frayed threads
as colors along different parts of a worn out rope).  I want to draw
bars going from left to right across a page and color different parts
of the bars in different shades.  Each graphic will need to have
several bars of different lenghts corresponding to the different
lenghts of the chromsomes.

Is there a package around that can help me draw custom bars of
different colors?  I am a hardcore SAS guy who is just learning R so I
am mostly clueless.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] glm error message when using family Gamma(link=inverse)

2008-12-09 Thread John Sorkin

It appears that I have sinned and for this I surely will wear sack cloth and 
grovel until my period of penitence is fulfilled.

I update R and my problem remains. Please see code snippet (as per posting 
guidelines) below

R 2.8.0
windows XP

 summary(data$AAMTCAREJ)
Min.  1st Qu.   Median Mean  3rd Qu. Max. 
 1.0404.3   1430.0   6567.0   5457.0 327900.0 
 fitglm-glm(AAMTCAREJ~sexcat+H_AGE+SmokeCat+InsuranceCat+MedicadeCat+
+ incomegrp+racecat+MARSTATJS+EdCat+bmiNewjohn,data=data,family=Gamma(link = 
inverse))
Error: no valid set of coefficients has been found: please supply starting 
values
In addition: Warning message:
In log(ifelse(y == 0, 1, y/mu)) : NaNs produced










John David Sorkin M.D., Ph.D.
Chief, Biostatistics and Informatics
University of Maryland School of Medicine Division of Gerontology
Baltimore VA Medical Center
10 North Greene Street
GRECC (BT/18/GR)
Baltimore, MD 21201-1524
(Phone) 410-605-7119
(Fax) 410-605-7913 (Please call phone number above prior to faxing)

 Prof Brian Ripley [EMAIL PROTECTED] 12/9/2008 2:11 PM 
On Tue, 9 Dec 2008, John Sorkin wrote:

 R 2.5

Please

1) do as the posting guide asks, and quote version numbers accurately.
2) do as the posting guide asks, and update *before* posting.

That's too old a version to support here.

 windows XP

 I am getting an error from glm() that I don't understand. Any help or 
 suggestions would be appreciated. N.B. 1=AAMTCAREJ=327900

 summary(data$AAMTCAREJ)
Min.  1st Qu.   Median Mean  3rd Qu. Max.
 1.0404.3   1430.0   6567.0   5457.0 327900.0


 fitglm-glm(AAMTCAREJ~sexcat+H_AGE+SmokeCat+InsuranceCat+MedicadeCat+
 + incomegrp+racecat+MARSTATJS+EdCat+bmiNewjohn,data=data,family=Gamma(link = 
 inverse))
 Error: no valid set of coefficients has been found: please supply starting 
 values
 In addition: Warning message:
 NaNs produced in: log(x)

That model is not necessarily valid: the linear predictor has to be 
strictly positive.  If you really know why it is applicable you will be 
able to give starting values (e.g. maybe all the columns of the design 
matrix are positive, in which case you will be able to find suitable 
positive initial coefficients).

 Thanks
 John

 John David Sorkin M.D., Ph.D.
 Chief, Biostatistics and Informatics
 University of Maryland School of Medicine Division of Gerontology
 Baltimore VA Medical Center
 10 North Greene Street
 GRECC (BT/18/GR)
 Baltimore, MD 21201-1524
 (Phone) 410-605-7119
 (Fax) 410-605-7913 (Please call phone number above prior to faxing)

 Confidentiality Statement:
 This email message, including any attachments, is for ...{{dropped:16}}

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Logical inconsistency