date:20131108

[R] Different output from lm() and lmPerm lmp() if categorical variables are included in the analysis

2013-11-08 Thread Agustin Lobo

I've found a problem when using
categorical variables in lmp() from package lmPerm

According to help(lmp): This function will behave identically to lm()
if the following parameters are set: perm=, seq=TRUE,
center=FALSE.)
But not in the case of including categorical variables:

require(lmPerm)
set.seed(42)
testx1 - rnorm(100,10,5)
testx2 - c(rep(a,50),rep(b,50))
testy - 5*testx1 + 3 + runif(100,-20,20)
test - data.frame(x1=testx1,x2=
testx2,y=testy)
atest - lm(y ~ x1*x2,data=test)
aptest - lmp(y ~ x1*x2,data=test,perm = , seqs = TRUE, center = FALSE)
summary(atest)

Call:
lm(formula = y ~ x1 * x2, data = test)
Residuals:
Min   1Q   Median   3Q  Max
-17.1777  -9.5306  -0.9733   7.6840  22.2728

Coefficients:
Estimate Std. Error t value Pr(|t|)
(Intercept)  -2.0036 3.2488  -0.6170.539
x15.3346 0.2861  18.646   2e-16 ***
x2b   2.4952 5.2160   0.4780.633
x1:x2b   -0.3833 0.4568  -0.8390.404

summary(aptest)

Call:
lmp(formula = y ~ x1 * x2, data = test, perm = , seqs = TRUE,
center = FALSE)

Residuals:
Min   1Q   Median   3Q  Max
-17.1777  -9.5306  -0.9733   7.6840  22.2728

Coefficients:
   Estimate Std. Error t value Pr(|t|)
x1   5.1429 0.2284  22.516   2e-16 ***
x21 -1.2476 2.6080  -0.4780.633
x1:x21   0.1917 0.2284   0.8390.404

It looks like lmp() is internally coding dummy variables in a different way, so
lmp results are for a (named 1 by lmp) while lm results are for
b ?

 Agus

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] [R-pkgs] ndl 0.2.13 released today.

2013-11-08 Thread Cyrus Shaoul

Dear R-package Folk,

I am pleased to announce the release of a new version of the ndl package (
http://cran.r-project.org/web/packages/ndl/)

What is NDL? It is a simple learning model based on the Rescorla-Wagner
model of discrimination learning.

I have become the new maintainer, replacing Dr. Antti Arppe (thank you for
all your hard work, Antti!)

This release is a major update. In particular the following items are new
since the last CRAN version (v 0.1.6 from Dec. 2012)

* improved speed in counting co-occurrence counts through the use of Rcpp
and C++ functions.
* improved scalability: it can process many millions of events, with much
larger numbers of cues and outcomes.
* support for Unicode text
* new ability to count background rates (optional)
* new method for converting counts to probabilities
-- and many other small improvements.

For access to the development branch, to join development,  or to submit
issues, please go to: https://bitbucket.org/cyrusshaoul/ndl/

Best regards,

Cyrus

-- 
Cyrus Shaoul, PhD
http://www.sfs.uni-tuebingen.de/~cshaoul/

[[alternative HTML version deleted]]

___
R-packages mailing list
r-packa...@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-packages

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] problem with interaction in lmer even after creating an interaction variable

2013-11-08 Thread a_lampei

Thank you very much,
First, sorry for posting on wrong mailing list, I did not know that 
there exists a special one for lmer.
Yes, there are collinearities in the data.
Still, I would like to have the variables in one model to compare 
explained variability. Is there some option, or it is simply impossible?
Thank you, Anna

On 07.11.2013 16:03, bbolker [via R] wrote:
 a_lampei anna.lampei-bucharova at uni-tuebingen.de writes:

 
  Dear all,
  I have a problem with interactions in lmer. I have 2 factors (garden 
 and
  gebiet) which interact, plus one other variable (home),
  dataframe arr. When
  I put:
  /
  lmer (biomass ~ home + garden:gebiet +  ( 1|Block), data = arr)/
 
  it writes:
  /Error in lme4::lFormula(formula = biomass ~ home + garden:gebiet
  + (1 |  :
rank of X = 28  ncol(X) = 30/
 
  In the lmer help I found out that if not all combination of
   the interaction
  are realized, lmer has problems and one should do new variable using
  interaction, which I did:
  /
  arr$agg - interaction (arr$gebiet, arr$garden, drop = TRUE)/
 
  when I fit the interaction term now:
   /
  lmer (biomass ~ home + agg+  ( 1|Block), data = arr)/
 
  The error does not change:
  /
  Error in lme4::lFormula(formula = biomass ~ home + agg + (1 | 
 Block),  :
rank of X = 28  ncol(X) = 29/
 
  No NAs are in the given variables in the dataset.
 
  Interestingly it works when I put only given interaction like
 
  /lmer (biomass ~ agg +  ( 1|Block), data = arr)/
 
  Even following models work:
  /lmer (biomass ~ gebiet*garden +  ( 1|Block), data = arr)
  lmer (biomass ~ garden + garden:gebiet  +( 1|Block), data = arr)/
 
  But if I add the interaction term in th enew formate of
  the new fariable, it
  reports again the same error.
 
  /lmer (biomass ~ garden + agg  +( 1|Block), data = arr)/
 
  If I put any other variable from the very same dataframe
  (not only variable
  home), the error is reported again.
 
  I do not understand it, the new variable is just another
  factor now, or? And
  it is in the same dataframe, it has the same length.
 
  Does anyone have any idea?
 
  Thanks a lot, Anna
 

   This probably belongs on r-sig-mixed-models.

   Presumably 'home' is still correlated with one of the
 columns of 'garden:gebiet'.

   Here's an example of how you can use svd() to find out which
 of your columns are collinear:

 set.seed(101)
 d - data.frame(x=runif(100),y=1:100,z=2:101)
 m - model.matrix(~x+y+z,data=d)
 s - svd(m)
 zapsmall(s$d)
 ## [1] 828.8452   6.6989   2.6735   0.
 ## this tells us there is one collinear component
 zapsmall(s$v)
 ##[,1]   [,2]   [,3]   [,4]
 ## [1,] -0.0105005 -0.7187260  0.3872847  0.5773503
 ## [2,] -0.0054954 -0.4742947 -0.8803490  0.000
 ## [3,] -0.7017874  0.3692117 -0.1945349  0.5773503
 ## [4,] -0.7122879 -0.3495142  0.1927498 -0.5773503
 ## this tells us that the first (intercept), third (y),
 ## and fourth (z) column of the model matrix are
 ## involved in the collinear term, i.e.
 ## 1+y-z is zero

 __
 [hidden email] /user/SendEmail.jtp?type=nodenode=4679965i=0 
 mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide 
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


 
 If you reply to this email, your message will be added to the 
 discussion below:
 http://r.789695.n4.nabble.com/problem-with-interaction-in-lmer-even-after-creating-an-interaction-variable-tp4679951p4679965.html
  

 To unsubscribe from problem with interaction in lmer even after 
 creating an interaction variable, click here 
 http://r.789695.n4.nabble.com/template/NamlServlet.jtp?macro=unsubscribe_by_codenode=4679951code=YW5uYS5sYW1wZWktYnVjaGFyb3ZhQHVuaS10dWViaW5nZW4uZGV8NDY3OTk1MXwyMjkyNTIzNjM=.
 NAML 
 http://r.789695.n4.nabble.com/template/NamlServlet.jtp?macro=macro_viewerid=instant_html%21nabble%3Aemail.namlbase=nabble.naml.namespaces.BasicNamespace-nabble.view.web.template.NabbleNamespace-nabble.view.web.template.NodeNamespacebreadcrumbs=notify_subscribers%21nabble%3Aemail.naml-instant_emails%21nabble%3Aemail.naml-send_instant_email%21nabble%3Aemail.naml
  









--
View this message in context: 
http://r.789695.n4.nabble.com/problem-with-interaction-in-lmer-even-after-creating-an-interaction-variable-tp4679951p4680022.html
Sent from the R help mailing list archive at Nabble.com.
[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Graph dashboard

2013-11-08 Thread mohan . radhakrishnan

Hi,
  I have been exploring graph dashboards like 
http://sematext.com/img/products/spm/spm-solr-overview.png. I use R but 
haven't attempted to create a
dashboard like this. I am thinking of parsing logs and showing dynamic 
logs - logs that fit into a small window but move left or right with new 
data - in a browser or
just a R graphics window. R shiny could be useful. But that is a browser 
view. Isn't it ? So the dynamic view works differently.

What are the recommendations ? The logs that will be parsed are all 
static.

Thanks,
Mohan


This e-Mail may contain proprietary and confidential information and is sent 
for the intended recipient(s) only.  If by an addressing or transmission error 
this mail has been misdirected to you, you are requested to delete this mail 
immediately. You are also hereby notified that any use, any form of 
reproduction, dissemination, copying, disclosure, modification, distribution 
and/or publication of this e-mail message, contents or its attachment other 
than by its intended recipient/s is strictly prohibited.

Visit us at http://www.polarisFT.com

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Inserting 17M entries into env took 18h, inserting 34M entries taking 5+ days

2013-11-08 Thread Magnus Thor Torfason


Thanks to Thomas, Martin, Jim and William,

Your input was very informative, and thanks for the reference to Sedgwick.

In the end, it does seem to me that all these algorithms require fast 
lookup by ID of nodes to access data, and that conditional on such fast 
lookup, algorithms are possible with efficiency O(n) or O(n*log(n)) 
(depending on whether lookup time is constant or logarithmic). I believe 
my original algorithm achieves that.


We come back to the fact that I assumed that R environments, implemented 
as hash tables, would give me that fast lookup. But on my systems, their 
efficiency (for insert and lookup) seems to degrade fast at several 
million entries. Certainly much faster than either O(1) or O(log(n)). I 
believe this does not have to do with disk access time. For example, I 
tested this on my desktop computer, running a pure hash insert loop. I 
observe 100% processor use but no disk access, as the size of the hash 
table approaches millions of entries.


I have tested this on two systems, but have not gone into the 
implementation of the hashed environments to look at this in details. If 
others have the same (or different) experiences with using hashed 
environments with millions of entries, it would be very useful to know.


Barring a solution to the hashed environment speed, it seems the way to 
speed this algorithm up (within the confines of R) would be to move away 
from hash tables and towards a numerically indexed array.


Thanks again for all of the help,
Magnus

On 11/4/2013 8:20 PM, Thomas Lumley wrote:

On Sat, Nov 2, 2013 at 11:12 AM, Martin Morgan mtmor...@fhcrc.org
mailto:mtmor...@fhcrc.org wrote:

On 11/01/2013 08:22 AM, Magnus Thor Torfason wrote:

Sure,

I was attempting to be concise and boiling it down to what I saw
as the root
issue, but you are right, I could have taken it a step further.
So here goes.

I have a set of around around 20M string pairs. A given string
(say, A) can
either be equivalent to another string (B) or not. If A and B
occur together in
the same pair, they are equivalent. But equivalence is
transitive, so if A and B
occur together in one pair, and A and C occur together in
another pair, then A
and C are also equivalent. I need a way to quickly determine if
any two strings
from my data set are equivalent or not.


Do you mean that if A,B occur together and B,C occur together, then
A,B and A,C are equivalent?

Here's a function that returns a unique identifier (not well
tested!), allowing for transitive relations but not circularity.

  uid - function(x, y)
 {
 i - seq_along(x)   # global index
 xy - paste0(x, y)  # make unique identifiers
 idx - match(xy, xy)

 repeat {
 ## transitive look-up
 y_idx - match(y[idx], x)   # look up 'y' in 'x'
 keep - !is.na http://is.na(y_idx)
 if (!any(keep)) # no transitive
relations, done!
 break
 x[idx[keep]] - x[y_idx[keep]]
 y[idx[keep]] - y[y_idx[keep]]

 ## create new index of values
 xy - paste0(x, y)
 idx - match(xy, xy)
 }
 idx
 }

Values with the same index are identical. Some tests

  x - c(1, 2, 3, 4)
  y - c(2, 3, 5, 6)
  uid(x, y)
 [1] 1 1 1 4
  i - sample(x); uid(x[i], y[i])
 [1] 1 1 3 1
  uid(as.character(x), as.character(y))  ## character() ok
 [1] 1 1 1 4
  uid(1:10, 1 + 1:10)
  [1] 1 1 1 1 1 1 1 1 1 1
  uid(integer(), integer())
 integer(0)
  x - c(1, 2, 3)
  y - c(2, 3, 1)
  uid(x, y)  ## circular!
   C-c C-c

I think this will scale well enough, but the worst-case scenario can
be made to be log(longest chain) and copying can be reduced by using
an index i and subsetting the original vector on each iteration. I
think you could test for circularity by checking that the updated x
are not a permutation of the kept x, all(x[y_idx[keep]] %in% x[keep]))

Martin



This problem (union-find) is discussed in Chapter 1 of Sedgwick's
Algorithms.  There's an algorithm given that takes linear time to
build the structure, worst-case logarithmic time to query, and
effectively constant average time to query (inverse-Ackerman amortized
complexity).

-thomas

--
Thomas Lumley
Professor of Biostatistics
University of Auckland


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal,

[R] Remove a column of a matrix with unnamed column header

2013-11-08 Thread Kuma Raj

I have a matrix names test which I want to convert to a data frame. When I
use a command  test2-as.data.frame(test) it is executed without a problem.
But when I want to browse the content I receive an error message Error in
data.frame(outcome = c(cardva, respir, cereb, neoplasm,  :
 duplicate row.names: Estimate . The problem is clearly due to a duplicate
in  row name . But I am unable to remove this column. I need help on how to
remove this specific column that has essentially no column header name.
dput of the matrix is here:

 dput(test)
structure(c(cardva, respir, cereb, neoplasm, ami, ischem,
heartf, pneumo, copd, asthma, dysrhy, diabet,
0.00259492159959046,
0.00979775441709427, 0.00103414632535868, 0.00486468139227382,
0.0164825543879707, 0.0116647168053943, -0.0012137908515233,
0.00730433232907741, 0.00355583994565985, 0.000712387285735019,
-0.00103763671307935, 0.00981500221106926, 0.00325476724733837,
0.0049232113728293, 0.00520118026087645, 0.00386848394426742,
0.00688121694253705, 0.00585772614064902, 0.00564983058883797,
0.0061328202328586, 0.0108212194251692, 0.0173804438930357,
0.00867931407250442, 0.0106638104533486, 0.425323120845664,
0.0466180768654915, 0.842402292743715, 0.208609687427072,
0.0166336682608816, 0.0464833846710956, 0.8299010611324,
0.233685747699204, 0.742469001175026, 0.967306766450795,
0.904840885401235, 0.357394700741248), .Dim = c(12L, 4L), .Dimnames =
list(
c(Estimate, Estimate, Estimate, Estimate, Estimate,
Estimate, Estimate, Estimate, Estimate, Estimate,
Estimate, Estimate), c(outcome, beta, se, pval
)))

 test2-as.data.frame(test)
 test2
Error in data.frame(outcome = c(cardva, respir, cereb, neoplasm,  :
  duplicate row.names: Estimate

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Crime hotspot maps (kernel density)

2013-11-08 Thread David Studer

Hi everybody,

does anyone of you know how to create a (crime) hotspot map using R?
Are there any packages or do you know any ressources?

It should be something like this:
http://www.caliper.com/Maptitude/Crime/MotorVehicleTheft2.png
(but it doesnt necessarely have to be a map)

Many thanks, David

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] prod and F90 product

2013-11-08 Thread Jim Holtman

sounds like FAQ 7.31

Sent from my iPad

On Nov 7, 2013, at 18:39, Filippo ingf...@gmail.com wrote:

 Hi,
 I'm having strange differences between the R function prod ad the F90 
 function product.
 Processing the same vector C (see attachment). I get 2 different results:
 prod(C) = 1.069678e-307
 testProduct(C) = 0
 
 where testProd is the following wrapping function:
 
 testProd - function(x) {
return(.Fortran('testProd', as.double(x), as.double(0), as.double(0), 
 as.integer(length(x
 }
 
 subroutine testProd(x, p, q,  n)
implicit none
integer, intent (in) :: n
double precision, intent (in) :: x(n)
double precision, intent (out) :: p
double precision, intent (out) :: q
integer :: i
 
p = product(x)
q=1
do i = 1, n
q = q*x(i)
end do
 end subroutine testProd
 
 I check the lowest possible number and seems to be the same for both R and 
 F90.
 Can anyone help me understanding this behaviour?
 Thank you in advance
 Regards,
 Filippo
 
 C
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Remove a column of a matrix with unnamed column header

2013-11-08 Thread Berend Hasselman


On 08-11-2013, at 10:40, Kuma Raj pollar...@gmail.com wrote:

 I have a matrix names test which I want to convert to a data frame. When I
 use a command  test2-as.data.frame(test) it is executed without a problem.
 But when I want to browse the content I receive an error message Error in
 data.frame(outcome = c(cardva, respir, cereb, neoplasm,  :
 duplicate row.names: Estimate . The problem is clearly due to a duplicate
 in  row name . But I am unable to remove this column. I need help on how to
 remove this specific column that has essentially no column header name.
 dput of the matrix is here:
 
 dput(test)
 structure(c(cardva, respir, cereb, neoplasm, ami, ischem,
 heartf, pneumo, copd, asthma, dysrhy, diabet,
 0.00259492159959046,
 0.00979775441709427, 0.00103414632535868, 0.00486468139227382,
 0.0164825543879707, 0.0116647168053943, -0.0012137908515233,
 0.00730433232907741, 0.00355583994565985, 0.000712387285735019,
 -0.00103763671307935, 0.00981500221106926, 0.00325476724733837,
 0.0049232113728293, 0.00520118026087645, 0.00386848394426742,
 0.00688121694253705, 0.00585772614064902, 0.00564983058883797,
 0.0061328202328586, 0.0108212194251692, 0.0173804438930357,
 0.00867931407250442, 0.0106638104533486, 0.425323120845664,
 0.0466180768654915, 0.842402292743715, 0.208609687427072,
 0.0166336682608816, 0.0464833846710956, 0.8299010611324,
 0.233685747699204, 0.742469001175026, 0.967306766450795,
 0.904840885401235, 0.357394700741248), .Dim = c(12L, 4L), .Dimnames =
 list(
c(Estimate, Estimate, Estimate, Estimate, Estimate,
Estimate, Estimate, Estimate, Estimate, Estimate,
Estimate, Estimate), c(outcome, beta, se, pval
)))
 
 test2-as.data.frame(test)
 test2
 Error in data.frame(outcome = c(cardva, respir, cereb, neoplasm,  :
  duplicate row.names: Estimate

rownames(test) - NULL

Berend

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Remove a column of a matrix with unnamed column header

2013-11-08 Thread Kuma Raj

Beend, Thanks for that. Conversion of test to a data frame resulted in a
factor. Is there a possibility to selectively convert to numeric?   I have
tried this code and that has not produced the intended result.
test[, c(2:4)] - sapply(test[, c(2:4)], as.numeric)



On 8 November 2013 11:31, Berend Hasselman b...@xs4all.nl wrote:


 On 08-11-2013, at 10:40, Kuma Raj pollar...@gmail.com wrote:

  I have a matrix names test which I want to convert to a data frame. When
 I
  use a command  test2-as.data.frame(test) it is executed without a
 problem.
  But when I want to browse the content I receive an error message Error
 in
  data.frame(outcome = c(cardva, respir, cereb, neoplasm,  :
  duplicate row.names: Estimate . The problem is clearly due to a
 duplicate
  in  row name . But I am unable to remove this column. I need help on how
 to
  remove this specific column that has essentially no column header name.
  dput of the matrix is here:
 
  dput(test)
  structure(c(cardva, respir, cereb, neoplasm, ami, ischem,
  heartf, pneumo, copd, asthma, dysrhy, diabet,
  0.00259492159959046,
  0.00979775441709427, 0.00103414632535868, 0.00486468139227382,
  0.0164825543879707, 0.0116647168053943, -0.0012137908515233,
  0.00730433232907741, 0.00355583994565985, 0.000712387285735019,
  -0.00103763671307935, 0.00981500221106926, 0.00325476724733837,
  0.0049232113728293, 0.00520118026087645, 0.00386848394426742,
  0.00688121694253705, 0.00585772614064902, 0.00564983058883797,
  0.0061328202328586, 0.0108212194251692, 0.0173804438930357,
  0.00867931407250442, 0.0106638104533486, 0.425323120845664,
  0.0466180768654915, 0.842402292743715, 0.208609687427072,
  0.0166336682608816, 0.0464833846710956, 0.8299010611324,
  0.233685747699204, 0.742469001175026, 0.967306766450795,
  0.904840885401235, 0.357394700741248), .Dim = c(12L, 4L), .Dimnames =
  list(
 c(Estimate, Estimate, Estimate, Estimate, Estimate,
 Estimate, Estimate, Estimate, Estimate, Estimate,
 Estimate, Estimate), c(outcome, beta, se, pval
 )))
 
  test2-as.data.frame(test)
  test2
  Error in data.frame(outcome = c(cardva, respir, cereb, neoplasm,
  :
   duplicate row.names: Estimate

 rownames(test) - NULL

 Berend



[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Error running MuMIn dredge function using glmer models

2013-11-08 Thread Kamil Bartoń


There is indeed a glitch in 'dredge' that prevents you from seeing the
actual error message. It is explained in ?dredge, in section Missing
values. (it's been corrected now in 1.9.14, on R-forge)

kamil



On 2013-11-08 11:00, r-help-requ...@r-project.org wrote:

--

Message: 26
Date: Thu, 7 Nov 2013 11:55:50 -0500
From: Martin Turcottemart.turco...@gmail.com
To:r-help@r-project.org
Subject: [R] Error running MuMIn dredge function using glmer models
Message-ID:1e4f5497-ccb4-4e8b-a23a-8aa5e1136...@gmail.com
Content-Type: text/plain

Dear list,
I am trying to use MuMIn to compare all possible mixed models using the dredge 
function on binomial data but I am getting an error message that I cannot 
decode. This error only occurs when I use glmer. When I use an lmer analysis on 
a different response variable every works great.

Example using a simplified glmer model
global model:
mod- glmer(cbind(st$X2.REP.LIVE, st$X2.REP.DEAD) ~ DOMESTICATION*GLUC + 
(1|PAIR), data=st, na.action=na.omit , family=binomial)

The response variables are the number of survival and dead insects (successes 
and failures)
DOMESTICATION is a 2 level factor.
GLUC is a continuous variable.
PAIR is coded as a factor or character (both ways fail).

This model functions correctly but when I try it with dredge() I get an error.

g- dredge(mod, beta=F, evaluate=F, rank='AIC')
Error in sprintf(gettext(fmt, domain = domain), ...) :
   invalid type of argument[1]: 'symbol'

When I try with another rank the same thing happens:
chat- deviance(mod)/58
g- dredge(mod, beta=F, evaluate=F, rank='QAIC', chat=chat)
Error in sprintf(gettext(fmt, domain = domain), ...) :
   invalid type of argument[1]: 'symbol'

Any suggestions would be greatly appreciated

thanks

Martin Turcotte, Ph. D.
mart.turco...@gmail.com






The University of Aberdeen is a charity registered in Scotland, No SC013683.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] how to derive true surface area from `computeContour3d' (misc3d package)

2013-11-08 Thread j. van den hoff

I want to compute the total surface area of an isosurface in 3D space as  
approximated with

`computeContour3d' by adding up the areas of the triangles making up the
contour surface.

problem:
the vertex matrix returned by `computeContour3d' in general seems to  
provide the vertices
not in the frame of reference in which the original data are given but  
apparently rather

after some linear transformation (scaling + translation (+ rotation?) -- or
I am having some fundamental misconception of what is going on.

I'm interested in the simplest case where the input data are provided as a  
3D array
on an equidistant grid (i.e. leaving the x,y,z arguments at their  
defaults).


e.g. (slight modification of `example(computeContour3d)'):

library(misc3d)
x - seq(-1,1,len=11)
g - expand.grid(x = x, y = x, z = x)
v - array(g$x^4 + g$y^4 + g$z^4, rep(length(x),3))
con - computeContour3d(v, max(v), 1)
drawScene(makeTriangles(con))

this is (approximately) a cube with edge length 10 (taking the grid  
spacing as the unit of length).

so the expected (approximate) surface area is 600.

indeed,

apply(con, 2, range) yields

 [,1] [,2] [,3]
[1,]111
[2,]   11   11   11

which might be interpreted as providing the vertices in coordinates
where the grid spacing is used as unit of length. however
I get an area of only about 430 instead of approx. 600 which is already a  
much much larger deviation
from the ideal cube surface than I would have expected given the small  
amount of smoothing at the
box edges and corners (but I have to double-check whether my triangle area  
computation is right, although I believe it is).


choosing instead

x - seq(-2,2,len=50)

however, the corresponding range of `con' is

  [,1]   [,2]   [,3]
[1,] 13.274 13.274 13.274
[2,] 37.726 37.726 37.726

which cannot be the grid coordinates (which should be in the range  
[1,50]). adopting this interpretation nevertheless

(vertices are given in grid coordinates)
the sum of the triangle areas only amounts to about 2600 instead of the  
expected approx. 49^2*6 = 14406


question 1:
am I making a stupid error (if so which one...)?

if not so:

question 2:
is there a linear transformation from the original grid coordinates (with  
range from 1 to dim(v)[n], n=1:3)

involved which yields the reported vertex coordinates?

question 3:
could someone please explain where to find this information (even if  
hidden in the source code of the package)
how to convert the vertex coordinates as delivered by `computeContour3D'  
to 'grid coordinates' (or true world coordinates

in general (if the x,y,z arguments are specified, too)?

for the wishlist: it would of course be nice if `computeContour3d' would  
indeed return the total surface area itself,

e.g. as an attribute of the triangles matrix.

for the devs: there is a typo in the manpage of this function:
Value:

 A matrix of three columns representing the triangles making up the
 contour surface. Each row represents a vertex and goups of three
 rows represent a triangle.

(should be `groups' instead of `goups')

--

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] how to derive true surface area from `computeContour3d' (misc3d package) -- follow up

2013-11-08 Thread j. van den hoff

regarding my previous mail for this topic, I have in the meantime  
identified my misconception.



actually, `computeContour3d' returns the vertices just fine in the correct  
coordinate frame.


the misconception was caused basically by assuming that the `level'  
argument was a fractional
threshold relative to the maximum of the array. so I believed that the  
rendered cube actually
is the outer surface of the defined object in the example provided in  
the manpage.


I know understandt it's an absolute level and `example(computeContour3d)'  
consequently displays

some interior isocontour. this explains all my apparent errors.

I believe the manpage would benefit from a slight clarification that  
`level' actually is

an absolute, not a relative/fractional threshold.

apologies for the noise.

j.

ps: it of course would still be nice, if the surface area (or a vector  
containing the individual triangle areas)

were returned to the caller as well ...

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] how to derive true surface area from `computeContour3d' (misc3d package) -- follow up

2013-11-08 Thread Barry Rowlingson

On Fri, Nov 8, 2013 at 1:01 PM, j. van den hoff
veedeeh...@googlemail.com wrote:


 ps: it of course would still be nice, if the surface area (or a vector
 containing the individual triangle areas)
 were returned to the caller as well ...

 Does the 'surfaceArea' function in the sp package do what you want?
It's Edzer's integration of an R function that I wrote that calls some
C code that someone else wrote that implements an algorithm from 2004.

 You just need to coerce your grid data into the right form.

 Barry

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] how to derive true surface area from `computeContour3d' (misc3d package) -- follow up

2013-11-08 Thread j. van den hoff

On Fri, 08 Nov 2013 15:32:00 +0100, Barry Rowlingson  
b.rowling...@lancaster.ac.uk wrote:



On Fri, Nov 8, 2013 at 1:01 PM, j. van den hoff
veedeeh...@googlemail.com wrote:



ps: it of course would still be nice, if the surface area (or a vector
containing the individual triangle areas)
were returned to the caller as well ...


 Does the 'surfaceArea' function in the sp package do what you want?
It's Edzer's integration of an R function that I wrote that calls some
C code that someone else wrote that implements an algorithm from 2004.

 You just need to coerce your grid data into the right form.


not quite, I believe: I need to compute the area of a (closed) iso-surface  
of a 3D object

defined by samples on a discrete 3D grid.
if I understand correctly from a quick view on `sp', `surfaceArea' does  
compute

the surface integral of some function  z = f(x,y), instead.

but thanks for the pointer anyway, this might still be useful in a  
different context.


joerg




 Barry



--

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] problem with interaction in lmer even after creating an interaction variable

2013-11-08 Thread Ben Bolker

a_lampei anna.lampei-bucharova at uni-tuebingen.de writes:

 
 Thank you very much,
 First, sorry for posting on wrong mailing list, I did not know that 
 there exists a special one for lmer.
 Yes, there are collinearities in the data.
 Still, I would like to have the variables in one model to compare 
 explained variability. Is there some option, or it is simply impossible?
 Thank you, Anna

  The development version of lme4 has an (experimental) feature
that automatically removes collinear columns of the model matrix;
you could try that.

  Further discussion on r-sig-mixed-models ...

  Ben Bolker

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Remove a column of a matrix with unnamed column header

2013-11-08 Thread arun

Hi,
Try:
 test2 - as.data.frame(test,stringsAsFactors=FALSE)
test2[,c(2:4)] - lapply(test2[,c(2:4)],as.numeric)
A.K.




On Friday, November 8, 2013 6:24 AM, Kuma Raj pollar...@gmail.com wrote:
Beend, Thanks for that. Conversion of test to a data frame resulted in a
factor. Is there a possibility to selectively convert to numeric?   I have
tried this code and that has not produced the intended result.
test[, c(2:4)] - sapply(test[, c(2:4)], as.numeric)



On 8 November 2013 11:31, Berend Hasselman b...@xs4all.nl wrote:


 On 08-11-2013, at 10:40, Kuma Raj pollar...@gmail.com wrote:

  I have a matrix names test which I want to convert to a data frame. When
 I
  use a command  test2-as.data.frame(test) it is executed without a
 problem.
  But when I want to browse the content I receive an error message Error
 in
  data.frame(outcome = c(cardva, respir, cereb, neoplasm,  :
  duplicate row.names: Estimate . The problem is clearly due to a
 duplicate
  in  row name . But I am unable to remove this column. I need help on how
 to
  remove this specific column that has essentially no column header name.
  dput of the matrix is here:
 
  dput(test)
  structure(c(cardva, respir, cereb, neoplasm, ami, ischem,
  heartf, pneumo, copd, asthma, dysrhy, diabet,
  0.00259492159959046,
  0.00979775441709427, 0.00103414632535868, 0.00486468139227382,
  0.0164825543879707, 0.0116647168053943, -0.0012137908515233,
  0.00730433232907741, 0.00355583994565985, 0.000712387285735019,
  -0.00103763671307935, 0.00981500221106926, 0.00325476724733837,
  0.0049232113728293, 0.00520118026087645, 0.00386848394426742,
  0.00688121694253705, 0.00585772614064902, 0.00564983058883797,
  0.0061328202328586, 0.0108212194251692, 0.0173804438930357,
  0.00867931407250442, 0.0106638104533486, 0.425323120845664,
  0.0466180768654915, 0.842402292743715, 0.208609687427072,
  0.0166336682608816, 0.0464833846710956, 0.8299010611324,
  0.233685747699204, 0.742469001175026, 0.967306766450795,
  0.904840885401235, 0.357394700741248), .Dim = c(12L, 4L), .Dimnames =
  list(
     c(Estimate, Estimate, Estimate, Estimate, Estimate,
     Estimate, Estimate, Estimate, Estimate, Estimate,
     Estimate, Estimate), c(outcome, beta, se, pval
     )))
 
  test2-as.data.frame(test)
  test2
  Error in data.frame(outcome = c(cardva, respir, cereb, neoplasm,
  :
   duplicate row.names: Estimate

 rownames(test) - NULL

 Berend



    [[alternative HTML version deleted]]


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] SNPRelate: Plink conversion

2013-11-08 Thread Danica Fabrigar

Hi,

Following my earlier posts about having problems performing a PCA, I have
worked out what the problem is. The problem lies within the PLINK to gds
conversion. 

It seems as though the SNPs are imported as samples and in turn, the
samples are recognised as SNPs:

snpsgdsSummary(chr2L)
Some values of snp.position are invalid (should be  0)!
Some values of snp.chromosome are invalid (should be finite and =1)!
Some of snp.allele are not standard! E.g, 2/-9
The file name: chr2L
The total number of samples: 2638506
The total number of SNPs: 67
SNP genotypes are stored in SNP-major mode.
The number of valid samples: 2638506
The number of valid SNPs: 0


Anyone have any ideas on how to fix this?

Thanks,
Danica
[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] SNPRelate: Plink conversion

2013-11-08 Thread Bert Gunter

Doesn't this belong on Bioconductor rather than here?

-- Bert

On Fri, Nov 8, 2013 at 6:04 AM, Danica Fabrigar danica_...@hotmail.com wrote:
 Hi,

 Following my earlier posts about having problems performing a PCA, I have
 worked out what the problem is. The problem lies within the PLINK to gds
 conversion.

 It seems as though the SNPs are imported as samples and in turn, the
 samples are recognised as SNPs:

snpsgdsSummary(chr2L)
 Some values of snp.position are invalid (should be  0)!
 Some values of snp.chromosome are invalid (should be finite and =1)!
 Some of snp.allele are not standard! E.g, 2/-9
 The file name: chr2L
 The total number of samples: 2638506
 The total number of SNPs: 67
 SNP genotypes are stored in SNP-major mode.
 The number of valid samples: 2638506
 The number of valid SNPs: 0


 Anyone have any ideas on how to fix this?

 Thanks,
 Danica
 [[alternative HTML version deleted]]

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.



-- 

Bert Gunter
Genentech Nonclinical Biostatistics

(650) 467-7374

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Error running MuMIn dredge function using glmer models

2013-11-08 Thread Martin Turcotte

Removing   na.action=na.omit   solved the problem. 

Thanks for the help and  thanks Dr. BatoÅ for making such useful package. 


Mart


Martin Turcotte, Ph. D.
mart.turco...@gmail.com


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Uploading Google Spreadsheet data into R

2013-11-08 Thread David Carlson

Stripping down to the bare essentials seems to get it. In
particular making the query just select * instead of select *
where B!='' works. You don't need the processing that the more
complicated Guardian web page requires. After loading the RCurl
package and creating the gsqAPI function:


tmp=gsqAPI(0AkvLBhzbLcz5dHljNGhUdmNJZ0dOdGJLTVRjTkRhTkE,selec
t *, 0)
 str(tmp)
'data.frame':   9 obs. of  3 variables:
 $ COL1: chr  25/10/2013 25/10/2013 31/10/2013
31/10/2013 ...
 $ COL2: int  50 10 16 18 25 34 56 47 50
 $ COL3: chr  TEXT TEXT TEXT TEXT TEXT ...
 tmp
COL1 COL2  COL3
1 25/10/2013   50  TEXT
2 25/10/2013   10 TEXT TEXT
3 31/10/2013   16  TEXT
4 31/10/2013   18  TEXT
5 31/10/2013   25 TEXT TEXT
6 31/10/2013   34  TEXT
7 31/10/2013   56  TEXT
8 31/10/2013   47  TEXT
9 31/10/2013   50  TEXT

-
David L Carlson
Department of Anthropology
Texas AM University
College Station, TX 77840-4352

-Original Message-
From: r-help-boun...@r-project.org
[mailto:r-help-boun...@r-project.org] On Behalf Of Luca Meyer
Sent: Friday, November 8, 2013 1:33 AM
To: r-help@r-project.org
Subject: [R] Uploading Google Spreadsheet data into R

Hello,

I am trying to upload data I have on a Google Spreadsheet within
R to
perform some analysis. I regularly update such data and need to
perform
data analysis in the quickiest possible way - i.e. without need
to publish
the data, so I was wondering how to make work this piece of code
(source
http://www.r-bloggers.com/datagrabbing-commonly-formatted-sheets
-from-a-google-spreadsheet-guardian-2014-university-guide-data/)
with my dataset (see
https://docs.google.com/spreadsheet/ccc?key=0AkvLBhzbLcz5dHljNGh
UdmNJZ0dOdGJLTVRjTkRhTkE#gid=0
):

library(RCurl)
gsqAPI = function(key,query,gid=0){
  tmp=getURL( paste(
sep=,'https://spreadsheets.google.com/tq?',
'tqx=out:csv','tq=', curlEscape(query), 'key=', key, 'gid=',
gid),
ssl.verifypeer = FALSE )
  return( read.csv( textConnection( tmp ),  stringsAsFactors=F )
)
}
handler=function(key,i){
  tmp=gsqAPI(key,select * where B!='', i)
  subject=sub(.Rank,'',colnames(tmp)[1])
  colnames(tmp)[1]=Subject.Rank
  tmp$subject=subject
  tmp
}
key='0AkvLBhzbLcz5dHljNGhUdmNJZ0dOdGJLTVRjTkRhTkE'
gdata=handler(key,0)

The code is currently returning  the following:

Error in `$-.data.frame`(`*tmp*`, subject, value = COL1) :
  replacement has 1 row, data has 0

Thank you in advance,
Luca

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible
code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Uploading Google Spreadsheet data into R

2013-11-08 Thread Luca Meyer

It does indeed.
Thank you David,
Luca


2013/11/8 David Carlson dcarl...@tamu.edu

 Stripping down to the bare essentials seems to get it. In
 particular making the query just select * instead of select *
 where B!='' works. You don't need the processing that the more
 complicated Guardian web page requires. After loading the RCurl
 package and creating the gsqAPI function:

 
 tmp=gsqAPI(0AkvLBhzbLcz5dHljNGhUdmNJZ0dOdGJLTVRjTkRhTkE,selec
 t *, 0)
  str(tmp)
 'data.frame':   9 obs. of  3 variables:
  $ COL1: chr  25/10/2013 25/10/2013 31/10/2013
 31/10/2013 ...
  $ COL2: int  50 10 16 18 25 34 56 47 50
  $ COL3: chr  TEXT TEXT TEXT TEXT TEXT ...
  tmp
 COL1 COL2  COL3
 1 25/10/2013   50  TEXT
 2 25/10/2013   10 TEXT TEXT
 3 31/10/2013   16  TEXT
 4 31/10/2013   18  TEXT
 5 31/10/2013   25 TEXT TEXT
 6 31/10/2013   34  TEXT
 7 31/10/2013   56  TEXT
 8 31/10/2013   47  TEXT
 9 31/10/2013   50  TEXT

 -
 David L Carlson
 Department of Anthropology
 Texas AM University
 College Station, TX 77840-4352

 -Original Message-
 From: r-help-boun...@r-project.org
 [mailto:r-help-boun...@r-project.org] On Behalf Of Luca Meyer
 Sent: Friday, November 8, 2013 1:33 AM
 To: r-help@r-project.org
 Subject: [R] Uploading Google Spreadsheet data into R

 Hello,

 I am trying to upload data I have on a Google Spreadsheet within
 R to
 perform some analysis. I regularly update such data and need to
 perform
 data analysis in the quickiest possible way - i.e. without need
 to publish
 the data, so I was wondering how to make work this piece of code
 (source
 http://www.r-bloggers.com/datagrabbing-commonly-formatted-sheets
 -from-a-google-spreadsheet-guardian-2014-university-guide-data/)
 with my dataset (see
 https://docs.google.com/spreadsheet/ccc?key=0AkvLBhzbLcz5dHljNGh
 UdmNJZ0dOdGJLTVRjTkRhTkE#gid=0
 ):

 library(RCurl)
 gsqAPI = function(key,query,gid=0){
   tmp=getURL( paste(
 sep=,'https://spreadsheets.google.com/tq?',
 'tqx=out:csv','tq=', curlEscape(query), 'key=', key, 'gid=',
 gid),
 ssl.verifypeer = FALSE )
   return( read.csv( textConnection( tmp ),  stringsAsFactors=F )
 )
 }
 handler=function(key,i){
   tmp=gsqAPI(key,select * where B!='', i)
   subject=sub(.Rank,'',colnames(tmp)[1])
   colnames(tmp)[1]=Subject.Rank
   tmp$subject=subject
   tmp
 }
 key='0AkvLBhzbLcz5dHljNGhUdmNJZ0dOdGJLTVRjTkRhTkE'
 gdata=handler(key,0)

 The code is currently returning  the following:

 Error in `$-.data.frame`(`*tmp*`, subject, value = COL1) :
   replacement has 1 row, data has 0

 Thank you in advance,
 Luca

 [[alternative HTML version deleted]]

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible
 code.



[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] select .txt from .txt in a directory

2013-11-08 Thread Zilefac Elvis

Hi,
I have 300 .txt files in a directory. Out of this 300, I need just 100 of the 
files.
I have the names of the 100 .txt files which are also found in the 300 .txt 
files.
How can I extract only the 100 .txt files from the 300 ,txt files?

e.g given d1.txt, ds.txt, dx.txt, df.txt...d300.txt, how can I select only 
d1.txt and df.txt? Remember, I have 300 of such and want to extract 100 of them 
with names known.

Thanks for your great help.
Atem.
[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Nonnormal Residuals and GAMs

2013-11-08 Thread Robert Rigby

 Hi Colin,

The GAMLSS package allows modelling of the response variable distribution
using either Exponential family or non-Exponential family distributions.
It also allows modelling of the scale parameter
(and hence the dispersion parameter for Exponential family distributions)
using explanatory variables.
This can be important for selecting mean model terms
and is particularly important when interest lies in the variance and/or
quantiles
of the response variable.

Robert Rigby


On 06/11/13 21:46, Collin Lynch wrote:
 Greetings, My question is more algorithmic than prectical.  What I am
 trying to determine is, are the GAM algorithms used in the mgcv package
 affected by nonnormally-distributed residuals?

 As I understand the theory of linear models the Gauss-Markov theorem
 guarantees that least-squares regression is optimal over all unbiased
 estimators iff the data meet the conditions linearity, homoscedasticity,
 independence, and normally-distributed residuals.  Absent the last
 requirement it is optimal but only over unbiased linear estimators.

 What I am trying to determine is whether or not it is necessary to check
 for normally-distributed errors in a GAM from mgcv.  I know that the
 unsmoothed terms, if any, will be fitted by ordinary least-squares but I
 am unsure whether the default Penalized Iteratively Reweighted Least
 Squares method used in the package is also based upon this assumption or
 falls under any analogue to the Gauss-Markov Theorem.

 Thank you in advance for any help.

   Sincrely,
   Collin Lynch

Companies Act 2006 : http://www.londonmet.ac.uk/companyinfo

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] select .txt from .txt in a directory

2013-11-08 Thread Sarah Goslee

How do you decide which ones you need?

Is there some pattern that lets you distinguish needing df.txt from
not needing ds.txt?

You say you have the names - how do you have them? In a text file?

What are you trying to do with the text files?

Sarah

On Fri, Nov 8, 2013 at 12:33 PM, Zilefac Elvis zilefacel...@yahoo.com wrote:
 Hi,
 I have 300 .txt files in a directory. Out of this 300, I need just 100 of the 
 files.
 I have the names of the 100 .txt files which are also found in the 300 .txt 
 files.
 How can I extract only the 100 .txt files from the 300 ,txt files?

 e.g given d1.txt, ds.txt, dx.txt, df.txt...d300.txt, how can I select only 
 d1.txt and df.txt? Remember, I have 300 of such and want to extract 100 of 
 them with names known.

 Thanks for your great help.
 Atem.

-- 
Sarah Goslee
http://www.functionaldiversity.org

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] select .txt from .txt in a directory

2013-11-08 Thread Simon Zehnder

I do not understand the question. If you already know the names what is the 
problem to select the files by names? 

If you have the names but not inside of R you have to find a name pattern to 
avoid typing them in. Is there a pattern, e.g. da.txt, db.txt, dc.txt? 


On 08 Nov 2013, at 18:33, Zilefac Elvis zilefacel...@yahoo.com wrote:

 Hi,
 I have 300 .txt files in a directory. Out of this 300, I need just 100 of the 
 files.
 I have the names of the 100 .txt files which are also found in the 300 .txt 
 files.
 How can I extract only the 100 .txt files from the 300 ,txt files?
 
 e.g given d1.txt, ds.txt, dx.txt, df.txt...d300.txt, how can I select only 
 d1.txt and df.txt? Remember, I have 300 of such and want to extract 100 of 
 them with names known.
 
 Thanks for your great help.
 Atem.
   [[alternative HTML version deleted]]
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] select .txt from .txt in a directory

2013-11-08 Thread Bert Gunter

1. Please don't post in HTML (see posting guide).

2. What do you mean by extract?

3. Your qiestion sounds very basic. Have you read An Introduction to
R or other online R tutorial? If not please do so before posting
further. All of R's file input functions allow you to specify the
directory path and/or filename, so if I understand you correctly, it's
just a matter of giving them to the appropriate function in some sort
of loop. e.g. something like

alldat - lapply(filenameList, function(x)InputFunction(x,...))

4. If you need something fancier than is described in the tutorials,
consult the R data Import/Export manual,  please.

-- Bert

On Fri, Nov 8, 2013 at 9:33 AM, Zilefac Elvis zilefacel...@yahoo.com wrote:
 Hi,
 I have 300 .txt files in a directory. Out of this 300, I need just 100 of the 
 files.
 I have the names of the 100 .txt files which are also found in the 300 .txt 
 files.
 How can I extract only the 100 .txt files from the 300 ,txt files?

 e.g given d1.txt, ds.txt, dx.txt, df.txt...d300.txt, how can I select only 
 d1.txt and df.txt? Remember, I have 300 of such and want to extract 100 of 
 them with names known.

 Thanks for your great help.
 Atem.
 [[alternative HTML version deleted]]

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.



-- 

Bert Gunter
Genentech Nonclinical Biostatistics

(650) 467-7374

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] select .txt from .txt in a directory

2013-11-08 Thread Zilefac Elvis

All files are text files. They are found in a folder on my computer. 
Assume that I know the names of some of the files I want to select from the 300 
txt files.
How can I do this in R.
Atem.



On Friday, November 8, 2013 11:44 AM, Simon Zehnder szehn...@uni-bonn.de 
wrote:
 
I do not understand the question. If you already know the names what is the 
problem to select the files by names? 

If you have the names but not inside of R you have to find a name pattern to 
avoid typing them in. Is there a pattern, e.g. da.txt, db.txt, dc.txt? 



On 08 Nov 2013, at 18:33, Zilefac Elvis zilefacel...@yahoo.com wrote:

 Hi,
 I have 300 .txt files in a directory. Out of this 300, I need just 100 of the 
 files.
 I have the names of the 100 .txt files which are also found in the 300 .txt 
 files.
 How can I extract only the 100 .txt files from the 300 ,txt files?
 
 e.g given d1.txt, ds.txt, dx.txt, df.txt...d300.txt, how can I select only 
 d1.txt and df.txt? Remember, I have 300 of such and want to extract 100 of 
 them with names known.
 
 Thanks for your great help.
 Atem.
     [[alternative HTML version deleted]]
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.
[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] select .txt from .txt in a directory

2013-11-08 Thread Simon Zehnder

If you want to type in the names by hand, you can simply use read.table to load 
them into R … I still don’t get the aim of your text file handling


On 08 Nov 2013, at 18:51, Zilefac Elvis zilefacel...@yahoo.com wrote:

 All files are text files. They are found in a folder on my computer. 
 Assume that I know the names of some of the files I want to select from the 
 300 txt files.
 How can I do this in R.
 Atem.
 
 
 On Friday, November 8, 2013 11:44 AM, Simon Zehnder szehn...@uni-bonn.de 
 wrote:
 I do not understand the question. If you already know the names what is the 
 problem to select the files by names? 
 
 If you have the names but not inside of R you have to find a name pattern to 
 avoid typing them in. Is there a pattern, e.g. da.txt, db.txt, dc.txt? 
 
 
 On 08 Nov 2013, at 18:33, Zilefac Elvis zilefacel...@yahoo.com wrote:
 
  Hi,
  I have 300 .txt files in a directory. Out of this 300, I need just 100 of 
  the files.
  I have the names of the 100 .txt files which are also found in the 300 .txt 
  files.
  How can I extract only the 100 .txt files from the 300 ,txt files?
  
  e.g given d1.txt, ds.txt, dx.txt, df.txt...d300.txt, how can I select only 
  d1.txt and df.txt? Remember, I have 300 of such and want to extract 100 of 
  them with names known.
  
  Thanks for your great help.
  Atem.
 
  [[alternative HTML version deleted]]
  
  __
  R-help@r-project.org mailing list
  https://stat.ethz.ch/mailman/listinfo/r-help
  PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
  and provide commented, minimal, self-contained, reproducible code.
 
 
 

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Adding Proxy information in 'R' application

2013-11-08 Thread Batista, Jose

R-Help Mailing List,

I'm currently working with a user who is actively trying to download  install 
libraries for 'R' on her office PC. While using install.packages(packageName, 
dependencies = TRUE) works without a problem on our home PCs, we use a proxy at 
the firm and therefore it doesn't let the application go directly out of the 
network on port 80.  Is there a way to manually set proxy information within 
the application so that it can, indeed, reach the internet when we're trying to 
download and install libraries (and necessary dependencies) from within the 
application?  I've gone through some of the options but there's nothing there 
for it.

Regards,
José Emmanuel Batista



 
 

If you are not an intended recipient of this e-mail, you are not authorized to 
duplicate, copy, retransmit or redistribute it by any means. Please delete it 
and any attachments immediately and notify the sender that you have received it 
in error. Unless specifically indicated, this e-mail is not an offer to buy or 
sell or a solicitation to buy or sell any securities, investment products or 
other financial product or service, an official confirmation of any 
transaction, or an official statement of Neuberger Berman. Any views or 
opinions presented are solely those of the author and do not necessarily 
represent those of Neuberger Berman. This e-mail is subject to terms available 
at the following link: www.nb.com/disclaimer/usa.html. By messaging with 
Neuberger Berman you consent to the foregoing.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] How to show the second abline ?

2013-11-08 Thread Domokos Péter

Hi,

I have the next script in R:

x=c(8.0,17.5,23.5,32.0,38.5,48.5,58.5,68.5)
y=c(267,246,290,294,302,301,301,298)

gap.plot(x,y,ylim=c(8,310),pch=8,cex=0.5,
xlab=c('Time'),ylab=c('uS'),
gap=c(30,240),gap.axis='y',
ytics=c(10,20,30,270,280,290,300))
abline(h=31,col='white',lwd=20)
axis.break(axis=2,31)
axis.break(axis=4,31)

abline(coef(lm(x~y)),col=1)#Why don't show this???

Thank's for Your help,
Péter

--
Domokos Péter

BSc student
Babes-Bolyai University
Biology and Geology Faculty
Hungarian Department of Biology and Ecology
Cluj Napoca
Romania

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] SNPRelate: Plink conversion

2013-11-08 Thread Danica Fabrigar

Hi Bert,
I thought it was suitable to post the question on the R mailing list first 
seeing as the problem/question is related to an R package.

Danica

 Date: Fri, 8 Nov 2013 08:14:03 -0800
 Subject: Re: [R] SNPRelate: Plink conversion
 From: gunter.ber...@gene.com
 To: danica_...@hotmail.com
 CC: r-help@r-project.org

 Doesn't this belong on Bioconductor rather than here?

 -- Bert

 On Fri, Nov 8, 2013 at 6:04 AM, Danica Fabrigar danica_...@hotmail.com 
 wrote:
  Hi,

  Following my earlier posts about having problems performing a PCA, I have
  worked out what the problem is. The problem lies within the PLINK to gds
  conversion.

  It seems as though the SNPs are imported as samples and in turn, the
  samples are recognised as SNPs:

 snpsgdsSummary(chr2L)
  Some values of snp.position are invalid (should be  0)!
  Some values of snp.chromosome are invalid (should be finite and =1)!
  Some of snp.allele are not standard! E.g, 2/-9
  The file name: chr2L
  The total number of samples: 2638506
  The total number of SNPs: 67
  SNP genotypes are stored in SNP-major mode.
  The number of valid samples: 2638506
  The number of valid SNPs: 0

  Anyone have any ideas on how to fix this?

  Thanks,
  Danica
  [[alternative HTML version deleted]]

  __
  R-help@r-project.org mailing list
  https://stat.ethz.ch/mailman/listinfo/r-help
  PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
  and provide commented, minimal, self-contained, reproducible code.

 -- 

 Bert Gunter
 Genentech Nonclinical Biostatistics

 (650) 467-7374

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] How to show the second abline ?

2013-11-08 Thread Sarah Goslee

I have no idea where gap.plot() came from, so I can't reproduce this,
but you almost certainly need

y ~ x

in your formula.

abline(coef(lm(y ~ x)),col=1)

Sarah

On Fri, Nov 8, 2013 at 11:04 AM, Domokos Péter dom...@gmail.com wrote:
 Hi,

 I have the next script in R:

 x=c(8.0,17.5,23.5,32.0,38.5,48.5,58.5,68.5)
 y=c(267,246,290,294,302,301,301,298)

 gap.plot(x,y,ylim=c(8,310),pch=8,cex=0.5,
 xlab=c('Time'),ylab=c('uS'),
 gap=c(30,240),gap.axis='y',
 ytics=c(10,20,30,270,280,290,300))
 abline(h=31,col='white',lwd=20)
 axis.break(axis=2,31)
 axis.break(axis=4,31)

 abline(coef(lm(x~y)),col=1)#Why don't show this???

 Thank's for Your help,
 Péter




-- 
Sarah Goslee
http://www.functionaldiversity.org

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Fw: select .txt from .txt in a directory

2013-11-08 Thread Sarah Goslee

Hi,

I'm not particularly interested in opening unsolicited binary attachments.

Why don't you use dput() to provide part of your data to the R-help
list (copied on this email; emailing just me not being that useful).

You still haven't told us what you want to do with the named text
files - read them into R?

In general, you would read the file with the list of names into R,
then use a loop or a *apply construct to import each of those named
files. Based on what you've said, the fact that your desired list has
only 100 of the 300 total files is a red herring.

Sarah


On Fri, Nov 8, 2013 at 1:30 PM, Zilefac Elvis zilefacel...@yahoo.com wrote:
 Hi Sarah
 Attached are my data files.
 Btemperature_Stations is my main file.
 Temperature inventory is my 'wanted' file and is a subset of
 Btemperature_Stations.
 Using column 3 in both files, select the files in Temperature inventory from
 Btemperature_Stations.
 The .zip file contains the .txt files which you will extract to a folder and
 do the selection in R.

 Thanks,
 Atem.

-- 
Sarah Goslee
http://www.functionaldiversity.org

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Revolutions blog: October roundup

2013-11-08 Thread David Smith

Revolution Analytics staff write about R every weekday at the Revolutions blog:
 http://blog.revolutionanalytics.com
and every month I post a summary of articles from the previous month
of particular interest to readers of r-help.

In case you missed them, here are some articles related to R from the
month of October:

Joe Rickert recounts the R presence at the Strata + Hadoop World
conference, including slides from the R and Hadoop tutorial:
http://bit.ly/1acPavl

Hadley Wickham's favorite tools, gadgets and software (including of
course R): http://bit.ly/1acPavm

Revolution R Enterprise 7 is announced, including R 3.0.2: http://bit.ly/1acP8DM

I was interviewed on camera by theCUBE about R, data science, and
Revolution R Enterprise 7: http://bit.ly/1acP8DO

Patrick Burns shares some good reasons for switching from spreadsheets
to R for data analysis: http://bit.ly/1acP8DN

R is used for several sports-related analyses at the The New England
Symposium of Statistics in Sport: http://bit.ly/1acPavk

Some tips for using the .Rprofile file to customize your R session at
startup: http://bit.ly/1acP8DP

Quandl’s introduction to econometrics using R: http://bit.ly/1acPavr

Video replay of a recent webinar by DataSong on implementing
time-to-event models with Revolution R Enterprise:
http://bit.ly/1acP8DQ

Revolution R Enterprise is now integrated with Alteryx to provide a
drag-and-drop GUI workflow for R: http://bit.ly/1acPavt

An article in Forbes discusses using R from the Alteryx drag-and-drop
workflow interface: http://bit.ly/1acPavs

Joe Rickert reviews the sessions at the ACM 2013 Big Data Camp:
http://bit.ly/1acP8DR

The New York Times published an article on fantasy football analysis
with R: http://bit.ly/1acPaLG

The latest Rexer poll shows the use of R continues to skyrocket. It’s
the most popular data mining tool by a wide margin:
http://bit.ly/1acPaLH

Replays of two recent webinar presentations on using R on Hadoop,
presented by Cloudera http://bit.ly/1acPaLI and Hortonworks
http://bit.ly/1acP8U8 in conjunction with Revolution Analytics.

Tips and resources for using R for signal processing and time series
analysis: http://bit.ly/1acP8U7

The popular data-visualization software Tableau adds integration with
R: http://bit.ly/1acP8U5

An interactive web tool explains Simpson’s paradox: http://bit.ly/1acPaLJ

R-related presentations from the DataWeek 2013 conference, including
how an IBM division replaced SAS with R: http://bit.ly/1acP8U9

Some non-R stories in the past month included: remembering video
stores (http://bit.ly/1acP8Ub), some optical illusion trickery
(http://bit.ly/1acPaLK), better voting systems
(http://bit.ly/1acPaLL), a funny interpretation of air safety videos
(http://bit.ly/1acP8Ua) and a discussion on how to get ROT from
analytics (http://bit.ly/1acPaLM).

Meeting times for local R user groups (http://bit.ly/eC5YQe) can be
found on the updated R Community Calendar at: http://bit.ly/bb3naW

If you're looking for more articles about R, you can find summaries
from previous months at http://blog.revolutionanalytics.com/roundups/.
You can receive daily blog posts via email using services like
blogtrottr.com, or join the Revolution Analytics mailing list at
http://bit.ly/MH2I2Q to be alerted to new articles on a monthly basis.

As always, thanks for the comments and please keep sending suggestions
to me at da...@revolutionanalytics.com . Don't forget you can also
follow the blog using an RSS reader, or by following me on Twitter
(I'm @revodavid).

Cheers,
# David

-- 
David M Smith da...@revolutionanalytics.com
VP of Marketing, Revolution Analytics  http://blog.revolutionanalytics.com
Tel: +1 (650) 646-9523 (Seattle WA, USA)
Twitter: @revodavid
We're hiring! www.revolutionanalytics.com/careers

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] select .txt from .txt in a directory

2013-11-08 Thread Simon Zehnder

Elvis,

first, keep things on the list - so others can learn and comment. Second, as 
Sarah already commented: We do not like to open unsolicited binary attachments 
on the list. Sarah gives a good hint how to post data to the list.

What I would do so far is use the matching columns to get the names you need 
from BTemperature: 

temp_inv - read.table(Temperature Inventory, … ) (here I would change the 
.xlsx to a .csv and use read.csv instead of read.table)
btemp - read.table(“BTemperature_Stations.txt”, … ) (again think about 
converting via Excel to .csv - it makes things far more easy) 

Check ?read.table for options - you gonna need them.

Then match
mynames - btemp[(temp_inv[,3] %in% btemp[, 3]), 2]

Now you have the names of the stations and if your .txt files are named by the 
stations you can do something like:

for (name in mynames) {
tmp.table - read.table(paste(“path/to/your/Homog_daily_min_temp/“, name, 
“.txt”, sep = “”), … )
…. do things with the data
}



Best

Simon
 
On 08 Nov 2013, at 19:26, Zilefac Elvis zilefacel...@yahoo.com wrote:

 Hi Simon,
 Attached are my data files.
 Btemperature_Stations is my main file.
 Temperature inventory is my 'wanted' file and is a subset of 
 Btemperature_Stations.
 Using column 3 in both files, select the files in Temperature inventory from 
 Btemperature_Stations.
 The .zip file contains the .txt files which you will extract to a folder and 
 do the selection in R.
 
 Thanks,
 Atem.
  
 
 
 
 
 On Friday, November 8, 2013 11:54 AM, Simon Zehnder szehn...@uni-bonn.de 
 wrote:
 If you want to type in the names by hand, you can simply use read.table to 
 load them into R … I still don’t get the aim of your text file handling
 
 
 On 08 Nov 2013, at 18:51, Zilefac Elvis zilefacel...@yahoo.com wrote:
 
  All files are text files. They are found in a folder on my computer. 
  Assume that I know the names of some of the files I want to select from the 
  300 txt files.
  How can I do this in R.
  Atem.
  
  
  On Friday, November 8, 2013 11:44 AM, Simon Zehnder szehn...@uni-bonn.de 
  wrote:
  I do not understand the question. If you already know the names what is the 
  problem to select the files by names? 
  
  If you have the names but not inside of R you have to find a name pattern 
  to avoid typing them in. Is there a pattern, e.g. da.txt, db.txt, dc.txt? 
  
  
  On 08 Nov 2013, at 18:33, Zilefac Elvis zilefacel...@yahoo.com wrote:
  
   Hi,
   I have 300 .txt files in a directory. Out of this 300, I need just 100 of 
   the files.
   I have the names of the 100 .txt files which are also found in the 300 
   .txt files.
   How can I extract only the 100 .txt files from the 300 ,txt files?
   
   e.g given d1.txt, ds.txt, dx.txt, df.txt...d300.txt, how can I select 
   only d1.txt and df.txt? Remember, I have 300 of such and want to extract 
   100 of them with names known.
   
   Thanks for your great help.
   Atem.
  
  [[alternative HTML version deleted]]
   
   __
   R-help@r-project.org mailing list
   https://stat.ethz.ch/mailman/listinfo/r-help
   PLEASE do read the posting guide 
   http://www.R-project.org/posting-guide.html
   and provide commented, minimal, self-contained, reproducible code.
  
  
  
 
 
 BTemperature_Stations.txtTempearture 
 inventory.xlsxHomog_daily_min_temp.zip

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Date handling in R is hard to understand

2013-11-08 Thread Alemu Tadesse

Dear All,

I usually work with time series data. The data may come in AM/PM date
format or on 24 hour time basis. R can not recognize the two differences
automatically - at least for me. I have to specifically tell R in which
time format the data is. It seems that Pandas knows how to handle date
without being told the format. The problem arises when I try to shift time
by a certain time. Say adding 3600 to shift it forward, that case I have to
use something like:
Measured_data$Date - as.POSIXct(as.character(Measured_data$Date),
tz=,format = %m/%d/%Y %I:%M %p)+3600
or Measured_data$Date - as.POSIXct(as.character(Measured_data$Date),
tz=,format = %m/%d/%Y %H:%M)+3600  depending on the format. The date
also attaches MDT or MST and so on. When merging two data frames  with
dates of different format that may create a problem (I think). When I get
data from excel it could be in any/random format and I needed to customize
the date to use in R in one of the above formats. Any TIPS - for automatic
processing with no need to specifically tell the data format ?

Another problem I saw was that when using r bind to bind data frames, if
one column of one of the data frames is a character data (say for example
none - coming from mysql) format R doesn't know how to concatenate numeric
column from the other data frame to it. I needed to change the numeric to
character and later after binding takes place I had to re-convert it to
numeric. But, this causes problem in an automated environment. Any
suggestion ?

Thanks
Mihretu

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] C50 Node Assignment

2013-11-08 Thread David Winsemius

In my role as a moderator I am attempting to bypass the automatic mail filters 
that are blocking this posting. Please reply to the list and to:
=
Kevin Shaney kevin.sha...@rosetta.com

C50 Node Assignment

I am using C50 to classify individuals into 5 groups / categories (factor 
variable).  The tree / set of rules has 10 rules for classification.  I am 
trying to extract the RULE for which each individual qualifies (a number 
between 1 and 10), and cannot figure out how to do so.  I can extract the 
predicted group and predicted group probability, but not the RULE to which an 
individual qualifies.  Please let me know if you can help!

Kevin
=


-- 
David Winsemius
Alameda, CA, USA

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Adding Proxy information in 'R' application

2013-11-08 Thread Carina Salt

Hi Jose

I faced the same problem at my workplace too - the solution (at least for
us) was to insert the following function into the Rprofile.ste file in the
etc folder in the R install folder - or, if the  .First function already
exists, you could just insert the line beginning Sys.setenv(.) into that
function.  The bit in the speech marks needs to be a suitable web proxy
address and port, and the 'http_proxy_user=ask' bit is telling R to ask for
a username and password.

.First - function() {
Sys.setenv(http_proxy=http://webproxy:8080 http_proxy_user=ask)
}

Hope this works for you!

Cheers,
Carina


On 8 November 2013 17:10, Batista, Jose jose.bati...@nb.com wrote:

 R-Help Mailing List,

 I'm currently working with a user who is actively trying to download 
 install libraries for 'R' on her office PC. While using
 install.packages(packageName, dependencies = TRUE) works without a
 problem on our home PCs, we use a proxy at the firm and therefore it
 doesn't let the application go directly out of the network on port 80.  Is
 there a way to manually set proxy information within the application so
 that it can, indeed, reach the internet when we're trying to download and
 install libraries (and necessary dependencies) from within the application?
  I've gone through some of the options but there's nothing there for it.

 Regards,
 José Emmanuel Batista





 
 If you are not an intended recipient of this e-mail, you are not
 authorized to duplicate, copy, retransmit or redistribute it by any means.
 Please delete it and any attachments immediately and notify the sender that
 you have received it in error. Unless specifically indicated, this e-mail
 is not an offer to buy or sell or a solicitation to buy or sell any
 securities, investment products or other financial product or service, an
 official confirmation of any transaction, or an official statement of
 Neuberger Berman. Any views or opinions presented are solely those of the
 author and do not necessarily represent those of Neuberger Berman. This
 e-mail is subject to terms available at the following link:
 www.nb.com/disclaimer/usa.html. By messaging with Neuberger Berman you
 consent to the foregoing.

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Date handling in R is hard to understand

2013-11-08 Thread Bert Gunter

Have a look at the lubridate package. It claims to try to make
dealing with dates easier.


-- Bert

On Fri, Nov 8, 2013 at 11:41 AM, Alemu Tadesse alemu.tade...@gmail.com wrote:
 Dear All,

 I usually work with time series data. The data may come in AM/PM date
 format or on 24 hour time basis. R can not recognize the two differences
 automatically - at least for me. I have to specifically tell R in which
 time format the data is. It seems that Pandas knows how to handle date
 without being told the format. The problem arises when I try to shift time
 by a certain time. Say adding 3600 to shift it forward, that case I have to
 use something like:
 Measured_data$Date - as.POSIXct(as.character(Measured_data$Date),
 tz=,format = %m/%d/%Y %I:%M %p)+3600
 or Measured_data$Date - as.POSIXct(as.character(Measured_data$Date),
 tz=,format = %m/%d/%Y %H:%M)+3600  depending on the format. The date
 also attaches MDT or MST and so on. When merging two data frames  with
 dates of different format that may create a problem (I think). When I get
 data from excel it could be in any/random format and I needed to customize
 the date to use in R in one of the above formats. Any TIPS - for automatic
 processing with no need to specifically tell the data format ?

 Another problem I saw was that when using r bind to bind data frames, if
 one column of one of the data frames is a character data (say for example
 none - coming from mysql) format R doesn't know how to concatenate numeric
 column from the other data frame to it. I needed to change the numeric to
 character and later after binding takes place I had to re-convert it to
 numeric. But, this causes problem in an automated environment. Any
 suggestion ?

 Thanks
 Mihretu

 [[alternative HTML version deleted]]

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.



-- 

Bert Gunter
Genentech Nonclinical Biostatistics

(650) 467-7374

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Crime hotspot maps (kernel density)

2013-11-08 Thread Rolf Turner




It is not clear to me what you want/need to do, but it is possible that
the spatstat package (in particular the function density.ppp()) might
help you.

cheers,

Rolf Turner

On 11/08/13 23:10, David Studer wrote:

Hi everybody,

does anyone of you know how to create a (crime) hotspot map using R?
Are there any packages or do you know any ressources?

It should be something like this:
http://www.caliper.com/Maptitude/Crime/MotorVehicleTheft2.png
(but it doesnt necessarely have to be a map)


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] How to show the second abline ?

2013-11-08 Thread Jim Lemon


On 11/09/2013 03:04 AM, Domokos Péter wrote:

Hi,

I have the next script in R:

x=c(8.0,17.5,23.5,32.0,38.5,48.5,58.5,68.5)
y=c(267,246,290,294,302,301,301,298)

gap.plot(x,y,ylim=c(8,310),pch=8,cex=0.5,
xlab=c('Time'),ylab=c('uS'),
gap=c(30,240),gap.axis='y',
ytics=c(10,20,30,270,280,290,300))
abline(h=31,col='white',lwd=20)
axis.break(axis=2,31)
axis.break(axis=4,31)

abline(coef(lm(x~y)),col=1)#Why don't show this???


Hi Peter,
Perhaps because both of these numbers:

coef(lm(x~y))
 (Intercept)y
-176.50471600.7425131

are off the scale of your plot. Do you really want:

lmcoef-coef(lm(y~x))
abline(lmcoef[1]-210,lmcoef[2],col=1)

Jim

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Date handling in R is hard to understand

2013-11-08 Thread Jim Lemon


Hi Mihretu,
Can you grep for AM or PM? If so build your format string depending 
upon whether one of these exists in the date string.


Jim

On 11/09/2013 06:41 AM, Alemu Tadesse wrote:

Dear All,

I usually work with time series data. The data may come in AM/PM date
format or on 24 hour time basis. R can not recognize the two differences
automatically - at least for me. I have to specifically tell R in which
time format the data is. It seems that Pandas knows how to handle date
without being told the format. The problem arises when I try to shift time
by a certain time. Say adding 3600 to shift it forward, that case I have to
use something like:
Measured_data$Date- as.POSIXct(as.character(Measured_data$Date),
tz=,format = %m/%d/%Y %I:%M %p)+3600
or Measured_data$Date- as.POSIXct(as.character(Measured_data$Date),
tz=,format = %m/%d/%Y %H:%M)+3600  depending on the format. The date
also attaches MDT or MST and so on. When merging two data frames  with
dates of different format that may create a problem (I think). When I get
data from excel it could be in any/random format and I needed to customize
the date to use in R in one of the above formats. Any TIPS - for automatic
processing with no need to specifically tell the data format ?

Another problem I saw was that when using r bind to bind data frames, if
one column of one of the data frames is a character data (say for example
none - coming from mysql) format R doesn't know how to concatenate numeric
column from the other data frame to it. I needed to change the numeric to
character and later after binding takes place I had to re-convert it to
numeric. But, this causes problem in an automated environment. Any
suggestion ?

Thanks
Mihretu



__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Geweke Diagnostic in CODA package

2013-11-08 Thread Marino David

Hi all:

The CODA package provides Geweke Diagnostic method for convergence
checking. The geweke.diag in CODA returns Z-score value but not give a
conclustion that it is convergence or not. So I'd like to know how
small/big the magnitude of Z-score is corresponding to the convergence of
 a chain. That is, Doese Z-score smaller or more than *threshold *determine
the convergence? If so, how big the *threshold *value?

See as follows:
 data(line)
 geweke.diag(line)
$line1

Fraction in 1st window = 0.1
Fraction in 2nd window = 0.5

  alphabeta   sigma
 1.1726 -0.7537  1.0182


$line2

Fraction in 1st window = 0.1
Fraction in 2nd window = 0.5

  alphabeta   sigma
-0.1307 -1.7929 -0.6381

Can anyone tell which chain (line1 or line2) has a better property of
convergenc based on the returned Z-scores?

Thank you!

David

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] making chains from pairs

2013-11-08 Thread Hermann Norpois

Hello,

having a data frame like test with pairs of characters I would like to
create chains. For instance from the pairs A/B and B/I you get the vector A
B I. It is like jumping from one pair to the next related pair. So for my
example test you should get:
A B F G H I
C F I K
D L M N O P


 test
   V1 V2
1   A  B
2   A  F
3   A  G
4   A  H
5   B  F
6   B  I
7   C  F
8   C  I
9   C  K
10  D  L
11  D  M
12  D  N
13  L  O
14  L  P

Thanks
Hermann

 dput (test)
structure(list(V1 = c(A, A, A, A, B, B, C, C,
C, D, D, D, L, L), V2 = c(B, F, G, H, F,
I, F, I, K, L, M, N, O, P)), .Names = c(V1,
V2), row.names = c(NA, -14L), class = data.frame)


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] SNPRelate: Plink conversion

2013-11-08 Thread Hermann Norpois

You might try to import your data in GenABEL,
 use
as.numeric (gtdata (data))
to get a matrix that delivers you 0,1 or 2 for each snp and id (observation)
and then try prcomp.

Also check this
http://gettinggeneticsdone.blogspot.de/2011/10/new-dimension-to-principal-components_27.html
http://www.hsph.harvard.edu/alkes-price/software/

Hope this helps.

Hermann


2013/11/8 Danica Fabrigar danica_...@hotmail.com

 Hi Bert,
 I thought it was suitable to post the question on the R mailing list first
 seeing as the problem/question is related to an R package.

 Danica



  Date: Fri, 8 Nov 2013 08:14:03 -0800
  Subject: Re: [R] SNPRelate: Plink conversion
  From: gunter.ber...@gene.com
  To: danica_...@hotmail.com
  CC: r-help@r-project.org
 
  Doesn't this belong on Bioconductor rather than here?
 
  -- Bert
 
  On Fri, Nov 8, 2013 at 6:04 AM, Danica Fabrigar danica_...@hotmail.com
 wrote:
   Hi,
  
   Following my earlier posts about having problems performing a PCA, I
 have
   worked out what the problem is. The problem lies within the PLINK to
 gds
   conversion.
  
   It seems as though the SNPs are imported as samples and in turn, the
   samples are recognised as SNPs:
  
  snpsgdsSummary(chr2L)
   Some values of snp.position are invalid (should be  0)!
   Some values of snp.chromosome are invalid (should be finite and =1)!
   Some of snp.allele are not standard! E.g, 2/-9
   The file name: chr2L
   The total number of samples: 2638506
   The total number of SNPs: 67
   SNP genotypes are stored in SNP-major mode.
   The number of valid samples: 2638506
   The number of valid SNPs: 0
  
  
   Anyone have any ideas on how to fix this?
  
   Thanks,
   Danica
   [[alternative HTML version deleted]]
  
   __
   R-help@r-project.org mailing list
   https://stat.ethz.ch/mailman/listinfo/r-help
   PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
   and provide commented, minimal, self-contained, reproducible code.
 
 
 
  --
 
  Bert Gunter
  Genentech Nonclinical Biostatistics
 
  (650) 467-7374

 [[alternative HTML version deleted]]

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] select .txt from .txt in a directory

2013-11-08 Thread arun



Hi Atem,

It is not clear what you wanted to do.  If you want to transfer the subset of 
files from the main folder to a new location, then you may try: (make sure you 
create a copy of the original .txt folder before doing this)
I created three sub folders and two files (BTemperature_Stations.txt and 
Tempearture inventory.csv) in my working directory.


list.files()
#[1] BTemperature_Stations.txt Files1  ## Files1 folder contains 
all the .txt files; #SubsetFiles: created to subset the files that match the 
condition 
#[3] FilesCopy SubsetFiles1  #FilesCopy. A copy of 
the Files1 folder   
#[5] Tempearture inventory.csv


 

list.files(pattern=\\.)
#[1] BTemperature_Stations.txt Tempearture inventory.csv
fl1 - list.files(pattern=\\.)
 dat1 - 
read.table(fl1[1],header=TRUE,sep=,stringsAsFactors=FALSE,fill=TRUE,check.names=FALSE)
 dat2 - 
read.csv(fl1[2],header=TRUE,sep=,,stringsAsFactors=FALSE,check.names=FALSE)
vec1 - dat1[,3][dat1[,3]%in% dat2[,3]]
vec2 - list.files(path=/home/arunksa111/Zl/Files1,recursive=TRUE)
 sum(gsub(.txt,,vec2) %in% vec1)
#[1] 98
vec3 -  vec2[gsub(.txt,,vec2) %in% vec1]
lapply(vec3, function(x) 
file.rename(paste(/home/arunksa111/Zl/Files1,x,sep=/), 
paste(/home/arunksa111/Zl/SubsetFiles1,x,sep=/))) #change the path 
accordingly. 
length(list.files(path=/home/arunksa111/Zl/SubsetFiles1))
#[1] 98

fileDim - sapply(vec3,function(x) {x1 
-read.delim(paste(/home/arunksa111/Zl/SubsetFiles1,x,sep=/),header=TRUE,stringsAsFactors=FALSE,sep=,,check.names=FALSE);
 dim(x1)})
fileDim[,1:3]
# dn3011120.txt dn3011240.txt dn3011887.txt
#[1,]  1151   791  1054
#[2,] 7 7 7


A.K.





On Friday, November 8, 2013 1:41 PM, Zilefac Elvis zilefacel...@yahoo.com 
wrote:

Hi AK,


I want to select some files from a list of files. All are text files. The index 
for selection is found in column 3 of both files.


Attached are my data files.
Btemperature_Stations is my main file.
Temperature inventory is my 'wanted' file and is a subset of 
Btemperature_Stations.
Using column 3 in both files, select the files in Temperature inventory from 
Btemperature_Stations.
The .zip file contains the .txt files which you will extract to a folder and do 
the selection in R.

Thanks,
Atem.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Please help me to short my code

2013-11-08 Thread arun

Hi,
Try either:
Ceramic - read.table(ceramic.dat,header=TRUE)
Ceramic1 - Ceramic
Ceramic$indx - ((seq_len(nrow(Ceramic))-1)%/%60)+1
library(plyr)
DF1 - data.frame(M=as.vector(t(ddply(Ceramic,.(indx), colwise(mean))[,-1])), 
S=as.vector(t(ddply(Ceramic,.(indx),colwise(sd))[,-1])),Rep = 60)
 colnames(DF)[3] - colnames(DF1)[3]
 identical(DF,DF1)
#[1] TRUE


#or
 indx - ((seq_len(nrow(Ceramic))-1)%/%60)+1
Ceramic2 -  do.call(data.frame, c(aggregate(.~indx,data=Ceramic1,function(x) 
c(mean(x),sd(x))), check.names=FALSE))[,-1]
 DF2 - data.frame(M= as.vector(t(Ceramic2[,seq(1,ncol(Ceramic2),by=2)])), S= 
as.vector(t(Ceramic2[,seq(2,ncol(Ceramic2),by=2)])),Rep =60)
identical(DF,DF2)
#[1] TRUE



A.K.


please help me to short the code 

#To import data onto R 
Ceramic-read.table(D:/ceramic.dat,header=T) 
#to obtain mean, standard deviation and number of observations- 
LAB1-Ceramic[1:60,] 
LAB2-Ceramic[61:120,] 
LAB3-Ceramic[121:180,] 
LAB4-Ceramic[181:240,] 
LAB5-Ceramic[241:300,] 
LAB6-Ceramic[301:360,] 
LAB7-Ceramic[361:420,] 
LAB8-Ceramic[421:480,] 
M1-sapply(LAB1,mean) 
M2-sapply(LAB2,mean) 
M3-sapply(LAB3,mean) 
M4-sapply(LAB4,mean) 
M5-sapply(LAB5,mean) 
M6-sapply(LAB6,mean) 
M7-sapply(LAB7,mean) 
M8-sapply(LAB8,mean) 
S1-sapply(LAB1,sd) 
S2-sapply(LAB2,sd) 
S3-sapply(LAB3,sd) 
S4-sapply(LAB4,sd) 
S5-sapply(LAB5,sd) 
S6-sapply(LAB6,sd) 
S7-sapply(LAB7,sd) 
S8-sapply(LAB8,sd) 
#tabulating results- 
M-c(M1,M2,M3,M4,M5,M6,M7,M8) 
S-c(S1,S2,S3,S4,S5,S6,S7,S8) 
DF-data.frame(M,S,c(rep(60)))

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

48 matches

Mail list logo