2014-12-18 Thread email

I have one million names of city and their population and want to make
a hash so that by giving key = city, the hash function will return its

I use the hash package in R as follows:
h - hash(c(as.vector(df$city)), c(as.vector(df$population)))

But getting the following error:
Error in assign(keys[[i]], values[[i]], envir = hash@.Data) :
  variable names are limited to 1 bytes

How can it be solved?



[R] n-gram error with packages tau, tm, RTextTools

2014-10-05 Thread email

I am trying to compute n-grams using package tm and tau with following code:

tokenize_ngrams - function(x, n=3)
texts - c(This is the first document., This is the second file.,
This is the third text.)
corpus - Corpus(VectorSource(texts))
matrix - DocumentTermMatrix(corpus,control=list(tokenize=tokenize_ngrams))

And getting following error

 Error in FUN(X[[2L]], ...) : non-character argument

also getting same error using the RTextTools package.

Any solution?

Best regards:


[R] access scopus data

2014-06-05 Thread email

The Scopus bibliographic database allows one to manually download
publications. Is there any R package for accessing scopus data ? How
can it be accessed in R?


[R] eutils query not working

2014-02-10 Thread email

I am running the following code to query Pubmed database, but getting
the error couldn't connect to host.


url - http://eutils.ncbi.nlm.nih.gov/entrez/eutils/esearch.fcgi?;
q   - db=pubmedterm=saunders+nf[au]usehistory=y
esearch - xmlTreeParse(getURL(paste(url, q, sep=)), useInternal = T)

Error in function (type, msg, asError = TRUE)  : couldn't connect to host

How can this problem be solved?


[R] convert real valued matrix to binary matrix

2014-01-09 Thread email

I am trying to analyze an yeast gene expression data


I need to convert the real-valued data matrix to a binary (0,1)
matrix. Is there any package available? How can it be done?



[R] locate pattern in matrix

2014-01-09 Thread email
Dear all,

I have a binary matrix

0 0 0 0 0
1 1 1 0 0
1 1 1 0 0
0 0 0 0 0
0 0 0 1 1
0 0 0 1 1

I want to find the location of all the square and rectangular 1 blocks, like

First block in row=2, col=1 to row=3, col=3.
Second block in row=5, col=4, to row=6, col=5.

How can I find such blocks of 1?


[R] GA optimization in two dimensions

2014-01-09 Thread email

I am trying to implement a bandwidth reduction algorithm for a (M x
N) binary matrix using the GA package in R.

I am using the folloging code to get the optimal permutation for rows
and columns (optimization in two dimensions).

M - matrix(c(0, 0, 1, 0, 0, 0, 0, 0, 1, 0, 1, 0, 0, 0, 0, 0, 1, 0, 0,
1),nrow=5, ncol=4)

BW - function(patt, origMatrix) {
M1 - origMatrix[patt[1], patt[2]]
temp2 - 0;
temp3 - 0;
for(i in 1:nrow(M1))
for(j in 1:ncol(M1))
if(M1[i,j]  0)
temp1 - abs(i - j)
temp2 - append(temp2, temp1)
temp3 - append(temp3, max(temp2))
temp2 - 0

bwFit - function(patt, ...) 1/BW(patt[1], patt[2],...)

GA - ga(type = permutation, fitness = bwFit, origMatrix = M, min =
c(1,1), max = c(5,4), popSize = 100, maxiter = 5000, run = 500,
pmutation = 0.2)

I get this error message: Error in BW(patt[1], patt[2], ...) : unused
argument (patt[2])

In summary, I am unable to implement the optimization in two
dimensions. In the solution, I need two vectors containing the optimal row
and column permutations. Can you suggest a solution?


[R] find variation of a binary matrix

2013-11-18 Thread email

I want to calculate how much the values in a binary matrix varies, and
for that I apply the sd() method.

mat - matrix(c(1, 1, 1, 1, 1, 1, 0, 0, 0, 0, 1, 1), nrow=4, ncol=3)
stddev - sd(dist(mat,  method=binary))

And i get the following answer:

[1] 0.3442652

Is this correct? Or there is a better way out?



[R] selecting optimal cluster validation score

2013-11-16 Thread email

I have calculated the Silhouette score and Dunn score after
hierarchical clustering for 3 clusters:

#Distance measure
d - dist(USArrests, method = euclidean)
#Hierarchical clustering
hc - hclust(dist(USArrests), ave)
#calculating silhouette value for 3 clusters
sil- silhouette(cutree(hc, k=3), d)
#calculating Dunn index for 3 clusters
clus - cutree(hc, 3)
dun - dunn(d, clus)

How can the best of the two score be obtained? Is there any package to
automatically obtain the optimal (or best) of the Silhouette and the
Dunn scores ?


[R] polygon circling a graph

2013-11-14 Thread email

I want to create a polygon encircling a graph. For this i use convex
hull  to get the coordinate points for polygon.

g - barabasi.game(10)
temp1 - chull(L)
temp1 - c(temp1, temp1[1])
plot(g, layout=layout.fruchterman.reingold)

But when i plot the polygon with the code below, the polygon dosen't
encircle the graph.

polygon(L[temp1, ], col = #FFAA)

How can I plot a polygon circling a graph?


[R] cannot load MagAct96-98 - Extracurricular affiliation data

2013-11-06 Thread email

I have installed the NetData package, and want to use the  MagAct96-98
- Extracurricular affiliation data. But while loading the data, its
giving an error. Any help?

data('studentnets.magact96.97.98', package = NetData)

Warning message:
In data(studentnets.magact96.97.98, package = NetData) :
  data set ‘studentnets.magact96.97.98’ not found


[R] lm regression query

2013-02-14 Thread email

I have a 4-column dataset: Crime, Education, Urbanization, Age. I want to
construct a multiple linear regression to find the effect of Education,
Urbanization, and Age on Crime

lm(Crime ~ Education + Urbanization + Age)

If I use + in above statement, does it mean it will build a model to find
the relationship between Crime and Education when Urbanization and Age are
held constant?

What would be the difference if I drop the term Urbanization + Age ?

lm(Crime ~ Education)


[R] Conjunction and disjunction in pubmed query

2012-12-27 Thread email

I am trying to query pubmed abstracts using the following syntax:

url= http://eutils.ncbi.nlm.nih.gov/entrez/eutils/esearch.fcgi?;

search = paste(url, db=pubmedterm=, queryTerm1, +AND+,
queryTerm2,+OR+,queryTerm3, +OR+, queryTerm4,
[abstract]retmax=100usehistory=y, sep=)

docId - xmlTreeParse(getURL(paste(url, search, sep=)),

I want to fetch abstracts containing queryTerm1 AND queryTerm2
Or queryTerm3 OR queryTerm4. The code runs without error, but from the
result I find that conjunction and disjunction is not working. Can anyone
suggest a correct  syntax for doing AND and OR pubmed query?


[R] query multiple terms in PubMed abstract

2012-12-11 Thread email

I am trying to search PubMed abstracts which contains BOTH two terms:
COL4A1 AND Ocular. I am using the following code:

url= http://eutils.ncbi.nlm.nih.gov/entrez/eutils/esearch.fcgi?;
search = paste(url,
db=pubmedterm=COL4A1+AND+Ocular[abstract]retmax=300, sep=)
docId - xmlTreeParse(getURL(paste(url, search, sep=)),

I want to get the reply where BOTH the terms exist in abstract. But it is
not doing that now. Any idea?


[R] KMP String search

2012-12-08 Thread email

Is there any Package in R which implements the KMP String search algorithm ?


Re: [R] Cannot write a dataframe to xls or csv Windows 7

2012-09-14 Thread research email
Thank you so much,
Yes I need to look up how to make a reproducible example,
William I will try your advice, I believe this will be my salvation here, once 
I get my computer.


[R] Tendonitis and R users

2012-09-04 Thread research email

This request asks something beyond the technicalities of the R language, I 
would like to ask you wonderful people if you have ever suffered as programmers 
( or de facto programmers like myself though I am a 'research assistant') from 
tendonitis and how you coped with it, i have golfer's elbow on both sides. Any 

Pancho Mulongeni
[R] Exporting data to spss

2012-08-22 Thread research email
I have a dataframe of 80+ columns and over 700 rows. I use 
write.foreigin(data,C:/filename.dat,codefile.sps) and it does write out the 
.dat file and the code file.
Problem is that when I open the codefile in SPSS 20, I can an error message 
saying there are too many variables and something about the formatting (this is 
not an SPSS list so the details of the error are not germane).
Question: is there another way of reading data into SPSS without having to 
first create a text file, excel file that is then opened in SPSS manually.
Is there any other function apart from write.foreign?
Thank you

[R] bayesian gene network construction

2012-04-11 Thread email mail

I have looked at the bnlearn and deal packages for infering bayesian
network. Can anyone suggest any other suitable package for constructing
bayesian gene regulatory network using gene expression data?


[R] how to map microarray probe to gene, homology

2012-04-03 Thread email mail

I have clustered microarray gene expression data and trying to map between
microarray probe, gene, pathway, gene ontology, and homology for a set of
(affy) microarray probes. Is there any package in R which facilitates this?
I am looking at bioconductor, but till now could not find a solution. A
link to some worked example would be appreciated.

Thanks and regards.


[R] setting persistence upper limit in garchFit()

2009-08-18 Thread wc90024-email
I'm using garchFit() on a volatile time series.   I'd like to set a limit such 
that the SUM(alpha, beta)  1.   Is there a way to configure that by passing a 
parameter into garchFit()?   Or is there another way to do it?  Thanks.
[R] par(mfrow = ) resets par('cex'), not reduces it

2008-12-03 Thread [EMAIL PROTECTED]
Hello group!

I use R 2.8.0 . I've just found out that par(mfrow =) *resets* par
('cex'), not reduces it as documented. To reproduce:

par(cex = 0.5)
par(mfrow = c(2, 2))

It outputs 0.83, not 0.415 as expected.

Particularly such a behavior makes plot.acf effectively ignore par
('cex') value for multivariate case. I guess there are more situations
where the documented behavior would be more appropriate.

I think it is bug in par implementation not in documentation. Could
anyone comment on this?


R-help@r-project.org mailing list
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] specifications windows pc

2008-11-28 Thread [EMAIL PROTECTED]
Hello, I am about to order a new workstation at my university that will be used 
for R (and other research related tasks). I would appreciate any feedback on 
the specifications of a very fast machine. The machine should run windows (XP 
probably better than vista). Which chip, memory size and specification, etc 
should I be looking for? Thanks, Ruud

Re: [R] [Obo-relations] Discussion summary on original biological parts

2008-11-20 Thread [EMAIL PROTECTED]

On Thu, 20 Nov 2008, Wacek Kusnierczyk wrote:


So what exactly is your understanding of a canonical entity and
perhaps that'd clarify your point?

when i studied medicine, i used to think about 'canonical' anatomy
(well, we'd just speak of human anatomy, with no canonicity referred to)
as a model that corresponds to our expectations wrt. a human's body when
we do not have any additional information.  that is, the book
descriptions reflect, in most cases, the most frequently occurring
structural variants.

I agree.

there does not have to be a single human that is

exactly as the book descriptions want (and there is *no* single canon in
this respect, besides perhaps that a human has one heart and such gross
stuff), but there is nothing in the way of there being a human that is
accurately described by a particular canonical anatomy.  that's why i
got shot by your There is no instance of a canonical human body or a
canonical heart., which does not make sense to me if it amounts to
saying there cannot be.

as to A canonical human body will have canonical parts and those
canonical parts will have canonical subparts and so on, the problem
with the insisted canonicity here is that the more you go into details,
the less canonicity to be found.  if hardly anyone has more than one
heart (and who has none, except for patients under surgery?), hardly any
two people will have the same pattern of capillary vessels in a
particular location in their body.  that is, on the frequentist reading
of 'canonicity', the gross level canonical descriptions correspond to
strong accuracy of expectations, the more detailed levels correspond to
weaker accuracy of expectations.

I agree and that is why I shy away from canonical reference in that context. 
This is why I think it is important to clarify what canonical means. Does it 
pertain only to a particular level (gross level) or does it entail including 
all levels of granularity? Is it useful only for theoretical discourse or does 
it have a place in an ontology that primarily deals with portions of reality?

i have read the article on canonicity writen by fabian et al., and
besides its logical clarity, i found the definitions completely
useless.  it might be that i read them too quickly, and had no time to
examine them more carefully, but it stroke me that 'canonicity' was
defeined there in complete dissociation from the frequentist view.
basically, a canonical entity is one that is canonical.

Can you direct me to its url, if there is any? I'd be interested in reading it.

But let's assume the possibility that 1 or 2 out of 6 billion people
fit the idealized, canonical type. I don't see much utility in that.

of course.  what is useful is that if the canonical anatomy says that
there is one heart in a human body, and that it is located here or
there, then when a patients comes to me, without additional evidence i
assume he/she has one heart and it is here or there.  but i have less
confidence when it comes to more detailed descriptions; these are more
exemplary than canonical.

I want a system that can accommodate and provide the information that
your heart, my heart and anyone's heart are instances of some type
Heart which defines the necessary and sufficient conditions that
establish the identity of the heart structure  in you, me and everyone

wait.  if you define necessary conditions for an entity to be heart and
call this canonical anatomy, then either there are no hearts (because
none fulfils the necessary conditions) or there are hearts that
instantiate canonical anatomy.  i must be wrong, but where.

No, I want to stay away from calling it canonical because it will not 
accommodate as many instantiated hearts as there are or were.

This is why I'm re-evaluating my position with regards to any
canonical reference to the FMA, unless of course we redefine
canonicity to mean some general or generic description.

my feeling is that the term 'canonical' is virtually meaningless the way
you seem to use it.

Not meaningless but rather useless in terms of how it is currently perceived. 
And again that's why I'm questioning the current use of the term  'canonical'.



Re: [R] readPDF() -- unsure how to install xpdf to make this work?

2008-11-16 Thread [EMAIL PROTECTED]
I never said it *should* work.

I was simply trying something out that works on other types of files
I've needed in the past (eg: html, csv, dat, etc.). I don't know the
details of the pdf format, but I thought it was worth a try, certainly
no harm in experimenting, and hence I learned that pdfs aren't stored
in the same way that other files i've used in the past are. that's
fine, good to learn new things.

As for trying the readPDF() function, yes, I have downloaded and used
xpdf to convert pdfs into plain text since reading the OP email.
However, ow you can make xpdf available to the system so that readPDF
() works in R? i don't know, hence why I posted in this thread.

You clearly seem to have a solution, fancy sharing?

Clair Crossupton xx

Re: [R] readPDF() -- unsure how to install xpdf to make this work?

2008-11-15 Thread [EMAIL PROTECTED]
Hello, I was just wondering if you had found a solution? I am having
the same difficulty of converting pdf's into plain text documents in
R. I originally thought I could use the readLines() function, but as
you can see below that did not work.

R my.destfile - C:\\Documents and Settings\\clair\\Desktop\\test\\r-
R my.url - http://cran.r-project.org/doc/manuals/R-intro.pdf;
R download.file(url = my.url, destfile=my.destfile, mode='wb')
R txt - readLines(my.destfile)
R txt
[3] 1 0 obj

[4] /Length 587

[5] /Filter /

[8] [EMAIL PROTECTED]ÎÁ±?\024tBL\020$ñ°ãd4›½*´.‰\002\001øï·_•èÌf

Warm Regards,

On 13 Nov, 15:10, Tony Breyal [EMAIL PROTECTED] wrote:
 Dear R-Help,

 I need to convert a set of '.pdf' files into an equivalent set of
 '.txt' files. This is so that i can do some text mining on the

 In the latest R-News letter (http://cran.r-project.org/doc/Rnews/
 Rnews_2008-2.pdf), the package 'tm' for text mining is mentioned. In
 that lovely package, there is a function called 'readPDF()'. In order
 to use this, ?readPDF says

     Note that this PDF reader needs both the tools pdftotext and
 pdfinfo installed and accessable on your system.

 These tools are available fromhttp://www.foolabs.com/xpdf/download.html

 I am able to download this and use it easily from a dos window to
 convert a pdf file into a txt file.

 Question: how do i make these tools available to R, so that i can use
 the readPDF() function?

 Thank you in advance for any help, and I hope the above made sense.
 Tony Breyal

 ###OS = Windows Vista Ultimate sessionInfo()

 R version 2.8.0 (2008-10-20)

 LC_COLLATE=English_United Kingdom.1252;LC_CTYPE=English_United Kingdom.
 1252;LC_MONETARY=English_United Kingdom.
 1252;LC_NUMERIC=C;LC_TIME=English_United Kingdom.1252

 attached base packages:
 [1] grid      stats     graphics  grDevices utils     datasets
 methods   base

 other attached packages:
 [1] tm_0.3-1           XML_1.98-1         Snowball_0.0-3
 RWeka_0.3-14       rJava_0.6-0        Matrix_0.999375-16
 lattice_0.17-15    filehash_2.0

 loaded via a namespace (and not attached):
 [1] proxy_0.4-1

 [EMAIL PROTECTED] mailing listhttps://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guidehttp://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

[R] Using n+1 instances of R to utilise n processors on one machine - something like R with tabbed browsing?

2008-11-15 Thread [EMAIL PROTECTED]
Dear R-help,

Please forgive me if any of the following sounds naieve/confused, i've
just got back from a mini-pub-crawl, slightly tipsy, and am feeling
brave to ask a possibly silly question... also, not to shiny on the
technical side of things.

Problem - I need to text mine a collection of 10,000 plain text
documents, all of which are sitting in a single folder. i don't have
any money to buy a database package, and even if i did i have no idea
how that would speed things up if i want to do all the processing in

Assumption - It is my understanding that R can only use one processor
on a machine when handeling calculations. If you wanted to use 4
processors, than you would have to open up 4 seperate instances of R
and share the work between them eg. give each instance of R 25% of the
documents you want processed

Question - It is possible to have one instance of R to divide the
workload, and then that instance opens up 4 other instances of R to do
the processing?

Or, is sometihng akin to tabbed browsing, where you have one main
window and several tabs, each corresponding to a different instance of

appologies if none of hte above made sense  :o)

Clair xx

O/S: Windows Vista
R 2.8.0

[R] TIme Series AR to MA and (viceversa)

2008-11-04 Thread [EMAIL PROTECTED]

I am new to using R for Time series analysis. I was wondering if there are any 
functions that can convert ARMA or ARIMA time series into their corresponding 
AR or MA time series representations (by calculating the corresponding AR or MA 

Thanks a lot


[R] Sweave Error

2008-10-28 Thread [EMAIL PROTECTED]

dear R users,

I am using sweave to generate report for my data analysis.
I recently updated R ro 2.8.0, and now I have the following results when 
compile the the tex file generated from R. 

This is pdfTeXk, Version 3.141592-1.40.3 (Web2C 7.5.6)
 %-line parsing enabled.
entering extended mode
LaTeX2e 2005/12/01
Babel v3.8h and hyphenation patterns for english, usenglishmax, dumylang, noh
yphenation, croatian, ukrainian, russian, bulgarian, czech, slovak, danish, dut
ch, finnish, basque, french, german, ngerman, ibycus, greek, monogreek, ancient
greek, hungarian, italian, latin, mongolian, norsk, icelandic, interlingua, tur
kish, coptic, romanian, welsh, serbian, slovenian, estonian, esperanto, upperso
rbian, indonesian, polish, portuguese, spanish, catalan, galician, swedish, loa
Document Class: report 2005/09/16 v1.4f Standard LaTeX document class
For additional information on amsmath, use the `?' option.

! LaTeX Error: File `Sweave.sty' not found.

Type X to quit or RETURN to proceed,
or enter new name. (Default extension: sty)

who to solve the problem?


[R] TINN-R's R Explores - Available for other editors?

2008-10-21 Thread [EMAIL PROTECTED]

I am using TINN-R for working with R and for that purpose it is a very
handy editor, in particular the R-Explorer that shows the existing
objects and their properties is worth money.
But I want to move to a more flexible editor (in particular for Latex)
and was thinking of WinEdt (or maybe Eclipse, because of Java). I know
they have capabilities to work directly with R, but has any other
editor the same capabilities when it comes down to the R-Explorer?


R-help@r-project.org mailing list
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Multi matrix row-wise mapply?

2008-10-21 Thread [EMAIL PROTECTED]
Hi group!

Suppose I have 2 matrices A and B of equal dimensions.
I want to apply a function f to all corresponding pairs of rows from A
and B in an efficient manner.
Basically, I want

mapply(f, data.frame(A), data.frame(B))

but for rows.

How do I do it?


R-help@r-project.org mailing list
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] TINN-R's R Explores - Available for other editors?

2008-10-21 Thread [EMAIL PROTECTED]

 I am using TINN-R for working with R and for that purpose it is a very
 handy editor, in particular the R-Explorer that shows the existing
 objects and their properties is worth money.
 But I want to move to a more flexible editor (in particular for Latex)
 and was thinking of WinEdt (or maybe Eclipse, because of Java). I know
 they have capabilities to work directly with R, but has any other
 editor the same capabilities when it comes down to the R-Explorer?


You should definitely try Emacs with ESS package.


R-help@r-project.org mailing list
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Misura precauzionale: Cambia il tuo codice di accesso!

2008-10-13 Thread [EMAIL PROTECTED]

[R] Problem loading package created with package.skeleton

2008-10-06 Thread [EMAIL PROTECTED]

i'm trying to build an r-package (Windows Vista, R 2.7.2), so i created
one with the package.skeleton() command.
After that i zipped it and tried to load it in R.

Attaching the package via the Windows R-console menu 'Packages/Install
package(s) from local zip-files'  failed:
/ utils:::menuInstallLocal()
Error in gzfile(file, r) : cannot open the connection
In addition: Warning messages:
1: In zip.unpack(pkg, tmpDir) : error -1 in extracting from zip file
2: In gzfile(file, r) :
 cannot open compressed file 'MeinPacket/DESCRIPTION', probable reason
'No such file or directory'/

The library(package) command failed too:
/ library('MyPackage.zip')
Error in library(MyPackage.zip) :
 there is no package called 'MyPackage.zip'/

I checked the current working directory and made sure that the
package-zip was in it, but R couldn't find it anyway.

Does anyone know how to get the package loaded?
Thank you and best regards,


[R] warning message while using mice Multivariate Imputation by Chained Equations

2008-08-27 Thread [EMAIL PROTECTED]
 Hi there,

I am a beginner with R. Anyone could help to explain me what the following
warning msg means?

Warning messages:
1: In any(predictorMatrix[j, ]) :
  coercing argument of type 'double' to logical

Got 6 of it...although further calculations were done. Because of this
warning I am not sure if they are reliable since have no idea what exactly
the warning is about. Could anyone please help with the explanation?

I used mice Multivariate Imputation by Chained Equations (belonging to
mice package) - the default version - predictive mean matching


By using the command mice it is instead of using default version (which I
was using in this case and got the warning msg) possible to create your own
function which tells how the imputation data should be created. I would like
to create that function saying that Bayes bootstrap of the known data should
be imputed for those missing data. Any idea how that could be done??


[R] lapply, sapply

2008-08-02 Thread [EMAIL PROTECTED]
Hello everybody,
I have problem with a lapply command, which rather proves that I don't
fully understand it.

I want to extract from a list that consists of dataframes, the length
of the first sequences from a given variable (its part of a simulation

Below is code which does the job, but I think it should be possible to
make it more compact.

### Example Data
dat -list()
dat[[1]] - data.frame(matrix( rbinom(40, 1, .8),nrow=5))
dat[[2]] - data.frame(matrix( rbinom(40, 1, .8),nrow=5))
x-sapply(dat,[,3)#Extracting the vector
y-lapply(x,rle)  #Counting the sequences which is returned as a list
z-sapply(y,[, 1) #extracting the first element of the list of
sequence counts
final-sapply(z,[,1)   #extracting the first number, which gives the
length of the first sequence, which I want

Thanks for your help,

[R] Eclipse/Statet: How to set breakpoints

2008-07-30 Thread [EMAIL PROTECTED]

I'm using the IDE Eclipse with the Statet-Plugin to develop R-code.
Is there a possibility to set breakpoints for debugging in R-script files
(just like in Java-code)?

Re: [R] function to transform response of a formula

2008-07-30 Thread email

Its ok.

I've just read about update.formula in another message.


Paul Emberson wrote:


I am trying to write a function which takes a formula as input and 
outputs a new formula with a different response while keeping the rest 
of the formula the same.


respapply : ( y ~ a+b ) - ( f(y) ~ a + b )

I have tried the following but it doesn't work.  The terms become 
invalid as shown below.

respapply - function(fm, f) {

fm[[2]] - f(eval(fm[[2]]))


  fm - formula(y ~ a + b)
  y - runif(5)
  newfm - respapply(fm,identity)

c(0.552921097259969, 0.939932722365484, 0.62522904924117, 

0.877736972644925) ~ a + b

Error in terms.formula(fmapply(fm, identity)) :
  invalid term in model formula

Could someone put me in the right direction of how to correctly write a 
respapply function as described.


Paul Emberson

R-help@r-project.org mailing list
PLEASE do read the posting guide 

and provide commented, minimal, self-contained, reproducible code.

[R] enscript states file for R scripts?

2008-07-15 Thread [EMAIL PROTECTED]
Hi group!

GNU enscript is a free (as in freedom) text file decorator, which
among other features can highlight source code files. The language
syntax descriptions are provided via special states files. The
standard distribution contains states files for the most popular
languages (C, C++, Pascal, LaTeX, etc), but sadly there is no states
file for R.

Does anyone know if there is a place where I can obtain enscript
states file for decorating R scripts? Many thanks!

Andrey Paramonov

Re: [R] enscript states file for R scripts?

2008-07-15 Thread [EMAIL PROTECTED]
On 15 июл, 13:23, hadley wickham [EMAIL PROTECTED] wrote:
 An alternative to enscript is 
 highlight,http://www.andre-simon.de/doku/highlight/en/highlight.html, which 
 come with R highlighting built in.


Thanks for pointing this out.

But I think I actually need enscript highlighting. My ultimate goal is
to enable R scripts highlighting in WebSVN, and it uses enscript. Or
is highlight compatible with enscript?

Andrey Paramonov

[R] non-sample standard-deviation

2008-07-09 Thread [EMAIL PROTECTED]

R seems to use the 1/n-1-factor calculating the standard-deviation sd().

If i wat to get the non-sample standard-deviation i use 

Is there a parameter to get the sd()-function using the 1/n factor 

Or is there any other function to do so?

Thank you in advance :-)

[R] WIERD: Basic computing in R

2008-07-01 Thread [EMAIL PROTECTED]
Can someone please enlighten me as to why the following happens?
[1] -5125.407

 p- -2.7
 q- 8.6
[1] NaN
R seems perfectly able to calculate -2.7^8.6, but fails when the exact same 
values are assigned to variables and then the computation is repeated. 
Thanks in advance for any suggetsions.

Re: [R] median of grouped data

2008-06-30 Thread [EMAIL PROTECTED]
On 28 июн, 03:19, Bricklemyer, Ross S [EMAIL PROTECTED] wrote:
 I am having difficulty calculating the median of grouped data.  I have 8 to 
 10 repeated measures per sample and I have successfully used the following 
 code to calculate the average for each sample.


You might want to create a wrapper around ave like this:

myave - function(..., ave.FUN = mean)
  ave(..., FUN = ave.FUN)

and pass needed ave.FUN parameter to apply.


[R] Alternative of Cairo

2008-06-27 Thread [EMAIL PROTECTED]
Hi All,

I am a new member to R programming.

Am generating some visuals by using Cairo library. But Cairo is not
compatible with all compilers(Box plot,histogram and RNA degradation
plots-I would prefer to use some libraries rather than R native
functions).Can anyone suggest an alternative for this library.

Thanks in Advance

Re: [R] running R-code outside of R

2008-06-25 Thread [EMAIL PROTECTED]
On 25 июн, 09:28, Jim Porzak [EMAIL PROTECTED] wrote:
 The user of your R script sees only the outputs you create. The R source is

But you may expose R sources via the web server if you wish.


Re: [R] Question about copula-GARCH model

2008-06-23 Thread [EMAIL PROTECTED]
A simple approach is to assume that dependence structure between
variables (which is characterized by copula) is constant throughout
the process. In this case, you may apply log-likelihood estimation of
copula parameters to ranked AR-GARCH process residuals.

A more complicated approach is to invent a model of how copula
parameters depend on the previous values of the process. This *might*
provide a better description of some empirical effects. But in
reality, these models are not reliable enough, because you are likely
to deal with many more parameters than in univariate case, and there
is no general way to estimate the parameters of copula evolution.
However, attempts are still made (see for example package mgarchBEKK).


Re: [R] Programming Concepts and Philosophy

2008-06-20 Thread [EMAIL PROTECTED]
On 20 июн, 11:06, Wacek Kusnierczyk
 the result may be that the more beautiful the code, the more the performance

Sad but true.


Re: [R] combining two data frames (different question)

2008-06-18 Thread [EMAIL PROTECTED]



[R] qt with ncp37.62

2008-06-14 Thread [EMAIL PROTECTED]
help(qt) states that: 
ncpnon-centrality parameter delta; currently except for rt(), only for
abs(ncp) = 37.62

so I would expect that calling qt with non-centrality parameter exceeding
37.62 should fail, instead e.g. calling

 mapply(function(x) qt(p = 0.9, df = 55, ncp = x),35:45)


 [1] 40.21448 41.35293 42.49164 43.68862 44.82945 45.97048 47.11170 48.25310
 [9] 49.39467 50.53639 51.67826
Warning messages:
1: In qt(p = 0.9, df = 55, ncp = x) :
  full precision was not achieved in 'pnt'
2: In qt(p = 0.9, df = 55, ncp = x) :
  full precision was not achieved in 'pnt'
3: In qt(p = 0.9, df = 55, ncp = x) :
  full precision was not achieved in 'pnt'

so it seems calculation for (according to what is written in documentation)
allowed values of ncp, i.e. in this case 35,36 and 37 is done and precision is
checked, whereas calculation for the rest may be completely incorrect?
Or was there any update of code (in pnt.c ?), allowing calculation of pt with
higher ncp, not followed by documentation update?

Thank you,
Nikola Kaspříková

[R] Random Forests regression by strata

2008-06-02 Thread [EMAIL PROTECTED]
I'm trying to sample in Random Forests by a factor, but it is a regression 
problem and I can't figure out how to do this (I can only see how to sample by 
strata in classification).
Jesse Lasky

[R] plot 7 * 3 matrix on DIN A4 pdf

2008-05-22 Thread [EMAIL PROTECTED]

I want to plot 21 scatter plots in a 7*3 matrix on a single A4 page
(using mfcol). Below is an example, which -by and large- looks very
close to the desired result. But I do not get it to fill out the whole
page. If I specify the a different page size, the plots still do not
fill out the whole page.
Anybody an idea what to do? Maybe adjusting indvidiual plot size (if
so, how?). Or is there a neater command to create a matrix of
individual plots


 par(pty=m) #s makes a square plot

par(mar=c(3, 5, 0, 0) + 0.1)
plot(x,xlim=c(0,50),ylim=c(0,100),axes = FALSE)
plot(x,xlim=c(0,50), ylim=c(0,100),axes = FALSE)
plot(x, xlim=c(0,50),ylim=c(0,100),axes = FALSE)
plot(x,xlim=c(0,50), ylim=c(0,100),axes = FALSE)
plot(x,xlim=c(0,50), ylim=c(0,100),axes = FALSE)
plot(x,xlim=c(0,50), ylim=c(0,100),axes = FALSE)
par(mar=c(3, 5 ,0, 0) + 0.1)
plot(x,xlim=c(0,50), ylim=c(0,100),axes = FALSE)

par(mar=c(3, 0, 0, 0) + 0.1)
plot(as.integer(Sex),GER_TOTAL,ylim=c(0,100),xlim=c(0,3),axes = FALSE,pch=12)
plot(x,ylim=c(0,100),xlim=c(0,3),axes = FALSE,pch=12)
plot(x,ylim=c(0,100),xlim=c(0,3),axes = FALSE,pch=12)
plot(x,ylim=c(0,100),xlim=c(0,3),axes = FALSE)
plot(x,ylim=c(0,100),xlim=c(0,3),axes = FALSE,pch=12)
plot(x,ylim=c(0,100),xlim=c(0,3),axes = FALSE,pch=12)
par(mar=c(3, 0, 0, 0) + 0.1)
plot(as.integer(Sex),GER_F6,ylim=c(0,100),xlim=c(0,3),axes = FALSE)

par(mar=c(3, 0, 0, 0) + 0.1)
plot(x,xlim=c(0,100),ylim=c(0,100),axes = FALSE,pch=12)
plot(x,xlim=c(0,100), ylim=c(0,100),axes = FALSE)
plot(x, xlim=c(0,100),ylim=c(0,100),axes = FALSE)
plot(x, xlim=c(0,100),ylim=c(0,100),axes = FALSE)
plot(x, xlim=c(0,100),ylim=c(0,100),axes = FALSE,pch=12)
plot(x, xlim=c(0,100),ylim=c(0,100),axes = FALSE)
par(mar=c(3, 0, 0, 0) + 0.1)
plot(x,xlim=c(0,100),ylim=c(0,100),axes = FALSE,pch=12)

[R] Finding dependencies and clusters in live survey data with a mix of independent variable types

2008-05-08 Thread [EMAIL PROTECTED]
I have a set of live data about customer satisfaction and desires of a
live ecommerce site. There are only 311 survey responses. There were
approximately 154 questions. A large fraction of these questions were
questions with numerical answers (e.g. on a scale of 1 to 10 how
satisfied are you with our service,  how many months have you been a
customer for, how old are you, how many computers do you own). A
second large fraction of the questions had binary answers (e.g. do you
own an ipod,  do you think blogging will be more or less popular in 5
years time than it is now, do you use online video sites). The
remaining data were multinomial answers (e.g. from which of these
sources did you first find out about this site,  which of these most
closely describes the industry you are in).

I am mostly interested in finding subsets of customers for whom some
subset of survey answers best correlate with their answer to the
question On a scale of 1 to 10, how would you rate our overall
service?  I am also interested in identifying market segments of
like-minded individuals with similar interests and views and find out
what they, as a group most want from the service in the future.

I am aware of how to perform multiple linear regression using R but I
am not sure how to
1. handle the binary variables and multinomial variables as
independent variables
2. find a set of canonical independent variables which most closely
correlate in combination to the overall service rating data
3. find market segments among the data by looking for clusters of like
interests and views

Are any of the above suitable for analysis by R? If so, do there exist
example programs available which achieve similar things that I can
study as guides?

Thanks in advance for your contemplation.


[R] creating a matrix subset based on a threshold cutoff

2008-03-04 Thread [EMAIL PROTECTED]


I have a table of x rows and y columns. The table is huge and so i'd like
   to create a subset of the data containing rows where any of the y values are
   below a threshold, say 1e-4. Is there a simple way of doing this in R?


Re: [R] creating a matrix subset based on a threshold cutoff

2008-03-04 Thread [EMAIL PROTECTED]

   sorry, it would probably better be described as a table


   On Tue Mar 4 11:28 , Henrique Dallazuanna sent:
 You have a table or matrix?
 if is matrix:
 x - matrix(rnorm(100), 10, 10)
 cutoff - -1.5
 do.call(rbind, apply(x, 1, function(.x).x[any(.x  cutoff)]))
 On 04/03/2008, [EMAIL PROTECTED]
  I have a table of x rows and y columns. The table is huge and so i'd
  to create a subset of the data containing rows where any of the y values
  below a threshold, say 1e-4. Is there a simple way of doing this in R?
  [EMAIL PROTECTED] mailing list
  PLEASE do read the posting guide
  and provide commented, minimal, self-contained, reproducible code.
 Henrique Dallazuanna
 25° 25' 40 S 49° 16' 22 O


   1. javascript:top.opencompose('[EMAIL PROTECTED]','','','')
   2. javascript:top.opencompose('[EMAIL PROTECTED]','','','')
   3. javascript:top.opencompose('R-help@r-project.org','','','')
Re: [R] creating a matrix subset based on a threshold cutoff

2008-03-04 Thread [EMAIL PROTECTED]

   that's great, thanks
   On Tue Mar 4 12:29 , Henrique Dallazuanna sent:
 I think that this shoul works:
 do.call(rbind, apply(as.table(x), 1, function(.x).x[any(.x  cutoff)]))
 Change as.table(x) by your table
 On 04/03/2008, rich @ thevillas. eclipse. co. uk
  sorry, it would probably better be described as a table
  On Tue Mar 4 11:28 , Henrique Dallazuanna sent:
  You have a table or matrix?
  if is matrix:
  x - matrix(rnorm(100), 10, 10)
  cutoff - -1.5
  do.call(rbind, apply(x, 1, function(.x).x[any(.x  cutoff)]))
  On 04/03/2008, [EMAIL PROTECTED]
   I have a table of x rows and y columns. The table is huge and so i'd
   to create a subset of the data containing rows where any of the y
   below a threshold, say 1e-4. Is there a simple way of doing this in R?
   [EMAIL PROTECTED] mailing list
   PLEASE do read the posting guide
   and provide commented, minimal, self-contained, reproducible code.
  Henrique Dallazuanna
  25° 25' 40 S 49° 16' 22 O
 Henrique Dallazuanna
 25° 25' 40 S 49° 16' 22 O


   1. javascript:top.opencompose('[EMAIL PROTECTED]','','','')
   2. javascript:top.opencompose('[EMAIL PROTECTED]','','','')
   3. javascript:top.opencompose('[EMAIL PROTECTED]','','','')
   4. javascript:top.opencompose('R-help@r-project.org','','','')
[R] R on a computer cluster

2008-02-16 Thread [EMAIL PROTECTED]
Dear all,

I usually run  R on my laptop with Windows XP Professional.
Now I really want to run  R on a computer cluster (4 processors) with 
Suse Linux Enterprise ver. 10.   But I  am new with computer cluster.

Should I modify my functions in order to use the greater 
and availability than that provided by my laptop?

Is there any R 
manual  on parallel computations on multiple-processor? 
Any suggestion 
on a basic tutorial on this topic? 

Thank you.

Gabriele Accetta
O. Biostatistics  (C.S.P.O)
Via Cosimo il Vecchio, 2
50139 FLORENCE - 

[R] data manipulation for plotting

2008-02-14 Thread [EMAIL PROTECTED]


   i'd like to plot some data that I have with the value on the x axis and freq
   on the y axis.

   So, I need to calculate the freq a value is seen within my data vector

   for example, say i have a vector of data


   I want



   in order to enable me to plot this. Sorry, i'm new to R. What is standard
   procedure here for plotting the data vector?



[R] axis help

2008-02-06 Thread [EMAIL PROTECTED]

Hi, i'm having trouble with my x and y axis. The commands i'm using are
   below. The problem is that the y axis starts at coordinate 0,1 and the x
   axis starts at coordinate 0,0. As far as I know the y axis can't  start at 0
   (because it's log scaled) ,so I would like to position the x axis at 0,1 but
   don't know how to do this. Could anyone advise?




   ylim=c(1,4000),at=c(0.5,0.6,0.7),boxwex=   0.05,axes=FALSE,col   =
[R] boxplot axis labelling

2008-01-24 Thread [EMAIL PROTECTED]


   i'm very new to R, so sorry for what i'm sure is a very basic question. I'm
   producing a boxplot with the data below:

   boxplot(dfnew,log='y', ylim=c(1,4000))

   This produces x axis labels  'kcvd3,kcan3,knoncan3,k3', one for each plot as
   you might expect. However, I would like  all plot to sit next to each other
   with a single label. Could anybody help?


[R] integration

2007-12-18 Thread [EMAIL PROTECTED]
Dear All,
I need to perform a numerical integration of one dimensional 
fucntions. The extrems of integration are both finite and the functions 
I'm working on are quite complicated. I have already tried both area() 
and integrate(), but they do not perform well: area() is very slow and 
integrate() does not converge. Are in R other functions for numerical 
integration of one dimentional functions?

Thanks in advance 


Tiscali.Fax: il tuo fax online in promo fino al 31 dicembre, 
paghi 15€ e ricarichi 20€ 

[R] importing Expressions

2007-10-26 Thread [EMAIL PROTECTED]
I need to import some expressions form Mathematica into R editor for 
coding purposes. If I just copy the expression from Mathematica into 
editor, I obtian unuseful string. For instance, if I want to copy the 
expression beta/sigma from Mathematica, I have \!\(β\/σ\) in the R 
editor. Does there exist a way to import this kind of expressions 
Mathematica into R?

many thanks



