Re: [R] Using multicores in R

2012-12-04 Thread moriah
Thanks for the help, 

Perhaps I should elaborate a bit, I am working on bioinformatics  project in
which I am trying to run a forward selection algorithm for machine learning
classification of two biological conditions. 
At each iteration I want to find the gene that in addition to those I have
found already does the best classification.

It looks something like this:

for (j in 1:5030)
  { 
  tp <- 0;
  for (i in 1:5030)
  {
if (!(i %in% idx))
{

  classifier<-naiveBayes(trn[,c(i,idx)], trn[,20118]) 
  tbl <-table(predict(classifier, trn[,-20118]), trn[,20118])
  success <- (tbl[[1]] +tbl[[4]])/(tbl[[1]] +tbl[[4]]+tbl[[2]]+tbl[[3]])

  if (success > tp)
  {
tp <- success
ind <- i
gene <- names(trn)[i]
  }
}

  }
  idx <- c(idx,ind)
  res <- rbind(res, data.frame(Iteration=j,Success=tp*100,Gene=gene))
}

I am no expert when it comes to programming so I am not sure how can I
optimize my relatively primitive code in the best way...

Thanks,
Moriah





--
View this message in context: 
http://r.789695.n4.nabble.com/Using-multicores-in-R-tp4651808p4652034.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Using multicores in R

2012-12-03 Thread Jim Porzak
Moriah,

Since you are doing nested loops, Rcpp may be an easy speed-up. Follow
all the links here
http://blog.revolutionanalytics.com/2012/11/hadleys-guide-to-high-performance-r-with-rcpp.html
for details.

HTH,
Jim Porzak
Minted.com
San Francisco, CA
www.linkedin.com/in/jimporzak
use R! Group SF: www.meetup.com/R-Users/


On Mon, Dec 3, 2012 at 2:14 AM, moriah  wrote:
> Hi,
>
> I have an R script which is time consuming because it has two nested loops
> in it of at least 5000 iterations each, I have tried to use the multicore
> package but id doesn't seem to improve the elapsed time of the script(a
> shorter script for example) and I can't use the mcapply because of technical
> reasons.
>
> I was wondering how can I make my script use more cores and memory because I
> am running it on a server and it is a shame that it uses only one core.
>
> Thanks!
> Moriah
>
>
>
>
> --
> View this message in context: 
> http://r.789695.n4.nabble.com/Using-multicores-in-R-tp4651808.html
> Sent from the R help mailing list archive at Nabble.com.
>
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Using multicores in R

2012-12-03 Thread Spencer Graves
  1.  Have you looked at CRAN Task View: High-Performance and 
Parallel Computing with R 
(http://cran.r-project.org/web/views/HighPerformanceComputing.html)?



  2.  Have you tried the "compiler" package?  If I understand 
correctly, R is a two-stage interpreter, first translating what we know 
as R into byte code, which is then interpreted by a byte code 
interpreter.  If my memory is correct, this approach can cut the compute 
time by a factor of 100.



  3.  Have you reviewed the section on "Profiling R code for speed" 
in the "Writing R Extensions" manual that becomes available after 
help.start()?  The profiling tools discussed there help identify the 
portion of more complex code that takes the most time.  The standard 
advice then is to experiment with writing the most time consuming 
portion several different ways.  I've seen many examples where writing 
what appears to be the same thing in R several different ways identifies 
one that is easily 10 and maybe 100 or 1000 times faster than the 
slowest alternative tried.



  4.  Have you tried using the "sos" package to search for other 
functions and packages in R that may already have good code doing some 
of the things you want to do?  The "findFn" function in "sos" searches 
the "functions" subset of the "RSiteSearch" database and returns the 
result sorted by package.  There are also a "union" and 
"writeFindFn2xls" functions to make it easy to manipulate and evaluate 
the results, described in a vignette. It's the best literature search I 
know for anything statistical: If I don't find it there, it's OK to look 
someplace else. [Caveat:  I'm the lead author of "sos", so I'm biased.]



  Best Wishes,
  Spencer


On 12/3/2012 6:24 AM, Steve Lianoglou wrote:

And also:

On Monday, December 3, 2012, Uwe Ligges wrote:



On 03.12.2012 11:14, moriah wrote:


Hi,

I have an R script which is time consuming because it has two nested loops
in it of at least 5000 iterations each, I have tried to use the multicore
package but id doesn't seem to improve the elapsed time of the script(a
shorter script for example) and I can't use the mcapply because of
technical
reasons.


Errr, but otherwise multicore does not have an effect ...

See package "parallel" that offers various functions for parallel
computations. We cannot help much more if you do not tell us what the
technical reasons are why mcapply() does not work.


If the work you are doing within each iteration of the loop is trivial, you
will likely even see a decrease in performance if you try to parallelize it.

Without more info from you regarding your problem, there's little we can do
to help, tho.

  -Steve






--
Spencer Graves, PE, PhD
President and Chief Technology Officer
Structure Inspection and Monitoring, Inc.
751 Emerson Ct.
San José, CA 95126
ph:  408-655-4567
web:  www.structuremonitoring.com


--
Spencer Graves, PE, PhD
President and Chief Technology Officer
Structure Inspection and Monitoring, Inc.
751 Emerson Ct.
San José, CA 95126
ph:  408-655-4567
web:  www.structuremonitoring.com

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Using multicores in R

2012-12-03 Thread Steve Lianoglou
And also:

On Monday, December 3, 2012, Uwe Ligges wrote:

>
>
> On 03.12.2012 11:14, moriah wrote:
>
>> Hi,
>>
>> I have an R script which is time consuming because it has two nested loops
>> in it of at least 5000 iterations each, I have tried to use the multicore
>> package but id doesn't seem to improve the elapsed time of the script(a
>> shorter script for example) and I can't use the mcapply because of
>> technical
>> reasons.
>>
>
> Errr, but otherwise multicore does not have an effect ...
>
> See package "parallel" that offers various functions for parallel
> computations. We cannot help much more if you do not tell us what the
> technical reasons are why mcapply() does not work.


If the work you are doing within each iteration of the loop is trivial, you
will likely even see a decrease in performance if you try to parallelize it.

Without more info from you regarding your problem, there's little we can do
to help, tho.

 -Steve



-- 
Steve Lianoglou
Graduate Student: Computational Systems Biology
 | Memorial Sloan-Kettering Cancer Center
 | Weill Medical College of Cornell University
Contact Info: http://cbio.mskcc.org/~lianos/contact

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Using multicores in R

2012-12-03 Thread Uwe Ligges



On 03.12.2012 11:14, moriah wrote:

Hi,

I have an R script which is time consuming because it has two nested loops
in it of at least 5000 iterations each, I have tried to use the multicore
package but id doesn't seem to improve the elapsed time of the script(a
shorter script for example) and I can't use the mcapply because of technical
reasons.


Errr, but otherwise multicore does not have an effect ...

See package "parallel" that offers various functions for parallel 
computations. We cannot help much more if you do not tell us what the 
technical reasons are why mcapply() does not work.


Best,
Uwe Ligges





I was wondering how can I make my script use more cores and memory because I
am running it on a server and it is a shame that it uses only one core.





Thanks!
Moriah




--
View this message in context: 
http://r.789695.n4.nabble.com/Using-multicores-in-R-tp4651808.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.



__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Using multicores in R

2012-12-03 Thread moriah
Hi,

I have an R script which is time consuming because it has two nested loops
in it of at least 5000 iterations each, I have tried to use the multicore
package but id doesn't seem to improve the elapsed time of the script(a
shorter script for example) and I can't use the mcapply because of technical
reasons.

I was wondering how can I make my script use more cores and memory because I
am running it on a server and it is a shame that it uses only one core. 

Thanks!
Moriah
 



--
View this message in context: 
http://r.789695.n4.nabble.com/Using-multicores-in-R-tp4651808.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.