Re: [Rd] [R] multithreading calling from the rpy Python package

2006-10-12 Thread René J.V. Bertin
Thanks, Duncan,

> It is a mixture of two things. Yes, R is not thread safe so if
> two system threads were to access R concurrently, bad things would
> happen a.s.

  That's clear, yes. :-/ And a pity, but so be it.

> It is also an issue when Python is compiled and linked with
> threaded options and routines from  the system, e.g. libpthread
> and R is not.  When R is dynamically loaded into the Python
> process, unless R is very carefully compiled, symbols (i.e. routines)

I built Python with --enable-threads, but I don't think R has a build
option for this?

> that R uses will come from the Python executable and these may not
> agree with R's view at compilation. And bad things happen.

  But that would also happen in single-threaded applications, and it
doesn't. Unless I'm understanding you wrong...

> This depends on your operating system, and it doesn't appear that
> you have told us what that is. Bad boy :-)

  Indeed it depends on the OS. Read again. It says (somewhere...) that
I'm using Mac Os X 10.4.8 :P . And under that OS, symbols are not
visible by default across shared libraries.

> This is an issue with Rpy, RSPython, RSPerl, R apache module, rJava, ...

Rpy only allows the creation of a single R "instance". Suppose it were
possible, it probably wouldn't help to create as many instances as
there are to be threads, right? The "memory not mapped" error message
suggests one thread tries to access memory that was just freed by
another thread. A bit surprising maybe that this happens in a function
that appears to be intended to be recursive (judging from the
traceback). As far as I understand, thread-safe means re-entrant which
means recursive-safe too...

...

> e.g. make it extensible at the native level.  For stat. computing
> to continue to grow and for all of us to be able to explore newer
> areas, we probably need to think about building infrastructure for the
> next 5- 10 years and not continue to tweak a model that has been around
> for 30 years.  How we do this requires some serious thought

  I can't agree more, but have no suggestions

> and evaluating trade-offs of building things ourselves with a small
> community or leveraging other existing or emerging systems, e.g. Python,
> Perl6/Parrot, etc.

  Well, Python is great, numpy and scipy allow one to do serious work,
but there are things in which R has a clear advantage. Just to name
some: handling of missing values is one (and the reason I'm not using
numpy or scipy's var function). Slicing is another (somewhat
cumbersome in Python), data.frames yet another. I'm not sure how easy
it would be to extend Python's syntax to accomodate for something
useful as

a[ is.na(a) ] <- -1

R.B.

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] [R] multithreading calling from the rpy Python package

2006-10-20 Thread René J.V. Bertin
Since Python has been mentioned in this context: Could not Python's
threading model and implementation serve as a guideline?

>From a few simple benchmarks I've run, it seems as if the Python
interpreter itself is thread-safe but not threadable. That is, when I
run something "pure Python" like a recursive function that returns the
nth Fibonacci number in parallel, there is no speed-up for 2 threads
on a dual-processor machine. However, calling sleep in parallel does
scale down with the number of threads, even on a single-processor ;)

Real-life code does tend to speed up somewhat, though never as much as
one would hope.

Just an idea...

René

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] [R] multithreading calling from the rpy Python package

2006-10-20 Thread René J.V. Bertin
> But it still remains to be seen whether the extra work to introduce
> threads is warranted. Will people actually use them in R and will it
> have a significant impact on the computations or simply make writing
> GUIs within R slightly easier to manage?

  If threads can be set up easily, why not? Now that multi-core
machines are becoming more easily available...

It is not just about reducing computation time, btw. Not so long ago,
I was setting up a system in Matlab to do concurrent sampling of a DAQ
and an eye-tracker, and to show and record the sampled data. The DAQ
toolbox fires off its own thread that does the actual sampling and can
be configured to call a Matlab callback function at a predetermined
interval.
The eye-tracker code is single-threaded. If Matlab had been
threadable, I'd have been able to sample the tracker in a different
thread, and not miss out on the data coming in while plotting.

> One of the reasons I am hesitant to use Python as a framework on
> which to build a new system is the "thread-safe but not threadable"
> issue. Also, it is not easily extensible in an object oriented manner

  Well, I didn't mean to suggest that it would the perfect solution.
It seemed like a potentially worthwhile, feasible temporary solution
that would allow at least some multithreading. I don't see how it is
not easily extensible in an OO manner, though. The Python threads I
use *are* objects (and very similar apparently to Java's threading
model).

Best,
René

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] [R] multithreading calling from the rpy Python package

2006-10-28 Thread René J.V. Bertin
I don't want to keep hammering on the possible interest of python in
this context... but have you seen this?

http://ipython.scipy.org/moin/Parallel_Computing

I know, not exactly the same as multithreading ;)

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel