[Rd] Commenting conventions

2011-01-13 Thread dhinds
This might be a dumb question, but I couldn't figure out how to find
the answer: why is it that comments in R documentation files (i.e. in
examples) typically start with a double hash (##) instead of a single
hash?

-- Dave

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] Commenting conventions

2011-01-14 Thread dhinds
Erik Iverson  wrote:

> dhi...@sonic.net wrote:
> > This might be a dumb question, but I couldn't figure out how to find
> > the answer: why is it that comments in R documentation files (i.e. in
> > examples) typically start with a double hash (##) instead of a single
> > hash?

> See the second paragraph in section 7.5 for the likely answer.

> http://ess.r-project.org/Manual/ess.html#Indenting

Ahh.  I'd forgotten the (setq ess-fancy-comments nil) in my .emacs
file!  I thought the explanation would turn up in an R coding
standards document and/or in Writing R Documentation Files, and it
isn't an easy thing to google.

-- Dave

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] Moderating consequences of garbage collection when in C

2011-11-09 Thread dhinds
Martin Morgan  wrote:
> Allocating many small objects triggers numerous garbage collections as R 
> grows its memory, seriously degrading performance. The specific use case 
> is in creating a STRSXP of several 1,000,000's of elements of 60-100 
> characters each; a simplified illustration understating the effects 
> (because there is initially little to garbage collect, in contrast to an 
> R session with several packages loaded) is below.

What a coincidence -- I was just going to post a question about why it
is so slow to create a STRSXP of ~10,000,000 unique elements, each ~10
characters long.  I had noticed that this seemed to show much worse
than linear scaling.  I had not thought of garbage collection as the
culprit -- but indeed it is.  By manipulating the GC trigger, I can
make this operation take as little as 3 seconds (with no GC) or as
long as 76 seconds (with 31 garbage collections).

-- Dave

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] Moderating consequences of garbage collection when in C

2011-11-14 Thread dhinds
dhi...@sonic.net wrote:
> Martin Morgan  wrote:
> > Allocating many small objects triggers numerous garbage collections as R 
> > grows its memory, seriously degrading performance. The specific use case 
> > is in creating a STRSXP of several 1,000,000's of elements of 60-100 
> > characters each; a simplified illustration understating the effects 
> > (because there is initially little to garbage collect, in contrast to an 
> > R session with several packages loaded) is below.

> What a coincidence -- I was just going to post a question about why it
> is so slow to create a STRSXP of ~10,000,000 unique elements, each ~10
> characters long.  I had noticed that this seemed to show much worse
> than linear scaling.  I had not thought of garbage collection as the
> culprit -- but indeed it is.  By manipulating the GC trigger, I can
> make this operation take as little as 3 seconds (with no GC) or as
> long as 76 seconds (with 31 garbage collections).

I had done some google searches on this issue, since it seemed like it
should not be too uncommon, but the only other hit I could come up
with was a thread from 2006:

https://stat.ethz.ch/pipermail/r-devel/2006-November/043446.html

In any case, one issue with your suggested workaround is that it
requires knowing how much additional storage is needed, which may be
an expensive operation to determine.  I've just tried implementing a
different approach, which is to define two new functions to either
disable or enable GC.  The function to disable GC first invokes
R_gc_full() to shrink the heap as much as possible, then sets a flag.
Then in R_gc_internal(), I first check that flag, and if it is set, I
call AdjustHeapSize(size_needed) and exit immediately.

These calls could be used to bracket any code section that expects to
make lots of calls to R's memory allocator.  The down side is that
this approach requires that all paths out of such a code section
(including error handling) need to take care to unset the GC-disabled
flag.  I think I would want to hear from someone on the R team about
whether they think this is a good idea.

A final alternative might be to provide a vectorized version of mkChar
that would accept a char ** and use one of these methods internally,
rather than exporting the underlying methods as part of R's API.  I
don't know if there are other clear use cases where GC is a serious
bottleneck, besides constructing large vectors of mostly unique
strings.  Such a function would be less generally useful since it 
would require that the full vector of C strings be assembled at one
time.

-- Dave

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] Moderating consequences of garbage collection when in C

2011-11-14 Thread dhinds
Martin Morgan  wrote:
> On 11/14/2011 11:47 AM, dhi...@sonic.net wrote:
> > dhi...@sonic.net wrote:
> >> Martin Morgan  wrote:
> >
> > I had done some google searches on this issue, since it seemed like it
> > should not be too uncommon, but the only other hit I could come up
> > with was a thread from 2006:
> >
> > https://stat.ethz.ch/pipermail/r-devel/2006-November/043446.html
> >
> > In any case, one issue with your suggested workaround is that it
> > requires knowing how much additional storage is needed, which may be
> > an expensive operation to determine.  I've just tried implementing a
> > different approach, which is to define two new functions to either
> > disable or enable GC.  The function to disable GC first invokes
> > R_gc_full() to shrink the heap as much as possible, then sets a flag.
> > Then in R_gc_internal(), I first check that flag, and if it is set, I
> > call AdjustHeapSize(size_needed) and exit immediately.

> I think this is a better approach; mine seriously understated the 
> complexity of figuring out required size.

> > These calls could be used to bracket any code section that expects to
> > make lots of calls to R's memory allocator.  The down side is that
> > this approach requires that all paths out of such a code section
> > (including error handling) need to take care to unset the GC-disabled
> > flag.  I think I would want to hear from someone on the R team about
> > whether they think this is a good idea.
> >

> Another place where this comes up is during package load, especially for 
> packages with many S4 instances.

Do you know if this is all happening inside a C function that could
handle disabling and enabling GC?  Or would it require doing this at
the R level?  For testing, I am turning GC on and off at the R level
but I am thinking about where we would need to check for failures to
re-enable GC.  I suppose one approach would be to provide an R wrapper
that would evaluate an expression with GC disabled using tryCatch to
guarantee that it would exit with GC enabled.

>> system.time(as.character(1:1000))
>   user  system elapsed
>61.908   0.297  62.303

I get 6 seconds for this with GC disabled.

> There's a hierarchy of CHARSXP / STRSXP, so maybe that could be 
> exploited in the mark phase?

I haven't explored whether GC could be made smarter so that this isn't
as big of a hit.  I don't really understand the GC process.

-- Dave

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] Moderating consequences of garbage collection when in C

2011-11-14 Thread dhinds
Martin Morgan  wrote:

> > Do you know if this is all happening inside a C function that could
> > handle disabling and enabling GC?  Or would it require doing this at
> > the R level?  For testing, I am turning GC on and off at the R level

> Generally complicated operations across multiple function calls. 
> Something like

>f = function() {
>  state <- gcdisable(TRUE)
>  on.exit(gcdisable(state))
>  as.character(1:1000)
>}

> might be used.

Here is how I've implemented the core part of this (for discussion,
not a complete patch)

-- Dave




--- memory.c.orig   2011-04-04 15:05:04.0 -0700
+++ memory.c2011-11-14 15:21:42.0 -0800
@@ -98,6 +98,7 @@
 */
 
 static int gc_reporting = 0;
+static int gc_disabled = 0;
 static int gc_count = 0;
 
 #ifdef TESTING_WRITE_BARRIER
@@ -2467,6 +2468,17 @@
 R_gc_internal(size_needed);
 }
 
+SEXP attribute_hidden do_gcdisable(SEXP call, SEXP op, SEXP args,
SEXP rho)
+{
+int i;
+SEXP old = ScalarLogical(gc_disabled);
+checkArity(op, args);
+i = asLogical(CAR(args));
+if (i != NA_LOGICAL)
+   gc_disabled = i;
+return old;
+}
+
 #ifdef _R_HAVE_TIMING_
 double R_getClockIncrement(void);
 void R_getProcTime(double *data);
@@ -2541,6 +2553,14 @@
 SEXP first_bad_sexp_type_sexp = NULL;
 int first_bad_sexp_type_line = 0;
 
+if (gc_disabled) {
+   AdjustHeapSize(size_needed);
+   if (NO_FREE_NODES() || VHEAP_FREE() < size_needed) {
+   gc_disabled = 0;
+   error("Heap adjustment failed -- enabling GC");
+   } else return;
+}
+
  again:
 
 gc_count++;

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


[Rd] Maybe a bug in warning() for condition objects?

2006-10-02 Thread dhinds

I'm using R-2.3.1 but the code in question is the same in the
01-Oct-2006 snapshot for release 2.4.0.  I'd like to evaluate an
expression, catching errors and handling them as warnings.  My first
attempt:

  x <- tryCatch(lm(xyzzy), error=warning)

didn't work; the error is still treated as an error, not a warning.
So I thought, hmmm, the condition is still classed as an "error", how
about if I change that:

  as.warning <- function(e) warning(simpleWarning(e$message,e$call))
  x <- tryCatch(lm(xyzzy), error=as.warning)

Still no luck.  But this works:

  as.warning <- function(e) .signalSimpleWarning(e$message,e$call)
  x <- tryCatch(lm(xyzzy), error=as.warning)

I think the problem here is that warning() contains the code:

withRestarts({
.Internal(.signalCondition(cond, message, call))
.Internal(.dfltStop(message, call))
}, muffleWarning = function() NULL)

i.e., the default action is .dfltStop(), not .dfltWarn().  Is this
intentional?  It seems to make more sense to me for the default action
for conditions passed to warning() to be .dfltWarn(), but I may well
be misunderstanding something.

-- David Hinds

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


[Rd] Bug in warning() for condition objects (PR#9274)

2006-10-03 Thread dhinds
Full_Name: David Hinds
Version: 2.4.0
OS: Windows XP
Submission from: (NULL) (64.168.232.238)


A (maybe naive) use of tryCatch to trap errors and report as warnings does not
work, i.e.:

  x <- tryCatch(lm(xyzzy), error=warning)

In src/library/base/R/stop.R, the warning() function contains the following
code, for handling condition objects:

withRestarts({
.Internal(.signalCondition(cond, message, call))
.Internal(.dfltStop(message, call))
}, muffleWarning = function() NULL)

So all conditions result in calling .dfltStop().  It would seem more useful
and/or consistent for warning() to call .dfltWarn(), or in the alternative, to
choose between .dfltStop and .dfltWarn based on the class of the condition
object.

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] Watch out for the latest Cygwin upgrade

2006-10-25 Thread dhinds
Duncan Murdoch <[EMAIL PROTECTED]> wrote:
> I just updated my copy of Cygwin to the latest version, and now Windows 
> builds are failing on that machine.  The only parts of the R toolset I 
> changed were the Cygwin dlls.  I haven't tracked down exactly what the 
> problem is, and probably won't be able to do so for a few days.

> So if you're a Windows user thinking about a Cygwin upgrade, be prepared 
> for problems...

The change that bit me is that the latest cygwin bash is unhappy with
cr/lf line endings in scripts.  Which breaks some of the "R CMD ..."
scripts in non-obvious ways.  The error messages are hard to interpret
because they have embedded "\r" characters, causing some of the text
of the messages to be overwritten.

There is a sort-of workaround in the latest release (the "igncr" shell
option) but that seems to still be in flux so I decided to revert to
the previous release for now.

There seems to be a somewhat cavalier attitude among Cygwin developers
about backwards compatibility.  They've said that their primary focus
is on making cygwin as linux-like as possible, and they're willing to
sacrifice interoperability to do so.

-- Dave

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] DBI + ROracle problem with parser ?? (PR#9424)

2006-12-21 Thread dhinds
[EMAIL PROTECTED] wrote:

> doesn't:

> dbGetQuery(conn, "\nselect * from dual")

> dbGetQuery(conn, "select\n * from dual")

> dbGetQuery(conn, "/* comment */ select * from dual")

This sounds like my doing.  What version of Oracle are you using?
Oracle 9i has a bug that interferes with the documented mechanism for
asking Oracle for the type of an SQL statement (i.e. whether it is a
query returning row data, or a statement that modifies rows).  So I
asked David James for a quick fix that consisted of checking the
beginning of the SQL for either "select" or "with".

We could be more sophisticated about parsing things, I guess skipping
over any arbitrary combination of comments and white space.  Or, if
you're using a version of Oracle not affected by the bug, you can edit
src/Makefile and comment out the line:

  WORKAROUND = "-DRS_ORA_SQLGLS_WORKAROUND"

-- Dave

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] "ROracle" Packages is not to be installed (PR#10652)

2008-01-29 Thread dhinds
[EMAIL PROTECTED] wrote:

> /opt/oracle/product/10g/lib/libclntst10.a: file not recognized: File truncated

Here:

http://osdir.com/ml/lang.r.mac/2006-08/msg00031.html

I found a suggestion to do this:

$ cd $ORACLE_HOME/bin
$ ./genclntst

-- Dave

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


[Rd] RFC: "loop connections"

2005-08-22 Thread dhinds
I've just implemented a generalization of R's text connections, to
also support reading/writing raw binary data.  There is very little
new code to speak of.  For input connections, I wrote code to populate
the old text connection buffer from a raw vector, and provided a new
raw_read() method.  For output connections, I wrote a raw_write() to
append to a raw vector.  On input, the mode (text or binary) is
determined by the data type of the input object; on output, I use the
requested output mode (i.e. "w" / "wb").  For example:

 > con <- loopConnection("r", "wb")
 > a <- c(10,100,1000)
 > writeBin(a, con, size=4)
 > r
  [1] 00 00 20 41 00 00 c8 42 00 00 7a 44
 > close(con)
 > con <- loopConnection(r)
 > readBin(con, "double", n=3, size=4)
 [1]   10  100 1000
 > close(con)

I think "loop connection" is a better name for this sort of connection
than "text connection" was even for the old version; that confuses the
mode of the connection (text vs binary) with the mechanism (file,
socket, etc).

I've appended a patch to the end of this message.  As implemented
here, textConnection is replaced by loopConnection but functionally
this is a superset of the old textConnection.  For compatibility, one
could add:

  textConnection <- function(...) loopConnection(...)

The patch is against R-2.1.1.  I can investigate whether any changes
are required for the current development tree.  I can also update the
documentation files as required.  I thought I'd first check whether
anyone else thought this was worth inclusion before spending more time
on it.

The raw_write() code could be improved with smarter memory allocation
(grabbing bigger chunks rather than reallocating the raw vector for
every write), but this is at least a proof of principle.

-- David Hinds



--- src/main/connections.c.orig 2005-06-17 19:05:02.0 -0700
+++ src/main/connections.c  2005-08-22 15:54:03.156038200 -0700
@@ -1644,13 +1644,13 @@
 return ans;
 }
 
-/* --- text connections - */
+/* --- loop connections - */
 
 /* read a R character vector into a buffer */
 static void text_init(Rconnection con, SEXP text)
 {
 int i, nlines = length(text), nchars = 0;
-Rtextconn this = (Rtextconn)con->private;
+Rloopconn this = (Rloopconn)con->private;
 
 for(i = 0; i < nlines; i++)
nchars += strlen(CHAR(STRING_ELT(text, i))) + 1;
@@ -1668,19 +1668,35 @@
 this->cur = this->save = 0;
 }
 
-static Rboolean text_open(Rconnection con)
+/* read a R raw vector into a buffer */
+static void raw_init(Rconnection con, SEXP raw)
+{
+int nbytes = length(raw);
+Rloopconn this = (Rloopconn)con->private;
+
+this->data = (char *) malloc(nbytes);
+if(!this->data) {
+   free(this); free(con->description); free(con->class); free(con);
+   error(_("cannot allocate memory for raw connection"));
+}
+memcpy(this->data, RAW(raw), nbytes);
+this->nchars = nbytes;
+this->cur = this->save = 0;
+}
+
+static Rboolean loop_open(Rconnection con)
 {
 con->save = -1000;
 return TRUE;
 }
 
-static void text_close(Rconnection con)
+static void loop_close(Rconnection con)
 {
 }
 
-static void text_destroy(Rconnection con)
+static void loop_destroy(Rconnection con)
 {
-Rtextconn this = (Rtextconn)con->private;
+Rloopconn this = (Rloopconn)con->private;
 
 free(this->data);
 /* this->cur = this->nchars = 0; */
@@ -1689,7 +1705,7 @@
 
 static int text_fgetc(Rconnection con)
 {
-Rtextconn this = (Rtextconn)con->private;
+Rloopconn this = (Rloopconn)con->private;
 if(this->save) {
int c;
c = this->save;
@@ -1700,48 +1716,69 @@
 else return (int) (this->data[this->cur++]);
 }
 
-static double text_seek(Rconnection con, double where, int origin, int rw)
+static double loop_seek(Rconnection con, double where, int origin, int rw)
 {
-if(where >= 0) error(_("seek is not relevant for text connection"));
+if(where >= 0) error(_("seek is not relevant for loop connection"));
 return 0; /* if just asking, always at the beginning */
 }
 
-static Rconnection newtext(char *description, SEXP text)
+static size_t raw_read(void *ptr, size_t size, size_t nitems,
+  Rconnection con)
+{
+Rloopconn this = (Rloopconn)con->private;
+if (this->cur + size*nitems > this->nchars) {
+   nitems = (this->nchars - this->cur)/size;
+   memcpy(ptr, this->data+this->cur, size*nitems);
+   this->cur = this->nchars;
+} else {
+   memcpy(ptr, this->data+this->cur, size*nitems);
+   this->cur += size*nitems;
+}
+return nitems;
+}
+
+static Rconnection newloop(char *description, SEXP data)
 {
 Rconnection new;
 new = (Rconnection) malloc(sizeof(struct Rconn));
-if(!new) error(_("allocation of text connection failed"));
-new->class = (char *) malloc(strlen("textConnection") + 1);
+if(!new) error(_("allocation of loop connection failed"));
+ 

Re: [Rd] Typo(s) in proc.time.Rd and comment about ?proc.time (PR#8092)

2005-08-24 Thread dhinds
[EMAIL PROTECTED] wrote:
> On Wed, 24 Aug 2005 [EMAIL PROTECTED] wrote:

> > I just downloaded the file
> >
> > ftp://ftp.stat.math.ethz.ch/Software/R/R-devel.tar.gz
> >
> > and within proc.time.Rd, the second paragraph of the \value
> > section contains a typo:

> I believe your understanding of the English language is different from the 
> author here, who is English.  (You on the other hand seem to think there 
> is no need to give your country in your address when writing an addess in 
> Denmark.)  The preferred language for R documentation is English (and not 
> American).

> > The resolution of the times will be system-specific; it is common for
> > them to be recorded to of the order of 1/100 second, and elapsed [...]
> > ^
> >
> > I'd say replacing "to of" with just "of" would grammatically
> > fix the sentence.

> I'd say it was correct and your correction is incorrect.  In English we 
> say `recorded to 1/100th of a second', not `recorded 1/100th second'.

The correction was incorrect, but so was the original.  I've never
heard the expression "of the order of"; common usage (in English or
American, as far as I know) is "on the order of".  Your "recorded to
1/100th of a second" is also ok.

> > Second, the \note{} section for Unix-like machines reads:
> >
> > It is possible to compile \R without support for \code{proc.time},
> > when the function will throw an error.
> >
> > I believe this is ungrammatical and suggest replacing
> > "when the function will throw an error" with "in which
> > case the function will throw an error".

> Again, the statement given is the intended meaning.

I think more clear might be, "it is possible to compile R without
support for proc.time, when the function *would* throw an error".

-- Dave

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] RFC: "loop connections"

2005-08-26 Thread dhinds
I accidentally left one small change out of my previous patch.

So... no response to my request for comments.  Does that mean no one
has an opinion about whether this is a good idea or not?  I'd
appreciate a response from an R core member one way or the other; if
this is not the right way to get a response, should I email people
instead?

-- David Hinds


--- src/include/Internal.h.orig 2005-05-20 05:51:37.0 -0700
+++ src/include/Internal.h  2005-08-22 15:46:48.968190600 -0700
@@ -518,7 +518,7 @@
 SEXP do_pushback(SEXP, SEXP, SEXP, SEXP);
 SEXP do_pushbacklength(SEXP, SEXP, SEXP, SEXP);
 SEXP do_clearpushback(SEXP, SEXP, SEXP, SEXP);
-SEXP do_textconnection(SEXP, SEXP, SEXP, SEXP);
+SEXP do_loopconnection(SEXP, SEXP, SEXP, SEXP);
 SEXP do_getallconnections(SEXP, SEXP, SEXP, SEXP);
 SEXP do_sumconnection(SEXP, SEXP, SEXP, SEXP);
 SEXP do_download(SEXP, SEXP, SEXP, SEXP);

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] RFC: "loop connections"

2005-08-26 Thread dhinds
Gabor Grothendieck <[EMAIL PROTECTED]> wrote:
> OK.  I guess you want one of the core people to respond but in the
> interim can you explain the terminology "loop"?   
> Also, do you have any prototypical applications in mind?

"loop" is short for "loopback".  A loop or loopback device is one that
just returns the data sent to it.

The prototypical applications are the same sort of applications text
connections are used for: data transformation, in this case of raw
binary data, rather than formatted text data.  In my case, I needed to
interpret a "long raw" column from an Oracle table, that consisted of
packed single precision floating point numbers.

The caTools package on CRAN includes less capable raw2bin and bin2raw
functions, used to implement Base64 encoders and decoders.

-- Dave

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] RFC: "loop connections"

2005-08-27 Thread dhinds
Martin Maechler <[EMAIL PROTECTED]> wrote:

> In the mean time, I think it has become clear that
> "loopconnection" isn't necessarily a better name, and that
> textConnection() has been there in "the S litterature" for a
> good reason and for quite a while.
> Let's forget about the naming and the exact UI for the moment.

That is entirely fine with me.

> I think the main point of David's proposal is still worth
> consideration:  One way to see text connections is as a way to
> treat some kind of R objects as "generalized files" i.e., connections.
> And AFAICS David proposes to enlarge the kind of R objects that
> can be dealt with as connections 
>   from  {"character"} 
>   to{"character", "raw"} 
> something which has some appeal to me.
> IIUC, Brian Ripley is doubting the potential use for the
> proposed generalization, whereas David makes a point of someone
> else (the 'caTools' author) having written raw2bin / bin2raw function
> for a related use case.

> Maybe you can elaborate on the above a bit, David?

I'm not sure what more can be said on the subject.  Most connection
types support both text-mode and binary-mode, so this is partly a
proposal for symmetry and consistency.  Prof. Ripley is correct that
binary anonymous connections provide overlapping functionality, but
the semantics are slightly different, and performance is different.  I
don't see an advantage for having the "text-like" connection only
support text access.

I ran some quick benchmarks on three implementations, where the task
was conversion back and forth between a numeric vector of length 1000,
and a packed raw vector of single precision floats, repeated 1000
times.  The first method uses a new anonymous connection for each
transformation.  The second reuses a single anonymous connection.  The
third uses a new raw textConnection for each transformation.

  usr  sys  elapsed
  1.5  9.5   14.6anonymous
  1.1  0.11.2persistent
  0.9  0.00.9raw

Setting up and tearing down anonymous connections is very slow (at
least on Windows) because it requires substantial OS intervention.  If
a program can be easily organized so that a single connection can be
used, performance is much better.

I would appreciate feedback on how to improve raw_write() for the case
of appending to an existing vector.  Is it possible to reserve free
space at the end of a vector for appending?  I see that there is a
distinction between LENGTH() and TRUELENGTH() but I'm not sure if this
is the intended use.

> In any case, as you might have guessed by now, R-core would have
> been more positive to a proposal to generalize current
> textConnection() - fully back-compatibly - rather than renaming
> it first.

I have no interest in sacrificing back compatibility; I did intend
that there would always be a textConnection() entry point, if only as
a wrapper for the new constructor.  The only reason for a new name
(and I'm certainly open to suggestions) is because the notion of a
binary or raw textConnection seemed wrong.

-- David Hinds

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] floating point control on windows

2005-08-29 Thread dhinds
Chris Paulse <[EMAIL PROTECTED]> wrote:
> Hi,

> I'm sure that this question has come up many times before.  When I load an R
> extension dll I've built with the Microsoft compiler, I get the warning:

> Warning message:

> DLL attempted to change FPU control word from 8001f to 9001f

I think maybe most parsimonious/simple fix for this problem is to add
"fp10.obj" to the link line for your code.  This file is provided by
Microsoft to flip the precision of the run time library to 80 bits.
The linker should find it automatically.

http://msdn.microsoft.com/library/en-us/vclib/html/_crt_floating.2d.point_support.asp

-- David Hinds

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] RFC: "loop connections"

2005-08-30 Thread dhinds
Gabor Grothendieck <[EMAIL PROTECTED]> wrote:

> Just to be concrete, suppose one wants to run the following as a
> concurrent process to R.  (What is does is it implicitly sets x to
> zero and then for each line of stdin it adds the first field of the
> input to x and prints that to stdout unless the first field is
> "exit" in which case it exits.  gawk has an implicit read/process
> loop so one does not have to specify the read step.  The fflush()
> command just makes sure that output is emitted, rather than
> buffered, as it is produced.)

It seems you're just trying to reinvent fifo and/or pipe connections
for interprocess communication.  That is not directly related to the
problem I wanted to address.

-- Dave

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] RFC: "loop connections"

2005-08-31 Thread dhinds
Martin Maechler <[EMAIL PROTECTED]> wrote:

> I think the main point of David's proposal is still worth
> consideration:  One way to see text connections is as a way to
> treat some kind of R objects as "generalized files" i.e., connections.

To summarize the motivation for the proposal, again:

- There are two modes of connections: text and binary.  The operations
  supported on text and binary connections are mostly disjoint.  Most
  connection classes (socket, file, etc) support both modes.

- textConnection() binds a character vector to a text connection.
  There is no equivalent for a binary connection.  there are
  workarounds (i.e. anonymous connections, equivalent to temporary
  files), but these have substantial performance penalties.

- Both connection modes have useful applications.  textConnection() is
  useful, or it would not exist.  Orthogonality is good, special cases
  are bad.

- Only about 50 lines of code are required to implement a binary form
  of textConnection() in the R core.  Implementing this functionality
  in a separate package requires substantially more code.

- I need it, and in at least one case, another R package developer has
  implemented it using temporary files (caTools).  I also just noticed
  that Duncon Murdoch recently proposed the EXACT SAME feature on
  r-help:

  https://stat.ethz.ch/pipermail/r-help/2005-April/067651.html

I think that just about sums it up.  I've attached a smaller patch
that makes fewer changes to R source, doesn't change any existing
function names, etc.  The feature adds 400 bytes to the size of R.dll.

-- Dave



--- src/main/connections.c.orig 2005-06-17 19:05:02.0 -0700
+++ src/main/connections.c  2005-08-31 15:26:19.947195100 -0700
@@ -1644,7 +1644,7 @@
 return ans;
 }
 
-/* --- text connections - */
+/* --- text and raw connections - */
 
 /* read a R character vector into a buffer */
 static void text_init(Rconnection con, SEXP text)
@@ -1668,6 +1668,22 @@
 this->cur = this->save = 0;
 }
 
+/* read a R raw vector into a buffer */
+static void raw_init(Rconnection con, SEXP raw)
+{
+int nbytes = length(raw);
+Rtextconn this = (Rtextconn)con->private;
+
+this->data = (char *) malloc(nbytes);
+if(!this->data) {
+   free(this); free(con->description); free(con->class); free(con);
+   error(_("cannot allocate memory for raw connection"));
+}
+memcpy(this->data, RAW(raw), nbytes);
+this->nchars = nbytes;
+this->cur = this->save = 0;
+}
+
 static Rboolean text_open(Rconnection con)
 {
 con->save = -1000;
@@ -1702,41 +1718,60 @@
 
 static double text_seek(Rconnection con, double where, int origin, int rw)
 {
-if(where >= 0) error(_("seek is not relevant for text connection"));
+if(where >= 0) error(_("seek is not relevant for this connection"));
 return 0; /* if just asking, always at the beginning */
 }
 
-static Rconnection newtext(char *description, SEXP text)
+static size_t raw_read(void *ptr, size_t size, size_t nitems,
+  Rconnection con)
+{
+Rtextconn this = (Rtextconn)con->private;
+if (this->cur + size*nitems > this->nchars) {
+   nitems = (this->nchars - this->cur)/size;
+   memcpy(ptr, this->data+this->cur, size*nitems);
+   this->cur = this->nchars;
+} else {
+   memcpy(ptr, this->data+this->cur, size*nitems);
+   this->cur += size*nitems;
+}
+return nitems;
+}
+
+static Rconnection newtext(char *description, SEXP data)
 {
 Rconnection new;
+int isText = isString(data);
 new = (Rconnection) malloc(sizeof(struct Rconn));
-if(!new) error(_("allocation of text connection failed"));
-new->class = (char *) malloc(strlen("textConnection") + 1);
-if(!new->class) {
-   free(new);
-   error(_("allocation of text connection failed"));
-}
-strcpy(new->class, "textConnection");
+if(!new) goto f1;
+new->class = (char *) malloc(strlen("Connection") + 1);
+if(!new->class) goto f2;
+sprintf(new->class, "%sConnection", isText ? "text" : "raw");
 new->description = (char *) malloc(strlen(description) + 1);
-if(!new->description) {
-   free(new->class); free(new);
-   error(_("allocation of text connection failed"));
-}
+if(!new->description) goto f3;
 init_con(new, description, "r");
 new->isopen = TRUE;
 new->canwrite = FALSE;
 new->open = &text_open;
 new->close = &text_close;
 new->destroy = &text_destroy;
-new->fgetc = &text_fgetc;
 new->seek = &text_seek;
 new->private = (void*) malloc(sizeof(struct textconn));
-if(!new->private) {
-   free(new->description); free(new->class); free(new);
-   error(_("allocation of text connection failed"));
+if(!new->private) goto f4;
+new->text = isText;
+if (new->text) {
+   new->fgetc = &text_fgetc;
+   text_init(new, data);
+} else {
+   new->re

Re: [Rd] RFC: rawConnection (was "loop connections")

2005-08-31 Thread dhinds
Duncan Murdoch <[EMAIL PROTECTED]> wrote:

> I would implement it differently from the way you did.  I'd call it
> a rawConnection, taking a raw variable (or converting something else
> using as.raw) as the input, and providing both text and binary
> read/write modes (using the same conventions for text mode as a file
> connection would).  It *should* support seek, at least in binary
> mode.

I was trying to reuse as much of the textConnection semantics and
underlying code as possible...

Having a rawConnection() entry point is simple enough.  Seeking also
seems straightforward.  I'm not so sure about using as.raw().  I
wondered about that, but also thought that rather than coercing to
raw, it might make more sense to cast atomic vector types to raw,
byte-for-byte.

Can you given an example of where a text-mode raw connection would be
a useful thing?

-- Dave

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] RFC: rawConnection (was "loop connections")

2005-09-01 Thread dhinds
Duncan Murdoch <[EMAIL PROTECTED]> wrote:
> > 
> > Having a rawConnection() entry point is simple enough.  Seeking also
> > seems straightforward.  I'm not so sure about using as.raw().  I
> > wondered about that, but also thought that rather than coercing to
> > raw, it might make more sense to cast atomic vector types to raw,
> > byte-for-byte.

> I'd prefer as.raw, so that we don't end up with two incompatible ways to 
> convert other objects to raw objects.

An advantage of no as.raw() would be that you could create a raw
connection on an object without making an extra copy, which was
another of your requests.  But there would be a lack of symmetry,
because you could "r" from an arbitrary R object, but only "w" to raw,
unless there was also a way of specifying a type for the result
vector.

Having the backing store be an R object with no copy does seem tricky,
however.  Currently, textConnection() makes a copy for "r" connections
but writes directly to an R object for "w" connections.  The "w" case
is buggy; you can crash R by removing the target object while the
connection is being used.  I'm not familiar enough with R internals to
know how to fix that.  Maybe the object has to be searched for every
time the connection is used, to avoid potentially stale pointers?

> > Can you given an example of where a text-mode raw connection would be
> > a useful thing?

> No, but someone else might.  Why unnecessarily let the source of the 
> bytes determine the mode of the connection?  In the case of 
> textConnection, there are natural line breaks, so a text mode connection 
> makes sense.  A raw object can contain anything, so why wouldn't someone 
> want to put text in it some day?

It seems that that a text-mode raw connection would be equivalent to a
textConnection on the result of rawToChar(), no?

While some of these possibilities seem like they might be useful, I'm
not sure that all need to be implemented immediately.  If we can agree
on the basic interface and semantics, then we could implement a basic
version now, and relax restrictions on the arguments later as needed?

-- Dave

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] RFC: rawConnection (was "loop connections")

2005-09-01 Thread dhinds
Duncan Murdoch <[EMAIL PROTECTED]> wrote:

> I think the cost of duplicating as.raw is worse than the cost of using 
> extra memory.  If the lack of symmetry bothers you, a solution is to 
> require a raw object as input.

It wouldn't exactly be duplicating as.raw since this way of converting
to raw is actually to do nothing at all, just to treat the object as
if it is already raw.  But, I don't have a strong opinion.

>  > Currently, textConnection() makes a copy for "r" connections
> > but writes directly to an R object for "w" connections.  The "w" case
> > is buggy; you can crash R by removing the target object while the
> > connection is being used.  I'm not familiar enough with R internals to
> > know how to fix that.  Maybe the object has to be searched for every
> > time the connection is used, to avoid potentially stale pointers?

> I've been having an argument with some other people about something 
> related to this.  I think they would say that the language doesn't 
> support writing to a variable.

I tried changing textConnection output connections to look up the
destination object on every access and that seems to solve the problem
without being terribly expensive.

> If so, then a binary mode rawConnection (with mention of the way to 
> convert in the Rd file) would be good enough for me.

It seems we are coming back to something close to what I had
originally implemented?

-- Dave

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] RFC: rawConnection (was "loop connections")

2005-09-03 Thread dhinds
Duncan Murdoch <[EMAIL PROTECTED]> wrote:

> Probably!  The differences I still know about are:

>   - I'd like the name to reflect the data source, so rawConnection or 
> something similar rather than overloading textConnection.

>   - It needs a man page, or to be included on the textConnection man page.

Here is an updated patch, with the rawConnection() entry point, and a
man page, against today's R-devel snapshot.  This also fixes (text or
raw) output connections to verify that the target object still exists
before writing to that object.

-- Dave



--- src/main/connections.c.orig 2005-08-29 17:47:35.0 -0700
+++ src/main/connections.c  2005-09-03 13:34:25.098514900 -0700
@@ -1678,7 +1678,7 @@
 return ans;
 }
 
-/* --- text connections - */
+/* --- text and raw connections - */
 
 /* read a R character vector into a buffer */
 static void text_init(Rconnection con, SEXP text)
@@ -1702,6 +1702,22 @@
 this->cur = this->save = 0;
 }
 
+/* read a R raw vector into a buffer */
+static void raw_init(Rconnection con, SEXP raw)
+{
+int nbytes = length(raw);
+Rtextconn this = (Rtextconn)con->private;
+
+this->data = (char *) malloc(nbytes);
+if(!this->data) {
+   free(this); free(con->description); free(con->class); free(con);
+   error(_("cannot allocate memory for raw connection"));
+}
+memcpy(this->data, RAW(raw), nbytes);
+this->nchars = nbytes;
+this->cur = this->save = 0;
+}
+
 static Rboolean text_open(Rconnection con)
 {
 con->save = -1000;
@@ -1736,41 +1752,60 @@
 
 static double text_seek(Rconnection con, double where, int origin, int rw)
 {
-if(where >= 0) error(_("seek is not relevant for text connection"));
+if(where >= 0) error(_("seek is not relevant for this connection"));
 return 0; /* if just asking, always at the beginning */
 }
 
-static Rconnection newtext(char *description, SEXP text)
+static size_t raw_read(void *ptr, size_t size, size_t nitems,
+  Rconnection con)
+{
+Rtextconn this = (Rtextconn)con->private;
+if (this->cur + size*nitems > this->nchars) {
+   nitems = (this->nchars - this->cur)/size;
+   memcpy(ptr, this->data+this->cur, size*nitems);
+   this->cur = this->nchars;
+} else {
+   memcpy(ptr, this->data+this->cur, size*nitems);
+   this->cur += size*nitems;
+}
+return nitems;
+}
+
+static Rconnection newtext(char *description, SEXP data)
 {
 Rconnection new;
+int isText = isString(data);
 new = (Rconnection) malloc(sizeof(struct Rconn));
-if(!new) error(_("allocation of text connection failed"));
-new->class = (char *) malloc(strlen("textConnection") + 1);
-if(!new->class) {
-   free(new);
-   error(_("allocation of text connection failed"));
-}
-strcpy(new->class, "textConnection");
+if(!new) goto f1;
+new->class = (char *) malloc(strlen("Connection") + 1);
+if(!new->class) goto f2;
+sprintf(new->class, "%sConnection", isText ? "text" : "raw");
 new->description = (char *) malloc(strlen(description) + 1);
-if(!new->description) {
-   free(new->class); free(new);
-   error(_("allocation of text connection failed"));
-}
+if(!new->description) goto f3;
 init_con(new, description, "r");
 new->isopen = TRUE;
 new->canwrite = FALSE;
 new->open = &text_open;
 new->close = &text_close;
 new->destroy = &text_destroy;
-new->fgetc = &text_fgetc;
 new->seek = &text_seek;
 new->private = (void*) malloc(sizeof(struct textconn));
-if(!new->private) {
-   free(new->description); free(new->class); free(new);
-   error(_("allocation of text connection failed"));
+if(!new->private) goto f4;
+new->text = isText;
+if (new->text) {
+   new->fgetc = &text_fgetc;
+   text_init(new, data);
+} else {
+   new->read = &raw_read;
+   raw_init(new, data);
 }
-text_init(new, text);
 return new;
+
+f4: free(new->description);
+f3: free(new->class);
+f2: free(new);
+f1: error(_("allocation of %s connection failed"),
+ isText ? "text" : "raw");
 }
 
 static void outtext_close(Rconnection con)
@@ -1780,10 +1815,13 @@
 int idx = ConnIndex(con);
 
 if(strlen(this->lastline) > 0) {
-   PROTECT(tmp = lengthgets(this->data, ++this->len));
+   tmp = findVar1(this->namesymbol, VECTOR_ELT(OutTextData, idx),
+  STRSXP, FALSE);
+   if (tmp == R_UnboundValue)
+   error(_("connection endpoint unbound"));
+   PROTECT(tmp = lengthgets(tmp, ++this->len));
SET_STRING_ELT(tmp, this->len - 1, mkChar(this->lastline));
defineVar(this->namesymbol, tmp, VECTOR_ELT(OutTextData, idx));
-   this->data = tmp;
UNPROTECT(1);
 }
 SET_VECTOR_ELT(OutTextData, idx, R_NilValue);
@@ -1843,10 +1881,13 @@
if(q) {
int idx = ConnIndex(con);

[Rd] A memory management question

2005-09-03 Thread dhinds
Can someone explain the use of SETLENGTH() and SETTRUELENGTH()?

I would like to allocate a vector and reserve some space at the end,
so that it appears shorter than the allocated size.  So that I can
more efficiently append to the vector, without requiring a new copy
every time. So I'd like to use SETLENGTH() with a shorter apparent
length, and bump this up as needed until I've used the entire space.

There are only a couple users of SETLENGTH() in R, and they all appear
at first glance to be pointless: a few routines use allocVector() and
then call SETLENGTH() to set the vector length to the value that was
just allocated.  What are valid uses for SETLENGTH()?  And what are
the intended semantics for "truelength" as opposed to the regular
length?

If GC happens and an object is moved, and its apparent LENGTH()
differs from its allocated length, does GC preserve the allocated
length, or the updated LENGTH()?  Is there any way to get at the
original allocated length, given an SEXP?

-- Dave

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] A memory management question

2005-09-05 Thread dhinds
Luke Tierney <[EMAIL PROTECTED]> wrote:

> It might or might not work now but is not guaranteed to do so reliably
> in the future.  Seeing the risks of leaving SETLENGTH exposed, it is
> very likely that SETLENGTH will be removed from the sources after the
> 2.2.0 release.

> If you provide your own methods to read and write the external pointer
> then you don' need this; this is safer than relying on undocumented
> behavior of [ and [<- in any case.  You also then don't need to use
> R_PreserveObject unless you really need to use it from the C level
> outside of a context where an R reference exists.

I'm not sure I follow this.  Maybe I should explain the context for
the problem.

textConnection("xyz", "w") creates a connection, the output of which
is deposited in a char vector named "xyz", which is updated line by
line as output is sent to the connection.  The current code maintains
a pointer to "xyz" in the form of an unprotected SEXP.  Hence if the
user does rm(xyz), bad things happen.  A small bug, I admit.

I think the best fix is to use a protected reference to the result
vector.  I think this is safe and doesn't rely on any abuse of the
interfaces.

There's also a performance issue, that the result is updated after
every line of output, resulting in a vast amount of copying if a large
result is accumulated.  This is the part that could be fixed by using
SETLENGTH to manage the length of the protected result vector.

I'm not sure what you mean by undocumented behavior of [ and [<-.  I
think all I'm relying on is that as long as an outstanding reference
to the result vector exists, that R has to make sure the reference
remains valid, and hence can't change the memory allocation of the
result vector in any way.  I don't care what else happens to the
contents of the vector, as long as I get to control when it is
released.  It is ok with me if the user modifies the result vector
in-place, since my reference stays valid.  So I don't actually care
how [ and [<- work.

I think the only undocumented thing I'm relying on, is that the memory
manager doesn't pay attention to the LENGTH of objects that it isn't
actively doing anything to.  Currently, it actually only uses LENGTH
in one spot: for updating R_LargeVallocSize when a large vector is
released.  The true allocation sizes for individual objects are always
kept in another place (either by malloc, or in the node class of the
object).

It seems like in this limited usage, SETLENGTH does represent a useful
feature, by permitting safe over-allocation of a protected object, and
might be worth preserving (and documenting) for that purpose.  

Of course, the real problem here is the semantics of textConnection(),
which make life much more difficult and can't be changed because they
are specified outside of R.

-- Dave

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] A memory management question

2005-09-05 Thread dhinds
Luke Tierney <[EMAIL PROTECTED]> wrote:

> I am not comfortable making this available at this point.  It might be
> useful to have but would need careful thought.  Without some way to
> find out the true length there are potential problems.  Without some
> way of making sure the fields in VECSXP and STRSXP that are added are
> valid there are potential problems (not the first time but if the size
> is shrunk and then increased).  Not that this can't be resolved but it
> would take time that I don't have now, and this isn't high priority
> enough to schedule in the near future.  So for now you should not use
> SETLENGTH if you want your code to work beyond 2.2.0.

Ok, that's fine... given the lack of other valid uses of SETLENGTH, it
doesn't seem worth preserving it just for this one debatable usage.

> It may be possible to expand the semantics by adding a logical
> argument that controls whether the vector is to be over-allocated and
> filled with zero length strings and truncated to the true length on
> close.  Another variant would be to have a logical argument that says
> to keep the input internally and provide a function, say
> textConnectionOutput, to retrieve the internal output.

These are possible... or optionally just don't reveal the intermediate
output at all, and just make the final result visible on close...

-- Dave

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


[Rd] Updated rawConnection() patch

2005-09-18 Thread dhinds
Here's an update of my rawConnection() implementation.  In addition to
providing a raw version of textConnection(), this fixes two existing
issues with textConnection(): one is that the current textConnection()
implementation carries around unprotected SEXP pointers, the other is
a performance problem due to prolific copying of the output buffer as
output is accumulated line by line.

This new version uses a separate buffer for connection output, which
is extended in larger chunks, so that resize operations are less
frequent.  And the buffer is hidden behind an active binding, so that
the user can't corrupt it.

My original need for this is largely addressed by Brian Ripley's
recent extension of readBin/writeBin to operate on raw vectors as well
as connections, in the latest development tree.  But I think having a
raw version of textConnection is still a bit more orthogonal and
flexible, and requires very little code.

-- Dave


--- ./src/include/Internal.h.orig   2005-08-29 17:47:27.0 -0700
+++ ./src/include/Internal.h2005-09-18 00:32:08.196336200 -0700
@@ -525,6 +525,7 @@
 SEXP do_pushbacklength(SEXP, SEXP, SEXP, SEXP);
 SEXP do_clearpushback(SEXP, SEXP, SEXP, SEXP);
 SEXP do_textconnection(SEXP, SEXP, SEXP, SEXP);
+SEXP do_graboutput(SEXP, SEXP, SEXP, SEXP);
 SEXP do_getallconnections(SEXP, SEXP, SEXP, SEXP);
 SEXP do_sumconnection(SEXP, SEXP, SEXP, SEXP);
 SEXP do_download(SEXP, SEXP, SEXP, SEXP);
--- ./src/include/Rconnections.h.orig   2005-08-03 08:50:36.0 -0700
+++ ./src/include/Rconnections.h2005-09-17 23:56:01.875475000 -0700
@@ -94,8 +94,7 @@
 
 typedef struct outtextconn {
 int len;  /* number of lines */
-SEXP namesymbol;
-SEXP data;
+SEXP namesymbol, data, venv;
 char *lastline;
 int lastlinelength; /* buffer size */
 } *Routtextconn;
--- ./src/library/base/man/rawconnections.Rd.orig   2005-09-18 
11:37:18.004405000 -0700
+++ ./src/library/base/man/rawconnections.Rd2005-09-18 11:37:00.535655300 
-0700
@@ -0,0 +1,71 @@
+\name{rawConnection}
+\alias{rawConnection}
+\title{Raw Connections}
+\description{
+  Input and output raw connections.
+}
+\usage{
+rawConnection(object, open = "r", local = FALSE)
+}
+\arguments{
+  \item{object}{raw or character.  A description of the connection. 
+For an input this is an \R raw vector object, and for an output
+connection the name for the \R raw vector to receive the
+output.
+  }
+  \item{open}{character.  Either \code{"rb"} (or equivalently \code{""})
+for an input connection or \code{"wb"} or \code{"ab"} for an output
+connection.}
+  \item{local}{logical.  Used only for output connections.  If \code{TRUE},
+output is assigned to a variable in the calling environment.  Otherwise
+the global environment is used.}
+}
+\details{
+  An input raw connection is opened and the raw vector is copied
+  at time the connection object is created, and \code{close}
+  destroys the copy.
+
+  An output raw connection is opened and creates an \R raw vector of
+  the given name in the user's workspace or in the calling
+  environment, depending on the value of the \code{local} argument.
+  This object will at all times hold the accumulated output to the
+  connection.
+
+  Opening a raw connection with \code{mode = "ab"} will attempt to
+  append to an existing raw vector with the given name in the user's
+  workspace or the calling environment.  If none is found (even if an
+  object exists of the right name but the wrong type) a new raw vector
+  wil be created, with a warning.
+
+  You cannot \code{seek} on a raw connection, and \code{seek} will
+  always return zero as the position.
+}
+
+\value{
+  A binary-mode connection object of class \code{"rawConnection"}
+  which inherits from class \code{"connection"}.
+}
+
+\seealso{
+  \code{\link{connections}}, \code{\link{showConnections}},
+  \code{\link{readBin}}, \code{\link{writeBin}},
+  \code{\link{textConnection}}.
+}
+
+\examples{
+zz <- rawConnection("foo", "wb")
+writeBin(1:2, zz)
+writeBin(1:8, zz, size=1)
+writeBin(pi, zz, size=4)
+close(zz)
+foo
+
+zz <- rawConnection(foo)
+readBin(zz, "integer", n=2)
+sprintf("\%04x", readBin(zz, "integer", n=2, size=2))
+sprintf("\%08x", readBin(zz, "integer", endian="swap"))
+readBin(zz, "numeric", n=1, size=4)
+close(zz)
+}
+\keyword{file}
+\keyword{connection}
--- ./src/library/base/man/textconnections.Rd.orig  2005-09-03 
13:55:48.274305900 -0700
+++ ./src/library/base/man/textconnections.Rd   2005-09-18 11:37:03.457530300 
-0700
@@ -45,16 +45,11 @@
 }
 
 \value{
-  A connection object of class \code{"textConnection"} which inherits
-  from class \code{"connection"}.
+  A text-mode connection object of class \code{"textConnection"} which
+  inherits from class \code{"connection"}.
 }
 
 \note{
-  As output text connections keep the character vector up to date
-  line-by-line, they are relatively expensive to use, and it is often
-  better to use an anonymous \code{\link{file

[Rd] Future plans for raw data type?

2005-09-27 Thread dhinds
I've been working with raw vectors quite a bit and was wondering if
the R team might comment on where they see raw vector support going in
the long run.  Is the intent that 'raw' will eventually become a first
class data type on the same level as 'integer'?  Or should 'raw' have 
more limited support, by design?

For example, with very minor changes to subassign.c to implement some
automatic coercions, raw vectors can become arguments to ifelse() and
can be members of data frames.  Would this be desirable?

-- David Hinds

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] Problems with autoconf example from r-ext.

2005-10-05 Thread dhinds
Prof Brian Ripley <[EMAIL PROTECTED]> wrote:
> The current R-exts.texi has

> AC_INIT([RODBC], 1.1.4) dnl package name, version

> and that is crucially different from your example.  Autoconf 2.59 has a 
> barely documented back-compatibility mode than is invoked for AC_INIT with 
> just one argument.

I was tripped up by this same issue, and was not easily able to figure
out from the autoconf documentation how AC_INIT had changed over time.
The one-argument AC_INIT, for the version of autoconf I was using
(2.57), expects its argument to be a path to a file that is relatively
unique to the package.

However, this isn't actually related to the problem at hand:

> > R CMD INSTALL
> > --configure-args='--with-sbmlode-lib=/data/opt/sbmlodesolve/include \
> > --with-sbmlode-include=/data/opt/sbmlodesolve/lib' \
> > SBMLodeSolveR

This is a shell programming error.  Remove the '\' inside your quoted
--configure-args argument.  The backslash causes the newline to be
escaped in the string passed to the configure script, which confuses
the argument parser.  You don't need the backslash because a quoted
string is automatically continued until the closing quote is seen.

-- Dave

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel