Re: [Rd] ALTREP wrappers and factors

2019-07-18 Thread King Jiefei
Hi Kylie,

For your question, I don't think a wrapper can completely solve your
problem. The duplication occurs since your variable y has more than 1
reference number( Please see highlighted), so even you have a wrapper, any
changes on the value of the wrapper still can trigger the duplication.

> .Internal(inspect(y))
> @7fb0ce78c0f0 13 INTSXP g0c0 *[NAM(7)]* matter vector (mode=3, len=3,
> mem=0)


My guess is that *matter:::as.altrep* function assigned the variable *y* to
a local variable so that it increases the reference number. For example:

*This would not cause a duplication*

> > a=c(1,2,3)
> > .Internal(inspect(a))
> @0x2384f530 14 REALSXP g0c3 [NAM(1)] (len=3, tl=0) 1,2,3
> > attr(a,"dim")=c(1,3)
> > .Internal(inspect(a))
> @0x2384f530 14 REALSXP g0c3 [NAM(1),ATT] (len=3, tl=0) 1,2,3
> ATTRIB:
>   @0x23864b58 02 LISTSXP g0c0 []
> TAG: @0x044b1a90 01 SYMSXP g1c0 [MARK,LCK,gp=0x4000] "dim"
> (has value)
> @0x2384cb48 13 INTSXP g0c1 [NAM(7)] (len=2, tl=0) 1,3
>

*This would cause a duplication, even though the function test does
nothing.*

> > test<-function(x) x1=x
> > a=c(1,2,3)
> > .Internal(inspect(a))
> @0x2384f260 14 REALSXP g0c3 [NAM(1)] (len=3, tl=0) 1,2,3
> > test(a)
> > .Internal(inspect(a))
> @0x2384f260 14 REALSXP g0c3 [NAM(7)] (len=3, tl=0) 1,2,3
> > attr(a,"dim")=c(1,3)
> > .Internal(inspect(a))
> @0x2384f0d0 14 REALSXP g0c3 [NAM(1),ATT] (len=3, tl=0) 1,2,3
> ATTRIB:
>   @0x238666c0 02 LISTSXP g0c0 []
> TAG: @0x044b1a90 01 SYMSXP g1c0 [MARK,LCK,gp=0x4000] "dim"
> (has value)
> @0x2384c6e8 13 INTSXP g0c1 [NAM(7)] (len=2, tl=0) 1,3
>


If that is the case and you are 100% sure the reference number should be 1
for your variable *y*, my solution is to call *SET_NAMED *in C++ to reset
the reference number. Note that you need to unbind your local variable
before you reset the number. To return an unbound SEXP,  the C++ function
should be placed at the end of your *matter:::as.altrep *function. I don't
know if there is any simpler way to do that and I'll be happy to see any
opinion.


Also, I notice that you are using ALTREP to create a wrapper for your
*matter_vec *class. I'm an author of AltWrapper package and the package is
able to define an ALTREP in pure R level, it is capable to add an attribute
to ALTREP object when creating the object and has a correct reference
number. The simplest example would be

*CODE*
```
library(AltWrapper)
inspectFunc <- function(x) cat("Altrep object\n")
lengthFunc <- function(x) return(length(x))
getPtrFunc <- function(x, writeable) return(x)

setAltClass(className = "test", classType = "real")
setAltMethod(className = "test", inspect = inspectFunc)
setAltMethod(className = "test", getLength = lengthFunc)
setAltMethod(className = "test", getDataptr = getPtrFunc)

A = runif(6)
A_alt = makeAltrep(className = "test", x = A, *attributes = list(dim = c(2,
3))*)
```
*RESULT*
```
> .Internal(inspect(A_alt))
@0x2385ac00 14 REALSXP g0c0 [NAM(1),ATT] Altrep object
ATTRIB:
  @0x2385a8b8 02 LISTSXP g0c0 []
TAG: @0x044b1a90 01 SYMSXP g1c0 [MARK,LCK,gp=0x4000] "dim" (has
value)
@0x2384d590 13 INTSXP g0c1 [NAM(7)] (len=2, tl=0) 2,3
> A_alt
  [,1] [,2]  [,3]
[1,] 0.9430458 0.548670 0.4148741
[2,] 0.9550899 0.251857 0.6077540
```
I will be happy to talk more about it if you are interested in the package,
it is available at
https://github.com/Jiefei-Wang/AltWrapper

Best,
Jiefei


On Thu, Jul 18, 2019 at 3:28 AM Bemis, Kylie 
wrote:

> Hello,
>
> I’m experimenting with ALTREP and was wondering if there is a preferred
> way to create an ALTREP wrapper vector without using
> .Internal(wrap_meta(…)), which R CMD check doesn’t like since it uses an
> .Internal() function.
>
> I was trying to create a factor that used an ALTREP integer, but
> attempting to set the class and levels attributes always ended up
> duplicating and materializing the integer vector. Using the wrapper avoided
> this issue.
>
> Here is my initial ALTREP integer vector:
>
> > fc0 <- factor(c("a", "a", "b"))
> >
> > y <- matter::as.matter(as.integer(fc0))
> > y <- matter:::as.altrep(y)
> >
> > .Internal(inspect(y))
> @7fb0ce78c0f0 13 INTSXP g0c0 [NAM(7)] matter vector (mode=3, len=3, mem=0)
>
> Here is what I get without a wrapper:
>
> > fc1 <- structure(y, class="factor", levels=levels(x))
> > .Internal(inspect(fc1))
> @7fb0cae66408 13 INTSXP g0c2 [OBJ,NAM(2),ATT] (len=3, tl=0) 1,1,2
> ATTRIB:
>   @7fb0ce771868 02 LISTSXP g0c0 []
> TAG: @7fb0c80043d0 01 SYMSXP g1c0 [MARK,LCK,gp=0x4000] "class" (has
> value)
> @7fb0c9fcbe90 16 STRSXP g0c1 [NAM(7)] (len=1, tl=0)
>   @7fb0c80841a0 09 CHARSXP g1c1 [MARK,gp=0x61] [ASCII] [cached]
> "factor"
> TAG: @7fb0c8004050 01 SYMSXP g1c0 [MARK,NAM(7),LCK,gp=0x4000] "levels"
> (has value)
> @7fb0d1dd58c8 16 STRSXP g0c2 [MARK,NAM(7)] (len=2, tl=0)
>   @7fb0c81bf4c0 09 CHARSXP g1c1 

Re: [Rd] strange increase in the reference number

2019-07-15 Thread King Jiefei
Hi Duncan, Gabriel, and Brodie

Thanks for the explanations and references. Brodie's blog talks about
exactly the same problem without involving too many technical details. I
would recommend to read it if anyone is interested in it. I really
appreciate all of you guys' answers.


Best,
Jiefei

On Sat, Jul 13, 2019 at 2:19 PM brodie gaslam via R-devel <
r-devel@r-project.org> wrote:

> Re ENSURE_NAMEDMAX, I am unsure but think this happens in (src/eval.c@492
> ):
>  static SEXP forcePromise(SEXP e)
> {
> if (PRVALUE(e) == R_UnboundValue) {
> /* ... SNIP ...*/
> val = eval(PRCODE(e), PRENV(e));
> /* ... SNIP ...*/
> SET_PRSEEN(e, 0);
> SET_PRVALUE(e, val);
> ENSURE_NAMEDMAX(val); <<<<<<< HERE
> SET_PRENV(e, R_NilValue);
> }
> return PRVALUE(e);
> }
>
> as part of the evaluations of the closure.  `forcePromise` is called
> ineval (src/eval.c@656).  It's been a while since I've looked at the
> mechanicsof how the native version of `eval` works so I could be completely
> wrong.
>
> B.
>
> PS: line references are in r-devel@76287.
>
>
> On Friday, July 12, 2019, 4:38:06 PM EDT, Gabriel Becker <
> gabembec...@gmail.com> wrote:
>
>
>
>
>
> Hi Jiefei and Duncan,
>
> I suspect what is likely happening is that one of  ENSURE_NAMEDMAX or
> MARK_NOT_MUTABLE are being hit for x. These used to set named to 3, but now
> set it to 7 (ie the previous and current NAMEDMAX  value, respectively).
>
> Because these are macros rather than C functions, its not easy to figure
> out why one of them is being invoked from do_isvector  (a cursory
> exploration didn't reveal what was going on, at least to me) and I don't
> have the time to dig super deeply into this right now,  but perhaps Luke or
> Tomas know why this is happening of the top of their head.
>
> Sorry I can't be of more help.
>
> ~G
>
>
>
> On Fri, Jul 12, 2019 at 11:47 AM Duncan Murdoch 
> wrote:
>
> > On 12/07/2019 1:22 p.m., King Jiefei wrote:
> > > Hi,
> > >
> > > I just found a strange increase in the reference number and I'm
> wondering
> > > if there is any reason for it, here is the code.
> > >
> > >> a=c(1,2,3)
> > >> .Internal(inspect(a))
> > > @0x1bf0b9b0 14 REALSXP g0c3 [NAM(1)] (len=3, tl=0) 1,2,3
> > >> is.vector(a)
> > > [1] TRUE
> > >> .Internal(inspect(a))
> > > @0x1bf0b9b0 14 REALSXP g0c3 [NAM(7)] (len=3, tl=0) 1,2,3
> > >
> > > The variable *a* initially has one reference number, after calling
> > > *is.vector* function, the reference number goes to 7, which I believe
> is
> > > the highest number that is allowed in R.  I also tried the other R
> > > functions, *is.atomic, is.integer* and *is.numeric* do not increase the
> > > reference number, but *typeof *will do. Is it intentional?
> >
> > is.vector() is a closure that calls .Internal.  is.atomic(),
> > is.integer() and is.numeric() are all primitives.
> >
> > Generally speaking closures that call .Internal are easier to implement
> > (e.g. is.vector can use the regular mechanism to set a default for its
> > second argument), but less efficient in CPU time.  From it's help page,
> > it appears that the logic for is.vector() is a lot more complex than for
> > the others, so that implementation does make sense.
> >
> > So why does NAMED go to 7?  Initially, the vector is bound to a.  Within
> > is.vector, it is bound to the local variable x.  At this point there are
> > two names bound to the same object, so it has to be considered
> > immutable.  There's really no difference between any of the values of 2
> > or more in the memory manager.  (But see
> > http://developer.r-project.org/Refcnt.html for some plans.  That
> > document is from about 5 years ago; I don't know the current state.)
> >
> > Duncan Murdoch
> >
> > __
> > R-devel@r-project.org mailing list
> > https://stat.ethz.ch/mailman/listinfo/r-devel
> >
>
> [[alternative HTML version deleted]]
>
>
> __
> R-devel@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel
> [[alternative HTML version deleted]]
>
> __
> R-devel@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel
>

[[alternative HTML version deleted]]

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


[Rd] strange increase in the reference number

2019-07-12 Thread King Jiefei
Hi,

I just found a strange increase in the reference number and I'm wondering
if there is any reason for it, here is the code.

> a=c(1,2,3)
> .Internal(inspect(a))
@0x1bf0b9b0 14 REALSXP g0c3 [NAM(1)] (len=3, tl=0) 1,2,3
> is.vector(a)
[1] TRUE
> .Internal(inspect(a))
@0x1bf0b9b0 14 REALSXP g0c3 [NAM(7)] (len=3, tl=0) 1,2,3

The variable *a* initially has one reference number, after calling
*is.vector* function, the reference number goes to 7, which I believe is
the highest number that is allowed in R.  I also tried the other R
functions, *is.atomic, is.integer* and *is.numeric* do not increase the
reference number, but *typeof *will do. Is it intentional?

Best,
Jiefei

[[alternative HTML version deleted]]

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] Fast way to call an R function from C++?

2019-06-18 Thread King Jiefei
Hello Kevin and Iñaki,

Thanks for your quick responses. I sincerely appreciate them! I can see how
complicated it is to interact with R in C. Iñaki's suggestion is very
helpful, I saw there is a lot of performance gain by turning the flag on,
but sadly the best performance it can offer still cannot beat R itself. It
is interesting to see that C++ is worse than R in this special case despite
there is a common belief that C++ code is the fast one... Anyway, thanks
again for your suggestions and reference!

Best,
Jiefei


On Tue, Jun 18, 2019 at 2:39 PM Iñaki Ucar  wrote:

> For reference, your benchmark using UNWIND_PROTECT:
>
> > system.time(test(testFunc, evn$x))
>user  system elapsed
>   0.331   0.000   0.331
> > system.time(test(C_test1, testFunc, evn$x))
>user  system elapsed
>   2.029   0.000   2.036
> > system.time(test(C_test2, expr, evn))
>user  system elapsed
>   2.307   0.000   2.313
> > system.time(test(C_test3, testFunc, evn$x))
>user  system elapsed
>   2.131   0.000   2.138
>
> Iñaki
>
> On Tue, 18 Jun 2019 at 20:35, Iñaki Ucar  wrote:
> >
> > On Tue, 18 Jun 2019 at 19:41, King Jiefei  wrote:
> > >
> > > [...]
> > >
> > > It is clear to see that calling an R function in R is the fast one, it
> is
> > > about 5X faster than ` R_forceAndCall ` and ` Rf_eval`. the latter two
> > > functions have a similar performance and using Rcpp is the worst one.
> Is it
> > > expected? Why is calling an R function from C++ much slower than
> calling
> > > the function from R? Is there any faster way to do the function call
> in C++?
> >
> > Yes, there is: enable fast evaluation by setting
> > -DRCPP_USE_UNWIND_PROTECT, or alternatively, use
> >
> > // [[Rcpp::plugins(unwindProtect)]]
> >
> > Iñaki
>
>
>
> --
> Iñaki Úcar
>

[[alternative HTML version deleted]]

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


[Rd] Fast way to call an R function from C++?

2019-06-18 Thread King Jiefei
Hi,

I'm looking for a most efficient way to call an R function from C++ in a
package. I know there are two functions (`R_forceAndCall` and `Rf_eval`)
that can do the "call" part, but both are slow compared to calling the same
function in R. I also try to use Rcpp and it is the worse one. Here is my
test code:

C++ code:
```
// [[Rcpp::export]]
SEXP C_test1(SEXP f, SEXP x) {
SEXP call =PROTECT(Rf_lang2(f, x));
SEXP val = R_forceAndCall(call, 1, R_GlobalEnv);
UNPROTECT(1);
return val;
}

// [[Rcpp::export]]
SEXP C_test2(SEXP expr, SEXP env) {
SEXP val = Rf_eval(expr, env);
return val;
}

// [[Rcpp::export]]
SEXP C_test3(SEXP f,SEXP x) {
Function fun(f);
return fun(x);
}
```

R code:
```
testFunc<-function(x){
  x=x^2
  return(x)
}
evn=new.env()
evn$x=x
expr=quote(testFunc(evn$x))

testFunc(evn$x)
C_test1(testFunc, evn$x)
C_test2(expr,evn)
C_test3(testFunc,evn$x)
```

For the results, I run each function 1,000,000 times:

   - testFunc : 0.47 sec
   - C_test1 : 2.46 sec
   - C_test2 : 2.74 sec
   - C_test3 : 18.86 sec

It is clear to see that calling an R function in R is the fast one, it is
about 5X faster than ` R_forceAndCall ` and ` Rf_eval`. the latter two
functions have a similar performance and using Rcpp is the worst one. Is it
expected? Why is calling an R function from C++ much slower than calling
the function from R? Is there any faster way to do the function call in C++?

Best,
Jiefei

[[alternative HTML version deleted]]

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] "if" function in pure R?

2019-05-26 Thread King Jiefei
Hi Alexandre,

I'm not an R expert so this is only my personal thought:

I don't think you can achieve what you want exactly. A possible solution
would be defining a binary operator %*%, where you can replace the asterisk
with any function name you want. The function %*% is special since it has
two arguments, left operand and right operand respectively. You then
can call the `substitute` function to get its function arguments in an
expression format and proceed to do what you want. Here is an example to
show the idea.

*Code:*

`%myOperator%` <- function(x, y) {
  x = substitute(x)
  y = substitute(y)
  return(list(x, y))
}


myIf(i == 1, arg1) %myOperator% {
  doSomeThing
}


*Results:*

[[1]]
myIf(i == 1, arg1)

[[2]]
{
doSomeThing
}

I hope that helps.

Best,
Jiefei

On Sun, May 26, 2019 at 4:45 AM Alexandre Courtiol <
alexandre.court...@gmail.com> wrote:

> Hi all,
>
> Could anyone refer to me to a good source to learn how to program a simple
> control-flow construct* in R, or provide me with a simple example?
>
> Control-flow constructs are programmed as primitives, but I would like to
> be able to do that (if possible) in pure R.
>
> The general context is that those functions are a mystery to me. The
> motivating example is that I would like to create a function that behave
> similarly to base::`if` with an extra argument to the function (e.g. to
> include an error rate on the condition).
>
> Many thanks,
>
> Alex
>
> * control-flow constructs are functions such as if, for, while... that
> allow for call of the form fn(x) expr to work (see ?Control).
>
> --
> Alexandre Courtiol
>
> http://sites.google.com/site/alexandrecourtiol/home
>
> *"Science is the belief in the ignorance of experts"*, R. Feynman
>
> [[alternative HTML version deleted]]
>
> __
> R-devel@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel
>

[[alternative HTML version deleted]]

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel