Github user HyukjinKwon commented on the issue:

    https://github.com/apache/spark/pull/18465
  
    I referred this - http://adv-r.had.co.nz/Rcpp.html and your link.
    
    I did as below:
    
    **R test alone**
    
    ```
    vi tmp.R
    ```
    
    copy and paste the codes in **Before** and **After** and then ran
    
    
    ```
    Rscript tmp.R
    ```
    
    Before
    
    ```R
    library(Rcpp)
    cppFunction('double takeLog(double val) {
        try {
            if (val <= 0.0) {           // log() not defined here
                throw std::range_error("Inadmissible value");
            }
            return log(val);
        } catch(std::exception &ex) {   
        forward_exception_to_r(ex);
        } catch(...) { 
        ::Rf_error("c++ exception (unknown reason)"); 
        }
        return NA_REAL;             // not reached
    }')
    
    for(i in 0:10000) {
      p <- parallel:::mcfork()
      if (inherits(p, "masterProcess")) {
        takeLog(-1.0)
        print("unreachable")
        tools::pskill(child, tools::SIGUSR1)
      }
    }
    
    print("end")
    Sys.sleep(10L)
    ```
    
    
    
    After
    
    
    ```R
    library(Rcpp)
    cppFunction('double takeLog(double val) {
        try {
            if (val <= 0.0) {           // log() not defined here
                throw std::range_error("Inadmissible value");
            }
            return log(val);
        } catch(std::exception &ex) {   
        forward_exception_to_r(ex);
        } catch(...) { 
        ::Rf_error("c++ exception (unknown reason)"); 
        }
        return NA_REAL;             // not reached
    }')
    
    for(i in 0:10000) {
      p <- parallel:::mcfork()
      if (inherits(p, "masterProcess")) {
        takeLog(-1.0)
        print("unreachable")
      }
    
      children <- suppressWarnings(parallel:::selectChildren(timeout = 0))
      if (is.integer(children)) {
        lapply(children, function(child) {
          print(parallel:::readChild(child))
          tools::pskill(child, tools::SIGUSR1)
        })
      }
    }
    
    print("end")
    Sys.sleep(10L)
    ```
    
    The symptoms are similar with 
https://github.com/apache/spark/pull/18465#issuecomment-313049544
    
    
    **End to end**
    
    I could not do this as I did above with `cppFunction` due to such errors 
below:
    
    ```
    Error in as.character(node[[1]]) :
      cannot coerce type 'builtin' to vector of type 'character'
    ```
    
    So, I did as below:
    
    ```
    vi takeLog.cpp
    ```
    
    copy and paste
    
    ```cpp
    #include <Rcpp.h>
    
    using namespace Rcpp;
    
    // [[Rcpp::export]]
    double takeLog(double val) {
        try {
            if (val <= 0.0) {           // log() not defined here
                throw std::range_error("Inadmissible value");
            }
            return log(val);
        } catch(std::exception &ex) {
        forward_exception_to_r(ex);
        } catch(...) {
        ::Rf_error("c++ exception (unknown reason)");
        }
        return NA_REAL;             // not reached
    }
    ```
    
    And then ran below with SparkR:
    
    ```R
    func <- function(key, x) {
      library(Rcpp)
      path <- "/.../spark/takeLog.cpp"
      sourceCpp(path)
      takeLog(-1.0)
    }
    
    df <- createDataFrame(list(list(1L, 1, "1", 0.1)), c("a", "b", "c", "d"))
    collect(gapply(df, "a", func, schema(df)))
    ... 30 times
    collect(gapply(df, "a", function(key, x) { x }, schema(df)))
    ```
    
    The symptoms are also similar with 
https://github.com/apache/spark/pull/18465#issuecomment-313055990 for both 
before/after.
    



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

Reply via email to