Jeet created SPARK-34791:
----------------------------

             Summary: SparkR throws node stack overflow
                 Key: SPARK-34791
                 URL: https://issues.apache.org/jira/browse/SPARK-34791
             Project: Spark
          Issue Type: Question
          Components: SparkR
    Affects Versions: 3.0.1
            Reporter: Jeet


SparkR throws "node stack overflow" error upon running code (sample below) on 
R-4.0.2 with Spark 3.0.1.

Same piece of code works on R-3.3.3 with Spark 2.2.1 (& SparkR 2.4.5)

{{}}
{code:java}
source('sample.R')
myclsr = myclosure_func()
myclsr$get_some_date('2021-01-01')

## spark.lapply throws node stack overflow
result = spark.lapply(c('2021-01-01', '2021-01-02'), function (rdate) {
    source('sample.R')
    another_closure = myclosure_func()
    return(another_closure$get_some_date(rdate))
})

{code}
{{}}

Sample.R

{{}}
{code:java}
## util function, which calls itself
getPreviousBusinessDate <- function(asofdate) {
    asdt <- asofdate;
    asdt <- as.Date(asofdate)-1;

    wd <- format(as.Date(asdt),"%A")
    if(wd == "Saturday" | wd == "Sunday") {
        return (getPreviousBusinessDate(asdt));
    }

    return (asdt);
}

## closure which calls util function
myclosure_func = function() {

    myclosure = list()

    get_some_date = function (random_date) {
        return (getPreviousBusinessDate(random_date))
    }
    myclosure$get_some_date = get_some_date

    return(myclosure)
}

{code}
{{}}

{{}}

This seems to have caused by sourcing sample.R twice. Once before invoking 
Spark session and another within Spark session.

{{}}

{{}}

{{}}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

Reply via email to