huonw commented on a change in pull request #24414: [SPARK-22044][SQL] Add `cost` and `codegen` arguments to `explain` URL: https://github.com/apache/spark/pull/24414#discussion_r278398094
########## File path: R/pkg/R/DataFrame.R ########## @@ -147,19 +155,16 @@ setMethod("schema", #' sparkR.session() #' path <- "path/to/file.json" #' df <- read.json(path) -#' explain(df, TRUE) +#' explain(df) +#' explain(df, extended = TRUE) +#' explain(df, codegen = TRUE) +#' explain(df, cost = TRUE) #'} #' @note explain since 1.4.0 setMethod("explain", signature(x = "SparkDataFrame"), - function(x, extended = FALSE) { - queryExec <- callJMethod(x@sdf, "queryExecution") - if (extended) { - cat(callJMethod(queryExec, "toString")) - } else { - execPlan <- callJMethod(queryExec, "executedPlan") - cat(callJMethod(execPlan, "toString")) - } + function(x, extended = FALSE, codegen = FALSE, cost = FALSE) { Review comment: > does this change the result (by default when extended = FALSE, codegen = FALSE, cost = FALSE) from before? Yes, but it changes it to match the output of Scala Spark's `.explain` and SQL's `EXPLAIN ...`. For instance, given a file `test.json` that contains: ```json {"a": 1,"b":1.2} {"a": 2,"b":3.4} {"a": 3,"b":4.5} ``` 2.4: ``` > explain(read.json("/tmp/test.json")) *(1) FileScan json [a#24L,b#25] Batched: false, Format: JSON, Location: InMemoryFileIndex[file:/private/tmp/test.json], PartitionFilters: [], PushedFilters: [], ReadSchema: struct<a:bigint,b:double>``` ``` This PR: ``` > explain(read.json("/tmp/test.json")) == Physical Plan == *(1) FileScan json [a#37L,b#38] Batched: false, DataFilters: [], Format: JSON, Location: InMemoryFileIndex[file:/private/tmp/test.json], PartitionFilters: [], PushedFilters: [], ReadSchema: struct<a:bigint,b:double> ``` Scala(for reference). This is on `master`, but 2.4 is similar: ``` scala> spark.read.json("/tmp/test.json").explain == Physical Plan == *(1) FileScan json [a#19L,b#20] Batched: false, DataFilters: [], Format: JSON, Location: InMemoryFileIndex[file:/tmp/test.json], PartitionFilters: [], PushedFilters: [], ReadSchema: struct<a:bigint,b:double> ``` > can you check / test explain(df, TRUE) if the same as explain(df, extended = TRUE) I added a test. ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services --------------------------------------------------------------------- To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org