[jira] [Commented] (SPARK-22063) Upgrade lintr to latest commit sha1 ID
[ https://issues.apache.org/jira/browse/SPARK-22063?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16892595#comment-16892595 ] Manu Zhang commented on SPARK-22063: Is there any update in this thread ? Which lint-r version is used in build now ? I also find upgrading lint-r would also upgrade testthat to latest version while [SparkR requires testthat 1.0.2 |https://github.com/apache/spark/blob/master/docs/building-spark.md#running-r-tests] > Upgrade lintr to latest commit sha1 ID > -- > > Key: SPARK-22063 > URL: https://issues.apache.org/jira/browse/SPARK-22063 > Project: Spark > Issue Type: Improvement > Components: SparkR >Affects Versions: 2.3.0 >Reporter: Hyukjin Kwon >Priority: Minor > > Currently, we set lintr to {{jimhester/lintr@a769c0b}} (see [this > pr|https://github.com/apache/spark/commit/7d1175011c976756efcd4e4e4f70a8fd6f287026]) > and SPARK-14074. > Today, I tried to upgrade the latest, > https://github.com/jimhester/lintr/commit/5431140ffea65071f1327625d4a8de9688fa7e72 > This fixes many bugs and now finds many instances that I have observed and > thought should be caught time to time: > {code} > inst/worker/worker.R:71:10: style: Remove spaces before the left parenthesis > in a function call. > return (output) > ^ > R/column.R:241:1: style: Lines should not be more than 100 characters. > #' > \href{https://spark.apache.org/docs/latest/sparkr.html#data-type-mapping-between-r-and-spark}{ > ^~~~ > R/context.R:332:1: style: Variable and function names should not be longer > than 30 characters. > spark.getSparkFilesRootDirectory <- function() { > ^~~~ > R/DataFrame.R:1912:1: style: Lines should not be more than 100 characters. > #' @param j,select expression for the single Column or a list of columns to > select from the SparkDataFrame. > ^~~ > R/DataFrame.R:1918:1: style: Lines should not be more than 100 characters. > #' @return A new SparkDataFrame containing only the rows that meet the > condition with selected columns. > ^~~ > R/DataFrame.R:2597:22: style: Remove spaces before the left parenthesis in a > function call. > return (joinRes) > ^ > R/DataFrame.R:2652:1: style: Variable and function names should not be longer > than 30 characters. > generateAliasesForIntersectedCols <- function (x, intersectedColNames, > suffix) { > ^ > R/DataFrame.R:2652:47: style: Remove spaces before the left parenthesis in a > function call. > generateAliasesForIntersectedCols <- function (x, intersectedColNames, > suffix) { > ^ > R/DataFrame.R:2660:14: style: Remove spaces before the left parenthesis in a > function call. > stop ("The following column name: ", newJoin, " occurs more than once > in the 'DataFrame'.", > ^ > R/DataFrame.R:3047:1: style: Lines should not be more than 100 characters. > #' @note The statistics provided by \code{summary} were change in 2.3.0 use > \link{describe} for previous defaults. > ^~ > R/DataFrame.R:3754:1: style: Lines should not be more than 100 characters. > #' If grouping expression is missing \code{cube} creates a single global > aggregate and is equivalent to > ^~~ > R/DataFrame.R:3789:1: style: Lines should not be more than 100 characters. > #' If grouping expression is missing \code{rollup} creates a single global > aggregate and is equivalent to > ^ > R/deserialize.R:46:10: style: Remove spaces before the left parenthesis in a > function call. > switch (type, > ^ > R/functions.R:41:1: style: Lines should not be more than 100 characters. > #' @param x Column to compute on. In \code{window}, it must be a time Column > of \code{TimestampType}. > ^ > R/functions.R:93:1: style: Lines should not be more than 100 characters. > #' @param x Column to compute on. In \code{shiftLeft}, \code{shiftRight} and > \code{shiftRightUnsigned}, > ^~~ > R/functions.R:483:52:
[jira] [Commented] (SPARK-22063) Upgrade lintr to latest commit sha1 ID
[ https://issues.apache.org/jira/browse/SPARK-22063?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16187574#comment-16187574 ] Felix Cheung commented on SPARK-22063: -- surely, I think we could even start with something simple with install.package(..., lib =) (or install_github(..., lib=)) and then library(... lib.loc) > Upgrade lintr to latest commit sha1 ID > -- > > Key: SPARK-22063 > URL: https://issues.apache.org/jira/browse/SPARK-22063 > Project: Spark > Issue Type: Improvement > Components: SparkR >Affects Versions: 2.3.0 >Reporter: Hyukjin Kwon >Priority: Minor > > Currently, we set lintr to {{jimhester/lintr@a769c0b}} (see [this > pr|https://github.com/apache/spark/commit/7d1175011c976756efcd4e4e4f70a8fd6f287026]) > and SPARK-14074. > Today, I tried to upgrade the latest, > https://github.com/jimhester/lintr/commit/5431140ffea65071f1327625d4a8de9688fa7e72 > This fixes many bugs and now finds many instances that I have observed and > thought should be caught time to time: > {code} > inst/worker/worker.R:71:10: style: Remove spaces before the left parenthesis > in a function call. > return (output) > ^ > R/column.R:241:1: style: Lines should not be more than 100 characters. > #' > \href{https://spark.apache.org/docs/latest/sparkr.html#data-type-mapping-between-r-and-spark}{ > ^~~~ > R/context.R:332:1: style: Variable and function names should not be longer > than 30 characters. > spark.getSparkFilesRootDirectory <- function() { > ^~~~ > R/DataFrame.R:1912:1: style: Lines should not be more than 100 characters. > #' @param j,select expression for the single Column or a list of columns to > select from the SparkDataFrame. > ^~~ > R/DataFrame.R:1918:1: style: Lines should not be more than 100 characters. > #' @return A new SparkDataFrame containing only the rows that meet the > condition with selected columns. > ^~~ > R/DataFrame.R:2597:22: style: Remove spaces before the left parenthesis in a > function call. > return (joinRes) > ^ > R/DataFrame.R:2652:1: style: Variable and function names should not be longer > than 30 characters. > generateAliasesForIntersectedCols <- function (x, intersectedColNames, > suffix) { > ^ > R/DataFrame.R:2652:47: style: Remove spaces before the left parenthesis in a > function call. > generateAliasesForIntersectedCols <- function (x, intersectedColNames, > suffix) { > ^ > R/DataFrame.R:2660:14: style: Remove spaces before the left parenthesis in a > function call. > stop ("The following column name: ", newJoin, " occurs more than once > in the 'DataFrame'.", > ^ > R/DataFrame.R:3047:1: style: Lines should not be more than 100 characters. > #' @note The statistics provided by \code{summary} were change in 2.3.0 use > \link{describe} for previous defaults. > ^~ > R/DataFrame.R:3754:1: style: Lines should not be more than 100 characters. > #' If grouping expression is missing \code{cube} creates a single global > aggregate and is equivalent to > ^~~ > R/DataFrame.R:3789:1: style: Lines should not be more than 100 characters. > #' If grouping expression is missing \code{rollup} creates a single global > aggregate and is equivalent to > ^ > R/deserialize.R:46:10: style: Remove spaces before the left parenthesis in a > function call. > switch (type, > ^ > R/functions.R:41:1: style: Lines should not be more than 100 characters. > #' @param x Column to compute on. In \code{window}, it must be a time Column > of \code{TimestampType}. > ^ > R/functions.R:93:1: style: Lines should not be more than 100 characters. > #' @param x Column to compute on. In \code{shiftLeft}, \code{shiftRight} and > \code{shiftRightUnsigned}, > ^~~ > R/functions.R:483:52: style: Remove spaces before the left parenthesis in a > function call. > jcols <- lapply(list(x, ...), function
[jira] [Commented] (SPARK-22063) Upgrade lintr to latest commit sha1 ID
[ https://issues.apache.org/jira/browse/SPARK-22063?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16187551#comment-16187551 ] Shivaram Venkataraman commented on SPARK-22063: --- [~shaneknapp] [~felixcheung] Lets move the discussion to the JIRA ? I think there are a couple of ways to address this issue -- the first as [~hyukjin.kwon] pointed out we can make the lint-r script do the installation. I am not too much in favor of that as it will result in the script affecting packages at runtime. Instead I was thinking if we could create R environments for each Spark version -- https://stackoverflow.com/questions/24283171/virtual-environment-in-r has a bunch of ideas on how to do this. Any thoughts on the approaches listed there ? > Upgrade lintr to latest commit sha1 ID > -- > > Key: SPARK-22063 > URL: https://issues.apache.org/jira/browse/SPARK-22063 > Project: Spark > Issue Type: Improvement > Components: SparkR >Affects Versions: 2.3.0 >Reporter: Hyukjin Kwon >Priority: Minor > > Currently, we set lintr to {{jimhester/lintr@a769c0b}} (see [this > pr|https://github.com/apache/spark/commit/7d1175011c976756efcd4e4e4f70a8fd6f287026]) > and SPARK-14074. > Today, I tried to upgrade the latest, > https://github.com/jimhester/lintr/commit/5431140ffea65071f1327625d4a8de9688fa7e72 > This fixes many bugs and now finds many instances that I have observed and > thought should be caught time to time: > {code} > inst/worker/worker.R:71:10: style: Remove spaces before the left parenthesis > in a function call. > return (output) > ^ > R/column.R:241:1: style: Lines should not be more than 100 characters. > #' > \href{https://spark.apache.org/docs/latest/sparkr.html#data-type-mapping-between-r-and-spark}{ > ^~~~ > R/context.R:332:1: style: Variable and function names should not be longer > than 30 characters. > spark.getSparkFilesRootDirectory <- function() { > ^~~~ > R/DataFrame.R:1912:1: style: Lines should not be more than 100 characters. > #' @param j,select expression for the single Column or a list of columns to > select from the SparkDataFrame. > ^~~ > R/DataFrame.R:1918:1: style: Lines should not be more than 100 characters. > #' @return A new SparkDataFrame containing only the rows that meet the > condition with selected columns. > ^~~ > R/DataFrame.R:2597:22: style: Remove spaces before the left parenthesis in a > function call. > return (joinRes) > ^ > R/DataFrame.R:2652:1: style: Variable and function names should not be longer > than 30 characters. > generateAliasesForIntersectedCols <- function (x, intersectedColNames, > suffix) { > ^ > R/DataFrame.R:2652:47: style: Remove spaces before the left parenthesis in a > function call. > generateAliasesForIntersectedCols <- function (x, intersectedColNames, > suffix) { > ^ > R/DataFrame.R:2660:14: style: Remove spaces before the left parenthesis in a > function call. > stop ("The following column name: ", newJoin, " occurs more than once > in the 'DataFrame'.", > ^ > R/DataFrame.R:3047:1: style: Lines should not be more than 100 characters. > #' @note The statistics provided by \code{summary} were change in 2.3.0 use > \link{describe} for previous defaults. > ^~ > R/DataFrame.R:3754:1: style: Lines should not be more than 100 characters. > #' If grouping expression is missing \code{cube} creates a single global > aggregate and is equivalent to > ^~~ > R/DataFrame.R:3789:1: style: Lines should not be more than 100 characters. > #' If grouping expression is missing \code{rollup} creates a single global > aggregate and is equivalent to > ^ > R/deserialize.R:46:10: style: Remove spaces before the left parenthesis in a > function call. > switch (type, > ^ > R/functions.R:41:1: style: Lines should not be more than 100 characters. > #' @param x Column to compute on. In \code{window}, it must be a time Column > of \code{TimestampType}. > ^ >
[jira] [Commented] (SPARK-22063) Upgrade lintr to latest commit sha1 ID
[ https://issues.apache.org/jira/browse/SPARK-22063?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16187328#comment-16187328 ] Hyukjin Kwon commented on SPARK-22063: -- The lint failures were fixed first in https://github.com/apache/spark/pull/19290; however, it is not actually upgraded to {{jimhester/lintr@5431140}} due to the concern of breaking other builds. Please see the discussion in the PR if anyone is interested in this. This is not yet fully solved. > Upgrade lintr to latest commit sha1 ID > -- > > Key: SPARK-22063 > URL: https://issues.apache.org/jira/browse/SPARK-22063 > Project: Spark > Issue Type: Improvement > Components: SparkR >Affects Versions: 2.3.0 >Reporter: Hyukjin Kwon >Priority: Minor > > Currently, we set lintr to {{jimhester/lintr@a769c0b}} (see [this > pr|https://github.com/apache/spark/commit/7d1175011c976756efcd4e4e4f70a8fd6f287026]) > and SPARK-14074. > Today, I tried to upgrade the latest, > https://github.com/jimhester/lintr/commit/5431140ffea65071f1327625d4a8de9688fa7e72 > This fixes many bugs and now finds many instances that I have observed and > thought should be caught time to time: > {code} > inst/worker/worker.R:71:10: style: Remove spaces before the left parenthesis > in a function call. > return (output) > ^ > R/column.R:241:1: style: Lines should not be more than 100 characters. > #' > \href{https://spark.apache.org/docs/latest/sparkr.html#data-type-mapping-between-r-and-spark}{ > ^~~~ > R/context.R:332:1: style: Variable and function names should not be longer > than 30 characters. > spark.getSparkFilesRootDirectory <- function() { > ^~~~ > R/DataFrame.R:1912:1: style: Lines should not be more than 100 characters. > #' @param j,select expression for the single Column or a list of columns to > select from the SparkDataFrame. > ^~~ > R/DataFrame.R:1918:1: style: Lines should not be more than 100 characters. > #' @return A new SparkDataFrame containing only the rows that meet the > condition with selected columns. > ^~~ > R/DataFrame.R:2597:22: style: Remove spaces before the left parenthesis in a > function call. > return (joinRes) > ^ > R/DataFrame.R:2652:1: style: Variable and function names should not be longer > than 30 characters. > generateAliasesForIntersectedCols <- function (x, intersectedColNames, > suffix) { > ^ > R/DataFrame.R:2652:47: style: Remove spaces before the left parenthesis in a > function call. > generateAliasesForIntersectedCols <- function (x, intersectedColNames, > suffix) { > ^ > R/DataFrame.R:2660:14: style: Remove spaces before the left parenthesis in a > function call. > stop ("The following column name: ", newJoin, " occurs more than once > in the 'DataFrame'.", > ^ > R/DataFrame.R:3047:1: style: Lines should not be more than 100 characters. > #' @note The statistics provided by \code{summary} were change in 2.3.0 use > \link{describe} for previous defaults. > ^~ > R/DataFrame.R:3754:1: style: Lines should not be more than 100 characters. > #' If grouping expression is missing \code{cube} creates a single global > aggregate and is equivalent to > ^~~ > R/DataFrame.R:3789:1: style: Lines should not be more than 100 characters. > #' If grouping expression is missing \code{rollup} creates a single global > aggregate and is equivalent to > ^ > R/deserialize.R:46:10: style: Remove spaces before the left parenthesis in a > function call. > switch (type, > ^ > R/functions.R:41:1: style: Lines should not be more than 100 characters. > #' @param x Column to compute on. In \code{window}, it must be a time Column > of \code{TimestampType}. > ^ > R/functions.R:93:1: style: Lines should not be more than 100 characters. > #' @param x Column to compute on. In \code{shiftLeft}, \code{shiftRight} and > \code{shiftRightUnsigned}, > ^~~ >
[jira] [Commented] (SPARK-22063) Upgrade lintr to latest commit sha1 ID
[ https://issues.apache.org/jira/browse/SPARK-22063?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16172950#comment-16172950 ] Apache Spark commented on SPARK-22063: -- User 'HyukjinKwon' has created a pull request for this issue: https://github.com/apache/spark/pull/19290 > Upgrade lintr to latest commit sha1 ID > -- > > Key: SPARK-22063 > URL: https://issues.apache.org/jira/browse/SPARK-22063 > Project: Spark > Issue Type: Improvement > Components: SparkR >Affects Versions: 2.3.0 >Reporter: Hyukjin Kwon >Priority: Minor > > Currently, we set lintr to {{jimhester/lintr@a769c0b}} (see [this > pr|https://github.com/apache/spark/commit/7d1175011c976756efcd4e4e4f70a8fd6f287026]) > and SPARK-14074. > Today, I tried to upgrade the latest, > https://github.com/jimhester/lintr/commit/5431140ffea65071f1327625d4a8de9688fa7e72 > This fixes many bugs and now finds many instances that I have observed and > thought should be caught time to time: > {code} > inst/worker/worker.R:71:10: style: Remove spaces before the left parenthesis > in a function call. > return (output) > ^ > R/column.R:241:1: style: Lines should not be more than 100 characters. > #' > \href{https://spark.apache.org/docs/latest/sparkr.html#data-type-mapping-between-r-and-spark}{ > ^~~~ > R/context.R:332:1: style: Variable and function names should not be longer > than 30 characters. > spark.getSparkFilesRootDirectory <- function() { > ^~~~ > R/DataFrame.R:1912:1: style: Lines should not be more than 100 characters. > #' @param j,select expression for the single Column or a list of columns to > select from the SparkDataFrame. > ^~~ > R/DataFrame.R:1918:1: style: Lines should not be more than 100 characters. > #' @return A new SparkDataFrame containing only the rows that meet the > condition with selected columns. > ^~~ > R/DataFrame.R:2597:22: style: Remove spaces before the left parenthesis in a > function call. > return (joinRes) > ^ > R/DataFrame.R:2652:1: style: Variable and function names should not be longer > than 30 characters. > generateAliasesForIntersectedCols <- function (x, intersectedColNames, > suffix) { > ^ > R/DataFrame.R:2652:47: style: Remove spaces before the left parenthesis in a > function call. > generateAliasesForIntersectedCols <- function (x, intersectedColNames, > suffix) { > ^ > R/DataFrame.R:2660:14: style: Remove spaces before the left parenthesis in a > function call. > stop ("The following column name: ", newJoin, " occurs more than once > in the 'DataFrame'.", > ^ > R/DataFrame.R:3047:1: style: Lines should not be more than 100 characters. > #' @note The statistics provided by \code{summary} were change in 2.3.0 use > \link{describe} for previous defaults. > ^~ > R/DataFrame.R:3754:1: style: Lines should not be more than 100 characters. > #' If grouping expression is missing \code{cube} creates a single global > aggregate and is equivalent to > ^~~ > R/DataFrame.R:3789:1: style: Lines should not be more than 100 characters. > #' If grouping expression is missing \code{rollup} creates a single global > aggregate and is equivalent to > ^ > R/deserialize.R:46:10: style: Remove spaces before the left parenthesis in a > function call. > switch (type, > ^ > R/functions.R:41:1: style: Lines should not be more than 100 characters. > #' @param x Column to compute on. In \code{window}, it must be a time Column > of \code{TimestampType}. > ^ > R/functions.R:93:1: style: Lines should not be more than 100 characters. > #' @param x Column to compute on. In \code{shiftLeft}, \code{shiftRight} and > \code{shiftRightUnsigned}, > ^~~ > R/functions.R:483:52: style: Remove spaces before the left parenthesis in a > function call. > jcols <- lapply(list(x, ...), function (x) { >
[jira] [Commented] (SPARK-22063) Upgrade lintr to latest commit sha1 ID
[ https://issues.apache.org/jira/browse/SPARK-22063?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16171987#comment-16171987 ] Hyukjin Kwon commented on SPARK-22063: -- Will open a PR within few days. > Upgrade lintr to latest commit sha1 ID > -- > > Key: SPARK-22063 > URL: https://issues.apache.org/jira/browse/SPARK-22063 > Project: Spark > Issue Type: Improvement > Components: SparkR >Affects Versions: 2.3.0 >Reporter: Hyukjin Kwon >Priority: Minor > > Currently, we set lintr to {{jimhester/lintr@a769c0b}} (see [this > pr|https://github.com/apache/spark/commit/7d1175011c976756efcd4e4e4f70a8fd6f287026]) > and SPARK-14074. > Today, I tried to upgrade the latest, > https://github.com/jimhester/lintr/commit/5431140ffea65071f1327625d4a8de9688fa7e72 > This fixes many bugs and now finds many instances that I have observed and > thought should be caught time to time: > {code} > inst/worker/worker.R:71:10: style: Remove spaces before the left parenthesis > in a function call. > return (output) > ^ > R/column.R:241:1: style: Lines should not be more than 100 characters. > #' > \href{https://spark.apache.org/docs/latest/sparkr.html#data-type-mapping-between-r-and-spark}{ > ^~~~ > R/context.R:332:1: style: Variable and function names should not be longer > than 30 characters. > spark.getSparkFilesRootDirectory <- function() { > ^~~~ > R/DataFrame.R:1912:1: style: Lines should not be more than 100 characters. > #' @param j,select expression for the single Column or a list of columns to > select from the SparkDataFrame. > ^~~ > R/DataFrame.R:1918:1: style: Lines should not be more than 100 characters. > #' @return A new SparkDataFrame containing only the rows that meet the > condition with selected columns. > ^~~ > R/DataFrame.R:2597:22: style: Remove spaces before the left parenthesis in a > function call. > return (joinRes) > ^ > R/DataFrame.R:2652:1: style: Variable and function names should not be longer > than 30 characters. > generateAliasesForIntersectedCols <- function (x, intersectedColNames, > suffix) { > ^ > R/DataFrame.R:2652:47: style: Remove spaces before the left parenthesis in a > function call. > generateAliasesForIntersectedCols <- function (x, intersectedColNames, > suffix) { > ^ > R/DataFrame.R:2660:14: style: Remove spaces before the left parenthesis in a > function call. > stop ("The following column name: ", newJoin, " occurs more than once > in the 'DataFrame'.", > ^ > R/DataFrame.R:3047:1: style: Lines should not be more than 100 characters. > #' @note The statistics provided by \code{summary} were change in 2.3.0 use > \link{describe} for previous defaults. > ^~ > R/DataFrame.R:3754:1: style: Lines should not be more than 100 characters. > #' If grouping expression is missing \code{cube} creates a single global > aggregate and is equivalent to > ^~~ > R/DataFrame.R:3789:1: style: Lines should not be more than 100 characters. > #' If grouping expression is missing \code{rollup} creates a single global > aggregate and is equivalent to > ^ > R/deserialize.R:46:10: style: Remove spaces before the left parenthesis in a > function call. > switch (type, > ^ > R/functions.R:41:1: style: Lines should not be more than 100 characters. > #' @param x Column to compute on. In \code{window}, it must be a time Column > of \code{TimestampType}. > ^ > R/functions.R:93:1: style: Lines should not be more than 100 characters. > #' @param x Column to compute on. In \code{shiftLeft}, \code{shiftRight} and > \code{shiftRightUnsigned}, > ^~~ > R/functions.R:483:52: style: Remove spaces before the left parenthesis in a > function call. > jcols <- lapply(list(x, ...), function (x) { >^ > R/functions.R:679:52: style: Remove spaces before the left