[ https://issues.apache.org/jira/browse/SPARK-39959?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17574864#comment-17574864 ]
Yikun Jiang edited comment on SPARK-39959 at 8/3/22 4:59 PM: ------------------------------------------------------------- 1. Looks like docker cache not working (do full refresh then) - [https://www.diffchecker.com/TpOlQsg1] from results, it might related to change on github action docker config change? - need to do a Full refresh on dockerfile to make cache work. 2. roxygen2 upgrade to 7.2.1, this should be the root reason of sparkr job failed. was (Author: yikunkero): 1. Looks like docker cache not working (do full refresh then) - [https://www.diffchecker.com/TpOlQsg1] from results, it might related to change on github action docker config change? - need to do a Full refresh on dockerfile to make cache work. 2. roxygen2 upgrade to 7.2.1, this should be the root reason. > Recover SparkR CRAN check in GitHub Actions CI > ---------------------------------------------- > > Key: SPARK-39959 > URL: https://issues.apache.org/jira/browse/SPARK-39959 > Project: Spark > Issue Type: Test > Components: Project Infra, SparkR > Affects Versions: 3.4.0 > Reporter: Hyukjin Kwon > Priority: Major > > {code} > cd R > ./install-dev.sh > ./check-cran.sh > {code} > fails (I think with latest dependences of R documentation build, e.g., > rmarkdown or roxygen2) as below in the current CI > (https://github.com/apache/spark/runs/7623722912?check_suite_focus=true) > {code} > * checking for missing documentation entries ... WARNING > Undocumented code objects: > ‘%<=>%’ ‘add_months’ ‘agg’ ‘approxCountDistinct’ ‘approxQuantile’ > ‘approx_count_distinct’ ‘arrange’ ‘array_aggregate’ ‘array_contains’ > ‘array_distinct’ ‘array_except’ ‘array_exists’ ‘array_filter’ > ‘array_forall’ ‘array_intersect’ ‘array_join’ ‘array_max’ ‘array_min’ > ‘array_position’ ‘array_remove’ ‘array_repeat’ ‘array_sort’ > ‘array_to_vector’ ‘array_transform’ ‘array_union’ ‘arrays_overlap’ > ‘arrays_zip’ ‘arrays_zip_with’ ‘as.DataFrame’ ‘as.data.frame’ ‘asc’ > ‘asc_nulls_first’ ‘asc_nulls_last’ ‘ascii’ ‘assert_true’ ‘avg’ > ‘awaitTermination’ ‘base64’ ‘between’ ‘bin’ ‘bit_length’ ‘bitwiseNOT’ > ‘bitwise_not’ ‘broadcast’ ‘bround’ ‘cache’ ‘cacheTable’ > ‘cancelJobGroup’ ‘cast’ ‘cbrt’ ‘ceil’ ‘checkpoint’ ‘clearCache’ > ‘clearJobGroup’ ‘coalesce’ ‘collect’ ‘collect_list’ ‘collect_set’ > ‘colnames’ ‘colnames<-’ ‘coltypes’ ‘coltypes<-’ ‘column’ ‘columns’ > ‘concat’ ‘concat_ws’ ‘contains’ ‘conv’ ‘corr’ ‘cot’ ‘count’ > ‘countDistinct’ ‘count_distinct’ ‘cov’ ‘covar_pop’ ‘covar_samp’ > ‘crc32’ ‘createDataFrame’ ‘createExternalTable’ > ‘createOrReplaceTempView’ ‘createTable’ ‘create_array’ ‘create_map’ > ‘crossJoin’ ‘crosstab’ ‘csc’ ‘cube’ ‘cume_dist’ ‘currentCatalog’ > ‘currentDatabase’ ‘current_date’ ‘current_timestamp’ ‘dapply’ > ‘dapplyCollect’ ‘databaseExists’ ‘date_add’ ‘date_format’ ‘date_sub’ > ‘date_trunc’ ‘datediff’ ‘dayofmonth’ ‘dayofweek’ ‘dayofyear’ ‘decode’ > ‘degrees’ ‘dense_rank’ ‘desc’ ‘desc_nulls_first’ ‘desc_nulls_last’ > ‘describe’ ‘distinct’ ‘drop’ ‘dropDuplicates’ ‘dropFields’ > ‘dropTempTable’ ‘dropTempView’ ‘dropna’ ‘dtypes’ ‘element_at’ > ‘encode’ ‘endsWith’ ‘except’ ‘exceptAll’ ‘explain’ ‘explode’ > ‘explode_outer’ ‘expr’ ‘fillna’ ‘filter’ ‘first’ ‘flatten’ > ‘format_number’ ‘format_string’ ‘freqItems’ ‘from_avro’ ‘from_csv’ > ‘from_json’ ‘from_unixtime’ ‘from_utc_timestamp’ ‘functionExists’ > ‘gapply’ ‘gapplyCollect’ ‘getDatabase’ ‘getField’ ‘getFunc’ ‘getItem’ > ‘getLocalProperty’ ‘getNumPartitions’ ‘getTable’ ‘greatest’ ‘groupBy’ > ‘group_by’ ‘grouping_bit’ ‘grouping_id’ ‘hash’ ‘hex’ ‘hint’ > ‘histogram’ ‘hour’ ‘hypot’ ‘ilike’ ‘initcap’ ‘input_file_name’ > ‘insertInto’ ‘install.spark’ ‘instr’ ‘intersect’ ‘intersectAll’ > ‘isActive’ ‘isLocal’ ‘isNaN’ ‘isNotNull’ ‘isNull’ ‘isStreaming’ > ‘isnan’ ‘join’ ‘kurtosis’ ‘lag’ ‘last’ ‘lastProgress’ ‘last_day’ > ‘lead’ ‘least’ ‘levenshtein’ ‘like’ ‘limit’ ‘listCatalogs’ > ‘listColumns’ ‘listDatabases’ ‘listFunctions’ ‘listTables’ ‘lit’ > ‘loadDF’ ‘localCheckpoint’ ‘locate’ ‘lower’ ‘lpad’ ‘ltrim’ > ‘make_date’ ‘map_concat’ ‘map_entries’ ‘map_filter’ ‘map_from_arrays’ > ‘map_from_entries’ ‘map_keys’ ‘map_values’ ‘map_zip_with’ ‘max_by’ > ‘md5’ ‘min_by’ ‘minute’ ‘monotonically_increasing_id’ ‘month’ > ‘months_between’ ‘mutate’ ‘n’ ‘n_distinct’ ‘na.omit’ ‘nanvl’ ‘negate’ > ‘next_day’ ‘not’ ‘nth_value’ ‘ntile’ ‘octet_length’ ‘orderBy’ > ‘otherwise’ ‘over’ ‘overlay’ ‘partitionBy’ ‘percent_rank’ > ‘percentile_approx’ ‘persist’ ‘pivot’ ‘pmod’ ‘posexplode’ > ‘posexplode_outer’ ‘predict’ ‘print.jobj’ ‘print.structField’ > ‘print.structType’ ‘print.summary.DecisionTreeClassificationModel’ > ‘print.summary.DecisionTreeRegressionModel’ > ‘print.summary.GBTClassificationModel’ > ‘print.summary.GBTRegressionModel’ > ‘print.summary.GeneralizedLinearRegressionModel’ > ‘print.summary.KSTest’ > ‘print.summary.RandomForestClassificationModel’ > ‘print.summary.RandomForestRegressionModel’ ‘printSchema’ ‘product’ > ‘quarter’ ‘queryName’ ‘radians’ ‘raise_error’ ‘rand’ ‘randn’ > ‘randomSplit’ ‘rangeBetween’ ‘rank’ ‘rbind’ ‘read.df’ ‘read.jdbc’ > ‘read.json’ ‘read.ml’ ‘read.orc’ ‘read.parquet’ ‘read.stream’ > ‘read.text’ ‘recoverPartitions’ ‘refreshByPath’ ‘refreshTable’ > ‘regexp_extract’ ‘regexp_replace’ ‘registerTempTable’ ‘rename’ > ‘repartition’ ‘repartitionByRange’ ‘repeat_string’ ‘reverse’ ‘rint’ > ‘rlike’ ‘rollup’ ‘row_number’ ‘rowsBetween’ ‘rpad’ ‘rtrim’ ‘sample’ > ‘sampleBy’ ‘sample_frac’ ‘saveAsTable’ ‘saveDF’ ‘schema’ > ‘schema_of_csv’ ‘schema_of_json’ ‘sd’ ‘sec’ ‘second’ ‘select’ > ‘selectExpr’ ‘setCheckpointDir’ ‘setCurrentCatalog’ > ‘setCurrentDatabase’ ‘setJobDescription’ ‘setJobGroup’ > ‘setLocalProperty’ ‘setLogLevel’ ‘sha1’ ‘sha2’ ‘shiftLeft’ > ‘shiftRight’ ‘shiftRightUnsigned’ ‘shiftleft’ ‘shiftright’ > ‘shiftrightunsigned’ ‘showDF’ ‘shuffle’ ‘signum’ ‘size’ ‘skewness’ > ‘slice’ ‘sort_array’ ‘soundex’ ‘spark.addFile’ ‘spark.als’ > ‘spark.assignClusters’ ‘spark.associationRules’ > ‘spark.bisectingKmeans’ ‘spark.decisionTree’ > ‘spark.findFrequentSequentialPatterns’ ‘spark.fmClassifier’ > ‘spark.fmRegressor’ ‘spark.fpGrowth’ ‘spark.freqItemsets’ > ‘spark.gaussianMixture’ ‘spark.gbt’ ‘spark.getSparkFiles’ > ‘spark.getSparkFilesRootDirectory’ ‘spark.glm’ ‘spark.isoreg’ > ‘spark.kmeans’ ‘spark.kstest’ ‘spark.lapply’ ‘spark.lda’ ‘spark.lm’ > ‘spark.logit’ ‘spark.mlp’ ‘spark.naiveBayes’ ‘spark.perplexity’ > ‘spark.posterior’ ‘spark.randomForest’ ‘spark.survreg’ > ‘spark.svmLinear’ ‘sparkR.callJMethod’ ‘sparkR.callJStatic’ > ‘sparkR.conf’ ‘sparkR.init’ ‘sparkR.newJObject’ ‘sparkR.session’ > ‘sparkR.session.stop’ ‘sparkR.stop’ ‘sparkR.uiWebUrl’ > ‘sparkR.version’ ‘sparkRHive.init’ ‘sparkRSQL.init’ > ‘spark_partition_id’ ‘split_string’ ‘sql’ ‘startsWith’ ‘status’ > ‘stddev’ ‘stddev_pop’ ‘stddev_samp’ ‘stopQuery’ ‘storageLevel’ > ‘struct’ ‘structField’ ‘structField.character’ ‘structField.jobj’ > ‘structType’ ‘structType.character’ ‘structType.jobj’ > ‘structType.structField’ ‘subset’ ‘substring_index’ ‘sumDistinct’ > ‘sum_distinct’ ‘summarize’ ‘summary’ ‘tableExists’ ‘tableNames’ > ‘tableToDF’ ‘tables’ ‘take’ ‘timestamp_seconds’ ‘toDegrees’ ‘toJSON’ > ‘toRadians’ ‘to_avro’ ‘to_csv’ ‘to_date’ ‘to_json’ ‘to_timestamp’ > ‘to_utc_timestamp’ ‘transform’ ‘transform_keys’ ‘transform_values’ > ‘translate’ ‘trim’ ‘unbase64’ ‘uncacheTable’ ‘unhex’ ‘union’ > ‘unionAll’ ‘unionByName’ ‘unix_timestamp’ ‘unpersist’ ‘upper’ ‘var’ > ‘var_pop’ ‘var_samp’ ‘variance’ ‘vector_to_array’ ‘weekofyear’ ‘when’ > ‘where’ ‘window’ ‘windowOrderBy’ ‘windowPartitionBy’ ‘withColumn’ > ‘withColumnRenamed’ ‘withField’ ‘withWatermark’ ‘write.df’ > ‘write.jdbc’ ‘write.json’ ‘write.ml’ ‘write.orc’ ‘write.parquet’ > ‘write.stream’ ‘write.text’ ‘xxhash64’ ‘year’ > {code} -- This message was sent by Atlassian Jira (v8.20.10#820010) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org