[jira] [Commented] (SPARK-22327) R CRAN check fails on non-latest branches
[ https://issues.apache.org/jira/browse/SPARK-22327?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16214181#comment-16214181 ] Apache Spark commented on SPARK-22327: -- User 'felixcheung' has created a pull request for this issue: https://github.com/apache/spark/pull/19550 > R CRAN check fails on non-latest branches > - > > Key: SPARK-22327 > URL: https://issues.apache.org/jira/browse/SPARK-22327 > Project: Spark > Issue Type: Bug > Components: SparkR >Affects Versions: 1.6.4, 2.0.3, 2.1.3, 2.2.1, 2.3.0 >Reporter: Felix Cheung > > with warning > * checking CRAN incoming feasibility ... WARNING > Maintainer: 'Shivaram Venkataraman ' > Insufficient package version (submitted: 2.0.3, existing: 2.1.2) > Unknown, possibly mis-spelled, fields in DESCRIPTION: > 'RoxygenNote' > WARNING: There was 1 warning. > NOTE: There were 2 notes. > We have seen this in branch-1.6, branch-2.0, and this would be a problem for > branch-2.1 after we ship 2.2.1. > The root cause of the issue is in the package version check in CRAN check. > After the SparkR package version 2.1.2 (is first) published, any older > version is failing the version check. As far as we know, there is no way to > skip this version check. > Also, there is previously a NOTE on new maintainer. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-22327) R CRAN check fails on non-latest branches
[ https://issues.apache.org/jira/browse/SPARK-22327?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16214180#comment-16214180 ] Apache Spark commented on SPARK-22327: -- User 'felixcheung' has created a pull request for this issue: https://github.com/apache/spark/pull/19549 > R CRAN check fails on non-latest branches > - > > Key: SPARK-22327 > URL: https://issues.apache.org/jira/browse/SPARK-22327 > Project: Spark > Issue Type: Bug > Components: SparkR >Affects Versions: 1.6.4, 2.0.3, 2.1.3, 2.2.1, 2.3.0 >Reporter: Felix Cheung > > with warning > * checking CRAN incoming feasibility ... WARNING > Maintainer: 'Shivaram Venkataraman ' > Insufficient package version (submitted: 2.0.3, existing: 2.1.2) > Unknown, possibly mis-spelled, fields in DESCRIPTION: > 'RoxygenNote' > WARNING: There was 1 warning. > NOTE: There were 2 notes. > We have seen this in branch-1.6, branch-2.0, and this would be a problem for > branch-2.1 after we ship 2.2.1. > The root cause of the issue is in the package version check in CRAN check. > After the SparkR package version 2.1.2 (is first) published, any older > version is failing the version check. As far as we know, there is no way to > skip this version check. > Also, there is previously a NOTE on new maintainer. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Assigned] (SPARK-22327) R CRAN check fails on non-latest branches
[ https://issues.apache.org/jira/browse/SPARK-22327?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-22327: Assignee: (was: Apache Spark) > R CRAN check fails on non-latest branches > - > > Key: SPARK-22327 > URL: https://issues.apache.org/jira/browse/SPARK-22327 > Project: Spark > Issue Type: Bug > Components: SparkR >Affects Versions: 1.6.4, 2.0.3, 2.1.3, 2.2.1, 2.3.0 >Reporter: Felix Cheung > > with warning > * checking CRAN incoming feasibility ... WARNING > Maintainer: 'Shivaram Venkataraman ' > Insufficient package version (submitted: 2.0.3, existing: 2.1.2) > Unknown, possibly mis-spelled, fields in DESCRIPTION: > 'RoxygenNote' > WARNING: There was 1 warning. > NOTE: There were 2 notes. > We have seen this in branch-1.6, branch-2.0, and this would be a problem for > branch-2.1 after we ship 2.2.1. > The root cause of the issue is in the package version check in CRAN check. > After the SparkR package version 2.1.2 (is first) published, any older > version is failing the version check. As far as we know, there is no way to > skip this version check. > Also, there is previously a NOTE on new maintainer. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Assigned] (SPARK-22327) R CRAN check fails on non-latest branches
[ https://issues.apache.org/jira/browse/SPARK-22327?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-22327: Assignee: Apache Spark > R CRAN check fails on non-latest branches > - > > Key: SPARK-22327 > URL: https://issues.apache.org/jira/browse/SPARK-22327 > Project: Spark > Issue Type: Bug > Components: SparkR >Affects Versions: 1.6.4, 2.0.3, 2.1.3, 2.2.1, 2.3.0 >Reporter: Felix Cheung >Assignee: Apache Spark > > with warning > * checking CRAN incoming feasibility ... WARNING > Maintainer: 'Shivaram Venkataraman ' > Insufficient package version (submitted: 2.0.3, existing: 2.1.2) > Unknown, possibly mis-spelled, fields in DESCRIPTION: > 'RoxygenNote' > WARNING: There was 1 warning. > NOTE: There were 2 notes. > We have seen this in branch-1.6, branch-2.0, and this would be a problem for > branch-2.1 after we ship 2.2.1. > The root cause of the issue is in the package version check in CRAN check. > After the SparkR package version 2.1.2 (is first) published, any older > version is failing the version check. As far as we know, there is no way to > skip this version check. > Also, there is previously a NOTE on new maintainer. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Assigned] (SPARK-22303) [SQL] Getting java.sql.SQLException: Unsupported type 101 for BINARY_DOUBLE
[ https://issues.apache.org/jira/browse/SPARK-22303?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-22303: Assignee: (was: Apache Spark) > [SQL] Getting java.sql.SQLException: Unsupported type 101 for BINARY_DOUBLE > --- > > Key: SPARK-22303 > URL: https://issues.apache.org/jira/browse/SPARK-22303 > Project: Spark > Issue Type: Bug > Components: SQL >Affects Versions: 2.2.0 >Reporter: Kohki Nishio >Priority: Minor > > When a table contains columns such as BINARY_DOUBLE or BINARY_FLOAT, this > JDBC connector throws SQL exception > {code} > java.sql.SQLException: Unsupported type 101 > at > org.apache.spark.sql.execution.datasources.jdbc.JdbcUtils$.org$apache$spark$sql$execution$datasources$jdbc$JdbcUtils$$getCatalystType(JdbcUtils.scala:235) > at > org.apache.spark.sql.execution.datasources.jdbc.JdbcUtils$$anonfun$8.apply(JdbcUtils.scala:292) > at > org.apache.spark.sql.execution.datasources.jdbc.JdbcUtils$$anonfun$8.apply(JdbcUtils.scala:292) > at scala.Option.getOrElse(Option.scala:121) > at > org.apache.spark.sql.execution.datasources.jdbc.JdbcUtils$.getSchema(JdbcUtils.scala:291) > at > org.apache.spark.sql.execution.datasources.jdbc.JDBCRDD$.resolveTable(JDBCRDD.scala:64) > at > org.apache.spark.sql.execution.datasources.jdbc.JDBCRelation.(JDBCRelation.scala:113) > at > org.apache.spark.sql.execution.datasources.jdbc.JdbcRelationProvider.createRelation(JdbcRelationProvider.scala:47) > at > org.apache.spark.sql.execution.datasources.DataSource.resolveRelation(DataSource.scala:306) > at org.apache.spark.sql.DataFrameReader.load(DataFrameReader.scala:178) > at org.apache.spark.sql.DataFrameReader.load(DataFrameReader.scala:146) > {code} > these types are Oracle specific ones, described here > https://docs.oracle.com/cd/E11882_01/timesten.112/e21642/types.htm#TTSQL148 -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-22303) [SQL] Getting java.sql.SQLException: Unsupported type 101 for BINARY_DOUBLE
[ https://issues.apache.org/jira/browse/SPARK-22303?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16214164#comment-16214164 ] Apache Spark commented on SPARK-22303: -- User 'taroplus' has created a pull request for this issue: https://github.com/apache/spark/pull/19548 > [SQL] Getting java.sql.SQLException: Unsupported type 101 for BINARY_DOUBLE > --- > > Key: SPARK-22303 > URL: https://issues.apache.org/jira/browse/SPARK-22303 > Project: Spark > Issue Type: Bug > Components: SQL >Affects Versions: 2.2.0 >Reporter: Kohki Nishio >Priority: Minor > > When a table contains columns such as BINARY_DOUBLE or BINARY_FLOAT, this > JDBC connector throws SQL exception > {code} > java.sql.SQLException: Unsupported type 101 > at > org.apache.spark.sql.execution.datasources.jdbc.JdbcUtils$.org$apache$spark$sql$execution$datasources$jdbc$JdbcUtils$$getCatalystType(JdbcUtils.scala:235) > at > org.apache.spark.sql.execution.datasources.jdbc.JdbcUtils$$anonfun$8.apply(JdbcUtils.scala:292) > at > org.apache.spark.sql.execution.datasources.jdbc.JdbcUtils$$anonfun$8.apply(JdbcUtils.scala:292) > at scala.Option.getOrElse(Option.scala:121) > at > org.apache.spark.sql.execution.datasources.jdbc.JdbcUtils$.getSchema(JdbcUtils.scala:291) > at > org.apache.spark.sql.execution.datasources.jdbc.JDBCRDD$.resolveTable(JDBCRDD.scala:64) > at > org.apache.spark.sql.execution.datasources.jdbc.JDBCRelation.(JDBCRelation.scala:113) > at > org.apache.spark.sql.execution.datasources.jdbc.JdbcRelationProvider.createRelation(JdbcRelationProvider.scala:47) > at > org.apache.spark.sql.execution.datasources.DataSource.resolveRelation(DataSource.scala:306) > at org.apache.spark.sql.DataFrameReader.load(DataFrameReader.scala:178) > at org.apache.spark.sql.DataFrameReader.load(DataFrameReader.scala:146) > {code} > these types are Oracle specific ones, described here > https://docs.oracle.com/cd/E11882_01/timesten.112/e21642/types.htm#TTSQL148 -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Assigned] (SPARK-22303) [SQL] Getting java.sql.SQLException: Unsupported type 101 for BINARY_DOUBLE
[ https://issues.apache.org/jira/browse/SPARK-22303?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-22303: Assignee: Apache Spark > [SQL] Getting java.sql.SQLException: Unsupported type 101 for BINARY_DOUBLE > --- > > Key: SPARK-22303 > URL: https://issues.apache.org/jira/browse/SPARK-22303 > Project: Spark > Issue Type: Bug > Components: SQL >Affects Versions: 2.2.0 >Reporter: Kohki Nishio >Assignee: Apache Spark >Priority: Minor > > When a table contains columns such as BINARY_DOUBLE or BINARY_FLOAT, this > JDBC connector throws SQL exception > {code} > java.sql.SQLException: Unsupported type 101 > at > org.apache.spark.sql.execution.datasources.jdbc.JdbcUtils$.org$apache$spark$sql$execution$datasources$jdbc$JdbcUtils$$getCatalystType(JdbcUtils.scala:235) > at > org.apache.spark.sql.execution.datasources.jdbc.JdbcUtils$$anonfun$8.apply(JdbcUtils.scala:292) > at > org.apache.spark.sql.execution.datasources.jdbc.JdbcUtils$$anonfun$8.apply(JdbcUtils.scala:292) > at scala.Option.getOrElse(Option.scala:121) > at > org.apache.spark.sql.execution.datasources.jdbc.JdbcUtils$.getSchema(JdbcUtils.scala:291) > at > org.apache.spark.sql.execution.datasources.jdbc.JDBCRDD$.resolveTable(JDBCRDD.scala:64) > at > org.apache.spark.sql.execution.datasources.jdbc.JDBCRelation.(JDBCRelation.scala:113) > at > org.apache.spark.sql.execution.datasources.jdbc.JdbcRelationProvider.createRelation(JdbcRelationProvider.scala:47) > at > org.apache.spark.sql.execution.datasources.DataSource.resolveRelation(DataSource.scala:306) > at org.apache.spark.sql.DataFrameReader.load(DataFrameReader.scala:178) > at org.apache.spark.sql.DataFrameReader.load(DataFrameReader.scala:146) > {code} > these types are Oracle specific ones, described here > https://docs.oracle.com/cd/E11882_01/timesten.112/e21642/types.htm#TTSQL148 -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-22303) [SQL] Getting java.sql.SQLException: Unsupported type 101 for BINARY_DOUBLE
[ https://issues.apache.org/jira/browse/SPARK-22303?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16214163#comment-16214163 ] Kohki Nishio commented on SPARK-22303: -- https://github.com/apache/spark/pull/19548 > [SQL] Getting java.sql.SQLException: Unsupported type 101 for BINARY_DOUBLE > --- > > Key: SPARK-22303 > URL: https://issues.apache.org/jira/browse/SPARK-22303 > Project: Spark > Issue Type: Bug > Components: SQL >Affects Versions: 2.2.0 >Reporter: Kohki Nishio >Priority: Minor > > When a table contains columns such as BINARY_DOUBLE or BINARY_FLOAT, this > JDBC connector throws SQL exception > {code} > java.sql.SQLException: Unsupported type 101 > at > org.apache.spark.sql.execution.datasources.jdbc.JdbcUtils$.org$apache$spark$sql$execution$datasources$jdbc$JdbcUtils$$getCatalystType(JdbcUtils.scala:235) > at > org.apache.spark.sql.execution.datasources.jdbc.JdbcUtils$$anonfun$8.apply(JdbcUtils.scala:292) > at > org.apache.spark.sql.execution.datasources.jdbc.JdbcUtils$$anonfun$8.apply(JdbcUtils.scala:292) > at scala.Option.getOrElse(Option.scala:121) > at > org.apache.spark.sql.execution.datasources.jdbc.JdbcUtils$.getSchema(JdbcUtils.scala:291) > at > org.apache.spark.sql.execution.datasources.jdbc.JDBCRDD$.resolveTable(JDBCRDD.scala:64) > at > org.apache.spark.sql.execution.datasources.jdbc.JDBCRelation.(JDBCRelation.scala:113) > at > org.apache.spark.sql.execution.datasources.jdbc.JdbcRelationProvider.createRelation(JdbcRelationProvider.scala:47) > at > org.apache.spark.sql.execution.datasources.DataSource.resolveRelation(DataSource.scala:306) > at org.apache.spark.sql.DataFrameReader.load(DataFrameReader.scala:178) > at org.apache.spark.sql.DataFrameReader.load(DataFrameReader.scala:146) > {code} > these types are Oracle specific ones, described here > https://docs.oracle.com/cd/E11882_01/timesten.112/e21642/types.htm#TTSQL148 -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Reopened] (SPARK-22303) [SQL] Getting java.sql.SQLException: Unsupported type 101 for BINARY_DOUBLE
[ https://issues.apache.org/jira/browse/SPARK-22303?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kohki Nishio reopened SPARK-22303: -- Spark supports Oracle specific types, I'm working on a PR, please keep this open > [SQL] Getting java.sql.SQLException: Unsupported type 101 for BINARY_DOUBLE > --- > > Key: SPARK-22303 > URL: https://issues.apache.org/jira/browse/SPARK-22303 > Project: Spark > Issue Type: Bug > Components: SQL >Affects Versions: 2.2.0 >Reporter: Kohki Nishio >Priority: Minor > > When a table contains columns such as BINARY_DOUBLE or BINARY_FLOAT, this > JDBC connector throws SQL exception > {code} > java.sql.SQLException: Unsupported type 101 > at > org.apache.spark.sql.execution.datasources.jdbc.JdbcUtils$.org$apache$spark$sql$execution$datasources$jdbc$JdbcUtils$$getCatalystType(JdbcUtils.scala:235) > at > org.apache.spark.sql.execution.datasources.jdbc.JdbcUtils$$anonfun$8.apply(JdbcUtils.scala:292) > at > org.apache.spark.sql.execution.datasources.jdbc.JdbcUtils$$anonfun$8.apply(JdbcUtils.scala:292) > at scala.Option.getOrElse(Option.scala:121) > at > org.apache.spark.sql.execution.datasources.jdbc.JdbcUtils$.getSchema(JdbcUtils.scala:291) > at > org.apache.spark.sql.execution.datasources.jdbc.JDBCRDD$.resolveTable(JDBCRDD.scala:64) > at > org.apache.spark.sql.execution.datasources.jdbc.JDBCRelation.(JDBCRelation.scala:113) > at > org.apache.spark.sql.execution.datasources.jdbc.JdbcRelationProvider.createRelation(JdbcRelationProvider.scala:47) > at > org.apache.spark.sql.execution.datasources.DataSource.resolveRelation(DataSource.scala:306) > at org.apache.spark.sql.DataFrameReader.load(DataFrameReader.scala:178) > at org.apache.spark.sql.DataFrameReader.load(DataFrameReader.scala:146) > {code} > these types are Oracle specific ones, described here > https://docs.oracle.com/cd/E11882_01/timesten.112/e21642/types.htm#TTSQL148 -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-21657) Spark has exponential time complexity to explode(array of structs)
[ https://issues.apache.org/jira/browse/SPARK-21657?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ruslan Dautkhanov updated SPARK-21657: -- Affects Version/s: 2.3.0 Issue Type: Bug (was: Improvement) > Spark has exponential time complexity to explode(array of structs) > -- > > Key: SPARK-21657 > URL: https://issues.apache.org/jira/browse/SPARK-21657 > Project: Spark > Issue Type: Bug > Components: Spark Core, SQL >Affects Versions: 2.0.0, 2.1.0, 2.1.1, 2.2.0, 2.3.0 >Reporter: Ruslan Dautkhanov > Labels: cache, caching, collections, nested_types, performance, > pyspark, sparksql, sql > Attachments: ExponentialTimeGrowth.PNG, > nested-data-generator-and-test.py > > > It can take up to half a day to explode a modest-sized nested collection > (0.5m). > On a recent Xeon processors. > See attached pyspark script that reproduces this problem. > {code} > cached_df = sqlc.sql('select individ, hholdid, explode(amft) from ' + > table_name).cache() > print sqlc.count() > {code} > This script generate a number of tables, with the same total number of > records across all nested collection (see `scaling` variable in loops). > `scaling` variable scales up how many nested elements in each record, but by > the same factor scales down number of records in the table. So total number > of records stays the same. > Time grows exponentially (notice log-10 vertical axis scale): > !ExponentialTimeGrowth.PNG! > At scaling of 50,000 (see attached pyspark script), it took 7 hours to > explode the nested collections (\!) of 8k records. > After 1000 elements in nested collection, time grows exponentially. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Assigned] (SPARK-22302) Remove manual backports for subprocess.check_output and check_call
[ https://issues.apache.org/jira/browse/SPARK-22302?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hyukjin Kwon reassigned SPARK-22302: Assignee: Hyukjin Kwon > Remove manual backports for subprocess.check_output and check_call > -- > > Key: SPARK-22302 > URL: https://issues.apache.org/jira/browse/SPARK-22302 > Project: Spark > Issue Type: Improvement > Components: Project Infra >Affects Versions: 2.3.0 >Reporter: Hyukjin Kwon >Assignee: Hyukjin Kwon >Priority: Trivial > Fix For: 2.3.0 > > > This JIRA is loosely related with SPARK-21573. Python 2.6 could be used in > Jenkins given the past cases and investigations up to my knowledge and it > looks failing to execute some other scripts. > In this particular case, it was: > {code} > cd dev && python2.6 > {code} > {code} > >>> from sparktestsupport import shellutils > >>> shellutils.subprocess_check_call("ls") > Traceback (most recent call last): > File "", line 1, in > File "sparktestsupport/shellutils.py", line 46, in subprocess_check_call > retcode = call(*popenargs, **kwargs) > NameError: global name 'call' is not defined > {code} > Please see > https://amplab.cs.berkeley.edu/jenkins/job/NewSparkPullRequestBuilder/3950/console > Since we dropped the Python 2.6.x support, looks better we remove those > workarounds and print out explicit error messages in order to duplicate the > efforts to find out the root causes for such cases. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Resolved] (SPARK-22302) Remove manual backports for subprocess.check_output and check_call
[ https://issues.apache.org/jira/browse/SPARK-22302?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hyukjin Kwon resolved SPARK-22302. -- Resolution: Fixed Fix Version/s: 2.3.0 Issue resolved by pull request 19524 [https://github.com/apache/spark/pull/19524] > Remove manual backports for subprocess.check_output and check_call > -- > > Key: SPARK-22302 > URL: https://issues.apache.org/jira/browse/SPARK-22302 > Project: Spark > Issue Type: Improvement > Components: Project Infra >Affects Versions: 2.3.0 >Reporter: Hyukjin Kwon >Priority: Trivial > Fix For: 2.3.0 > > > This JIRA is loosely related with SPARK-21573. Python 2.6 could be used in > Jenkins given the past cases and investigations up to my knowledge and it > looks failing to execute some other scripts. > In this particular case, it was: > {code} > cd dev && python2.6 > {code} > {code} > >>> from sparktestsupport import shellutils > >>> shellutils.subprocess_check_call("ls") > Traceback (most recent call last): > File "", line 1, in > File "sparktestsupport/shellutils.py", line 46, in subprocess_check_call > retcode = call(*popenargs, **kwargs) > NameError: global name 'call' is not defined > {code} > Please see > https://amplab.cs.berkeley.edu/jenkins/job/NewSparkPullRequestBuilder/3950/console > Since we dropped the Python 2.6.x support, looks better we remove those > workarounds and print out explicit error messages in order to duplicate the > efforts to find out the root causes for such cases. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-22328) ClosureCleaner misses referenced superclass fields, gives them null values
Ryan Williams created SPARK-22328: - Summary: ClosureCleaner misses referenced superclass fields, gives them null values Key: SPARK-22328 URL: https://issues.apache.org/jira/browse/SPARK-22328 Project: Spark Issue Type: Bug Components: Spark Core Affects Versions: 2.2.0 Reporter: Ryan Williams [Runnable repro here|https://github.com/ryan-williams/spark-bugs/tree/closure]: Superclass with some fields: {code} abstract class App extends Serializable { // SparkContext stub @transient lazy val sc = new SparkContext(new SparkConf().setAppName("test").setMaster("local[4]").set("spark.ui.showConsoleProgress", "false")) // These fields get missed by the ClosureCleaner in some situations val n1 = 111 val s1 = "aaa" // Simple scaffolding to exercise passing a closure to RDD.foreach in subclasses def rdd = sc.parallelize(1 to 1) def run(name: String): Unit = { print(s"$name:\t") body() sc.stop() } def body(): Unit } {code} Running a simple Spark job with various instantiations of this class: {code} object Main { /** [[App]]s generated this way will not correctly detect references to [[App.n1]] in Spark closures */ val fn = () ⇒ new App { val n2 = 222 val s2 = "bbb" def body(): Unit = rdd.foreach { _ ⇒ println(s"$n1, $n2, $s1, $s2") } } /** Doesn't serialize closures correctly */ val app1 = fn() /** Works fine */ val app2 = new App { val n2 = 222 val s2 = "bbb" def body(): Unit = rdd.foreach { _ ⇒ println(s"$n1, $n2, $s1, $s2") } } /** [[App]]s created this way also work fine */ def makeApp(): App = new App { val n2 = 222 val s2 = "bbb" def body(): Unit = rdd.foreach { _ ⇒ println(s"$n1, $n2, $s1, $s2") } } val app3 = makeApp() // ok val fn2 = () ⇒ makeApp() // ok def main(args: Array[String]): Unit = { fn().run("fn")// bad: n1 → 0, s1 → null app1.run("app1") // bad: n1 → 0, s1 → null app2.run("app2") // ok app3.run("app3") // ok fn2().run("fn2") // ok } } {code} Build + Run: {code} $ sbt run … fn: 0, 222, null, bbb app1: 0, 222, null, bbb app2: 111, 222, aaa, bbb app3: 111, 222, aaa, bbb fn2:111, 222, aaa, bbb {code} The first two versions have {{0}} and {{null}}, resp., for the {{A.n1}} and {{A.s1}} fields. Something about this syntax causes the problem: {code} () => new App { … } {code} -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-14540) Support Scala 2.12 closures and Java 8 lambdas in ClosureCleaner
[ https://issues.apache.org/jira/browse/SPARK-14540?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16213890#comment-16213890 ] Sean Owen commented on SPARK-14540: --- Thanks [~lrytz] -- by the way I confirmed that 2.12.4 does fix this particular issue. I'm on to other issues in Spark with respect to the new lambda-based implementation of closures in Scala. For example, closures compile to functions with names containing "$Lambda$" rather than "$anonfun$", and some classes that turn up for cleaning have names that don't map to the class file that they're in. I've gotten through a few of these issues and may post a WIP PR for feedback, but haven't resolved them all. > Support Scala 2.12 closures and Java 8 lambdas in ClosureCleaner > > > Key: SPARK-14540 > URL: https://issues.apache.org/jira/browse/SPARK-14540 > Project: Spark > Issue Type: Sub-task > Components: Spark Core >Reporter: Josh Rosen > > Using https://github.com/JoshRosen/spark/tree/build-for-2.12, I tried running > ClosureCleanerSuite with Scala 2.12 and ran into two bad test failures: > {code} > [info] - toplevel return statements in closures are identified at cleaning > time *** FAILED *** (32 milliseconds) > [info] Expected exception > org.apache.spark.util.ReturnStatementInClosureException to be thrown, but no > exception was thrown. (ClosureCleanerSuite.scala:57) > {code} > and > {code} > [info] - user provided closures are actually cleaned *** FAILED *** (56 > milliseconds) > [info] Expected ReturnStatementInClosureException, but got > org.apache.spark.SparkException: Job aborted due to stage failure: Task not > serializable: java.io.NotSerializableException: java.lang.Object > [info]- element of array (index: 0) > [info]- array (class "[Ljava.lang.Object;", size: 1) > [info]- field (class "java.lang.invoke.SerializedLambda", name: > "capturedArgs", type: "class [Ljava.lang.Object;") > [info]- object (class "java.lang.invoke.SerializedLambda", > SerializedLambda[capturingClass=class > org.apache.spark.util.TestUserClosuresActuallyCleaned$, > functionalInterfaceMethod=scala/runtime/java8/JFunction1$mcII$sp.apply$mcII$sp:(I)I, > implementation=invokeStatic > org/apache/spark/util/TestUserClosuresActuallyCleaned$.org$apache$spark$util$TestUserClosuresActuallyCleaned$$$anonfun$69:(Ljava/lang/Object;I)I, > instantiatedMethodType=(I)I, numCaptured=1]) > [info]- element of array (index: 0) > [info]- array (class "[Ljava.lang.Object;", size: 1) > [info]- field (class "java.lang.invoke.SerializedLambda", name: > "capturedArgs", type: "class [Ljava.lang.Object;") > [info]- object (class "java.lang.invoke.SerializedLambda", > SerializedLambda[capturingClass=class org.apache.spark.rdd.RDD, > functionalInterfaceMethod=scala/Function3.apply:(Ljava/lang/Object;Ljava/lang/Object;Ljava/lang/Object;)Ljava/lang/Object;, > implementation=invokeStatic > org/apache/spark/rdd/RDD.org$apache$spark$rdd$RDD$$$anonfun$20$adapted:(Lscala/Function1;Lorg/apache/spark/TaskContext;Ljava/lang/Object;Lscala/collection/Iterator;)Lscala/collection/Iterator;, > > instantiatedMethodType=(Lorg/apache/spark/TaskContext;Ljava/lang/Object;Lscala/collection/Iterator;)Lscala/collection/Iterator;, > numCaptured=1]) > [info]- field (class "org.apache.spark.rdd.MapPartitionsRDD", name: > "f", type: "interface scala.Function3") > [info]- object (class "org.apache.spark.rdd.MapPartitionsRDD", > MapPartitionsRDD[2] at apply at Transformer.scala:22) > [info]- field (class "scala.Tuple2", name: "_1", type: "class > java.lang.Object") > [info]- root object (class "scala.Tuple2", (MapPartitionsRDD[2] at > apply at > Transformer.scala:22,org.apache.spark.SparkContext$$Lambda$957/431842435@6e803685)). > [info] This means the closure provided by user is not actually cleaned. > (ClosureCleanerSuite.scala:78) > {code} > We'll need to figure out a closure cleaning strategy which works for 2.12 > lambdas. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-21551) pyspark's collect fails when getaddrinfo is too slow
[ https://issues.apache.org/jira/browse/SPARK-21551?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen updated SPARK-21551: -- Fix Version/s: 2.0.3 > pyspark's collect fails when getaddrinfo is too slow > > > Key: SPARK-21551 > URL: https://issues.apache.org/jira/browse/SPARK-21551 > Project: Spark > Issue Type: Bug > Components: PySpark >Affects Versions: 2.1.0 >Reporter: peay >Assignee: peay >Priority: Critical > Fix For: 2.0.3, 2.1.3, 2.2.1, 2.3.0 > > > Pyspark's {{RDD.collect}}, as well as {{DataFrame.toLocalIterator}} and > {{DataFrame.collect}} all work by starting an ephemeral server in the driver, > and having Python connect to it to download the data. > All three are implemented along the lines of: > {code} > port = self._jdf.collectToPython() > return list(_load_from_socket(port, BatchedSerializer(PickleSerializer( > {code} > The server has **a hardcoded timeout of 3 seconds** > (https://github.com/apache/spark/blob/e26dac5feb02033f980b1e69c9b0ff50869b6f9e/core/src/main/scala/org/apache/spark/api/python/PythonRDD.scala#L695) > -- i.e., the Python process has 3 seconds to connect to it from the very > moment the driver server starts. > In general, that seems fine, but I have been encountering frequent timeouts > leading to `Exception: could not open socket`. > After investigating a bit, it turns out that {{_load_from_socket}} makes a > call to {{getaddrinfo}}: > {code} > def _load_from_socket(port, serializer): > sock = None > # Support for both IPv4 and IPv6. > # On most of IPv6-ready systems, IPv6 will take precedence. > for res in socket.getaddrinfo("localhost", port, socket.AF_UNSPEC, > socket.SOCK_STREAM): >.. connect .. > {code} > I am not sure why, but while most such calls to {{getaddrinfo}} on my machine > only take a couple milliseconds, about 10% of them take between 2 and 10 > seconds, leading to about 10% of jobs failing. I don't think we can always > expect {{getaddrinfo}} to return instantly. More generally, Python may > sometimes pause for a couple seconds, which may not leave enough time for the > process to connect to the server. > Especially since the server timeout is hardcoded, I think it would be best to > set a rather generous value (15 seconds?) to avoid such situations. > A {{getaddrinfo}} specific fix could avoid doing it every single time, or do > it before starting up the driver server. > > cc SPARK-677 [~davies] -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Resolved] (SPARK-22303) [SQL] Getting java.sql.SQLException: Unsupported type 101 for BINARY_DOUBLE
[ https://issues.apache.org/jira/browse/SPARK-22303?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen resolved SPARK-22303. --- Resolution: Won't Fix > [SQL] Getting java.sql.SQLException: Unsupported type 101 for BINARY_DOUBLE > --- > > Key: SPARK-22303 > URL: https://issues.apache.org/jira/browse/SPARK-22303 > Project: Spark > Issue Type: Bug > Components: SQL >Affects Versions: 2.2.0 >Reporter: Kohki Nishio >Priority: Minor > > When a table contains columns such as BINARY_DOUBLE or BINARY_FLOAT, this > JDBC connector throws SQL exception > {code} > java.sql.SQLException: Unsupported type 101 > at > org.apache.spark.sql.execution.datasources.jdbc.JdbcUtils$.org$apache$spark$sql$execution$datasources$jdbc$JdbcUtils$$getCatalystType(JdbcUtils.scala:235) > at > org.apache.spark.sql.execution.datasources.jdbc.JdbcUtils$$anonfun$8.apply(JdbcUtils.scala:292) > at > org.apache.spark.sql.execution.datasources.jdbc.JdbcUtils$$anonfun$8.apply(JdbcUtils.scala:292) > at scala.Option.getOrElse(Option.scala:121) > at > org.apache.spark.sql.execution.datasources.jdbc.JdbcUtils$.getSchema(JdbcUtils.scala:291) > at > org.apache.spark.sql.execution.datasources.jdbc.JDBCRDD$.resolveTable(JDBCRDD.scala:64) > at > org.apache.spark.sql.execution.datasources.jdbc.JDBCRelation.(JDBCRelation.scala:113) > at > org.apache.spark.sql.execution.datasources.jdbc.JdbcRelationProvider.createRelation(JdbcRelationProvider.scala:47) > at > org.apache.spark.sql.execution.datasources.DataSource.resolveRelation(DataSource.scala:306) > at org.apache.spark.sql.DataFrameReader.load(DataFrameReader.scala:178) > at org.apache.spark.sql.DataFrameReader.load(DataFrameReader.scala:146) > {code} > these types are Oracle specific ones, described here > https://docs.oracle.com/cd/E11882_01/timesten.112/e21642/types.htm#TTSQL148 -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-22327) R CRAN check fails on non-latest branches
[ https://issues.apache.org/jira/browse/SPARK-22327?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Felix Cheung updated SPARK-22327: - Description: with warning * checking CRAN incoming feasibility ... WARNING Maintainer: 'Shivaram Venkataraman ' Insufficient package version (submitted: 2.0.3, existing: 2.1.2) Unknown, possibly mis-spelled, fields in DESCRIPTION: 'RoxygenNote' WARNING: There was 1 warning. NOTE: There were 2 notes. We have seen this in branch-1.6, branch-2.0, and this would be a problem for branch-2.1 after we ship 2.2.1. The root cause of the issue is in the package version check in CRAN check. After the SparkR package version 2.1.2 (is first) published, any older version is failing the version check. As far as we know, there is no way to skip this version check. Also, there is previously a NOTE on new maintainer. was: with warning * checking CRAN incoming feasibility ... WARNING Maintainer: 'Shivaram Venkataraman ' Insufficient package version (submitted: 2.0.3, existing: 2.1.2) Unknown, possibly mis-spelled, fields in DESCRIPTION: 'RoxygenNote' We have seen this in branch-1.6, branch-2.0, and this would be a problem for branch-2.1 after we ship 2.2.1. The root cause of the issue is in the package version check in CRAN check. After the SparkR package version 2.1.2 (is first) published, any older version is failing the version check. As far as we know, there is no way to skip this version check. Also, there is previously a NOTE on new maintainer. > R CRAN check fails on non-latest branches > - > > Key: SPARK-22327 > URL: https://issues.apache.org/jira/browse/SPARK-22327 > Project: Spark > Issue Type: Bug > Components: SparkR >Affects Versions: 1.6.4, 2.0.3, 2.1.3, 2.2.1, 2.3.0 >Reporter: Felix Cheung > > with warning > * checking CRAN incoming feasibility ... WARNING > Maintainer: 'Shivaram Venkataraman ' > Insufficient package version (submitted: 2.0.3, existing: 2.1.2) > Unknown, possibly mis-spelled, fields in DESCRIPTION: > 'RoxygenNote' > WARNING: There was 1 warning. > NOTE: There were 2 notes. > We have seen this in branch-1.6, branch-2.0, and this would be a problem for > branch-2.1 after we ship 2.2.1. > The root cause of the issue is in the package version check in CRAN check. > After the SparkR package version 2.1.2 (is first) published, any older > version is failing the version check. As far as we know, there is no way to > skip this version check. > Also, there is previously a NOTE on new maintainer. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Comment Edited] (SPARK-22327) R CRAN check fails on non-latest branches
[ https://issues.apache.org/jira/browse/SPARK-22327?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16213798#comment-16213798 ] Felix Cheung edited comment on SPARK-22327 at 10/21/17 7:45 AM: in contrast, this is from master * checking CRAN incoming feasibility ... NOTE Maintainer: 'Shivaram Venkataraman ' Unknown, possibly mis-spelled, fields in DESCRIPTION: 'RoxygenNote' * checking package dependencies ... NOTE No repository set, so cyclic dependency check skipped * checking R code for possible problems ... NOTE Found the following calls to attach(): File 'SparkR/R/DataFrame.R': attach(newEnv, pos = pos, name = name, warn.conflicts = warn.conflicts) See section 'Good practice' in '?attach'. NOTE: There were 3 notes. So it should have 3 notes, or (2 notes + 1warning) as the one note turns into a warning for Insufficient package version was (Author: felixcheung): in contrast, this is from master * checking CRAN incoming feasibility ... NOTE Maintainer: 'Shivaram Venkataraman ' Unknown, possibly mis-spelled, fields in DESCRIPTION: 'RoxygenNote' * checking package dependencies ... NOTE No repository set, so cyclic dependency check skipped * checking R code for possible problems ... NOTE Found the following calls to attach(): File 'SparkR/R/DataFrame.R': attach(newEnv, pos = pos, name = name, warn.conflicts = warn.conflicts) See section 'Good practice' in '?attach'. NOTE: There were 3 notes. So it should have 3 notes or (2 notes + 1warning) as the one note turns into a warning for Insufficient package version > R CRAN check fails on non-latest branches > - > > Key: SPARK-22327 > URL: https://issues.apache.org/jira/browse/SPARK-22327 > Project: Spark > Issue Type: Bug > Components: SparkR >Affects Versions: 1.6.4, 2.0.3, 2.1.3, 2.2.1, 2.3.0 >Reporter: Felix Cheung > > with warning > * checking CRAN incoming feasibility ... WARNING > Maintainer: 'Shivaram Venkataraman ' > Insufficient package version (submitted: 2.0.3, existing: 2.1.2) > Unknown, possibly mis-spelled, fields in DESCRIPTION: > 'RoxygenNote' > We have seen this in branch-1.6, branch-2.0, and this would be a problem for > branch-2.1 after we ship 2.2.1. > The root cause of the issue is in the package version check in CRAN check. > After the SparkR package version 2.1.2 (is first) published, any older > version is failing the version check. As far as we know, there is no way to > skip this version check. > Also, there is previously a NOTE on new maintainer. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-22327) R CRAN check fails on non-latest branches
[ https://issues.apache.org/jira/browse/SPARK-22327?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16213798#comment-16213798 ] Felix Cheung commented on SPARK-22327: -- in contrast, this is from master * checking CRAN incoming feasibility ... NOTE Maintainer: 'Shivaram Venkataraman ' Unknown, possibly mis-spelled, fields in DESCRIPTION: 'RoxygenNote' * checking package dependencies ... NOTE No repository set, so cyclic dependency check skipped * checking R code for possible problems ... NOTE Found the following calls to attach(): File 'SparkR/R/DataFrame.R': attach(newEnv, pos = pos, name = name, warn.conflicts = warn.conflicts) See section 'Good practice' in '?attach'. NOTE: There were 3 notes. So it should have 3 notes or (2 notes + 1warning) as the one note turns into a warning for Insufficient package version > R CRAN check fails on non-latest branches > - > > Key: SPARK-22327 > URL: https://issues.apache.org/jira/browse/SPARK-22327 > Project: Spark > Issue Type: Bug > Components: SparkR >Affects Versions: 1.6.4, 2.0.3, 2.1.3, 2.2.1, 2.3.0 >Reporter: Felix Cheung > > with warning > * checking CRAN incoming feasibility ... WARNING > Maintainer: 'Shivaram Venkataraman ' > Insufficient package version (submitted: 2.0.3, existing: 2.1.2) > Unknown, possibly mis-spelled, fields in DESCRIPTION: > 'RoxygenNote' > We have seen this in branch-1.6, branch-2.0, and this would be a problem for > branch-2.1 after we ship 2.2.1. > The root cause of the issue is in the package version check in CRAN check. > After the SparkR package version 2.1.2 (is first) published, any older > version is failing the version check. As far as we know, there is no way to > skip this version check. > Also, there is previously a NOTE on new maintainer. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Comment Edited] (SPARK-22327) R CRAN check fails on non-latest branches
[ https://issues.apache.org/jira/browse/SPARK-22327?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16213798#comment-16213798 ] Felix Cheung edited comment on SPARK-22327 at 10/21/17 7:45 AM: in contrast, this is from master https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/82919/consoleFull * checking CRAN incoming feasibility ... NOTE Maintainer: 'Shivaram Venkataraman ' Unknown, possibly mis-spelled, fields in DESCRIPTION: 'RoxygenNote' * checking package dependencies ... NOTE No repository set, so cyclic dependency check skipped * checking R code for possible problems ... NOTE Found the following calls to attach(): File 'SparkR/R/DataFrame.R': attach(newEnv, pos = pos, name = name, warn.conflicts = warn.conflicts) See section 'Good practice' in '?attach'. NOTE: There were 3 notes. So it should have 3 notes, or (2 notes + 1warning) as the one note turns into a warning for Insufficient package version was (Author: felixcheung): in contrast, this is from master * checking CRAN incoming feasibility ... NOTE Maintainer: 'Shivaram Venkataraman ' Unknown, possibly mis-spelled, fields in DESCRIPTION: 'RoxygenNote' * checking package dependencies ... NOTE No repository set, so cyclic dependency check skipped * checking R code for possible problems ... NOTE Found the following calls to attach(): File 'SparkR/R/DataFrame.R': attach(newEnv, pos = pos, name = name, warn.conflicts = warn.conflicts) See section 'Good practice' in '?attach'. NOTE: There were 3 notes. So it should have 3 notes, or (2 notes + 1warning) as the one note turns into a warning for Insufficient package version > R CRAN check fails on non-latest branches > - > > Key: SPARK-22327 > URL: https://issues.apache.org/jira/browse/SPARK-22327 > Project: Spark > Issue Type: Bug > Components: SparkR >Affects Versions: 1.6.4, 2.0.3, 2.1.3, 2.2.1, 2.3.0 >Reporter: Felix Cheung > > with warning > * checking CRAN incoming feasibility ... WARNING > Maintainer: 'Shivaram Venkataraman ' > Insufficient package version (submitted: 2.0.3, existing: 2.1.2) > Unknown, possibly mis-spelled, fields in DESCRIPTION: > 'RoxygenNote' > We have seen this in branch-1.6, branch-2.0, and this would be a problem for > branch-2.1 after we ship 2.2.1. > The root cause of the issue is in the package version check in CRAN check. > After the SparkR package version 2.1.2 (is first) published, any older > version is failing the version check. As far as we know, there is no way to > skip this version check. > Also, there is previously a NOTE on new maintainer. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-22327) R CRAN check fails on non-latest branches
[ https://issues.apache.org/jira/browse/SPARK-22327?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Felix Cheung updated SPARK-22327: - Description: with warning * checking CRAN incoming feasibility ... WARNING Maintainer: 'Shivaram Venkataraman ' Insufficient package version (submitted: 2.0.3, existing: 2.1.2) Unknown, possibly mis-spelled, fields in DESCRIPTION: 'RoxygenNote' We have seen this in branch-1.6, branch-2.0, and this would be a problem for branch-2.1 after we ship 2.2.1. The root cause of the issue is in the package version check in CRAN check. After the SparkR package version 2.1.2 (is first) published, any older version is failing the version check. As far as we know, there is no way to skip this version check. Also, there is previously a NOTE on new maintainer. was: with warning Insufficient package version (submitted: 2.0.3, existing: 2.1.2) We have seen this in branch-1.6, branch-2.0, and this would be a problem for branch-2.1 after we ship 2.2.1. The root cause of the issue is in the package version check in CRAN check. After the SparkR package version 2.1.2 (is first) published, any older version is failing the version check. As far as we know, there is no way to skip this version check. Also, there is previously a NOTE on new maintainer. > R CRAN check fails on non-latest branches > - > > Key: SPARK-22327 > URL: https://issues.apache.org/jira/browse/SPARK-22327 > Project: Spark > Issue Type: Bug > Components: SparkR >Affects Versions: 1.6.4, 2.0.3, 2.1.3, 2.2.1, 2.3.0 >Reporter: Felix Cheung > > with warning > * checking CRAN incoming feasibility ... WARNING > Maintainer: 'Shivaram Venkataraman ' > Insufficient package version (submitted: 2.0.3, existing: 2.1.2) > Unknown, possibly mis-spelled, fields in DESCRIPTION: > 'RoxygenNote' > We have seen this in branch-1.6, branch-2.0, and this would be a problem for > branch-2.1 after we ship 2.2.1. > The root cause of the issue is in the package version check in CRAN check. > After the SparkR package version 2.1.2 (is first) published, any older > version is failing the version check. As far as we know, there is no way to > skip this version check. > Also, there is previously a NOTE on new maintainer. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-22327) R CRAN check fails on non-latest branches
[ https://issues.apache.org/jira/browse/SPARK-22327?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Felix Cheung updated SPARK-22327: - Description: with warning Insufficient package version (submitted: 2.0.3, existing: 2.1.2) We have seen this in branch-1.6, branch-2.0, and this would be a problem for branch-2.1 after we ship 2.2.1. The root cause of the issue is in the package version check in CRAN check. After the SparkR package version 2.1.2 (is first) published, any older version is failing the version check. As far as we know, there is no way to skip this version check. Also, there is previously a NOTE on new maintainer. was: with error Insufficient package version (submitted: 2.0.3, existing: 2.1.2) We have seen this in branch-1.6, branch-2.0, and this would be a problem for branch-2.1 after we ship 2.2.1 > R CRAN check fails on non-latest branches > - > > Key: SPARK-22327 > URL: https://issues.apache.org/jira/browse/SPARK-22327 > Project: Spark > Issue Type: Bug > Components: SparkR >Affects Versions: 1.6.4, 2.0.3, 2.1.3, 2.2.1, 2.3.0 >Reporter: Felix Cheung > > with warning > Insufficient package version (submitted: 2.0.3, existing: 2.1.2) > We have seen this in branch-1.6, branch-2.0, and this would be a problem for > branch-2.1 after we ship 2.2.1. > The root cause of the issue is in the package version check in CRAN check. > After the SparkR package version 2.1.2 (is first) published, any older > version is failing the version check. As far as we know, there is no way to > skip this version check. > Also, there is previously a NOTE on new maintainer. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-22327) R CRAN check fails on non-latest branches
[ https://issues.apache.org/jira/browse/SPARK-22327?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16213781#comment-16213781 ] Felix Cheung commented on SPARK-22327: -- https://amplab.cs.berkeley.edu/jenkins/job/NewSparkPullRequestBuilder/3956/consoleFull > R CRAN check fails on non-latest branches > - > > Key: SPARK-22327 > URL: https://issues.apache.org/jira/browse/SPARK-22327 > Project: Spark > Issue Type: Bug > Components: SparkR >Affects Versions: 1.6.4, 2.0.3, 2.1.3, 2.2.1, 2.3.0 >Reporter: Felix Cheung > > with error > Insufficient package version (submitted: 2.0.3, existing: 2.1.2) > We have seen this in branch-1.6, branch-2.0, and this would be a problem for > branch-2.1 after we ship 2.2.1 -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-22327) R CRAN check fails on non-latest branches
[ https://issues.apache.org/jira/browse/SPARK-22327?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Felix Cheung updated SPARK-22327: - Affects Version/s: 2.3.0 > R CRAN check fails on non-latest branches > - > > Key: SPARK-22327 > URL: https://issues.apache.org/jira/browse/SPARK-22327 > Project: Spark > Issue Type: Bug > Components: SparkR >Affects Versions: 1.6.4, 2.0.3, 2.1.3, 2.2.1, 2.3.0 >Reporter: Felix Cheung > > with error > Insufficient package version (submitted: 2.0.3, existing: 2.1.2) > We have seen this in branch-1.6, branch-2.0, and this would be a problem for > branch-2.1 after we ship 2.2.1 -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-22327) R CRAN check fails on non-latest branches
[ https://issues.apache.org/jira/browse/SPARK-22327?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Felix Cheung updated SPARK-22327: - Affects Version/s: 2.2.1 > R CRAN check fails on non-latest branches > - > > Key: SPARK-22327 > URL: https://issues.apache.org/jira/browse/SPARK-22327 > Project: Spark > Issue Type: Bug > Components: SparkR >Affects Versions: 1.6.4, 2.0.3, 2.1.3, 2.2.1 >Reporter: Felix Cheung > > with error > Insufficient package version (submitted: 2.0.3, existing: 2.1.2) > We have seen this in branch-1.6, branch-2.0, and this would be a problem for > branch-2.1 after we ship 2.2.1 -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-22327) R CRAN check fails on non-latest branches
Felix Cheung created SPARK-22327: Summary: R CRAN check fails on non-latest branches Key: SPARK-22327 URL: https://issues.apache.org/jira/browse/SPARK-22327 Project: Spark Issue Type: Bug Components: SparkR Affects Versions: 1.6.4, 2.0.3, 2.1.3 Reporter: Felix Cheung with error Insufficient package version (submitted: 2.0.3, existing: 2.1.2) We have seen this in branch-1.6, branch-2.0, and this would be a problem for branch-2.1 after we ship 2.2.1 -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-20331) Broaden support for Hive partition pruning predicate pushdown
[ https://issues.apache.org/jira/browse/SPARK-20331?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16213775#comment-16213775 ] Apache Spark commented on SPARK-20331: -- User 'gatorsmile' has created a pull request for this issue: https://github.com/apache/spark/pull/19547 > Broaden support for Hive partition pruning predicate pushdown > - > > Key: SPARK-20331 > URL: https://issues.apache.org/jira/browse/SPARK-20331 > Project: Spark > Issue Type: Improvement > Components: SQL >Affects Versions: 2.1.0 >Reporter: Michael Allman >Assignee: Michael Allman > Fix For: 2.3.0 > > > Spark 2.1 introduced scalable support for Hive tables with huge numbers of > partitions. Key to leveraging this support is the ability to prune > unnecessary table partitions to answer queries. Spark supports a subset of > the class of partition pruning predicates that the Hive metastore supports. > If a user writes a query with a partition pruning predicate that is *not* > supported by Spark, Spark falls back to loading all partitions and pruning > client-side. We want to broaden Spark's current partition pruning predicate > pushdown capabilities. > One of the key missing capabilities is support for disjunctions. For example, > for a table partitioned by date, specifying with a predicate like > {code}date = 20161011 or date = 20161014{code} > will result in Spark fetching all partitions. For a table partitioned by date > and hour, querying a range of hours across dates can be quite difficult to > accomplish without fetching all partition metadata. > The current partition pruning support supports only comparisons against > literals. We can expand that to foldable expressions by evaluating them at > planning time. > We can also implement support for the "IN" comparison by expanding it to a > sequence of "OR"s. > This ticket covers those enhancements. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org