[jira] [Commented] (SPARK-45424) Regression in CSV schema inference when timestamps do not match specified timestampFormat
[ https://issues.apache.org/jira/browse/SPARK-45424?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17772365#comment-17772365 ] Andy Grove commented on SPARK-45424: The regression seems to have been introduced in https://issues.apache.org/jira/browse/SPARK-39280 and/or https://issues.apache.org/jira/browse/SPARK-39281 Commits: [https://github.com/apache/spark/commit/b1c0d599ba32a4562ae1697e3f488264f1d03c76] [https://github.com/apache/spark/commit/3192bbd29585607d43d0819c6c2d3ac00180261a] [~fanjia] Do you understand why this behavior has changed? > Regression in CSV schema inference when timestamps do not match specified > timestampFormat > - > > Key: SPARK-45424 > URL: https://issues.apache.org/jira/browse/SPARK-45424 > Project: Spark > Issue Type: Bug > Components: SQL >Affects Versions: 3.5.0 >Reporter: Andy Grove >Priority: Major > > There is a regression in Spark 3.5.0 when inferring the schema of CSV files > containing timestamps, where a column will be inferred as a timestamp even if > the contents do not match the specified timestampFormat. > *Test Data* > I have the following CSV file: > {code:java} > 2884-06-24T02:45:51.138 > 2884-06-24T02:45:51.138 > 2884-06-24T02:45:51.138 > {code} > *Spark 3.4.0 Behavior (correct)* > In Spark 3.4.0, if I specify the correct timestamp format, then the schema is > inferred as timestamp: > {code:java} > scala> val df = spark.read.option("timestampFormat", > "-MM-dd'T'HH:mm:ss.SSS").option("inferSchema", > true).csv("/tmp/timestamps.csv") > df: org.apache.spark.sql.DataFrame = [_c0: timestamp] > {code} > If I specify an incompatible timestampFormat, then the schema is inferred as > string: > {code:java} > scala> val df = spark.read.option("timestampFormat", > "-MM-dd'T'HH:mm:ss").option("inferSchema", > true).csv("/tmp/timestamps.csv") > df: org.apache.spark.sql.DataFrame = [_c0: string] > {code} > *Spark 3.5.0* > In Spark 3.5.0, the column will be inferred as timestamp even if the data > does not match the specified timestampFormat. > {code:java} > scala> val df = spark.read.option("timestampFormat", > "-MM-dd'T'HH:mm:ss").option("inferSchema", > true).csv("/tmp/timestamps.csv") > df: org.apache.spark.sql.DataFrame = [_c0: timestamp] > {code} > Reading the DataFrame then results in an error: > {code:java} > Caused by: java.time.format.DateTimeParseException: Text > '2884-06-24T02:45:51.138' could not be parsed, unparsed text found at index 19 > {code} -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-45424) Regression in CSV schema inference when timestamps do not match specified timestampFormat
[ https://issues.apache.org/jira/browse/SPARK-45424?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andy Grove updated SPARK-45424: --- Summary: Regression in CSV schema inference when timestamps do not match specified timestampFormat (was: Regression in CSV schema inference when timestampFormat is specified) > Regression in CSV schema inference when timestamps do not match specified > timestampFormat > - > > Key: SPARK-45424 > URL: https://issues.apache.org/jira/browse/SPARK-45424 > Project: Spark > Issue Type: Bug > Components: SQL >Affects Versions: 3.5.0 >Reporter: Andy Grove >Priority: Major > > There is a regression in Spark 3.5.0 when inferring the schema of CSV files > containing timestamps, where a column will be inferred as a timestamp even if > the contents do not match the specified timestampFormat. > *Test Data* > I have the following CSV file: > {code:java} > 2884-06-24T02:45:51.138 > 2884-06-24T02:45:51.138 > 2884-06-24T02:45:51.138 > {code} > *Spark 3.4.0 Behavior (correct)* > In Spark 3.4.0, if I specify the correct timestamp format, then the schema is > inferred as timestamp: > {code:java} > scala> val df = spark.read.option("timestampFormat", > "-MM-dd'T'HH:mm:ss.SSS").option("inferSchema", > true).csv("/tmp/timestamps.csv") > df: org.apache.spark.sql.DataFrame = [_c0: timestamp] > {code} > If I specify an incompatible timestampFormat, then the schema is inferred as > string: > {code:java} > scala> val df = spark.read.option("timestampFormat", > "-MM-dd'T'HH:mm:ss").option("inferSchema", > true).csv("/tmp/timestamps.csv") > df: org.apache.spark.sql.DataFrame = [_c0: string] > {code} > *Spark 3.5.0* > In Spark 3.5.0, the column will be inferred as timestamp even if the data > does not match the specified timestampFormat. > {code:java} > scala> val df = spark.read.option("timestampFormat", > "-MM-dd'T'HH:mm:ss").option("inferSchema", > true).csv("/tmp/timestamps.csv") > df: org.apache.spark.sql.DataFrame = [_c0: timestamp] > {code} > Reading the DataFrame then results in an error: > {code:java} > Caused by: java.time.format.DateTimeParseException: Text > '2884-06-24T02:45:51.138' could not be parsed, unparsed text found at index 19 > {code} -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-45424) Regression in CSV schema inference when timestampFormat is specified
[ https://issues.apache.org/jira/browse/SPARK-45424?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andy Grove updated SPARK-45424: --- Description: There is a regression in Spark 3.5.0 when inferring the schema of CSV files containing timestamps, where a column will be inferred as a timestamp even if the contents do not match the specified timestampFormat. *Test Data* I have the following CSV file: {code:java} 2884-06-24T02:45:51.138 2884-06-24T02:45:51.138 2884-06-24T02:45:51.138 {code} *Spark 3.4.0 Behavior (correct)* In Spark 3.4.0, if I specify the correct timestamp format, then the schema is inferred as timestamp: {code:java} scala> val df = spark.read.option("timestampFormat", "-MM-dd'T'HH:mm:ss.SSS").option("inferSchema", true).csv("/tmp/timestamps.csv") df: org.apache.spark.sql.DataFrame = [_c0: timestamp] {code} If I specify an incompatible timestampFormat, then the schema is inferred as string: {code:java} scala> val df = spark.read.option("timestampFormat", "-MM-dd'T'HH:mm:ss").option("inferSchema", true).csv("/tmp/timestamps.csv") df: org.apache.spark.sql.DataFrame = [_c0: string] {code} *Spark 3.5.0* In Spark 3.5.0, the column will be inferred as timestamp even if the data does not match the specified timestampFormat. {code:java} scala> val df = spark.read.option("timestampFormat", "-MM-dd'T'HH:mm:ss").option("inferSchema", true).csv("/tmp/timestamps.csv") df: org.apache.spark.sql.DataFrame = [_c0: timestamp] {code} Reading the DataFrame then results in an error: {code:java} Caused by: java.time.format.DateTimeParseException: Text '2884-06-24T02:45:51.138' could not be parsed, unparsed text found at index 19 {code} was: There is a regression in Spark 3.5.0 when inferring the schema of files containing timestamps, where a column will be inferred as a timestamp even if the contents do not match the specified timestampFormat. *Test Data* I have the following csv file: {code:java} 2884-06-24T02:45:51.138 2884-06-24T02:45:51.138 2884-06-24T02:45:51.138 {code} *Spark 3.4.0 Behavior (correct)* In Spark 3.4.0, if I specify the correct timestamp format, then the schema is inferred as timestamp: {code:java} scala> val df = spark.read.option("timestampFormat", "-MM-dd'T'HH:mm:ss.SSS").option("inferSchema", true).csv("/tmp/timestamps.csv") df: org.apache.spark.sql.DataFrame = [_c0: timestamp] {code} If I specify an incompatible timestampFormat, then the schema is inferred as string: {code:java} scala> val df = spark.read.option("timestampFormat", "-MM-dd'T'HH:mm:ss").option("inferSchema", true).csv("/tmp/timestamps.csv") df: org.apache.spark.sql.DataFrame = [_c0: string] {code} *Spark 3.5.0* In Spark 3.5.0, the column will be inferred as timestamp even if the data does not match the specified timestampFormat. {code:java} scala> val df = spark.read.option("timestampFormat", "-MM-dd'T'HH:mm:ss").option("inferSchema", true).csv("/tmp/timestamps.csv") df: org.apache.spark.sql.DataFrame = [_c0: timestamp] {code} Reading the DataFrame then results in an error: {code:java} Caused by: java.time.format.DateTimeParseException: Text '2884-06-24T02:45:51.138' could not be parsed, unparsed text found at index 19 {code} > Regression in CSV schema inference when timestampFormat is specified > > > Key: SPARK-45424 > URL: https://issues.apache.org/jira/browse/SPARK-45424 > Project: Spark > Issue Type: Bug > Components: SQL >Affects Versions: 3.5.0 >Reporter: Andy Grove >Priority: Major > > There is a regression in Spark 3.5.0 when inferring the schema of CSV files > containing timestamps, where a column will be inferred as a timestamp even if > the contents do not match the specified timestampFormat. > *Test Data* > I have the following CSV file: > {code:java} > 2884-06-24T02:45:51.138 > 2884-06-24T02:45:51.138 > 2884-06-24T02:45:51.138 > {code} > *Spark 3.4.0 Behavior (correct)* > In Spark 3.4.0, if I specify the correct timestamp format, then the schema is > inferred as timestamp: > {code:java} > scala> val df = spark.read.option("timestampFormat", > "-MM-dd'T'HH:mm:ss.SSS").option("inferSchema", > true).csv("/tmp/timestamps.csv") > df: org.apache.spark.sql.DataFrame = [_c0: timestamp] > {code} > If I specify an incompatible timestampFormat, then the schema is inferred as > string: > {code:java} > scala> val df = spark.read.option("timestampFormat", > "-MM-dd'T'HH:mm:ss").option("inferSchema", > true).csv("/tmp/timestamps.csv") > df: org.apache.spark.sql.DataFrame = [_c0: string] > {code} > *Spark 3.5.0* > In Spark 3.5.0, the column will be inferred as timestamp even if the data > does not match the specified timestampFormat. > {code:java} > scala> val df = spark.read.option("timestampFormat", > "-MM-dd'T'HH:mm:ss").option(
[jira] [Created] (SPARK-45424) Regression in CSV schema inference when timestampFormat is specified
Andy Grove created SPARK-45424: -- Summary: Regression in CSV schema inference when timestampFormat is specified Key: SPARK-45424 URL: https://issues.apache.org/jira/browse/SPARK-45424 Project: Spark Issue Type: Bug Components: SQL Affects Versions: 3.5.0 Reporter: Andy Grove There is a regression in Spark 3.5.0 when inferring the schema of files containing timestamps, where a column will be inferred as a timestamp even if the contents do not match the specified timestampFormat. *Test Data* I have the following csv file: {code:java} 2884-06-24T02:45:51.138 2884-06-24T02:45:51.138 2884-06-24T02:45:51.138 {code} *Spark 3.4.0 Behavior (correct)* In Spark 3.4.0, if I specify the correct timestamp format, then the schema is inferred as timestamp: {code:java} scala> val df = spark.read.option("timestampFormat", "-MM-dd'T'HH:mm:ss.SSS").option("inferSchema", true).csv("/tmp/timestamps.csv") df: org.apache.spark.sql.DataFrame = [_c0: timestamp] {code} If I specify an incompatible timestampFormat, then the schema is inferred as string: {code:java} scala> val df = spark.read.option("timestampFormat", "-MM-dd'T'HH:mm:ss").option("inferSchema", true).csv("/tmp/timestamps.csv") df: org.apache.spark.sql.DataFrame = [_c0: string] {code} *Spark 3.5.0* In Spark 3.5.0, the column will be inferred as timestamp even if the data does not match the specified timestampFormat. {code:java} scala> val df = spark.read.option("timestampFormat", "-MM-dd'T'HH:mm:ss").option("inferSchema", true).csv("/tmp/timestamps.csv") df: org.apache.spark.sql.DataFrame = [_c0: timestamp] {code} Reading the DataFrame then results in an error: {code:java} Caused by: java.time.format.DateTimeParseException: Text '2884-06-24T02:45:51.138' could not be parsed, unparsed text found at index 19 {code} -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-44616) Hive Generic UDF support no longer supports short-circuiting of argument evaluation
[ https://issues.apache.org/jira/browse/SPARK-44616?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andy Grove updated SPARK-44616: --- Description: PR [https://github.com/apache/spark/pull/39555] changed DeferredObject to no longer contain a function, and instead contains a value. This removes the deferred evaluation capability and means that HiveGenericUDF implementations can no longer short-circuit the evaluation of their arguments, which could be a performance issue for some users. Here is a relevant javadoc comment from the Hive source for DeferredObject: {code:java} /** * A Defered Object allows us to do lazy-evaluation and short-circuiting. * GenericUDF use DeferedObject to pass arguments. */ public static interface DeferredObject { {code} was: PR https://github.com/apache/spark/pull/39555 changed DeferredObject to no longer contain a function, and instead contains a value. This removes the deferred evaluation capability and means that HiveGenericUDF implementations can no longer short-circuit the evaluation of their arguments, which could be a performance issue for some users. Here is a relevant javadoc comment from the Hive source for DeferredObject: {{{ /** * A Defered Object allows us to do lazy-evaluation and short-circuiting. * GenericUDF use DeferedObject to pass arguments. */ public static interface DeferredObject { }}} > Hive Generic UDF support no longer supports short-circuiting of argument > evaluation > --- > > Key: SPARK-44616 > URL: https://issues.apache.org/jira/browse/SPARK-44616 > Project: Spark > Issue Type: Bug > Components: SQL >Affects Versions: 3.5.0 >Reporter: Andy Grove >Priority: Major > > PR [https://github.com/apache/spark/pull/39555] changed DeferredObject to no > longer contain a function, and instead contains a value. This removes the > deferred evaluation capability and means that HiveGenericUDF implementations > can no longer short-circuit the evaluation of their arguments, which could be > a performance issue for some users. > Here is a relevant javadoc comment from the Hive source for DeferredObject: > {code:java} > /** >* A Defered Object allows us to do lazy-evaluation and short-circuiting. >* GenericUDF use DeferedObject to pass arguments. >*/ > public static interface DeferredObject { > {code} > -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-44616) Hive Generic UDF support no longer supports short-circuiting of argument evaluation
[ https://issues.apache.org/jira/browse/SPARK-44616?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andy Grove updated SPARK-44616: --- Description: PR https://github.com/apache/spark/pull/39555 changed DeferredObject to no longer contain a function, and instead contains a value. This removes the deferred evaluation capability and means that HiveGenericUDF implementations can no longer short-circuit the evaluation of their arguments, which could be a performance issue for some users. Here is a relevant javadoc comment from the Hive source for DeferredObject: {{{ /** * A Defered Object allows us to do lazy-evaluation and short-circuiting. * GenericUDF use DeferedObject to pass arguments. */ public static interface DeferredObject { }}} was:PR https://github.com/apache/spark/pull/39555 changed DeferredObject to no longer contain a function, and instead contains a value. This removes the deferred evaluation capability and means that HiveGenericUDF implementations can no longer short-circuit the evaluation of their arguments, which could be a performance issue for some users. > Hive Generic UDF support no longer supports short-circuiting of argument > evaluation > --- > > Key: SPARK-44616 > URL: https://issues.apache.org/jira/browse/SPARK-44616 > Project: Spark > Issue Type: Bug > Components: SQL >Affects Versions: 3.5.0 >Reporter: Andy Grove >Priority: Major > > PR https://github.com/apache/spark/pull/39555 changed DeferredObject to no > longer contain a function, and instead contains a value. This removes the > deferred evaluation capability and means that HiveGenericUDF implementations > can no longer short-circuit the evaluation of their arguments, which could be > a performance issue for some users. > Here is a relevant javadoc comment from the Hive source for DeferredObject: > {{{ > /** >* A Defered Object allows us to do lazy-evaluation and short-circuiting. >* GenericUDF use DeferedObject to pass arguments. >*/ > public static interface DeferredObject { > }}} -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-44616) Hive Generic UDF support no longer supports short-circuiting of argument evaluation
Andy Grove created SPARK-44616: -- Summary: Hive Generic UDF support no longer supports short-circuiting of argument evaluation Key: SPARK-44616 URL: https://issues.apache.org/jira/browse/SPARK-44616 Project: Spark Issue Type: Bug Components: SQL Affects Versions: 3.5.0 Reporter: Andy Grove PR https://github.com/apache/spark/pull/39555 changed DeferredObject to no longer contain a function, and instead contains a value. This removes the deferred evaluation capability and means that HiveGenericUDF implementations can no longer short-circuit the evaluation of their arguments, which could be a performance issue for some users. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-39991) AQE should use available column statistics from completed query stages
[ https://issues.apache.org/jira/browse/SPARK-39991?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andy Grove updated SPARK-39991: --- Description: In QueryStageExec.computeStats we copy partial statistics from materlized query stages by calling QueryStageExec#getRuntimeStatistics, which in turn calls ShuffleExchangeLike#runtimeStatistics or BroadcastExchangeLike#runtimeStatistics. Only dataSize and numOutputRows are copied into the new Statistics object: {code:scala} def computeStats(): Option[Statistics] = if (isMaterialized) { val runtimeStats = getRuntimeStatistics val dataSize = runtimeStats.sizeInBytes.max(0) val numOutputRows = runtimeStats.rowCount.map(_.max(0)) Some(Statistics(dataSize, numOutputRows, isRuntime = true)) } else { None } {code} I would like to also copy over the column statistics stored in Statistics.attributeMap so that they can be fed back into the logical plan optimization phase. This is a small change as shown below: {code:scala} def computeStats(): Option[Statistics] = if (isMaterialized) { val runtimeStats = getRuntimeStatistics val dataSize = runtimeStats.sizeInBytes.max(0) val numOutputRows = runtimeStats.rowCount.map(_.max(0)) val attributeStats = runtimeStats.attributeStats Some(Statistics(dataSize, numOutputRows, attributeStats, isRuntime = true)) } else { None } {code} The Spark implementations of ShuffleExchangeLike and BroadcastExchangeLike do not currently provide such column statistics, but other custom implementations can. was: n QueryStageExec.computeStats we copy partial statistics from materlized query stages by calling QueryStageExec#getRuntimeStatistics, which in turn calls ShuffleExchangeLike#runtimeStatistics or BroadcastExchangeLike#runtimeStatistics. Only dataSize and numOutputRows are copied into the new Statistics object: {code:scala} def computeStats(): Option[Statistics] = if (isMaterialized) { val runtimeStats = getRuntimeStatistics val dataSize = runtimeStats.sizeInBytes.max(0) val numOutputRows = runtimeStats.rowCount.map(_.max(0)) Some(Statistics(dataSize, numOutputRows, isRuntime = true)) } else { None } {code} I would like to also copy over the column statistics stored in Statistics.attributeMap so that they can be fed back into the logical plan optimization phase. This is a small change as shown below: {code:scala} def computeStats(): Option[Statistics] = if (isMaterialized) { val runtimeStats = getRuntimeStatistics val dataSize = runtimeStats.sizeInBytes.max(0) val numOutputRows = runtimeStats.rowCount.map(_.max(0)) val attributeStats = runtimeStats.attributeStats Some(Statistics(dataSize, numOutputRows, attributeStats, isRuntime = true)) } else { None } {code} The Spark implementations of ShuffleExchangeLike and BroadcastExchangeLike do not currently provide such column statistics, but other custom implementations can. > AQE should use available column statistics from completed query stages > -- > > Key: SPARK-39991 > URL: https://issues.apache.org/jira/browse/SPARK-39991 > Project: Spark > Issue Type: Improvement > Components: SQL >Affects Versions: 3.3.0 >Reporter: Andy Grove >Priority: Major > > In QueryStageExec.computeStats we copy partial statistics from materlized > query stages by calling QueryStageExec#getRuntimeStatistics, which in turn > calls ShuffleExchangeLike#runtimeStatistics or > BroadcastExchangeLike#runtimeStatistics. > Only dataSize and numOutputRows are copied into the new Statistics object: > {code:scala} > def computeStats(): Option[Statistics] = if (isMaterialized) { > val runtimeStats = getRuntimeStatistics > val dataSize = runtimeStats.sizeInBytes.max(0) > val numOutputRows = runtimeStats.rowCount.map(_.max(0)) > Some(Statistics(dataSize, numOutputRows, isRuntime = true)) > } else { > None > } > {code} > I would like to also copy over the column statistics stored in > Statistics.attributeMap so that they can be fed back into the logical plan > optimization phase. This is a small change as shown below: > {code:scala} > def computeStats(): Option[Statistics] = if (isMaterialized) { > val runtimeStats = getRuntimeStatistics > val dataSize = runtimeStats.sizeInBytes.max(0) > val numOutputRows = runtimeStats.rowCount.map(_.max(0)) > val attributeStats = runtimeStats.attributeStats > Some(Statistics(dataSize, numOutputRows, attributeStats, isRuntime = > true)) > } else { > None > } > {code} > The Spark implementations of ShuffleExchangeLike and BroadcastExchangeLike do > not currently provide such column statistics, but other custom > implementations can. -- This message was sent by Atlassian Jira (v
[jira] [Created] (SPARK-39991) AQE should use available column statistics from completed query stages
Andy Grove created SPARK-39991: -- Summary: AQE should use available column statistics from completed query stages Key: SPARK-39991 URL: https://issues.apache.org/jira/browse/SPARK-39991 Project: Spark Issue Type: Improvement Components: SQL Affects Versions: 3.3.0 Reporter: Andy Grove n QueryStageExec.computeStats we copy partial statistics from materlized query stages by calling QueryStageExec#getRuntimeStatistics, which in turn calls ShuffleExchangeLike#runtimeStatistics or BroadcastExchangeLike#runtimeStatistics. Only dataSize and numOutputRows are copied into the new Statistics object: {code:scala} def computeStats(): Option[Statistics] = if (isMaterialized) { val runtimeStats = getRuntimeStatistics val dataSize = runtimeStats.sizeInBytes.max(0) val numOutputRows = runtimeStats.rowCount.map(_.max(0)) Some(Statistics(dataSize, numOutputRows, isRuntime = true)) } else { None } {code} I would like to also copy over the column statistics stored in Statistics.attributeMap so that they can be fed back into the logical plan optimization phase. This is a small change as shown below: {code:scala} def computeStats(): Option[Statistics] = if (isMaterialized) { val runtimeStats = getRuntimeStatistics val dataSize = runtimeStats.sizeInBytes.max(0) val numOutputRows = runtimeStats.rowCount.map(_.max(0)) val attributeStats = runtimeStats.attributeStats Some(Statistics(dataSize, numOutputRows, attributeStats, isRuntime = true)) } else { None } {code} The Spark implementations of ShuffleExchangeLike and BroadcastExchangeLike do not currently provide such column statistics, but other custom implementations can. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-30788) Support `SimpleDateFormat` and `FastDateFormat` as legacy date/timestamp formatters
[ https://issues.apache.org/jira/browse/SPARK-30788?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17503123#comment-17503123 ] Andy Grove commented on SPARK-30788: [~cloud_fan] [~maxgekk] I think the fix version on this issue should be 3.3.0 and not 3.0.0 ? > Support `SimpleDateFormat` and `FastDateFormat` as legacy date/timestamp > formatters > --- > > Key: SPARK-30788 > URL: https://issues.apache.org/jira/browse/SPARK-30788 > Project: Spark > Issue Type: Improvement > Components: SQL >Affects Versions: 3.0.0 >Reporter: Max Gekk >Assignee: Max Gekk >Priority: Major > Fix For: 3.0.0 > > > To be absolutely sure that Spark 3.0 is compatible with 2.4 when > spark.sql.legacy.timeParser.enabled is set to true, need to support > SimpleDateFormat and FastDateFormat as legacy parsers/formatters in > TimestampFormatter. > Spark 2.4.x uses the following parsers for parsing/formatting date/timestamp > strings: > # DateTimeFormat in CSV/JSON datasource > # SimpleDateFormat - is used in JDBC datasource, in partitions parsing. > # SimpleDateFormat in strong mode (lenient = false). It is used by the > date_format, from_unixtime, unix_timestamp and to_unix_timestamp functions. > Spark 3.0 should use the same parsers in those cases. -- This message was sent by Atlassian Jira (v8.20.1#820001) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-38060) Inconsistent behavior from JSON option allowNonNumericNumbers
[ https://issues.apache.org/jira/browse/SPARK-38060?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andy Grove updated SPARK-38060: --- Description: The behavior of the JSON option allowNonNumericNumbers is not consistent: 1. Some NaN and Infinity values are still parsed when the option is set to false 2. Some values are parsed differently depending on whether they are quoted or not (see results for positive and negative Infinity) h2. Input data {code:java} { "number": "NaN" } { "number": NaN } { "number": "+INF" } { "number": +INF } { "number": "-INF" } { "number": -INF } { "number": "INF" } { "number": INF } { "number": Infinity } { "number": +Infinity } { "number": -Infinity } { "number": "Infinity" } { "number": "+Infinity" } { "number": "-Infinity" } {code} h2. Setup {code:java} import org.apache.spark.sql.types._ val schema = StructType(Seq(StructField("number", DataTypes.FloatType, false))) {code} h2. allowNonNumericNumbers = false {code:java} spark.read.format("json").schema(schema).option("allowNonNumericNumbers", "false").json("nan_valid.json") df.show +-+ | number| +-+ | NaN| | null| | null| | null| | null| | null| | null| | null| | null| | null| | null| | Infinity| | null| |-Infinity| +-+ {code} h2. allowNonNumericNumbers = true {code:java} val df = spark.read.format("json").schema(schema).option("allowNonNumericNumbers", "true").json("nan_valid.json") df.show +-+ | number| +-+ | NaN| | NaN| | null| | Infinity| | null| |-Infinity| | null| | null| | Infinity| | Infinity| |-Infinity| | Infinity| | null| |-Infinity| +-+{code} was: The behavior of the JSON option allowNonNumericNumbers is not consistent and still supports parsing NaN and Infinity values in some cases when the option is set to false. h2. Input data {code:java} { "number": "NaN" } { "number": NaN } { "number": "+INF" } { "number": +INF } { "number": "-INF" } { "number": -INF } { "number": "INF" } { "number": INF } { "number": Infinity } { "number": +Infinity } { "number": -Infinity } { "number": "Infinity" } { "number": "+Infinity" } { "number": "-Infinity" } {code} h2. Setup {code:java} import org.apache.spark.sql.types._ val schema = StructType(Seq(StructField("number", DataTypes.FloatType, false))) {code} h2. allowNonNumericNumbers = false {code:java} spark.read.format("json").schema(schema).option("allowNonNumericNumbers", "false").json("nan_valid.json") df.show +-+ | number| +-+ | NaN| | null| | null| | null| | null| | null| | null| | null| | null| | null| | null| | Infinity| | null| |-Infinity| +-+ {code} h2. allowNonNumericNumbers = true {code:java} val df = spark.read.format("json").schema(schema).option("allowNonNumericNumbers", "true").json("nan_valid.json") df.show +-+ | number| +-+ | NaN| | NaN| | null| | Infinity| | null| |-Infinity| | null| | null| | Infinity| | Infinity| |-Infinity| | Infinity| | null| |-Infinity| +-+{code} > Inconsistent behavior from JSON option allowNonNumericNumbers > - > > Key: SPARK-38060 > URL: https://issues.apache.org/jira/browse/SPARK-38060 > Project: Spark > Issue Type: Bug > Components: SQL >Affects Versions: 3.2.0 > Environment: Running Spark 3.2.0 in local mode on Ubuntu 20.04.3 LTS >Reporter: Andy Grove >Priority: Minor > > The behavior of the JSON option allowNonNumericNumbers is not consistent: > 1. Some NaN and Infinity values are still parsed when the option is set to > false > 2. Some values are parsed differently depending on whether they are quoted or > not (see results for positive and negative Infinity) > h2. Input data > {code:java} > { "number": "NaN" } > { "number": NaN } > { "number": "+INF" } > { "number": +INF } > { "number": "-INF" } > { "number": -INF } > { "number": "INF" } > { "number": INF } > { "number": Infinity } > { "number": +Infinity } > { "number": -Infinity } > { "number": "Infinity" } > { "number": "+Infinity" } > { "number": "-Infinity" } > {code} > h2. Setup > {code:java} > import org.apache.spark.sql.types._ > val schema = StructType(Seq(StructField("number", DataTypes.FloatType, > false))) {code} > h2. allowNonNumericNumbers = false > {code:java} > spark.read.format("json").schema(schema).option("allowNonNumericNumbers", > "false").json("nan_valid.json") > df.show > +-+ > | number| > +-+ > | NaN| > | null| > | null| > | null| > | null| > | null| > | null| > | null| > | null| > | null| > | null| > | Infinity| > | null| > |-Infinity| > +-+ {code} > h2. allowNonNumericNumbers =
[jira] [Created] (SPARK-38060) Inconsistent behavior from JSON option allowNonNumericNumbers
Andy Grove created SPARK-38060: -- Summary: Inconsistent behavior from JSON option allowNonNumericNumbers Key: SPARK-38060 URL: https://issues.apache.org/jira/browse/SPARK-38060 Project: Spark Issue Type: Bug Components: SQL Affects Versions: 3.2.0 Environment: Running Spark 3.2.0 in local mode on Ubuntu 20.04.3 LTS Reporter: Andy Grove The behavior of the JSON option allowNonNumericNumbers is not consistent and still supports parsing NaN and Infinity values in some cases when the option is set to false. h2. Input data {code:java} { "number": "NaN" } { "number": NaN } { "number": "+INF" } { "number": +INF } { "number": "-INF" } { "number": -INF } { "number": "INF" } { "number": INF } { "number": Infinity } { "number": +Infinity } { "number": -Infinity } { "number": "Infinity" } { "number": "+Infinity" } { "number": "-Infinity" } {code} h2. Setup {code:java} import org.apache.spark.sql.types._ val schema = StructType(Seq(StructField("number", DataTypes.FloatType, false))) {code} h2. allowNonNumericNumbers = false {code:java} spark.read.format("json").schema(schema).option("allowNonNumericNumbers", "false").json("nan_valid.json") df.show +-+ | number| +-+ | NaN| | null| | null| | null| | null| | null| | null| | null| | null| | null| | null| | Infinity| | null| |-Infinity| +-+ {code} h2. allowNonNumericNumbers = true {code:java} val df = spark.read.format("json").schema(schema).option("allowNonNumericNumbers", "true").json("nan_valid.json") df.show +-+ | number| +-+ | NaN| | NaN| | null| | Infinity| | null| |-Infinity| | null| | null| | Infinity| | Infinity| |-Infinity| | Infinity| | null| |-Infinity| +-+{code} -- This message was sent by Atlassian Jira (v8.20.1#820001) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-36666) [SQL] Regression in AQEShuffleReadExec
Andy Grove created SPARK-3: -- Summary: [SQL] Regression in AQEShuffleReadExec Key: SPARK-3 URL: https://issues.apache.org/jira/browse/SPARK-3 Project: Spark Issue Type: Bug Components: SQL Affects Versions: 3.2.0 Reporter: Andy Grove I am currently testing the RAPIDS Accelerator for Apache Spark with the Spark 3.2 release candidate and there is a regression in AQEShuffleReadExec where it now throws an exception if the shuffle's output partitioning does not match a specific list of schemes. The problem can be solved by returning UnknownPartitioning, as it does in some cases, rather than throwing an exception. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-36666) [SQL] Regression in AQEShuffleReadExec
[ https://issues.apache.org/jira/browse/SPARK-3?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andy Grove updated SPARK-3: --- Description: I am currently testing the RAPIDS Accelerator for Apache Spark with the Spark 3.2 release candidate and there is a regression in AQEShuffleReadExec where it now throws an exception if the shuffle's output partitioning does not match a specific list of schemes. The problem can be solved by returning UnknownPartitioning, as it already does in some cases, rather than throwing an exception. was: I am currently testing the RAPIDS Accelerator for Apache Spark with the Spark 3.2 release candidate and there is a regression in AQEShuffleReadExec where it now throws an exception if the shuffle's output partitioning does not match a specific list of schemes. The problem can be solved by returning UnknownPartitioning, as it does in some cases, rather than throwing an exception. > [SQL] Regression in AQEShuffleReadExec > -- > > Key: SPARK-3 > URL: https://issues.apache.org/jira/browse/SPARK-3 > Project: Spark > Issue Type: Bug > Components: SQL >Affects Versions: 3.2.0 >Reporter: Andy Grove >Priority: Major > > I am currently testing the RAPIDS Accelerator for Apache Spark with the Spark > 3.2 release candidate and there is a regression in AQEShuffleReadExec where > it now throws an exception if the shuffle's output partitioning does not > match a specific list of schemes. > The problem can be solved by returning UnknownPartitioning, as it already > does in some cases, rather than throwing an exception. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-36076) [SQL] ArrayIndexOutOfBounds in CAST string to timestamp
[ https://issues.apache.org/jira/browse/SPARK-36076?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andy Grove updated SPARK-36076: --- Summary: [SQL] ArrayIndexOutOfBounds in CAST string to timestamp (was: [SQL] ArrayIndexOutOfBounds in CAST string to date) > [SQL] ArrayIndexOutOfBounds in CAST string to timestamp > --- > > Key: SPARK-36076 > URL: https://issues.apache.org/jira/browse/SPARK-36076 > Project: Spark > Issue Type: Bug > Components: SQL >Affects Versions: 3.1.1 >Reporter: Andy Grove >Priority: Major > > I discovered this bug during some fuzz testing. > {code:java} > __ > / __/__ ___ _/ /__ > _\ \/ _ \/ _ `/ __/ '_/ >/___/ .__/\_,_/_/ /_/\_\ version 3.1.1 > /_/ > > Using Scala version 2.12.10 (OpenJDK 64-Bit Server VM, Java 1.8.0_282) > Type in expressions to have them evaluated. > Type :help for more information.scala> > scala> import org.apache.spark.sql.types.DataTypes > scala> val df = Seq(":8:434421+ 98:38").toDF("c0") > df: org.apache.spark.sql.DataFrame = [c0: string] > scala> val df2 = df.withColumn("c1", col("c0").cast(DataTypes.TimestampType)) > df2: org.apache.spark.sql.DataFrame = [c0: string, c1: timestamp] > scala> df2.show > java.lang.ArrayIndexOutOfBoundsException: 9 > at > org.apache.spark.sql.catalyst.util.DateTimeUtils$.stringToTimestamp(DateTimeUtils.scala:328) > at > org.apache.spark.sql.catalyst.expressions.CastBase.$anonfun$castToTimestamp$2(Cast.scala:455) > at > org.apache.spark.sql.catalyst.expressions.CastBase.buildCast(Cast.scala:295) > at > org.apache.spark.sql.catalyst.expressions.CastBase.$anonfun$castToTimestamp$1(Cast.scala:451) > at > org.apache.spark.sql.catalyst.expressions.CastBase.nullSafeEval(Cast.scala:840) > at > org.apache.spark.sql.catalyst.expressions.UnaryExpression.eval(Expression.scala:476) > {code} -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-36076) [SQL] ArrayIndexOutOfBounds in CAST string to date
[ https://issues.apache.org/jira/browse/SPARK-36076?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andy Grove updated SPARK-36076: --- Description: I discovered this bug during some fuzz testing. {code:java} __ / __/__ ___ _/ /__ _\ \/ _ \/ _ `/ __/ '_/ /___/ .__/\_,_/_/ /_/\_\ version 3.1.1 /_/ Using Scala version 2.12.10 (OpenJDK 64-Bit Server VM, Java 1.8.0_282) Type in expressions to have them evaluated. Type :help for more information.scala> scala> import org.apache.spark.sql.types.DataTypes scala> val df = Seq(":8:434421+ 98:38").toDF("c0") df: org.apache.spark.sql.DataFrame = [c0: string] scala> val df2 = df.withColumn("c1", col("c0").cast(DataTypes.TimestampType)) df2: org.apache.spark.sql.DataFrame = [c0: string, c1: timestamp] scala> df2.show java.lang.ArrayIndexOutOfBoundsException: 9 at org.apache.spark.sql.catalyst.util.DateTimeUtils$.stringToTimestamp(DateTimeUtils.scala:328) at org.apache.spark.sql.catalyst.expressions.CastBase.$anonfun$castToTimestamp$2(Cast.scala:455) at org.apache.spark.sql.catalyst.expressions.CastBase.buildCast(Cast.scala:295) at org.apache.spark.sql.catalyst.expressions.CastBase.$anonfun$castToTimestamp$1(Cast.scala:451) at org.apache.spark.sql.catalyst.expressions.CastBase.nullSafeEval(Cast.scala:840) at org.apache.spark.sql.catalyst.expressions.UnaryExpression.eval(Expression.scala:476) {code} was: {code:java} __ / __/__ ___ _/ /__ _\ \/ _ \/ _ `/ __/ '_/ /___/ .__/\_,_/_/ /_/\_\ version 3.1.1 /_/ Using Scala version 2.12.10 (OpenJDK 64-Bit Server VM, Java 1.8.0_282) Type in expressions to have them evaluated. Type :help for more information.scala> scala> import org.apache.spark.sql.types.DataTypes scala> val df = Seq(":8:434421+ 98:38").toDF("c0") df: org.apache.spark.sql.DataFrame = [c0: string] scala> val df2 = df.withColumn("c1", col("c0").cast(DataTypes.TimestampType)) df2: org.apache.spark.sql.DataFrame = [c0: string, c1: timestamp] scala> df2.show java.lang.ArrayIndexOutOfBoundsException: 9 at org.apache.spark.sql.catalyst.util.DateTimeUtils$.stringToTimestamp(DateTimeUtils.scala:328) at org.apache.spark.sql.catalyst.expressions.CastBase.$anonfun$castToTimestamp$2(Cast.scala:455) at org.apache.spark.sql.catalyst.expressions.CastBase.buildCast(Cast.scala:295) at org.apache.spark.sql.catalyst.expressions.CastBase.$anonfun$castToTimestamp$1(Cast.scala:451) at org.apache.spark.sql.catalyst.expressions.CastBase.nullSafeEval(Cast.scala:840) at org.apache.spark.sql.catalyst.expressions.UnaryExpression.eval(Expression.scala:476) {code} > [SQL] ArrayIndexOutOfBounds in CAST string to date > -- > > Key: SPARK-36076 > URL: https://issues.apache.org/jira/browse/SPARK-36076 > Project: Spark > Issue Type: Bug > Components: SQL >Affects Versions: 3.1.1 >Reporter: Andy Grove >Priority: Major > > I discovered this bug during some fuzz testing. > {code:java} > __ > / __/__ ___ _/ /__ > _\ \/ _ \/ _ `/ __/ '_/ >/___/ .__/\_,_/_/ /_/\_\ version 3.1.1 > /_/ > > Using Scala version 2.12.10 (OpenJDK 64-Bit Server VM, Java 1.8.0_282) > Type in expressions to have them evaluated. > Type :help for more information.scala> > scala> import org.apache.spark.sql.types.DataTypes > scala> val df = Seq(":8:434421+ 98:38").toDF("c0") > df: org.apache.spark.sql.DataFrame = [c0: string] > scala> val df2 = df.withColumn("c1", col("c0").cast(DataTypes.TimestampType)) > df2: org.apache.spark.sql.DataFrame = [c0: string, c1: timestamp] > scala> df2.show > java.lang.ArrayIndexOutOfBoundsException: 9 > at > org.apache.spark.sql.catalyst.util.DateTimeUtils$.stringToTimestamp(DateTimeUtils.scala:328) > at > org.apache.spark.sql.catalyst.expressions.CastBase.$anonfun$castToTimestamp$2(Cast.scala:455) > at > org.apache.spark.sql.catalyst.expressions.CastBase.buildCast(Cast.scala:295) > at > org.apache.spark.sql.catalyst.expressions.CastBase.$anonfun$castToTimestamp$1(Cast.scala:451) > at > org.apache.spark.sql.catalyst.expressions.CastBase.nullSafeEval(Cast.scala:840) > at > org.apache.spark.sql.catalyst.expressions.UnaryExpression.eval(Expression.scala:476) > {code} -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-36076) [SQL] ArrayIndexOutOfBounds in CAST string to date
[ https://issues.apache.org/jira/browse/SPARK-36076?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andy Grove updated SPARK-36076: --- Description: {code:java} __ / __/__ ___ _/ /__ _\ \/ _ \/ _ `/ __/ '_/ /___/ .__/\_,_/_/ /_/\_\ version 3.1.1 /_/ Using Scala version 2.12.10 (OpenJDK 64-Bit Server VM, Java 1.8.0_282) Type in expressions to have them evaluated. Type :help for more information.scala> scala> import org.apache.spark.sql.types.DataTypes scala> val df = Seq(":8:434421+ 98:38").toDF("c0") df: org.apache.spark.sql.DataFrame = [c0: string] scala> val df2 = df.withColumn("c1", col("c0").cast(DataTypes.TimestampType)) df2: org.apache.spark.sql.DataFrame = [c0: string, c1: timestamp] scala> df2.show java.lang.ArrayIndexOutOfBoundsException: 9 at org.apache.spark.sql.catalyst.util.DateTimeUtils$.stringToTimestamp(DateTimeUtils.scala:328) at org.apache.spark.sql.catalyst.expressions.CastBase.$anonfun$castToTimestamp$2(Cast.scala:455) at org.apache.spark.sql.catalyst.expressions.CastBase.buildCast(Cast.scala:295) at org.apache.spark.sql.catalyst.expressions.CastBase.$anonfun$castToTimestamp$1(Cast.scala:451) at org.apache.spark.sql.catalyst.expressions.CastBase.nullSafeEval(Cast.scala:840) at org.apache.spark.sql.catalyst.expressions.UnaryExpression.eval(Expression.scala:476) {code} was: {code:java} __ / __/__ ___ _/ /__ _\ \/ _ \/ _ `/ __/ '_/ /___/ .__/\_,_/_/ /_/\_\ version 3.1.1 /_/ Using Scala version 2.12.10 (OpenJDK 64-Bit Server VM, Java 1.8.0_282) Type in expressions to have them evaluated. Type :help for more information.scala> spark.conf.set("spark.rapids.sql.enabled", "false")scala> val df = Seq(":8:434421+ 98:38").toDF("c0") df: org.apache.spark.sql.DataFrame = [c0: string]scala> val df2 = df.withColumn("c1", col("c0").cast(DataTypes.TimestampType)) :25: error: not found: value DataTypes val df2 = df.withColumn("c1", col("c0").cast(DataTypes.TimestampType)) ^scala> import org.spark.sql.types.DataTypes :23: error: object spark is not a member of package org import org.spark.sql.types.DataTypes ^scala> import org.apache.spark.sql.types.DataTypes import org.apache.spark.sql.types.DataTypesscala> val df2 = df.withColumn("c1", col("c0").cast(DataTypes.TimestampType)) df2: org.apache.spark.sql.DataFrame = [c0: string, c1: timestamp]scala> df2.show java.lang.ArrayIndexOutOfBoundsException: 9 at org.apache.spark.sql.catalyst.util.DateTimeUtils$.stringToTimestamp(DateTimeUtils.scala:328) at org.apache.spark.sql.catalyst.expressions.CastBase.$anonfun$castToTimestamp$2(Cast.scala:455) at org.apache.spark.sql.catalyst.expressions.CastBase.buildCast(Cast.scala:295) at org.apache.spark.sql.catalyst.expressions.CastBase.$anonfun$castToTimestamp$1(Cast.scala:451) at org.apache.spark.sql.catalyst.expressions.CastBase.nullSafeEval(Cast.scala:840) at org.apache.spark.sql.catalyst.expressions.UnaryExpression.eval(Expression.scala:476) {code} > [SQL] ArrayIndexOutOfBounds in CAST string to date > -- > > Key: SPARK-36076 > URL: https://issues.apache.org/jira/browse/SPARK-36076 > Project: Spark > Issue Type: Bug > Components: SQL >Affects Versions: 3.1.1 >Reporter: Andy Grove >Priority: Major > > {code:java} > __ > / __/__ ___ _/ /__ > _\ \/ _ \/ _ `/ __/ '_/ >/___/ .__/\_,_/_/ /_/\_\ version 3.1.1 > /_/ > > Using Scala version 2.12.10 (OpenJDK 64-Bit Server VM, Java 1.8.0_282) > Type in expressions to have them evaluated. > Type :help for more information.scala> > scala> import org.apache.spark.sql.types.DataTypes > scala> val df = Seq(":8:434421+ 98:38").toDF("c0") > df: org.apache.spark.sql.DataFrame = [c0: string] > scala> val df2 = df.withColumn("c1", col("c0").cast(DataTypes.TimestampType)) > df2: org.apache.spark.sql.DataFrame = [c0: string, c1: timestamp] > scala> df2.show > java.lang.ArrayIndexOutOfBoundsException: 9 > at > org.apache.spark.sql.catalyst.util.DateTimeUtils$.stringToTimestamp(DateTimeUtils.scala:328) > at > org.apache.spark.sql.catalyst.expressions.CastBase.$anonfun$castToTimestamp$2(Cast.scala:455) > at > org.apache.spark.sql.catalyst.expressions.CastBase.buildCast(Cast.scala:295) > at > org.apache.spark.sql.catalyst.expressions.CastBase.$anonfun$castToTimestamp$1(Cast.scala:451) > at > org.apache.spark.sql.catalyst.expressions.CastBase.nullSafeEval(Cast.scala:840) > at > org.apache.spark.sql.catalyst.expressions.UnaryExpression.eval(Expression.scala:476) > {code} -- This message was sent by Atlassian Jira (v8.3.4#803005) -
[jira] [Created] (SPARK-36076) [SQL] ArrayIndexOutOfBounds in CAST string to date
Andy Grove created SPARK-36076: -- Summary: [SQL] ArrayIndexOutOfBounds in CAST string to date Key: SPARK-36076 URL: https://issues.apache.org/jira/browse/SPARK-36076 Project: Spark Issue Type: Bug Components: SQL Affects Versions: 3.1.1 Reporter: Andy Grove {code:java} __ / __/__ ___ _/ /__ _\ \/ _ \/ _ `/ __/ '_/ /___/ .__/\_,_/_/ /_/\_\ version 3.1.1 /_/ Using Scala version 2.12.10 (OpenJDK 64-Bit Server VM, Java 1.8.0_282) Type in expressions to have them evaluated. Type :help for more information.scala> spark.conf.set("spark.rapids.sql.enabled", "false")scala> val df = Seq(":8:434421+ 98:38").toDF("c0") df: org.apache.spark.sql.DataFrame = [c0: string]scala> val df2 = df.withColumn("c1", col("c0").cast(DataTypes.TimestampType)) :25: error: not found: value DataTypes val df2 = df.withColumn("c1", col("c0").cast(DataTypes.TimestampType)) ^scala> import org.spark.sql.types.DataTypes :23: error: object spark is not a member of package org import org.spark.sql.types.DataTypes ^scala> import org.apache.spark.sql.types.DataTypes import org.apache.spark.sql.types.DataTypesscala> val df2 = df.withColumn("c1", col("c0").cast(DataTypes.TimestampType)) df2: org.apache.spark.sql.DataFrame = [c0: string, c1: timestamp]scala> df2.show java.lang.ArrayIndexOutOfBoundsException: 9 at org.apache.spark.sql.catalyst.util.DateTimeUtils$.stringToTimestamp(DateTimeUtils.scala:328) at org.apache.spark.sql.catalyst.expressions.CastBase.$anonfun$castToTimestamp$2(Cast.scala:455) at org.apache.spark.sql.catalyst.expressions.CastBase.buildCast(Cast.scala:295) at org.apache.spark.sql.catalyst.expressions.CastBase.$anonfun$castToTimestamp$1(Cast.scala:451) at org.apache.spark.sql.catalyst.expressions.CastBase.nullSafeEval(Cast.scala:840) at org.apache.spark.sql.catalyst.expressions.UnaryExpression.eval(Expression.scala:476) {code} -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-35881) [SQL] AQE does not support columnar execution for the final query stage
[ https://issues.apache.org/jira/browse/SPARK-35881?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andy Grove updated SPARK-35881: --- Affects Version/s: 3.0.3 3.1.2 > [SQL] AQE does not support columnar execution for the final query stage > --- > > Key: SPARK-35881 > URL: https://issues.apache.org/jira/browse/SPARK-35881 > Project: Spark > Issue Type: Bug > Components: SQL >Affects Versions: 3.0.3, 3.1.2, 3.2.0 >Reporter: Andy Grove >Priority: Major > > In AdaptiveSparkPlanExec, a query is broken down into stages and these stages > are executed until the entire query has been executed. These stages can be > row-based or columnar. However, the final stage, produced by the private > getFinalPhysicalPlan method is always assumed to be row-based. The only way > to execute the final stage is by calling the various doExecute methods on > AdaptiveSparkPlanExec, and doExecuteColumnar is not implemented. The > supportsColumnar method also always returns false. > In the RAPIDS Accelerator for Apache Spark, we currently call the private > getFinalPhysicalPlan method using reflection and then determine if that plan > is columnar or not, and then call the appropriate doExecute method, bypassing > the doExecute methods on AdaptiveSparkPlanExec. We would like a supported > mechanism for executing a columnar AQE plan so that we do not need to use > reflection. > > > > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-35881) [SQL] AQE does not support columnar execution for the final query stage
[ https://issues.apache.org/jira/browse/SPARK-35881?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andy Grove updated SPARK-35881: --- Description: In AdaptiveSparkPlanExec, a query is broken down into stages and these stages are executed until the entire query has been executed. These stages can be row-based or columnar. However, the final stage, produced by the private getFinalPhysicalPlan method is always assumed to be row-based. The only way to execute the final stage is by calling the various doExecute methods on AdaptiveSparkPlanExec, and doExecuteColumnar is not implemented. The supportsColumnar method also always returns false. In the RAPIDS Accelerator for Apache Spark, we currently call the private getFinalPhysicalPlan method using reflection and then determine if that plan is columnar or not, and then call the appropriate doExecute method, bypassing the doExecute methods on AdaptiveSparkPlanExec. We would like a supported mechanism for executing a columnar AQE plan so that we do not need to use reflection. was: In AdaptiveSparkPlanExec, a query is broken down into stages and these stages are executed until the entire query has been executed. These stages can be row-based or columnar. However, the final stage, produced by the private getFinalPhysicalPlan method is always assumed to be row-based. The only way to execute the final stage is by calling the various doExecute methods on AdaptiveSparkPlanExec. The supportsColumnar method also always returns false, which is another limitation. However, AQE is special because we don't know if the final stage will be columnar or not until the child stages have been executed and the final stage has been re-planned and re-optimized, so we can't easily change the behavior of supportsColumnar. We can't just implement doExecuteColumnar because we don't know whether the final stage will be columnar oir not until after we start executing the query. In the RAPIDS Accelerator for Apache Spark, we currently call the private getFinalPhysicalPlan method using reflection and then determine if that plan is columnar or not, and then calling the appropriate doExecute method, bypassing the doExecute methods on AdaptiveSparkPlanExec. I propose that we make getFinalPhysicalPlan public, and part of the developer API, so that columnar plugins can call this method and determine if the final stage is columnar or not, and execute it appropriately. This would not affect any existing Spark functionality. We also need a mechanism for invoking finalPlanUpdate after the query has been executed. > [SQL] AQE does not support columnar execution for the final query stage > --- > > Key: SPARK-35881 > URL: https://issues.apache.org/jira/browse/SPARK-35881 > Project: Spark > Issue Type: Bug > Components: SQL >Affects Versions: 3.2.0 >Reporter: Andy Grove >Priority: Major > > In AdaptiveSparkPlanExec, a query is broken down into stages and these stages > are executed until the entire query has been executed. These stages can be > row-based or columnar. However, the final stage, produced by the private > getFinalPhysicalPlan method is always assumed to be row-based. The only way > to execute the final stage is by calling the various doExecute methods on > AdaptiveSparkPlanExec, and doExecuteColumnar is not implemented. The > supportsColumnar method also always returns false. > In the RAPIDS Accelerator for Apache Spark, we currently call the private > getFinalPhysicalPlan method using reflection and then determine if that plan > is columnar or not, and then call the appropriate doExecute method, bypassing > the doExecute methods on AdaptiveSparkPlanExec. We would like a supported > mechanism for executing a columnar AQE plan so that we do not need to use > reflection. > > > > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-35881) [SQL] AQE does not support columnar execution for the final query stage
[ https://issues.apache.org/jira/browse/SPARK-35881?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andy Grove updated SPARK-35881: --- Description: In AdaptiveSparkPlanExec, a query is broken down into stages and these stages are executed until the entire query has been executed. These stages can be row-based or columnar. However, the final stage, produced by the private getFinalPhysicalPlan method is always assumed to be row-based. The only way to execute the final stage is by calling the various doExecute methods on AdaptiveSparkPlanExec. The supportsColumnar method also always returns false, which is another limitation. However, AQE is special because we don't know if the final stage will be columnar or not until the child stages have been executed and the final stage has been re-planned and re-optimized, so we can't easily change the behavior of supportsColumnar. We can't just implement doExecuteColumnar because we don't know whether the final stage will be columnar oir not until after we start executing the query. In the RAPIDS Accelerator for Apache Spark, we currently call the private getFinalPhysicalPlan method using reflection and then determine if that plan is columnar or not, and then calling the appropriate doExecute method, bypassing the doExecute methods on AdaptiveSparkPlanExec. I propose that we make getFinalPhysicalPlan public, and part of the developer API, so that columnar plugins can call this method and determine if the final stage is columnar or not, and execute it appropriately. This would not affect any existing Spark functionality. We also need a mechanism for invoking finalPlanUpdate after the query has been executed. was: In AdaptiveSparkPlanExec, a query is broken down into stages and these stages are executed until the entire query has been executed. These stages can be row-based or columnar. However, the final stage, produced by the private getFinalPhysicalPlan method is always assumed to be row-based. The only way to execute the final stage is by calling the various doExecute methods on AdaptiveSparkPlanExec. The supportsColumnar method also always returns false, which is another limitation. However, AQE is special because we don't know if the final stage will be columnar or not until the child stages have been executed and the final stage has been re-planned and re-optimized, so we can't easily change the behavior of supportsColumnar. We can't just implement doExecuteColumnar because we don't know whether the final stage will be columnar oir not until after we start executing the query. In the RAPIDS Accelerator for Apache Spark, we currently call the private getFinalPhysicalPlan method using reflection and then determine if that plan is columnar or not, and then calling the appropriate doExecute method, bypassing the doExecute methods on AdaptiveSparkPlanExec. I propose that we make getFinalPhysicalPlan public, and part of the developer API, so that columnar plugins can call this method and determine if the final stage is columnar or not, and execute it appropriately. This would not affect any existing Spark functionality. > [SQL] AQE does not support columnar execution for the final query stage > --- > > Key: SPARK-35881 > URL: https://issues.apache.org/jira/browse/SPARK-35881 > Project: Spark > Issue Type: Bug > Components: SQL >Affects Versions: 3.2.0 >Reporter: Andy Grove >Priority: Major > > In AdaptiveSparkPlanExec, a query is broken down into stages and these stages > are executed until the entire query has been executed. These stages can be > row-based or columnar. However, the final stage, produced by the private > getFinalPhysicalPlan method is always assumed to be row-based. The only way > to execute the final stage is by calling the various doExecute methods on > AdaptiveSparkPlanExec. The supportsColumnar method also always returns false, > which is another limitation. However, AQE is special because we don't know if > the final stage will be columnar or not until the child stages have been > executed and the final stage has been re-planned and re-optimized, so we > can't easily change the behavior of supportsColumnar. We can't just implement > doExecuteColumnar because we don't know whether the final stage will be > columnar oir not until after we start executing the query. > In the RAPIDS Accelerator for Apache Spark, we currently call the private > getFinalPhysicalPlan method using reflection and then determine if that plan > is columnar or not, and then calling the appropriate doExecute method, > bypassing the doExecute methods on AdaptiveSparkPlanExec. > I propose that we make getFinalPhysicalPlan public, and part of the developer > API, so that columnar plugins
[jira] [Updated] (SPARK-35881) [SQL] AQE does not support columnar execution for the final query stage
[ https://issues.apache.org/jira/browse/SPARK-35881?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andy Grove updated SPARK-35881: --- Description: In AdaptiveSparkPlanExec, a query is broken down into stages and these stages are executed until the entire query has been executed. These stages can be row-based or columnar. However, the final stage, produced by the private getFinalPhysicalPlan method is always assumed to be row-based. The only way to execute the final stage is by calling the various doExecute methods on AdaptiveSparkPlanExec. The supportsColumnar method also always returns false, which is another limitation. However, AQE is special because we don't know if the final stage will be columnar or not until the child stages have been executed and the final stage has been re-planned and re-optimized, so we can't easily change the behavior of supportsColumnar. We can't just implement doExecuteColumnar because we don't know whether the final stage will be columnar oir not until after we start executing the query. In the RAPIDS Accelerator for Apache Spark, we currently call the private getFinalPhysicalPlan method using reflection and then determine if that plan is columnar or not, and then calling the appropriate doExecute method, bypassing the doExecute methods on AdaptiveSparkPlanExec. I propose that we make getFinalPhysicalPlan public, and part of the developer API, so that columnar plugins can call this method and determine if the final stage is columnar or not, and execute it appropriately. This would not affect any existing Spark functionality. was: In AdaptiveSparkPlanExec, a query is broken down into stages and these stages are executed until the entire query has been executed. These stages can be row-based or columnar. However, the final stage, produced by the private getFinalPhysicalPlan method is always assumed to be row-based. The only way to execute the final stage is by calling the various doExecute methods on AdaptiveSparkPlanExec. The supportsColumnar method also always returns false, which is another limitation. However, AQE is special because we don't know if the final stage will be columnar or not until the child stages have been executed and the final stage has been re-planned and re-optimized, so we can't easily change the behavior of supportsColumnar. We can't just implement doExecuteColumnar because we don't know whether the final stage will be columnar oir not until after we start executing the query. In the RAPIDS Accelerator for Apache Spark, we currently call the private getFinalPhysicalPlan method using reflection and then invoke that plan, bypassing the doExecute methods on AdaptiveSparkPlanExec. I propose that we make getFinalPhysicalPlan public, and part of the developer API, so that columnar plugins can call this method and determine if the final stage is columnar or not, and execute it appropriately. This would not affect any existing Spark functionality. > [SQL] AQE does not support columnar execution for the final query stage > --- > > Key: SPARK-35881 > URL: https://issues.apache.org/jira/browse/SPARK-35881 > Project: Spark > Issue Type: Bug > Components: SQL >Affects Versions: 3.2.0 >Reporter: Andy Grove >Priority: Major > > In AdaptiveSparkPlanExec, a query is broken down into stages and these stages > are executed until the entire query has been executed. These stages can be > row-based or columnar. However, the final stage, produced by the private > getFinalPhysicalPlan method is always assumed to be row-based. The only way > to execute the final stage is by calling the various doExecute methods on > AdaptiveSparkPlanExec. The supportsColumnar method also always returns false, > which is another limitation. However, AQE is special because we don't know if > the final stage will be columnar or not until the child stages have been > executed and the final stage has been re-planned and re-optimized, so we > can't easily change the behavior of supportsColumnar. We can't just implement > doExecuteColumnar because we don't know whether the final stage will be > columnar oir not until after we start executing the query. > In the RAPIDS Accelerator for Apache Spark, we currently call the private > getFinalPhysicalPlan method using reflection and then determine if that plan > is columnar or not, and then calling the appropriate doExecute method, > bypassing the doExecute methods on AdaptiveSparkPlanExec. > I propose that we make getFinalPhysicalPlan public, and part of the developer > API, so that columnar plugins can call this method and determine if the final > stage is columnar or not, and execute it appropriately. This would not affect > any existing Spark functionality. >
[jira] [Created] (SPARK-35881) [SQL] AQE does not support columnar execution for the final query stage
Andy Grove created SPARK-35881: -- Summary: [SQL] AQE does not support columnar execution for the final query stage Key: SPARK-35881 URL: https://issues.apache.org/jira/browse/SPARK-35881 Project: Spark Issue Type: Bug Components: SQL Affects Versions: 3.2.0 Reporter: Andy Grove In AdaptiveSparkPlanExec, a query is broken down into stages and these stages are executed until the entire query has been executed. These stages can be row-based or columnar. However, the final stage, produced by the private getFinalPhysicalPlan method is always assumed to be row-based. The only way to execute the final stage is by calling the various doExecute methods on AdaptiveSparkPlanExec. The supportsColumnar method also always returns false, which is another limitation. However, AQE is special because we don't know if the final stage will be columnar or not until the child stages have been executed and the final stage has been re-planned and re-optimized, so we can't easily change the behavior of supportsColumnar. We can't just implement doExecuteColumnar because we don't know whether the final stage will be columnar oir not until after we start executing the query. In the RAPIDS Accelerator for Apache Spark, we currently call the private getFinalPhysicalPlan method using reflection and then invoke that plan, bypassing the doExecute methods on AdaptiveSparkPlanExec. I propose that we make getFinalPhysicalPlan public, and part of the developer API, so that columnar plugins can call this method and determine if the final stage is columnar or not, and execute it appropriately. This would not affect any existing Spark functionality. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-35767) [SQL] CoalesceExec can execute child plan twice
[ https://issues.apache.org/jira/browse/SPARK-35767?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andy Grove updated SPARK-35767: --- Priority: Minor (was: Major) > [SQL] CoalesceExec can execute child plan twice > --- > > Key: SPARK-35767 > URL: https://issues.apache.org/jira/browse/SPARK-35767 > Project: Spark > Issue Type: Bug > Components: SQL >Affects Versions: 3.0.2, 3.1.2 >Reporter: Andy Grove >Priority: Minor > > CoalesceExec calls `child.execute()` in the if condition and throws away the > results, then calls `child.execute()` again in the else condition. This could > cause a section of the plan to be executed twice. > {code:java} > protected override def doExecute(): RDD[InternalRow] = { > if (numPartitions == 1 && child.execute().getNumPartitions < 1) { > // Make sure we don't output an RDD with 0 partitions, when claiming that > we have a > // `SinglePartition`. > new CoalesceExec.EmptyRDDWithPartitions(sparkContext, numPartitions) > } else { > child.execute().coalesce(numPartitions, shuffle = false) > } > } {code} -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-35767) [SQL] CoalesceExec can execute child plan twice
Andy Grove created SPARK-35767: -- Summary: [SQL] CoalesceExec can execute child plan twice Key: SPARK-35767 URL: https://issues.apache.org/jira/browse/SPARK-35767 Project: Spark Issue Type: Bug Components: SQL Affects Versions: 3.1.2, 3.0.2 Reporter: Andy Grove CoalesceExec calls `child.execute()` in the if condition and throws away the results, then calls `child.execute()` again in the else condition. This could cause a section of the plan to be executed twice. {code:java} protected override def doExecute(): RDD[InternalRow] = { if (numPartitions == 1 && child.execute().getNumPartitions < 1) { // Make sure we don't output an RDD with 0 partitions, when claiming that we have a // `SinglePartition`. new CoalesceExec.EmptyRDDWithPartitions(sparkContext, numPartitions) } else { child.execute().coalesce(numPartitions, shuffle = false) } } {code} -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-32304) Flaky Test: AdaptiveQueryExecSuite.multiple joins
[ https://issues.apache.org/jira/browse/SPARK-32304?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17347817#comment-17347817 ] Andy Grove commented on SPARK-32304: [~dongjoon] This is the issue we just saw in branch-3.0. > Flaky Test: AdaptiveQueryExecSuite.multiple joins > - > > Key: SPARK-32304 > URL: https://issues.apache.org/jira/browse/SPARK-32304 > Project: Spark > Issue Type: Sub-task > Components: SQL, Tests >Affects Versions: 3.1.0 >Reporter: Hyukjin Kwon >Priority: Major > > {code} > AdaptiveQueryExecSuite: > - Change merge join to broadcast join (313 milliseconds) > - Reuse the parallelism of CoalescedShuffleReaderExec in > LocalShuffleReaderExec (265 milliseconds) > - Reuse the default parallelism in LocalShuffleReaderExec (230 milliseconds) > - Empty stage coalesced to 0-partition RDD (514 milliseconds) > - Scalar subquery (406 milliseconds) > - Scalar subquery in later stages (500 milliseconds) > - multiple joins *** FAILED *** (739 milliseconds) > ArrayBuffer(BroadcastHashJoin [b#251429], [a#251438], Inner, BuildLeft > :- BroadcastQueryStage 5 > : +- BroadcastExchange HashedRelationBroadcastMode(List(cast(input[3, int, > false] as bigint))), [id=#504817] > : +- CustomShuffleReader local > :+- ShuffleQueryStage 4 > : +- Exchange hashpartitioning(b#251429, 5), true, [id=#504777] > : +- *(7) SortMergeJoin [key#251418], [a#251428], Inner > : :- *(5) Sort [key#251418 ASC NULLS FIRST], false, 0 > : : +- CustomShuffleReader coalesced > : : +- ShuffleQueryStage 0 > : :+- Exchange hashpartitioning(key#251418, 5), > true, [id=#504656] > : : +- *(1) Filter (isnotnull(value#251419) AND > (cast(value#251419 as int) = 1)) > : : +- *(1) SerializeFromObject > [knownnotnull(assertnotnull(input[0, > org.apache.spark.sql.test.SQLTestData$TestData, true])).key AS key#251418, > staticinvoke(class org.apache.spark.unsafe.types.UTF8String, StringType, > fromString, knownnotnull(assertnotnull(input[0, > org.apache.spark.sql.test.SQLTestData$TestData, true])).value, true, false) > AS value#251419] > : : +- Scan[obj#251417] > : +- *(6) Sort [a#251428 ASC NULLS FIRST], false, 0 > :+- CustomShuffleReader coalesced > : +- ShuffleQueryStage 1 > : +- Exchange hashpartitioning(a#251428, 5), true, > [id=#504663] > : +- *(2) Filter (b#251429 = 1) > :+- *(2) SerializeFromObject > [knownnotnull(assertnotnull(input[0, > org.apache.spark.sql.test.SQLTestData$TestData2, true])).a AS a#251428, > knownnotnull(assertnotnull(input[0, > org.apache.spark.sql.test.SQLTestData$TestData2, true])).b AS b#251429] > : +- Scan[obj#251427] > +- BroadcastHashJoin [n#251498], [a#251438], Inner, BuildRight > :- CustomShuffleReader local > : +- ShuffleQueryStage 2 > : +- Exchange hashpartitioning(n#251498, 5), true, [id=#504680] > :+- *(3) Filter (n#251498 = 1) > : +- *(3) SerializeFromObject > [knownnotnull(assertnotnull(input[0, > org.apache.spark.sql.test.SQLTestData$LowerCaseData, true])).n AS n#251498, > staticinvoke(class org.apache.spark.unsafe.types.UTF8String, StringType, > fromString, knownnotnull(assertnotnull(input[0, > org.apache.spark.sql.test.SQLTestData$LowerCaseData, true])).l, true, false) > AS l#251499] > : +- Scan[obj#251497] > +- BroadcastQueryStage 6 > +- BroadcastExchange HashedRelationBroadcastMode(List(cast(input[0, > int, false] as bigint))), [id=#504830] >+- CustomShuffleReader local > +- ShuffleQueryStage 3 > +- Exchange hashpartitioning(a#251438, 5), true, [id=#504694] > +- *(4) Filter (a#251438 = 1) >+- *(4) SerializeFromObject > [knownnotnull(assertnotnull(input[0, > org.apache.spark.sql.test.SQLTestData$TestData3, true])).a AS a#251438, > unwrapoption(IntegerType, knownnotnull(assertnotnull(input[0, > org.apache.spark.sql.test.SQLTestData$TestData3, true])).b) AS b#251439] > +- Scan[obj#251437] > , BroadcastHashJoin [n#251498], [a#251438], Inner, BuildRight > :- CustomShuffleReader local > : +- ShuffleQueryStage 2 > : +- Exchange hashpartitioning(n#251498, 5), true, [id=#504680] > :+- *(3) Filter (n#251498 = 1) > : +- *(3) SerializeFromObject > [knownnotnull(assertnotnull(input[0, > org.apache.spark.sql.test.SQLTestData$Lowe
[jira] [Updated] (SPARK-35353) Cross-building docker images to ARM64 is failing (with Ubuntu host)
[ https://issues.apache.org/jira/browse/SPARK-35353?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andy Grove updated SPARK-35353: --- Summary: Cross-building docker images to ARM64 is failing (with Ubuntu host) (was: Cross-building docker images to ARM64 is failing) > Cross-building docker images to ARM64 is failing (with Ubuntu host) > --- > > Key: SPARK-35353 > URL: https://issues.apache.org/jira/browse/SPARK-35353 > Project: Spark > Issue Type: Bug > Components: Kubernetes >Affects Versions: 3.1.1 >Reporter: Andy Grove >Priority: Minor > > I was trying to cross-build Spark 3.1.1 for ARM64 so that I could deploy to a > Raspberry Pi Kubernetes cluster this weekend and the Docker build fails. > Here are the commands I used: > {code:java} > docker buildx create --use > ./bin/docker-image-tool.sh -n -r andygrove -t 3.1.1 -X build {code} > The Docker build for ARM64 fails on the following command: > {code:java} > apt install -y bash tini libc6 libpam-modules krb5-user libnss3 procps{code} > The install fails with "Error while loading /usr/sbin/dpkg-split: No such > file or directory". > Here is a fragment of the output showing the relevant error message. > {code:java} > #6 6.034 Get:35 https://deb.debian.org/debian buster/main arm64 libnss3 arm64 > 2:3.42.1-1+deb10u3 [1082 kB] > #6 6.102 Get:36 https://deb.debian.org/debian buster/main arm64 psmisc arm64 > 23.2-1 [122 kB] > #6 6.109 Get:37 https://deb.debian.org/debian buster/main arm64 tini arm64 > 0.18.0-1 [194 kB] > #6 6.767 debconf: delaying package configuration, since apt-utils is not > installed > #6 6.883 Fetched 18.1 MB in 1s (13.4 MB/s) > #6 6.956 Error while loading /usr/sbin/dpkg-split: No such file or directory > #6 6.959 Error while loading /usr/sbin/dpkg-deb: No such file or directory > #6 6.961 dpkg: error processing archive > /tmp/apt-dpkg-install-NdOR40/00-libncurses6_6.1+20181013-2+deb10u2_arm64.deb > (--unpack): > {code} > My host environment details: > * Ubuntu 18.04.5 LTS > * Docker version 20.10.6, build 370c289 > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-35353) Cross-building docker images to ARM64 is failing
[ https://issues.apache.org/jira/browse/SPARK-35353?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17342066#comment-17342066 ] Andy Grove commented on SPARK-35353: The issue seems specific to running on an Ubuntu host. I reproduced the issue on two computers. It works fine on my Macbook Pro though. > Cross-building docker images to ARM64 is failing > > > Key: SPARK-35353 > URL: https://issues.apache.org/jira/browse/SPARK-35353 > Project: Spark > Issue Type: Bug > Components: Kubernetes >Affects Versions: 3.1.1 >Reporter: Andy Grove >Priority: Minor > > I was trying to cross-build Spark 3.1.1 for ARM64 so that I could deploy to a > Raspberry Pi Kubernetes cluster this weekend and the Docker build fails. > Here are the commands I used: > {code:java} > docker buildx create --use > ./bin/docker-image-tool.sh -n -r andygrove -t 3.1.1 -X build {code} > The Docker build for ARM64 fails on the following command: > {code:java} > apt install -y bash tini libc6 libpam-modules krb5-user libnss3 procps{code} > The install fails with "Error while loading /usr/sbin/dpkg-split: No such > file or directory". > Here is a fragment of the output showing the relevant error message. > {code:java} > #6 6.034 Get:35 https://deb.debian.org/debian buster/main arm64 libnss3 arm64 > 2:3.42.1-1+deb10u3 [1082 kB] > #6 6.102 Get:36 https://deb.debian.org/debian buster/main arm64 psmisc arm64 > 23.2-1 [122 kB] > #6 6.109 Get:37 https://deb.debian.org/debian buster/main arm64 tini arm64 > 0.18.0-1 [194 kB] > #6 6.767 debconf: delaying package configuration, since apt-utils is not > installed > #6 6.883 Fetched 18.1 MB in 1s (13.4 MB/s) > #6 6.956 Error while loading /usr/sbin/dpkg-split: No such file or directory > #6 6.959 Error while loading /usr/sbin/dpkg-deb: No such file or directory > #6 6.961 dpkg: error processing archive > /tmp/apt-dpkg-install-NdOR40/00-libncurses6_6.1+20181013-2+deb10u2_arm64.deb > (--unpack): > {code} > My host environment details: > * Ubuntu 18.04.5 LTS > * Docker version 20.10.6, build 370c289 > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-35353) Cross-building docker images to ARM64 is failing
Andy Grove created SPARK-35353: -- Summary: Cross-building docker images to ARM64 is failing Key: SPARK-35353 URL: https://issues.apache.org/jira/browse/SPARK-35353 Project: Spark Issue Type: Bug Components: Kubernetes Affects Versions: 3.1.1 Reporter: Andy Grove I was trying to cross-build Spark 3.1.1 for ARM64 so that I could deploy to a Raspberry Pi Kubernetes cluster this weekend and the Docker build fails. Here are the commands I used: {code:java} docker buildx create --use ./bin/docker-image-tool.sh -n -r andygrove -t 3.1.1 -X build {code} The Docker build for ARM64 fails on the following command: {code:java} apt install -y bash tini libc6 libpam-modules krb5-user libnss3 procps{code} The install fails with "Error while loading /usr/sbin/dpkg-split: No such file or directory". Here is a fragment of the output showing the relevant error message. {code:java} #6 6.034 Get:35 https://deb.debian.org/debian buster/main arm64 libnss3 arm64 2:3.42.1-1+deb10u3 [1082 kB] #6 6.102 Get:36 https://deb.debian.org/debian buster/main arm64 psmisc arm64 23.2-1 [122 kB] #6 6.109 Get:37 https://deb.debian.org/debian buster/main arm64 tini arm64 0.18.0-1 [194 kB] #6 6.767 debconf: delaying package configuration, since apt-utils is not installed #6 6.883 Fetched 18.1 MB in 1s (13.4 MB/s) #6 6.956 Error while loading /usr/sbin/dpkg-split: No such file or directory #6 6.959 Error while loading /usr/sbin/dpkg-deb: No such file or directory #6 6.961 dpkg: error processing archive /tmp/apt-dpkg-install-NdOR40/00-libncurses6_6.1+20181013-2+deb10u2_arm64.deb (--unpack): {code} My host environment details: * Ubuntu 18.04.5 LTS * Docker version 20.10.6, build 370c289 -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-35093) [SQL] AQE columnar mismatch on exchange reuse
Andy Grove created SPARK-35093: -- Summary: [SQL] AQE columnar mismatch on exchange reuse Key: SPARK-35093 URL: https://issues.apache.org/jira/browse/SPARK-35093 Project: Spark Issue Type: Bug Components: SQL Affects Versions: 3.1.1, 3.0.2 Reporter: Andy Grove With AQE enabled, AdaptiveSparkPlanExec will attempt to reuse exchanges that are semantically equal. This is done by comparing the canonicalized plan for two Exchange nodes to see if they are the same. Unfortunately this does not take into account the fact that two exchanges with the same canonical plan might be replaced by a plugin in a way that makes them not compatible. For example, a plugin could create one version with supportsColumnar=true and another with supportsColumnar=false. It is not valid to re-use exchanges if there is a supportsColumnar mismatch. I have tested a fix for this and will put up a PR once I figure out how to write the tests. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-34682) Regression in "operating on canonicalized plan" check in CustomShuffleReaderExec
Andy Grove created SPARK-34682: -- Summary: Regression in "operating on canonicalized plan" check in CustomShuffleReaderExec Key: SPARK-34682 URL: https://issues.apache.org/jira/browse/SPARK-34682 Project: Spark Issue Type: Bug Components: SQL Affects Versions: 3.1.1 Reporter: Andy Grove Fix For: 3.2.0, 3.1.2 In Spark 3.0.2 if I attempt to execute on a canonicalized version of CustomShuffleReaderExec I get an error "operating on canonicalized plan", as expected. There is a regression in Spark 3.1.1 where this check can never be reached because of a new call to sendDriverMetrics that was added prior to the check. This method will fail if operating on a canonicalized plan because it assumes the existence of metrics that do not exist if this is a canonicalized plan. {code:java} private lazy val shuffleRDD: RDD[_] = { sendDriverMetrics() shuffleStage.map { stage => stage.shuffle.getShuffleRDD(partitionSpecs.toArray) }.getOrElse { throw new IllegalStateException("operating on canonicalized plan") } }{code} The specific error looks like this: {code:java} java.util.NoSuchElementException: key not found: numPartitions at scala.collection.immutable.Map$EmptyMap$.apply(Map.scala:101) at scala.collection.immutable.Map$EmptyMap$.apply(Map.scala:99) at org.apache.spark.sql.execution.adaptive.CustomShuffleReaderExec.sendDriverMetrics(CustomShuffleReaderExec.scala:122) at org.apache.spark.sql.execution.adaptive.CustomShuffleReaderExec.shuffleRDD$lzycompute(CustomShuffleReaderExec.scala:182) at org.apache.spark.sql.execution.adaptive.CustomShuffleReaderExec.shuffleRDD(CustomShuffleReaderExec.scala:181) at org.apache.spark.sql.execution.adaptive.CustomShuffleReaderExec.doExecuteColumnar(CustomShuffleReaderExec.scala:196) {code} I think the fix is simply to avoid calling sendDriverMetrics if the plan is canonicalized and I am planning on creating a PR to fix this. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-34645) [K8S] Driver pod stuck in Running state after job completes
[ https://issues.apache.org/jira/browse/SPARK-34645?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17296391#comment-17296391 ] Andy Grove commented on SPARK-34645: I can reproduce this with Spark 3.1.1 + JDK 11 as well (and works fine with JDK 8). I will see if I can make a repro case. > [K8S] Driver pod stuck in Running state after job completes > --- > > Key: SPARK-34645 > URL: https://issues.apache.org/jira/browse/SPARK-34645 > Project: Spark > Issue Type: Bug > Components: Kubernetes >Affects Versions: 3.0.2 > Environment: Kubernetes: > {code:java} > Client Version: version.Info{Major:"1", Minor:"19", GitVersion:"v1.19.2", > GitCommit:"f5743093fd1c663cb0cbc89748f730662345d44d", GitTreeState:"clean", > BuildDate:"2020-09-16T13:41:02Z", GoVersion:"go1.15", Compiler:"gc", > Platform:"linux/amd64"} > Server Version: version.Info{Major:"1", Minor:"13", GitVersion:"v1.13.5", > GitCommit:"2166946f41b36dea2c4626f90a77706f426cdea2", GitTreeState:"clean", > BuildDate:"2019-03-25T15:19:22Z", GoVersion:"go1.11.5", Compiler:"gc", > Platform:"linux/amd64"} > {code} >Reporter: Andy Grove >Priority: Major > > I am running automated benchmarks in k8s, using spark-submit in cluster mode, > so the driver runs in a pod. > When running with Spark 3.0.1 and 3.1.1 everything works as expected and I > see the Spark context being shut down after the job completes. > However, when running with Spark 3.0.2 I do not see the context get shut down > and the driver pod is stuck in the Running state indefinitely. > This is the output I see after job completion with 3.0.1 and 3.1.1 and this > output does not appear with 3.0.2. With 3.0.2 there is no output at all after > the job completes. > {code:java} > 2021-03-05 20:09:24,576 INFO spark.SparkContext: Invoking stop() from > shutdown hook > 2021-03-05 20:09:24,592 INFO server.AbstractConnector: Stopped > Spark@784499d0{HTTP/1.1, (http/1.1)}{0.0.0.0:4040} > 2021-03-05 20:09:24,594 INFO ui.SparkUI: Stopped Spark web UI at > http://benchmark-runner-3e8a38780400e0d1-driver-svc.default.svc:4040 > 2021-03-05 20:09:24,599 INFO k8s.KubernetesClusterSchedulerBackend: Shutting > down all executors > 2021-03-05 20:09:24,600 INFO > k8s.KubernetesClusterSchedulerBackend$KubernetesDriverEndpoint: Asking each > executor to shut down > 2021-03-05 20:09:24,609 WARN k8s.ExecutorPodsWatchSnapshotSource: Kubernetes > client has been closed (this is expected if the application is shutting down.) > 2021-03-05 20:09:24,719 INFO spark.MapOutputTrackerMasterEndpoint: > MapOutputTrackerMasterEndpoint stopped! > 2021-03-05 20:09:24,736 INFO memory.MemoryStore: MemoryStore cleared > 2021-03-05 20:09:24,738 INFO storage.BlockManager: BlockManager stopped > 2021-03-05 20:09:24,744 INFO storage.BlockManagerMaster: BlockManagerMaster > stopped > 2021-03-05 20:09:24,752 INFO > scheduler.OutputCommitCoordinator$OutputCommitCoordinatorEndpoint: > OutputCommitCoordinator stopped! > 2021-03-05 20:09:24,768 INFO spark.SparkContext: Successfully stopped > SparkContext > 2021-03-05 20:09:24,768 INFO util.ShutdownHookManager: Shutdown hook called > 2021-03-05 20:09:24,769 INFO util.ShutdownHookManager: Deleting directory > /var/data/spark-67fa44df-e86c-463a-a149-25d95817ff8e/spark-a5476c14-c103-4108-b733-961400485d8a > 2021-03-05 20:09:24,772 INFO util.ShutdownHookManager: Deleting directory > /tmp/spark-9d6261f5-4394-472b-9c9a-e22bde877814 > 2021-03-05 20:09:24,778 INFO impl.MetricsSystemImpl: Stopping s3a-file-system > metrics system... > 2021-03-05 20:09:24,779 INFO impl.MetricsSystemImpl: s3a-file-system metrics > system stopped. > 2021-03-05 20:09:24,779 INFO impl.MetricsSystemImpl: s3a-file-system metrics > system shutdown complete. > {code} -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-34645) [K8S] Driver pod stuck in Running state after job completes
[ https://issues.apache.org/jira/browse/SPARK-34645?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17296358#comment-17296358 ] Andy Grove commented on SPARK-34645: I just confirmed that this only happens with JDK 11 and works fine with JDK 8. > [K8S] Driver pod stuck in Running state after job completes > --- > > Key: SPARK-34645 > URL: https://issues.apache.org/jira/browse/SPARK-34645 > Project: Spark > Issue Type: Bug > Components: Kubernetes >Affects Versions: 3.0.2 > Environment: Kubernetes: > {code:java} > Client Version: version.Info{Major:"1", Minor:"19", GitVersion:"v1.19.2", > GitCommit:"f5743093fd1c663cb0cbc89748f730662345d44d", GitTreeState:"clean", > BuildDate:"2020-09-16T13:41:02Z", GoVersion:"go1.15", Compiler:"gc", > Platform:"linux/amd64"} > Server Version: version.Info{Major:"1", Minor:"13", GitVersion:"v1.13.5", > GitCommit:"2166946f41b36dea2c4626f90a77706f426cdea2", GitTreeState:"clean", > BuildDate:"2019-03-25T15:19:22Z", GoVersion:"go1.11.5", Compiler:"gc", > Platform:"linux/amd64"} > {code} >Reporter: Andy Grove >Priority: Major > > I am running automated benchmarks in k8s, using spark-submit in cluster mode, > so the driver runs in a pod. > When running with Spark 3.0.1 and 3.1.1 everything works as expected and I > see the Spark context being shut down after the job completes. > However, when running with Spark 3.0.2 I do not see the context get shut down > and the driver pod is stuck in the Running state indefinitely. > This is the output I see after job completion with 3.0.1 and 3.1.1 and this > output does not appear with 3.0.2. With 3.0.2 there is no output at all after > the job completes. > {code:java} > 2021-03-05 20:09:24,576 INFO spark.SparkContext: Invoking stop() from > shutdown hook > 2021-03-05 20:09:24,592 INFO server.AbstractConnector: Stopped > Spark@784499d0{HTTP/1.1, (http/1.1)}{0.0.0.0:4040} > 2021-03-05 20:09:24,594 INFO ui.SparkUI: Stopped Spark web UI at > http://benchmark-runner-3e8a38780400e0d1-driver-svc.default.svc:4040 > 2021-03-05 20:09:24,599 INFO k8s.KubernetesClusterSchedulerBackend: Shutting > down all executors > 2021-03-05 20:09:24,600 INFO > k8s.KubernetesClusterSchedulerBackend$KubernetesDriverEndpoint: Asking each > executor to shut down > 2021-03-05 20:09:24,609 WARN k8s.ExecutorPodsWatchSnapshotSource: Kubernetes > client has been closed (this is expected if the application is shutting down.) > 2021-03-05 20:09:24,719 INFO spark.MapOutputTrackerMasterEndpoint: > MapOutputTrackerMasterEndpoint stopped! > 2021-03-05 20:09:24,736 INFO memory.MemoryStore: MemoryStore cleared > 2021-03-05 20:09:24,738 INFO storage.BlockManager: BlockManager stopped > 2021-03-05 20:09:24,744 INFO storage.BlockManagerMaster: BlockManagerMaster > stopped > 2021-03-05 20:09:24,752 INFO > scheduler.OutputCommitCoordinator$OutputCommitCoordinatorEndpoint: > OutputCommitCoordinator stopped! > 2021-03-05 20:09:24,768 INFO spark.SparkContext: Successfully stopped > SparkContext > 2021-03-05 20:09:24,768 INFO util.ShutdownHookManager: Shutdown hook called > 2021-03-05 20:09:24,769 INFO util.ShutdownHookManager: Deleting directory > /var/data/spark-67fa44df-e86c-463a-a149-25d95817ff8e/spark-a5476c14-c103-4108-b733-961400485d8a > 2021-03-05 20:09:24,772 INFO util.ShutdownHookManager: Deleting directory > /tmp/spark-9d6261f5-4394-472b-9c9a-e22bde877814 > 2021-03-05 20:09:24,778 INFO impl.MetricsSystemImpl: Stopping s3a-file-system > metrics system... > 2021-03-05 20:09:24,779 INFO impl.MetricsSystemImpl: s3a-file-system metrics > system stopped. > 2021-03-05 20:09:24,779 INFO impl.MetricsSystemImpl: s3a-file-system metrics > system shutdown complete. > {code} -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-34645) [K8S] Driver pod stuck in Running state after job completes
Andy Grove created SPARK-34645: -- Summary: [K8S] Driver pod stuck in Running state after job completes Key: SPARK-34645 URL: https://issues.apache.org/jira/browse/SPARK-34645 Project: Spark Issue Type: Bug Components: Kubernetes Affects Versions: 3.0.2 Environment: Kubernetes: {code:java} Client Version: version.Info{Major:"1", Minor:"19", GitVersion:"v1.19.2", GitCommit:"f5743093fd1c663cb0cbc89748f730662345d44d", GitTreeState:"clean", BuildDate:"2020-09-16T13:41:02Z", GoVersion:"go1.15", Compiler:"gc", Platform:"linux/amd64"} Server Version: version.Info{Major:"1", Minor:"13", GitVersion:"v1.13.5", GitCommit:"2166946f41b36dea2c4626f90a77706f426cdea2", GitTreeState:"clean", BuildDate:"2019-03-25T15:19:22Z", GoVersion:"go1.11.5", Compiler:"gc", Platform:"linux/amd64"} {code} Reporter: Andy Grove I am running automated benchmarks in k8s, using spark-submit in cluster mode, so the driver runs in a pod. When running with Spark 3.0.1 and 3.1.1 everything works as expected and I see the Spark context being shut down after the job completes. However, when running with Spark 3.0.2 I do not see the context get shut down and the driver pod is stuck in the Running state indefinitely. This is the output I see after job completion with 3.0.1 and 3.1.1 and this output does not appear with 3.0.2. With 3.0.2 there is no output at all after the job completes. {code:java} 2021-03-05 20:09:24,576 INFO spark.SparkContext: Invoking stop() from shutdown hook 2021-03-05 20:09:24,592 INFO server.AbstractConnector: Stopped Spark@784499d0{HTTP/1.1, (http/1.1)}{0.0.0.0:4040} 2021-03-05 20:09:24,594 INFO ui.SparkUI: Stopped Spark web UI at http://benchmark-runner-3e8a38780400e0d1-driver-svc.default.svc:4040 2021-03-05 20:09:24,599 INFO k8s.KubernetesClusterSchedulerBackend: Shutting down all executors 2021-03-05 20:09:24,600 INFO k8s.KubernetesClusterSchedulerBackend$KubernetesDriverEndpoint: Asking each executor to shut down 2021-03-05 20:09:24,609 WARN k8s.ExecutorPodsWatchSnapshotSource: Kubernetes client has been closed (this is expected if the application is shutting down.) 2021-03-05 20:09:24,719 INFO spark.MapOutputTrackerMasterEndpoint: MapOutputTrackerMasterEndpoint stopped! 2021-03-05 20:09:24,736 INFO memory.MemoryStore: MemoryStore cleared 2021-03-05 20:09:24,738 INFO storage.BlockManager: BlockManager stopped 2021-03-05 20:09:24,744 INFO storage.BlockManagerMaster: BlockManagerMaster stopped 2021-03-05 20:09:24,752 INFO scheduler.OutputCommitCoordinator$OutputCommitCoordinatorEndpoint: OutputCommitCoordinator stopped! 2021-03-05 20:09:24,768 INFO spark.SparkContext: Successfully stopped SparkContext 2021-03-05 20:09:24,768 INFO util.ShutdownHookManager: Shutdown hook called 2021-03-05 20:09:24,769 INFO util.ShutdownHookManager: Deleting directory /var/data/spark-67fa44df-e86c-463a-a149-25d95817ff8e/spark-a5476c14-c103-4108-b733-961400485d8a 2021-03-05 20:09:24,772 INFO util.ShutdownHookManager: Deleting directory /tmp/spark-9d6261f5-4394-472b-9c9a-e22bde877814 2021-03-05 20:09:24,778 INFO impl.MetricsSystemImpl: Stopping s3a-file-system metrics system... 2021-03-05 20:09:24,779 INFO impl.MetricsSystemImpl: s3a-file-system metrics system stopped. 2021-03-05 20:09:24,779 INFO impl.MetricsSystemImpl: s3a-file-system metrics system shutdown complete. {code} -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-33744) Canonicalization error in SortAggregate
[ https://issues.apache.org/jira/browse/SPARK-33744?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andy Grove updated SPARK-33744: --- Affects Version/s: 3.0.1 > Canonicalization error in SortAggregate > --- > > Key: SPARK-33744 > URL: https://issues.apache.org/jira/browse/SPARK-33744 > Project: Spark > Issue Type: Bug > Components: SQL >Affects Versions: 3.0.1, 3.1.0 >Reporter: Andy Grove >Priority: Minor > > The canonicalization plan for a simple aggregate query is different each time > for SortAggregate but not for HashAggregate. > The issue can be demonstrated by adding the following unit tests to > SQLQuerySuite. The HashAggregate test passes and the SortAggregate test fails. > The first test has numeric input and the second test is operating on strings, > which forces the use of SortAggregate rather than HashAggregate. > {code:java} > test("HashAggregate canonicalization") { > val data = Seq((1, 1)).toDF("c0", "c1") > val df1 = data.groupBy(col("c0")).agg(first("c1")) > val df2 = data.groupBy(col("c0")).agg(first("c1")) > assert(df1.queryExecution.executedPlan.canonicalized == > df2.queryExecution.executedPlan.canonicalized) > } > test("SortAggregate canonicalization") { > val data = Seq(("a", "a")).toDF("c0", "c1") > val df1 = data.groupBy(col("c0")).agg(first("c1")) > val df2 = data.groupBy(col("c0")).agg(first("c1")) > assert(df1.queryExecution.executedPlan.canonicalized == > df2.queryExecution.executedPlan.canonicalized) > } {code} > The SortAggregate test fails with the following output . > {code:java} > SortAggregate(key=[none#0], functions=[first(none#0, false)], output=[none#0, > #1]) > +- *(2) Sort [none#0 ASC NULLS FIRST], false, 0 >+- Exchange hashpartitioning(none#0, 5), ENSURE_REQUIREMENTS, [id=#105] > +- SortAggregate(key=[none#0], functions=[partial_first(none#1, > false)], output=[none#0, none#2, none#3]) > +- *(1) Sort [none#0 ASC NULLS FIRST], false, 0 > +- *(1) Project [none#0 AS #0, none#1 AS #1] >+- *(1) LocalTableScan [none#0, none#1] > did not equal > SortAggregate(key=[none#0], functions=[first(none#0, false)], output=[none#0, > #1]) > +- *(2) Sort [none#0 ASC NULLS FIRST], false, 0 >+- Exchange hashpartitioning(none#0, 5), ENSURE_REQUIREMENTS, [id=#148] > +- SortAggregate(key=[none#0], functions=[partial_first(none#1, > false)], output=[none#0, none#2, none#3]) > +- *(1) Sort [none#0 ASC NULLS FIRST], false, 0 > +- *(1) Project [none#0 AS #0, none#1 AS #1] >+- *(1) LocalTableScan [none#0, none#1] {code} > The error is caused by the resultExpression for the aggregate function being > assigned a new ExprId in the final aggregate. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-33744) Canonicalization error in SortAggregate
[ https://issues.apache.org/jira/browse/SPARK-33744?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andy Grove updated SPARK-33744: --- Affects Version/s: (was: 3.1.0) > Canonicalization error in SortAggregate > --- > > Key: SPARK-33744 > URL: https://issues.apache.org/jira/browse/SPARK-33744 > Project: Spark > Issue Type: Bug > Components: SQL >Affects Versions: 3.0.1 >Reporter: Andy Grove >Priority: Minor > > The canonicalization plan for a simple aggregate query is different each time > for SortAggregate but not for HashAggregate. > The issue can be demonstrated by adding the following unit tests to > SQLQuerySuite. The HashAggregate test passes and the SortAggregate test fails. > The first test has numeric input and the second test is operating on strings, > which forces the use of SortAggregate rather than HashAggregate. > {code:java} > test("HashAggregate canonicalization") { > val data = Seq((1, 1)).toDF("c0", "c1") > val df1 = data.groupBy(col("c0")).agg(first("c1")) > val df2 = data.groupBy(col("c0")).agg(first("c1")) > assert(df1.queryExecution.executedPlan.canonicalized == > df2.queryExecution.executedPlan.canonicalized) > } > test("SortAggregate canonicalization") { > val data = Seq(("a", "a")).toDF("c0", "c1") > val df1 = data.groupBy(col("c0")).agg(first("c1")) > val df2 = data.groupBy(col("c0")).agg(first("c1")) > assert(df1.queryExecution.executedPlan.canonicalized == > df2.queryExecution.executedPlan.canonicalized) > } {code} > The SortAggregate test fails with the following output . > {code:java} > SortAggregate(key=[none#0], functions=[first(none#0, false)], output=[none#0, > #1]) > +- *(2) Sort [none#0 ASC NULLS FIRST], false, 0 >+- Exchange hashpartitioning(none#0, 5), ENSURE_REQUIREMENTS, [id=#105] > +- SortAggregate(key=[none#0], functions=[partial_first(none#1, > false)], output=[none#0, none#2, none#3]) > +- *(1) Sort [none#0 ASC NULLS FIRST], false, 0 > +- *(1) Project [none#0 AS #0, none#1 AS #1] >+- *(1) LocalTableScan [none#0, none#1] > did not equal > SortAggregate(key=[none#0], functions=[first(none#0, false)], output=[none#0, > #1]) > +- *(2) Sort [none#0 ASC NULLS FIRST], false, 0 >+- Exchange hashpartitioning(none#0, 5), ENSURE_REQUIREMENTS, [id=#148] > +- SortAggregate(key=[none#0], functions=[partial_first(none#1, > false)], output=[none#0, none#2, none#3]) > +- *(1) Sort [none#0 ASC NULLS FIRST], false, 0 > +- *(1) Project [none#0 AS #0, none#1 AS #1] >+- *(1) LocalTableScan [none#0, none#1] {code} > The error is caused by the resultExpression for the aggregate function being > assigned a new ExprId in the final aggregate. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-33744) Canonicalization error in SortAggregate
Andy Grove created SPARK-33744: -- Summary: Canonicalization error in SortAggregate Key: SPARK-33744 URL: https://issues.apache.org/jira/browse/SPARK-33744 Project: Spark Issue Type: Bug Components: SQL Affects Versions: 3.1.0 Reporter: Andy Grove The canonicalization plan for a simple aggregate query is different each time for SortAggregate but not for HashAggregate. The issue can be demonstrated by adding the following unit tests to SQLQuerySuite. The HashAggregate test passes and the SortAggregate test fails. The first test has numeric input and the second test is operating on strings, which forces the use of SortAggregate rather than HashAggregate. {code:java} test("HashAggregate canonicalization") { val data = Seq((1, 1)).toDF("c0", "c1") val df1 = data.groupBy(col("c0")).agg(first("c1")) val df2 = data.groupBy(col("c0")).agg(first("c1")) assert(df1.queryExecution.executedPlan.canonicalized == df2.queryExecution.executedPlan.canonicalized) } test("SortAggregate canonicalization") { val data = Seq(("a", "a")).toDF("c0", "c1") val df1 = data.groupBy(col("c0")).agg(first("c1")) val df2 = data.groupBy(col("c0")).agg(first("c1")) assert(df1.queryExecution.executedPlan.canonicalized == df2.queryExecution.executedPlan.canonicalized) } {code} The SortAggregate test fails with the following output . {code:java} SortAggregate(key=[none#0], functions=[first(none#0, false)], output=[none#0, #1]) +- *(2) Sort [none#0 ASC NULLS FIRST], false, 0 +- Exchange hashpartitioning(none#0, 5), ENSURE_REQUIREMENTS, [id=#105] +- SortAggregate(key=[none#0], functions=[partial_first(none#1, false)], output=[none#0, none#2, none#3]) +- *(1) Sort [none#0 ASC NULLS FIRST], false, 0 +- *(1) Project [none#0 AS #0, none#1 AS #1] +- *(1) LocalTableScan [none#0, none#1] did not equal SortAggregate(key=[none#0], functions=[first(none#0, false)], output=[none#0, #1]) +- *(2) Sort [none#0 ASC NULLS FIRST], false, 0 +- Exchange hashpartitioning(none#0, 5), ENSURE_REQUIREMENTS, [id=#148] +- SortAggregate(key=[none#0], functions=[partial_first(none#1, false)], output=[none#0, none#2, none#3]) +- *(1) Sort [none#0 ASC NULLS FIRST], false, 0 +- *(1) Project [none#0 AS #0, none#1 AS #1] +- *(1) LocalTableScan [none#0, none#1] {code} The error is caused by the resultExpression for the aggregate function being assigned a new ExprId in the final aggregate. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Resolved] (SPARK-32671) Race condition in MapOutputTracker.getStatistics
[ https://issues.apache.org/jira/browse/SPARK-32671?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andy Grove resolved SPARK-32671. Resolution: Invalid I was mistaken about this issue. > Race condition in MapOutputTracker.getStatistics > > > Key: SPARK-32671 > URL: https://issues.apache.org/jira/browse/SPARK-32671 > Project: Spark > Issue Type: Bug > Components: Spark Core >Affects Versions: 3.0.0, 3.0.1 >Reporter: Andy Grove >Priority: Major > > MapOutputTracker.getStatistics builds an array of partition sizes for a > shuffle id and in some cases uses multiple threads running in parallel to > update this array. This code is not thread-safe and the output is > non-deterministic when there are multiple MapStatus entries for the same > partition. > We have unit tests such as the skewed join tests in AdaptiveQueryExecSuite > that depend on the output being deterministic, and intermittent failures in > these tests led me to track this bug down. > The issue is trivial to fix by using an AtomicLong when building the array of > partition sizes. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-32671) Race condition in MapOutputTracker.getStatistics
Andy Grove created SPARK-32671: -- Summary: Race condition in MapOutputTracker.getStatistics Key: SPARK-32671 URL: https://issues.apache.org/jira/browse/SPARK-32671 Project: Spark Issue Type: Bug Components: Spark Core Affects Versions: 3.0.0, 3.0.1 Reporter: Andy Grove MapOutputTracker.getStatistics builds an array of partition sizes for a shuffle id and in some cases uses multiple threads running in parallel to update this array. This code is not thread-safe and the output is non-deterministic when there are multiple MapStatus entries for the same partition. We have unit tests such as the skewed join tests in AdaptiveQueryExecSuite that depend on the output being deterministic, and intermittent failures in these tests led me to track this bug down. The issue is trivial to fix by using an AtomicLong when building the array of partition sizes. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-29640) [K8S] Intermittent "java.net.UnknownHostException: kubernetes.default.svc" in Spark driver
[ https://issues.apache.org/jira/browse/SPARK-29640?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16994510#comment-16994510 ] Andy Grove commented on SPARK-29640: Closing this as not a bug. We have confirmed that this is due to the way certain EKS clusters are set up. The issue only happens when Spark is on the same node as a CoreDNS pod and only happens intermittently even then. We have experienced the same issue with applications other than Spark as well. > [K8S] Intermittent "java.net.UnknownHostException: kubernetes.default.svc" in > Spark driver > -- > > Key: SPARK-29640 > URL: https://issues.apache.org/jira/browse/SPARK-29640 > Project: Spark > Issue Type: Bug > Components: Kubernetes >Affects Versions: 2.4.4 >Reporter: Andy Grove >Priority: Major > > We are running into intermittent DNS issues where the Spark driver fails to > resolve "kubernetes.default.svc" when trying to create executors. We are > running Spark 2.4.4 (with the patch for SPARK-28921) in cluster mode in EKS. > This happens approximately 10% of the time. > Here is the stack trace: > {code:java} > Exception in thread "main" org.apache.spark.SparkException: External > scheduler cannot be instantiated > at > org.apache.spark.SparkContext$.org$apache$spark$SparkContext$$createTaskScheduler(SparkContext.scala:2794) > at org.apache.spark.SparkContext.(SparkContext.scala:493) > at org.apache.spark.SparkContext$.getOrCreate(SparkContext.scala:2520) > at > org.apache.spark.sql.SparkSession$Builder$$anonfun$7.apply(SparkSession.scala:935) > at > org.apache.spark.sql.SparkSession$Builder$$anonfun$7.apply(SparkSession.scala:926) > at scala.Option.getOrElse(Option.scala:121) > at > org.apache.spark.sql.SparkSession$Builder.getOrCreate(SparkSession.scala:926) > at com.rms.execution.test.SparkPiTask$.main(SparkPiTask.scala:36) > at com.rms.execution.test.SparkPiTask.main(SparkPiTask.scala) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:498) > at > org.apache.spark.deploy.JavaMainApplication.start(SparkApplication.scala:52) > at > org.apache.spark.deploy.SparkSubmit.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:845) > at > org.apache.spark.deploy.SparkSubmit.doRunMain$1(SparkSubmit.scala:161) > at org.apache.spark.deploy.SparkSubmit.submit(SparkSubmit.scala:184) > at org.apache.spark.deploy.SparkSubmit.doSubmit(SparkSubmit.scala:86) > at > org.apache.spark.deploy.SparkSubmit$$anon$2.doSubmit(SparkSubmit.scala:920) > at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:929) > at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala) > Caused by: io.fabric8.kubernetes.client.KubernetesClientException: Operation: > [get] for kind: [Pod] with name: > [wf-5-69674f15d0fc45-1571354060179-driver] in namespace: > [tenant-8-workflows] failed. > at > io.fabric8.kubernetes.client.KubernetesClientException.launderThrowable(KubernetesClientException.java:64) > at > io.fabric8.kubernetes.client.KubernetesClientException.launderThrowable(KubernetesClientException.java:72) > at > io.fabric8.kubernetes.client.dsl.base.BaseOperation.getMandatory(BaseOperation.java:229) > at > io.fabric8.kubernetes.client.dsl.base.BaseOperation.get(BaseOperation.java:162) > at > org.apache.spark.scheduler.cluster.k8s.ExecutorPodsAllocator$$anonfun$1.apply(ExecutorPodsAllocator.scala:57) > at > org.apache.spark.scheduler.cluster.k8s.ExecutorPodsAllocator$$anonfun$1.apply(ExecutorPodsAllocator.scala:55) > at scala.Option.map(Option.scala:146) > at > org.apache.spark.scheduler.cluster.k8s.ExecutorPodsAllocator.(ExecutorPodsAllocator.scala:55) > at > org.apache.spark.scheduler.cluster.k8s.KubernetesClusterManager.createSchedulerBackend(KubernetesClusterManager.scala:89) > at > org.apache.spark.SparkContext$.org$apache$spark$SparkContext$$createTaskScheduler(SparkContext.scala:2788) > ... 20 more > Caused by: java.net.UnknownHostException: kubernetes.default.svc: Try again > at java.net.Inet4AddressImpl.lookupAllHostAddr(Native Method) > at java.net.InetAddress$2.lookupAllHostAddr(InetAddress.java:929) > at > java.net.InetAddress.getAddressesFromNameService(InetAddress.java:1324) > at java.net.InetAddress.getAllByName0(InetAddress.java:1277) > at java.net.InetAddress.getAllByName(InetAddress.java:119
[jira] [Issue Comment Deleted] (SPARK-29640) [K8S] Intermittent "java.net.UnknownHostException: kubernetes.default.svc" in Spark driver
[ https://issues.apache.org/jira/browse/SPARK-29640?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andy Grove updated SPARK-29640: --- Comment: was deleted (was: We were finally able to get to a root cause on this so I'm documenting it here in the hopes that it helps someone else in the future. The issue was due to the way that routing was set up on our EKS clusters combined with the fact that we were using an NLB rather than ELB along with nginx ingress controllers. Specifically, NLB does not support "hairpinning" as explained in [https://docs.aws.amazon.com/elasticloadbalancing/latest/network/load-balancer-troubleshooting.html] In layman's terms, if pod A tries to communicate with pod B, and both pods are on the same node and the request egresses from the node and is then routed back to the node via NLB and nginx controller then the request can never succeed and will time out. Switching to an ELB resolves the issue but a better solution is to use cluster local addressing so that communicate between pods on the same nodes uses the local network.) > [K8S] Intermittent "java.net.UnknownHostException: kubernetes.default.svc" in > Spark driver > -- > > Key: SPARK-29640 > URL: https://issues.apache.org/jira/browse/SPARK-29640 > Project: Spark > Issue Type: Bug > Components: Kubernetes >Affects Versions: 2.4.4 >Reporter: Andy Grove >Priority: Major > > We are running into intermittent DNS issues where the Spark driver fails to > resolve "kubernetes.default.svc" when trying to create executors. We are > running Spark 2.4.4 (with the patch for SPARK-28921) in cluster mode in EKS. > This happens approximately 10% of the time. > Here is the stack trace: > {code:java} > Exception in thread "main" org.apache.spark.SparkException: External > scheduler cannot be instantiated > at > org.apache.spark.SparkContext$.org$apache$spark$SparkContext$$createTaskScheduler(SparkContext.scala:2794) > at org.apache.spark.SparkContext.(SparkContext.scala:493) > at org.apache.spark.SparkContext$.getOrCreate(SparkContext.scala:2520) > at > org.apache.spark.sql.SparkSession$Builder$$anonfun$7.apply(SparkSession.scala:935) > at > org.apache.spark.sql.SparkSession$Builder$$anonfun$7.apply(SparkSession.scala:926) > at scala.Option.getOrElse(Option.scala:121) > at > org.apache.spark.sql.SparkSession$Builder.getOrCreate(SparkSession.scala:926) > at com.rms.execution.test.SparkPiTask$.main(SparkPiTask.scala:36) > at com.rms.execution.test.SparkPiTask.main(SparkPiTask.scala) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:498) > at > org.apache.spark.deploy.JavaMainApplication.start(SparkApplication.scala:52) > at > org.apache.spark.deploy.SparkSubmit.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:845) > at > org.apache.spark.deploy.SparkSubmit.doRunMain$1(SparkSubmit.scala:161) > at org.apache.spark.deploy.SparkSubmit.submit(SparkSubmit.scala:184) > at org.apache.spark.deploy.SparkSubmit.doSubmit(SparkSubmit.scala:86) > at > org.apache.spark.deploy.SparkSubmit$$anon$2.doSubmit(SparkSubmit.scala:920) > at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:929) > at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala) > Caused by: io.fabric8.kubernetes.client.KubernetesClientException: Operation: > [get] for kind: [Pod] with name: > [wf-5-69674f15d0fc45-1571354060179-driver] in namespace: > [tenant-8-workflows] failed. > at > io.fabric8.kubernetes.client.KubernetesClientException.launderThrowable(KubernetesClientException.java:64) > at > io.fabric8.kubernetes.client.KubernetesClientException.launderThrowable(KubernetesClientException.java:72) > at > io.fabric8.kubernetes.client.dsl.base.BaseOperation.getMandatory(BaseOperation.java:229) > at > io.fabric8.kubernetes.client.dsl.base.BaseOperation.get(BaseOperation.java:162) > at > org.apache.spark.scheduler.cluster.k8s.ExecutorPodsAllocator$$anonfun$1.apply(ExecutorPodsAllocator.scala:57) > at > org.apache.spark.scheduler.cluster.k8s.ExecutorPodsAllocator$$anonfun$1.apply(ExecutorPodsAllocator.scala:55) > at scala.Option.map(Option.scala:146) > at > org.apache.spark.scheduler.cluster.k8s.ExecutorPodsAllocator.(ExecutorPodsAllocator.scala:55) > at > org.apache.spark.scheduler.cluster.k8s.KubernetesClusterManager.createSchedulerBackend(KubernetesClusterManager.scala:
[jira] [Commented] (SPARK-29640) [K8S] Intermittent "java.net.UnknownHostException: kubernetes.default.svc" in Spark driver
[ https://issues.apache.org/jira/browse/SPARK-29640?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16994257#comment-16994257 ] Andy Grove commented on SPARK-29640: We were finally able to get to a root cause on this so I'm documenting it here in the hopes that it helps someone else in the future. The issue was due to the way that routing was set up on our EKS clusters combined with the fact that we were using an NLB rather than ELB along with nginx ingress controllers. Specifically, NLB does not support "hairpinning" as explained in [https://docs.aws.amazon.com/elasticloadbalancing/latest/network/load-balancer-troubleshooting.html] In layman's terms, if pod A tries to communicate with pod B, and both pods are on the same node and the request egresses from the node and is then routed back to the node via NLB and nginx controller then the request can never succeed and will time out. Switching to an ELB resolves the issue but a better solution is to use cluster local addressing so that communicate between pods on the same nodes uses the local network. > [K8S] Intermittent "java.net.UnknownHostException: kubernetes.default.svc" in > Spark driver > -- > > Key: SPARK-29640 > URL: https://issues.apache.org/jira/browse/SPARK-29640 > Project: Spark > Issue Type: Bug > Components: Kubernetes >Affects Versions: 2.4.4 >Reporter: Andy Grove >Priority: Major > > We are running into intermittent DNS issues where the Spark driver fails to > resolve "kubernetes.default.svc" when trying to create executors. We are > running Spark 2.4.4 (with the patch for SPARK-28921) in cluster mode in EKS. > This happens approximately 10% of the time. > Here is the stack trace: > {code:java} > Exception in thread "main" org.apache.spark.SparkException: External > scheduler cannot be instantiated > at > org.apache.spark.SparkContext$.org$apache$spark$SparkContext$$createTaskScheduler(SparkContext.scala:2794) > at org.apache.spark.SparkContext.(SparkContext.scala:493) > at org.apache.spark.SparkContext$.getOrCreate(SparkContext.scala:2520) > at > org.apache.spark.sql.SparkSession$Builder$$anonfun$7.apply(SparkSession.scala:935) > at > org.apache.spark.sql.SparkSession$Builder$$anonfun$7.apply(SparkSession.scala:926) > at scala.Option.getOrElse(Option.scala:121) > at > org.apache.spark.sql.SparkSession$Builder.getOrCreate(SparkSession.scala:926) > at com.rms.execution.test.SparkPiTask$.main(SparkPiTask.scala:36) > at com.rms.execution.test.SparkPiTask.main(SparkPiTask.scala) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:498) > at > org.apache.spark.deploy.JavaMainApplication.start(SparkApplication.scala:52) > at > org.apache.spark.deploy.SparkSubmit.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:845) > at > org.apache.spark.deploy.SparkSubmit.doRunMain$1(SparkSubmit.scala:161) > at org.apache.spark.deploy.SparkSubmit.submit(SparkSubmit.scala:184) > at org.apache.spark.deploy.SparkSubmit.doSubmit(SparkSubmit.scala:86) > at > org.apache.spark.deploy.SparkSubmit$$anon$2.doSubmit(SparkSubmit.scala:920) > at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:929) > at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala) > Caused by: io.fabric8.kubernetes.client.KubernetesClientException: Operation: > [get] for kind: [Pod] with name: > [wf-5-69674f15d0fc45-1571354060179-driver] in namespace: > [tenant-8-workflows] failed. > at > io.fabric8.kubernetes.client.KubernetesClientException.launderThrowable(KubernetesClientException.java:64) > at > io.fabric8.kubernetes.client.KubernetesClientException.launderThrowable(KubernetesClientException.java:72) > at > io.fabric8.kubernetes.client.dsl.base.BaseOperation.getMandatory(BaseOperation.java:229) > at > io.fabric8.kubernetes.client.dsl.base.BaseOperation.get(BaseOperation.java:162) > at > org.apache.spark.scheduler.cluster.k8s.ExecutorPodsAllocator$$anonfun$1.apply(ExecutorPodsAllocator.scala:57) > at > org.apache.spark.scheduler.cluster.k8s.ExecutorPodsAllocator$$anonfun$1.apply(ExecutorPodsAllocator.scala:55) > at scala.Option.map(Option.scala:146) > at > org.apache.spark.scheduler.cluster.k8s.ExecutorPodsAllocator.(ExecutorPodsAllocator.scala:55) > at > org.apache.spark.scheduler.cluster.k8s.KubernetesClusterManager.createSchedulerBackend(Kubern
[jira] [Resolved] (SPARK-29640) [K8S] Intermittent "java.net.UnknownHostException: kubernetes.default.svc" in Spark driver
[ https://issues.apache.org/jira/browse/SPARK-29640?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andy Grove resolved SPARK-29640. Resolution: Not A Bug > [K8S] Intermittent "java.net.UnknownHostException: kubernetes.default.svc" in > Spark driver > -- > > Key: SPARK-29640 > URL: https://issues.apache.org/jira/browse/SPARK-29640 > Project: Spark > Issue Type: Bug > Components: Kubernetes >Affects Versions: 2.4.4 >Reporter: Andy Grove >Priority: Major > > We are running into intermittent DNS issues where the Spark driver fails to > resolve "kubernetes.default.svc" when trying to create executors. We are > running Spark 2.4.4 (with the patch for SPARK-28921) in cluster mode in EKS. > This happens approximately 10% of the time. > Here is the stack trace: > {code:java} > Exception in thread "main" org.apache.spark.SparkException: External > scheduler cannot be instantiated > at > org.apache.spark.SparkContext$.org$apache$spark$SparkContext$$createTaskScheduler(SparkContext.scala:2794) > at org.apache.spark.SparkContext.(SparkContext.scala:493) > at org.apache.spark.SparkContext$.getOrCreate(SparkContext.scala:2520) > at > org.apache.spark.sql.SparkSession$Builder$$anonfun$7.apply(SparkSession.scala:935) > at > org.apache.spark.sql.SparkSession$Builder$$anonfun$7.apply(SparkSession.scala:926) > at scala.Option.getOrElse(Option.scala:121) > at > org.apache.spark.sql.SparkSession$Builder.getOrCreate(SparkSession.scala:926) > at com.rms.execution.test.SparkPiTask$.main(SparkPiTask.scala:36) > at com.rms.execution.test.SparkPiTask.main(SparkPiTask.scala) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:498) > at > org.apache.spark.deploy.JavaMainApplication.start(SparkApplication.scala:52) > at > org.apache.spark.deploy.SparkSubmit.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:845) > at > org.apache.spark.deploy.SparkSubmit.doRunMain$1(SparkSubmit.scala:161) > at org.apache.spark.deploy.SparkSubmit.submit(SparkSubmit.scala:184) > at org.apache.spark.deploy.SparkSubmit.doSubmit(SparkSubmit.scala:86) > at > org.apache.spark.deploy.SparkSubmit$$anon$2.doSubmit(SparkSubmit.scala:920) > at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:929) > at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala) > Caused by: io.fabric8.kubernetes.client.KubernetesClientException: Operation: > [get] for kind: [Pod] with name: > [wf-5-69674f15d0fc45-1571354060179-driver] in namespace: > [tenant-8-workflows] failed. > at > io.fabric8.kubernetes.client.KubernetesClientException.launderThrowable(KubernetesClientException.java:64) > at > io.fabric8.kubernetes.client.KubernetesClientException.launderThrowable(KubernetesClientException.java:72) > at > io.fabric8.kubernetes.client.dsl.base.BaseOperation.getMandatory(BaseOperation.java:229) > at > io.fabric8.kubernetes.client.dsl.base.BaseOperation.get(BaseOperation.java:162) > at > org.apache.spark.scheduler.cluster.k8s.ExecutorPodsAllocator$$anonfun$1.apply(ExecutorPodsAllocator.scala:57) > at > org.apache.spark.scheduler.cluster.k8s.ExecutorPodsAllocator$$anonfun$1.apply(ExecutorPodsAllocator.scala:55) > at scala.Option.map(Option.scala:146) > at > org.apache.spark.scheduler.cluster.k8s.ExecutorPodsAllocator.(ExecutorPodsAllocator.scala:55) > at > org.apache.spark.scheduler.cluster.k8s.KubernetesClusterManager.createSchedulerBackend(KubernetesClusterManager.scala:89) > at > org.apache.spark.SparkContext$.org$apache$spark$SparkContext$$createTaskScheduler(SparkContext.scala:2788) > ... 20 more > Caused by: java.net.UnknownHostException: kubernetes.default.svc: Try again > at java.net.Inet4AddressImpl.lookupAllHostAddr(Native Method) > at java.net.InetAddress$2.lookupAllHostAddr(InetAddress.java:929) > at > java.net.InetAddress.getAddressesFromNameService(InetAddress.java:1324) > at java.net.InetAddress.getAllByName0(InetAddress.java:1277) > at java.net.InetAddress.getAllByName(InetAddress.java:1193) > at java.net.InetAddress.getAllByName(InetAddress.java:1127) > at okhttp3.Dns$1.lookup(Dns.java:39) > at > okhttp3.internal.connection.RouteSelector.resetNextInetSocketAddress(RouteSelector.java:171) > at > okhttp3.internal.connection.RouteSelector.nextProxy(RouteSelector.java:137) > at okhttp3.
[jira] [Reopened] (SPARK-29640) [K8S] Intermittent "java.net.UnknownHostException: kubernetes.default.svc" in Spark driver
[ https://issues.apache.org/jira/browse/SPARK-29640?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andy Grove reopened SPARK-29640: Re-opening this as the issue came back and is ongoing. I am exploring other solutions and will post here if/when I find a solution that works. > [K8S] Intermittent "java.net.UnknownHostException: kubernetes.default.svc" in > Spark driver > -- > > Key: SPARK-29640 > URL: https://issues.apache.org/jira/browse/SPARK-29640 > Project: Spark > Issue Type: Bug > Components: Kubernetes >Affects Versions: 2.4.4 >Reporter: Andy Grove >Priority: Major > > We are running into intermittent DNS issues where the Spark driver fails to > resolve "kubernetes.default.svc" when trying to create executors. We are > running Spark 2.4.4 (with the patch for SPARK-28921) in cluster mode in EKS. > This happens approximately 10% of the time. > Here is the stack trace: > {code:java} > Exception in thread "main" org.apache.spark.SparkException: External > scheduler cannot be instantiated > at > org.apache.spark.SparkContext$.org$apache$spark$SparkContext$$createTaskScheduler(SparkContext.scala:2794) > at org.apache.spark.SparkContext.(SparkContext.scala:493) > at org.apache.spark.SparkContext$.getOrCreate(SparkContext.scala:2520) > at > org.apache.spark.sql.SparkSession$Builder$$anonfun$7.apply(SparkSession.scala:935) > at > org.apache.spark.sql.SparkSession$Builder$$anonfun$7.apply(SparkSession.scala:926) > at scala.Option.getOrElse(Option.scala:121) > at > org.apache.spark.sql.SparkSession$Builder.getOrCreate(SparkSession.scala:926) > at com.rms.execution.test.SparkPiTask$.main(SparkPiTask.scala:36) > at com.rms.execution.test.SparkPiTask.main(SparkPiTask.scala) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:498) > at > org.apache.spark.deploy.JavaMainApplication.start(SparkApplication.scala:52) > at > org.apache.spark.deploy.SparkSubmit.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:845) > at > org.apache.spark.deploy.SparkSubmit.doRunMain$1(SparkSubmit.scala:161) > at org.apache.spark.deploy.SparkSubmit.submit(SparkSubmit.scala:184) > at org.apache.spark.deploy.SparkSubmit.doSubmit(SparkSubmit.scala:86) > at > org.apache.spark.deploy.SparkSubmit$$anon$2.doSubmit(SparkSubmit.scala:920) > at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:929) > at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala) > Caused by: io.fabric8.kubernetes.client.KubernetesClientException: Operation: > [get] for kind: [Pod] with name: > [wf-5-69674f15d0fc45-1571354060179-driver] in namespace: > [tenant-8-workflows] failed. > at > io.fabric8.kubernetes.client.KubernetesClientException.launderThrowable(KubernetesClientException.java:64) > at > io.fabric8.kubernetes.client.KubernetesClientException.launderThrowable(KubernetesClientException.java:72) > at > io.fabric8.kubernetes.client.dsl.base.BaseOperation.getMandatory(BaseOperation.java:229) > at > io.fabric8.kubernetes.client.dsl.base.BaseOperation.get(BaseOperation.java:162) > at > org.apache.spark.scheduler.cluster.k8s.ExecutorPodsAllocator$$anonfun$1.apply(ExecutorPodsAllocator.scala:57) > at > org.apache.spark.scheduler.cluster.k8s.ExecutorPodsAllocator$$anonfun$1.apply(ExecutorPodsAllocator.scala:55) > at scala.Option.map(Option.scala:146) > at > org.apache.spark.scheduler.cluster.k8s.ExecutorPodsAllocator.(ExecutorPodsAllocator.scala:55) > at > org.apache.spark.scheduler.cluster.k8s.KubernetesClusterManager.createSchedulerBackend(KubernetesClusterManager.scala:89) > at > org.apache.spark.SparkContext$.org$apache$spark$SparkContext$$createTaskScheduler(SparkContext.scala:2788) > ... 20 more > Caused by: java.net.UnknownHostException: kubernetes.default.svc: Try again > at java.net.Inet4AddressImpl.lookupAllHostAddr(Native Method) > at java.net.InetAddress$2.lookupAllHostAddr(InetAddress.java:929) > at > java.net.InetAddress.getAddressesFromNameService(InetAddress.java:1324) > at java.net.InetAddress.getAllByName0(InetAddress.java:1277) > at java.net.InetAddress.getAllByName(InetAddress.java:1193) > at java.net.InetAddress.getAllByName(InetAddress.java:1127) > at okhttp3.Dns$1.lookup(Dns.java:39) > at > okhttp3.internal.connection.RouteSelector.resetNextInetSocketAddress(RouteSelector.
[jira] [Resolved] (SPARK-29640) [K8S] Intermittent "java.net.UnknownHostException: kubernetes.default.svc" in Spark driver
[ https://issues.apache.org/jira/browse/SPARK-29640?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andy Grove resolved SPARK-29640. Resolution: Not A Bug Specifying TCP mode for DNS lookups did not help and this turned out to be an env issue caused by two nodes in the cluster. Deleting the nodes and creating new ones resolved it. > [K8S] Intermittent "java.net.UnknownHostException: kubernetes.default.svc" in > Spark driver > -- > > Key: SPARK-29640 > URL: https://issues.apache.org/jira/browse/SPARK-29640 > Project: Spark > Issue Type: Bug > Components: Kubernetes >Affects Versions: 2.4.4 >Reporter: Andy Grove >Priority: Major > > We are running into intermittent DNS issues where the Spark driver fails to > resolve "kubernetes.default.svc" when trying to create executors. We are > running Spark 2.4.4 (with the patch for SPARK-28921) in cluster mode in EKS. > This happens approximately 10% of the time. > Here is the stack trace: > {code:java} > Exception in thread "main" org.apache.spark.SparkException: External > scheduler cannot be instantiated > at > org.apache.spark.SparkContext$.org$apache$spark$SparkContext$$createTaskScheduler(SparkContext.scala:2794) > at org.apache.spark.SparkContext.(SparkContext.scala:493) > at org.apache.spark.SparkContext$.getOrCreate(SparkContext.scala:2520) > at > org.apache.spark.sql.SparkSession$Builder$$anonfun$7.apply(SparkSession.scala:935) > at > org.apache.spark.sql.SparkSession$Builder$$anonfun$7.apply(SparkSession.scala:926) > at scala.Option.getOrElse(Option.scala:121) > at > org.apache.spark.sql.SparkSession$Builder.getOrCreate(SparkSession.scala:926) > at com.rms.execution.test.SparkPiTask$.main(SparkPiTask.scala:36) > at com.rms.execution.test.SparkPiTask.main(SparkPiTask.scala) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:498) > at > org.apache.spark.deploy.JavaMainApplication.start(SparkApplication.scala:52) > at > org.apache.spark.deploy.SparkSubmit.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:845) > at > org.apache.spark.deploy.SparkSubmit.doRunMain$1(SparkSubmit.scala:161) > at org.apache.spark.deploy.SparkSubmit.submit(SparkSubmit.scala:184) > at org.apache.spark.deploy.SparkSubmit.doSubmit(SparkSubmit.scala:86) > at > org.apache.spark.deploy.SparkSubmit$$anon$2.doSubmit(SparkSubmit.scala:920) > at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:929) > at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala) > Caused by: io.fabric8.kubernetes.client.KubernetesClientException: Operation: > [get] for kind: [Pod] with name: > [wf-5-69674f15d0fc45-1571354060179-driver] in namespace: > [tenant-8-workflows] failed. > at > io.fabric8.kubernetes.client.KubernetesClientException.launderThrowable(KubernetesClientException.java:64) > at > io.fabric8.kubernetes.client.KubernetesClientException.launderThrowable(KubernetesClientException.java:72) > at > io.fabric8.kubernetes.client.dsl.base.BaseOperation.getMandatory(BaseOperation.java:229) > at > io.fabric8.kubernetes.client.dsl.base.BaseOperation.get(BaseOperation.java:162) > at > org.apache.spark.scheduler.cluster.k8s.ExecutorPodsAllocator$$anonfun$1.apply(ExecutorPodsAllocator.scala:57) > at > org.apache.spark.scheduler.cluster.k8s.ExecutorPodsAllocator$$anonfun$1.apply(ExecutorPodsAllocator.scala:55) > at scala.Option.map(Option.scala:146) > at > org.apache.spark.scheduler.cluster.k8s.ExecutorPodsAllocator.(ExecutorPodsAllocator.scala:55) > at > org.apache.spark.scheduler.cluster.k8s.KubernetesClusterManager.createSchedulerBackend(KubernetesClusterManager.scala:89) > at > org.apache.spark.SparkContext$.org$apache$spark$SparkContext$$createTaskScheduler(SparkContext.scala:2788) > ... 20 more > Caused by: java.net.UnknownHostException: kubernetes.default.svc: Try again > at java.net.Inet4AddressImpl.lookupAllHostAddr(Native Method) > at java.net.InetAddress$2.lookupAllHostAddr(InetAddress.java:929) > at > java.net.InetAddress.getAddressesFromNameService(InetAddress.java:1324) > at java.net.InetAddress.getAllByName0(InetAddress.java:1277) > at java.net.InetAddress.getAllByName(InetAddress.java:1193) > at java.net.InetAddress.getAllByName(InetAddress.java:1127) > at okhttp3.Dns$1.lookup(Dns.java:39) > at > okhttp3.internal.conn
[jira] [Commented] (SPARK-29640) [K8S] Intermittent "java.net.UnknownHostException: kubernetes.default.svc" in Spark driver
[ https://issues.apache.org/jira/browse/SPARK-29640?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16963175#comment-16963175 ] Andy Grove commented on SPARK-29640: A hacky workaround is to wait for DNS to resolve before creating the Spark context: {code:java} def waitForDns(): Unit = { val host = "kubernetes.default.svc" println(s"Resolving $host ...") val t1 = System.currentTimeMillis() var attempts = 0 while (System.currentTimeMillis() - t1 < 15000) { try { attempts += 1 val address = InetAddress.getByName(host) println(s"Resolved $host as ${address.getHostAddress()} after $attempts attempt(s)") return } catch { case _: UnknownHostException => println(s"Failed to resolve $host due to UnknownHostException (attempt $attempts)") Thread.sleep(100) } } } {code} > [K8S] Intermittent "java.net.UnknownHostException: kubernetes.default.svc" in > Spark driver > -- > > Key: SPARK-29640 > URL: https://issues.apache.org/jira/browse/SPARK-29640 > Project: Spark > Issue Type: Bug > Components: Kubernetes >Affects Versions: 2.4.4 >Reporter: Andy Grove >Priority: Major > Fix For: 2.4.5 > > > We are running into intermittent DNS issues where the Spark driver fails to > resolve "kubernetes.default.svc" when trying to create executors. We are > running Spark 2.4.4 (with the patch for SPARK-28921) in cluster mode in EKS. > This happens approximately 10% of the time. > Here is the stack trace: > {code:java} > Exception in thread "main" org.apache.spark.SparkException: External > scheduler cannot be instantiated > at > org.apache.spark.SparkContext$.org$apache$spark$SparkContext$$createTaskScheduler(SparkContext.scala:2794) > at org.apache.spark.SparkContext.(SparkContext.scala:493) > at org.apache.spark.SparkContext$.getOrCreate(SparkContext.scala:2520) > at > org.apache.spark.sql.SparkSession$Builder$$anonfun$7.apply(SparkSession.scala:935) > at > org.apache.spark.sql.SparkSession$Builder$$anonfun$7.apply(SparkSession.scala:926) > at scala.Option.getOrElse(Option.scala:121) > at > org.apache.spark.sql.SparkSession$Builder.getOrCreate(SparkSession.scala:926) > at com.rms.execution.test.SparkPiTask$.main(SparkPiTask.scala:36) > at com.rms.execution.test.SparkPiTask.main(SparkPiTask.scala) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:498) > at > org.apache.spark.deploy.JavaMainApplication.start(SparkApplication.scala:52) > at > org.apache.spark.deploy.SparkSubmit.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:845) > at > org.apache.spark.deploy.SparkSubmit.doRunMain$1(SparkSubmit.scala:161) > at org.apache.spark.deploy.SparkSubmit.submit(SparkSubmit.scala:184) > at org.apache.spark.deploy.SparkSubmit.doSubmit(SparkSubmit.scala:86) > at > org.apache.spark.deploy.SparkSubmit$$anon$2.doSubmit(SparkSubmit.scala:920) > at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:929) > at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala) > Caused by: io.fabric8.kubernetes.client.KubernetesClientException: Operation: > [get] for kind: [Pod] with name: > [wf-5-69674f15d0fc45-1571354060179-driver] in namespace: > [tenant-8-workflows] failed. > at > io.fabric8.kubernetes.client.KubernetesClientException.launderThrowable(KubernetesClientException.java:64) > at > io.fabric8.kubernetes.client.KubernetesClientException.launderThrowable(KubernetesClientException.java:72) > at > io.fabric8.kubernetes.client.dsl.base.BaseOperation.getMandatory(BaseOperation.java:229) > at > io.fabric8.kubernetes.client.dsl.base.BaseOperation.get(BaseOperation.java:162) > at > org.apache.spark.scheduler.cluster.k8s.ExecutorPodsAllocator$$anonfun$1.apply(ExecutorPodsAllocator.scala:57) > at > org.apache.spark.scheduler.cluster.k8s.ExecutorPodsAllocator$$anonfun$1.apply(ExecutorPodsAllocator.scala:55) > at scala.Option.map(Option.scala:146) > at > org.apache.spark.scheduler.cluster.k8s.ExecutorPodsAllocator.(ExecutorPodsAllocator.scala:55) > at > org.apache.spark.scheduler.cluster.k8s.KubernetesClusterManager.createSchedulerBackend(KubernetesClusterManager.scala:89) > at > org.apache.spark.SparkContext$.org$apache$spark$SparkContext$$createTaskScheduler(SparkContext.scala:2788) > ... 20 more > Caused by: java.net.Unkno
[jira] [Updated] (SPARK-29640) [K8S] Intermittent "java.net.UnknownHostException: kubernetes.default.svc" in Spark driver
[ https://issues.apache.org/jira/browse/SPARK-29640?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andy Grove updated SPARK-29640: --- Description: We are running into intermittent DNS issues where the Spark driver fails to resolve "kubernetes.default.svc" when trying to create executors. We are running Spark 2.4.4 (with the patch for SPARK-28921) in cluster mode in EKS. This happens approximately 10% of the time. Here is the stack trace: {code:java} Exception in thread "main" org.apache.spark.SparkException: External scheduler cannot be instantiated at org.apache.spark.SparkContext$.org$apache$spark$SparkContext$$createTaskScheduler(SparkContext.scala:2794) at org.apache.spark.SparkContext.(SparkContext.scala:493) at org.apache.spark.SparkContext$.getOrCreate(SparkContext.scala:2520) at org.apache.spark.sql.SparkSession$Builder$$anonfun$7.apply(SparkSession.scala:935) at org.apache.spark.sql.SparkSession$Builder$$anonfun$7.apply(SparkSession.scala:926) at scala.Option.getOrElse(Option.scala:121) at org.apache.spark.sql.SparkSession$Builder.getOrCreate(SparkSession.scala:926) at com.rms.execution.test.SparkPiTask$.main(SparkPiTask.scala:36) at com.rms.execution.test.SparkPiTask.main(SparkPiTask.scala) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:498) at org.apache.spark.deploy.JavaMainApplication.start(SparkApplication.scala:52) at org.apache.spark.deploy.SparkSubmit.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:845) at org.apache.spark.deploy.SparkSubmit.doRunMain$1(SparkSubmit.scala:161) at org.apache.spark.deploy.SparkSubmit.submit(SparkSubmit.scala:184) at org.apache.spark.deploy.SparkSubmit.doSubmit(SparkSubmit.scala:86) at org.apache.spark.deploy.SparkSubmit$$anon$2.doSubmit(SparkSubmit.scala:920) at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:929) at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala) Caused by: io.fabric8.kubernetes.client.KubernetesClientException: Operation: [get] for kind: [Pod] with name: [wf-5-69674f15d0fc45-1571354060179-driver] in namespace: [tenant-8-workflows] failed. at io.fabric8.kubernetes.client.KubernetesClientException.launderThrowable(KubernetesClientException.java:64) at io.fabric8.kubernetes.client.KubernetesClientException.launderThrowable(KubernetesClientException.java:72) at io.fabric8.kubernetes.client.dsl.base.BaseOperation.getMandatory(BaseOperation.java:229) at io.fabric8.kubernetes.client.dsl.base.BaseOperation.get(BaseOperation.java:162) at org.apache.spark.scheduler.cluster.k8s.ExecutorPodsAllocator$$anonfun$1.apply(ExecutorPodsAllocator.scala:57) at org.apache.spark.scheduler.cluster.k8s.ExecutorPodsAllocator$$anonfun$1.apply(ExecutorPodsAllocator.scala:55) at scala.Option.map(Option.scala:146) at org.apache.spark.scheduler.cluster.k8s.ExecutorPodsAllocator.(ExecutorPodsAllocator.scala:55) at org.apache.spark.scheduler.cluster.k8s.KubernetesClusterManager.createSchedulerBackend(KubernetesClusterManager.scala:89) at org.apache.spark.SparkContext$.org$apache$spark$SparkContext$$createTaskScheduler(SparkContext.scala:2788) ... 20 more Caused by: java.net.UnknownHostException: kubernetes.default.svc: Try again at java.net.Inet4AddressImpl.lookupAllHostAddr(Native Method) at java.net.InetAddress$2.lookupAllHostAddr(InetAddress.java:929) at java.net.InetAddress.getAddressesFromNameService(InetAddress.java:1324) at java.net.InetAddress.getAllByName0(InetAddress.java:1277) at java.net.InetAddress.getAllByName(InetAddress.java:1193) at java.net.InetAddress.getAllByName(InetAddress.java:1127) at okhttp3.Dns$1.lookup(Dns.java:39) at okhttp3.internal.connection.RouteSelector.resetNextInetSocketAddress(RouteSelector.java:171) at okhttp3.internal.connection.RouteSelector.nextProxy(RouteSelector.java:137) at okhttp3.internal.connection.RouteSelector.next(RouteSelector.java:82) at okhttp3.internal.connection.StreamAllocation.findConnection(StreamAllocation.java:171) at okhttp3.internal.connection.StreamAllocation.findHealthyConnection(StreamAllocation.java:121) at okhttp3.internal.connection.StreamAllocation.newStream(StreamAllocation.java:100) at okhttp3.internal.connection.ConnectInterceptor.intercept(ConnectInterceptor.java:42) at okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.java:92) at
[jira] [Updated] (SPARK-29640) [K8S] Intermittent "java.net.UnknownHostException: kubernetes.default.svc" in Spark driver
[ https://issues.apache.org/jira/browse/SPARK-29640?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andy Grove updated SPARK-29640: --- Description: We are running into intermittent DNS issues where the Spark driver fails to resolve "kubernetes.default.svc" when trying to create executors. We are running Spark 2.4.4 (with the patch for SPARK-28921) in cluster mode. This happens approximately 10% of the time. Here is the stack trace: {code:java} Exception in thread "main" org.apache.spark.SparkException: External scheduler cannot be instantiated at org.apache.spark.SparkContext$.org$apache$spark$SparkContext$$createTaskScheduler(SparkContext.scala:2794) at org.apache.spark.SparkContext.(SparkContext.scala:493) at org.apache.spark.SparkContext$.getOrCreate(SparkContext.scala:2520) at org.apache.spark.sql.SparkSession$Builder$$anonfun$7.apply(SparkSession.scala:935) at org.apache.spark.sql.SparkSession$Builder$$anonfun$7.apply(SparkSession.scala:926) at scala.Option.getOrElse(Option.scala:121) at org.apache.spark.sql.SparkSession$Builder.getOrCreate(SparkSession.scala:926) at com.rms.execution.test.SparkPiTask$.main(SparkPiTask.scala:36) at com.rms.execution.test.SparkPiTask.main(SparkPiTask.scala) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:498) at org.apache.spark.deploy.JavaMainApplication.start(SparkApplication.scala:52) at org.apache.spark.deploy.SparkSubmit.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:845) at org.apache.spark.deploy.SparkSubmit.doRunMain$1(SparkSubmit.scala:161) at org.apache.spark.deploy.SparkSubmit.submit(SparkSubmit.scala:184) at org.apache.spark.deploy.SparkSubmit.doSubmit(SparkSubmit.scala:86) at org.apache.spark.deploy.SparkSubmit$$anon$2.doSubmit(SparkSubmit.scala:920) at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:929) at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala) Caused by: io.fabric8.kubernetes.client.KubernetesClientException: Operation: [get] for kind: [Pod] with name: [wf-5-69674f15d0fc45-1571354060179-driver] in namespace: [tenant-8-workflows] failed. at io.fabric8.kubernetes.client.KubernetesClientException.launderThrowable(KubernetesClientException.java:64) at io.fabric8.kubernetes.client.KubernetesClientException.launderThrowable(KubernetesClientException.java:72) at io.fabric8.kubernetes.client.dsl.base.BaseOperation.getMandatory(BaseOperation.java:229) at io.fabric8.kubernetes.client.dsl.base.BaseOperation.get(BaseOperation.java:162) at org.apache.spark.scheduler.cluster.k8s.ExecutorPodsAllocator$$anonfun$1.apply(ExecutorPodsAllocator.scala:57) at org.apache.spark.scheduler.cluster.k8s.ExecutorPodsAllocator$$anonfun$1.apply(ExecutorPodsAllocator.scala:55) at scala.Option.map(Option.scala:146) at org.apache.spark.scheduler.cluster.k8s.ExecutorPodsAllocator.(ExecutorPodsAllocator.scala:55) at org.apache.spark.scheduler.cluster.k8s.KubernetesClusterManager.createSchedulerBackend(KubernetesClusterManager.scala:89) at org.apache.spark.SparkContext$.org$apache$spark$SparkContext$$createTaskScheduler(SparkContext.scala:2788) ... 20 more Caused by: java.net.UnknownHostException: kubernetes.default.svc: Try again at java.net.Inet4AddressImpl.lookupAllHostAddr(Native Method) at java.net.InetAddress$2.lookupAllHostAddr(InetAddress.java:929) at java.net.InetAddress.getAddressesFromNameService(InetAddress.java:1324) at java.net.InetAddress.getAllByName0(InetAddress.java:1277) at java.net.InetAddress.getAllByName(InetAddress.java:1193) at java.net.InetAddress.getAllByName(InetAddress.java:1127) at okhttp3.Dns$1.lookup(Dns.java:39) at okhttp3.internal.connection.RouteSelector.resetNextInetSocketAddress(RouteSelector.java:171) at okhttp3.internal.connection.RouteSelector.nextProxy(RouteSelector.java:137) at okhttp3.internal.connection.RouteSelector.next(RouteSelector.java:82) at okhttp3.internal.connection.StreamAllocation.findConnection(StreamAllocation.java:171) at okhttp3.internal.connection.StreamAllocation.findHealthyConnection(StreamAllocation.java:121) at okhttp3.internal.connection.StreamAllocation.newStream(StreamAllocation.java:100) at okhttp3.internal.connection.ConnectInterceptor.intercept(ConnectInterceptor.java:42) at okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.java:92) at okhttp
[jira] [Commented] (SPARK-29640) [K8S] Make it possible to set DNS option to use TCP instead of UDP
[ https://issues.apache.org/jira/browse/SPARK-29640?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16963059#comment-16963059 ] Andy Grove commented on SPARK-29640: Installing node local dns cache daemon might be another workaround for this issue: https://github.com/kubernetes/kubernetes/issues/56903#issuecomment-511750647 > [K8S] Make it possible to set DNS option to use TCP instead of UDP > -- > > Key: SPARK-29640 > URL: https://issues.apache.org/jira/browse/SPARK-29640 > Project: Spark > Issue Type: Improvement > Components: Kubernetes >Affects Versions: 2.4.4 >Reporter: Andy Grove >Priority: Major > Fix For: 2.4.5 > > > We are running into intermittent DNS issues where the Spark driver fails to > resolve "kubernetes.default.svc" and this seems to be caused by > [https://github.com/kubernetes/kubernetes/issues/76790] > One suggested workaround is to specify TCP mode for DNS lookups in the pod > spec > ([https://github.com/kubernetes/kubernetes/issues/56903#issuecomment-424498508]). > I would like the ability to provide a flag to spark-submit to specify to use > TCP mode for DNS lookups. > I am working on a PR for this. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-29640) [K8S] Make it possible to set DNS option to use TCP instead of UDP
Andy Grove created SPARK-29640: -- Summary: [K8S] Make it possible to set DNS option to use TCP instead of UDP Key: SPARK-29640 URL: https://issues.apache.org/jira/browse/SPARK-29640 Project: Spark Issue Type: Improvement Components: Kubernetes Affects Versions: 2.4.4 Reporter: Andy Grove Fix For: 2.4.5 We are running into intermittent DNS issues where the Spark driver fails to resolve "kubernetes.default.svc" and this seems to be caused by [https://github.com/kubernetes/kubernetes/issues/76790] One suggested workaround is to specify TCP mode for DNS lookups in the pod spec ([https://github.com/kubernetes/kubernetes/issues/56903#issuecomment-424498508]). I would like the ability to provide a flag to spark-submit to specify to use TCP mode for DNS lookups. I am working on a PR for this. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-28921) Spark jobs failing on latest versions of Kubernetes (1.15.3, 1.14.6, 1,13.10, 1.12.10, 1.11.10)
[ https://issues.apache.org/jira/browse/SPARK-28921?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andy Grove updated SPARK-28921: --- Affects Version/s: 2.3.0 2.3.1 2.4.0 2.4.1 2.4.2 2.4.4 > Spark jobs failing on latest versions of Kubernetes (1.15.3, 1.14.6, 1,13.10, > 1.12.10, 1.11.10) > --- > > Key: SPARK-28921 > URL: https://issues.apache.org/jira/browse/SPARK-28921 > Project: Spark > Issue Type: Bug > Components: Kubernetes >Affects Versions: 2.3.0, 2.3.1, 2.3.3, 2.4.0, 2.4.1, 2.4.2, 2.4.3, 2.4.4 >Reporter: Paul Schweigert >Assignee: Andy Grove >Priority: Major > Fix For: 2.4.5, 3.0.0 > > > Spark jobs are failing on latest versions of Kubernetes when jobs attempt to > provision executor pods (jobs like Spark-Pi that do not launch executors run > without a problem): > > Here's an example error message: > > {code:java} > 19/08/30 01:29:09 INFO ExecutorPodsAllocator: Going to request 2 executors > from Kubernetes. > 19/08/30 01:29:09 INFO ExecutorPodsAllocator: Going to request 2 executors > from Kubernetes.19/08/30 01:29:09 WARN WatchConnectionManager: Exec Failure: > HTTP 403, Status: 403 - > java.net.ProtocolException: Expected HTTP 101 response but was '403 > Forbidden' > at > okhttp3.internal.ws.RealWebSocket.checkResponse(RealWebSocket.java:216) > at okhttp3.internal.ws.RealWebSocket$2.onResponse(RealWebSocket.java:183) > at okhttp3.RealCall$AsyncCall.execute(RealCall.java:141) > at okhttp3.internal.NamedRunnable.run(NamedRunnable.java:32) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) > > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) > > at java.lang.Thread.run(Thread.java:748) > {code} > > Looks like the issue is caused by fixes for a recent CVE : > CVE: [https://cve.mitre.org/cgi-bin/cvename.cgi?name=CVE-2019-14809] > Fix: [https://github.com/fabric8io/kubernetes-client/pull/1669] > > Looks like upgrading kubernetes-client to 4.4.2 would solve this issue. -- This message was sent by Atlassian Jira (v8.3.2#803003) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Resolved] (SPARK-28925) Update Kubernetes-client to 4.4.2 to be compatible with Kubernetes 1.13 and 1.14
[ https://issues.apache.org/jira/browse/SPARK-28925?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andy Grove resolved SPARK-28925. Resolution: Duplicate > Update Kubernetes-client to 4.4.2 to be compatible with Kubernetes 1.13 and > 1.14 > > > Key: SPARK-28925 > URL: https://issues.apache.org/jira/browse/SPARK-28925 > Project: Spark > Issue Type: Bug > Components: Kubernetes >Affects Versions: 2.3.3, 2.4.3 >Reporter: Eric >Priority: Minor > > Hello, > If you use Spark with Kubernetes 1.13 or 1.14 you will see this error: > {code:java} > {"time": "2019-08-28T09:56:11.866Z", "lvl":"INFO", "logger": > "org.apache.spark.internal.Logging", > "thread":"kubernetes-executor-snapshots-subscribers-0","msg":"Going to > request 1 executors from Kubernetes."} > {"time": "2019-08-28T09:56:12.028Z", "lvl":"WARN", "logger": > "io.fabric8.kubernetes.client.dsl.internal.WatchConnectionManager$2", > "thread":"OkHttp https://kubernetes.default.svc/...","msg":"Exec Failure: > HTTP 403, Status: 403 - "} > java.net.ProtocolException: Expected HTTP 101 response but was '403 Forbidden' > {code} > Apparently the bug is fixed here: > [https://github.com/fabric8io/kubernetes-client/pull/1669] > We have currently compiled Spark source code with Kubernetes-client 4.4.2 and > it's working great on our cluster. We are using Kubernetes 1.13.10. > > Could it be possible to update that dependency version? > > Thanks! -- This message was sent by Atlassian Jira (v8.3.2#803003) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-28921) Spark jobs failing on latest versions of Kubernetes (1.15.3, 1.14.6, 1,13.10, 1.12.10, 1.11.10)
[ https://issues.apache.org/jira/browse/SPARK-28921?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16920411#comment-16920411 ] Andy Grove commented on SPARK-28921: [~dongjoon] we are seeing it on both of the EKS clusters where we are running Spark jobs. I imagine it affects all EKS clusters? The versions we are using are 1.11.10 and 1.12.10 .. full version info: {code:java} Server Version: version.Info{Major:"1", Minor:"11+", GitVersion:"v1.11.10-eks-7f15cc", GitCommit:"7f15ccb4e58f112866f7ddcfebf563f199558488", GitTreeState:"clean", BuildDate:"2019-08-19T17:46:02Z", GoVersion:"go1.12.9", Compiler:"gc", Platform:"linux/amd64"} {code} {code:java} Server Version: version.Info{Major:"1", Minor:"12+", GitVersion:"v1.12.10-eks-825e5d", GitCommit:"825e5de08cb05714f9b224cd6c47d9514df1d1a7", GitTreeState:"clean", BuildDate:"2019-08-18T03:58:32Z", GoVersion:"go1.12.9", Compiler:"gc", Platform:"linux/amd64"} {code} > Spark jobs failing on latest versions of Kubernetes (1.15.3, 1.14.6, 1,13.10, > 1.12.10, 1.11.10) > --- > > Key: SPARK-28921 > URL: https://issues.apache.org/jira/browse/SPARK-28921 > Project: Spark > Issue Type: Bug > Components: Kubernetes >Affects Versions: 2.3.3, 2.4.3 >Reporter: Paul Schweigert >Priority: Major > > Spark jobs are failing on latest versions of Kubernetes when jobs attempt to > provision executor pods (jobs like Spark-Pi that do not launch executors run > without a problem): > > Here's an example error message: > > {code:java} > 19/08/30 01:29:09 INFO ExecutorPodsAllocator: Going to request 2 executors > from Kubernetes. > 19/08/30 01:29:09 INFO ExecutorPodsAllocator: Going to request 2 executors > from Kubernetes.19/08/30 01:29:09 WARN WatchConnectionManager: Exec Failure: > HTTP 403, Status: 403 - > java.net.ProtocolException: Expected HTTP 101 response but was '403 > Forbidden' > at > okhttp3.internal.ws.RealWebSocket.checkResponse(RealWebSocket.java:216) > at okhttp3.internal.ws.RealWebSocket$2.onResponse(RealWebSocket.java:183) > at okhttp3.RealCall$AsyncCall.execute(RealCall.java:141) > at okhttp3.internal.NamedRunnable.run(NamedRunnable.java:32) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) > > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) > > at java.lang.Thread.run(Thread.java:748) > {code} > > Looks like the issue is caused by fixes for a recent CVE : > CVE: [https://cve.mitre.org/cgi-bin/cvename.cgi?name=CVE-2019-14809] > Fix: [https://github.com/fabric8io/kubernetes-client/pull/1669] > > Looks like upgrading kubernetes-client to 4.4.2 would solve this issue. -- This message was sent by Atlassian Jira (v8.3.2#803003) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-28921) Spark jobs failing on latest versions of Kubernetes (1.15.3, 1.14.6, 1,13.10, 1.12.10, 1.11.10)
[ https://issues.apache.org/jira/browse/SPARK-28921?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andy Grove updated SPARK-28921: --- Summary: Spark jobs failing on latest versions of Kubernetes (1.15.3, 1.14.6, 1,13.10, 1.12.10, 1.11.10) (was: Spark jobs failing on latest versions of Kubernetes (1.15.3, 1.14.6, 1,13.10, 1.11.10)) > Spark jobs failing on latest versions of Kubernetes (1.15.3, 1.14.6, 1,13.10, > 1.12.10, 1.11.10) > --- > > Key: SPARK-28921 > URL: https://issues.apache.org/jira/browse/SPARK-28921 > Project: Spark > Issue Type: Bug > Components: Kubernetes >Affects Versions: 2.3.3, 2.4.3 >Reporter: Paul Schweigert >Priority: Major > > Spark jobs are failing on latest versions of Kubernetes when jobs attempt to > provision executor pods (jobs like Spark-Pi that do not launch executors run > without a problem): > > Here's an example error message: > > {code:java} > 19/08/30 01:29:09 INFO ExecutorPodsAllocator: Going to request 2 executors > from Kubernetes. > 19/08/30 01:29:09 INFO ExecutorPodsAllocator: Going to request 2 executors > from Kubernetes.19/08/30 01:29:09 WARN WatchConnectionManager: Exec Failure: > HTTP 403, Status: 403 - > java.net.ProtocolException: Expected HTTP 101 response but was '403 > Forbidden' > at > okhttp3.internal.ws.RealWebSocket.checkResponse(RealWebSocket.java:216) > at okhttp3.internal.ws.RealWebSocket$2.onResponse(RealWebSocket.java:183) > at okhttp3.RealCall$AsyncCall.execute(RealCall.java:141) > at okhttp3.internal.NamedRunnable.run(NamedRunnable.java:32) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) > > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) > > at java.lang.Thread.run(Thread.java:748) > {code} > > Looks like the issue is caused by fixes for a recent CVE : > CVE: [https://cve.mitre.org/cgi-bin/cvename.cgi?name=CVE-2019-14809] > Fix: [https://github.com/fabric8io/kubernetes-client/pull/1669] > > Looks like upgrading kubernetes-client to 4.4.2 would solve this issue. -- This message was sent by Atlassian Jira (v8.3.2#803003) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-28921) Spark jobs failing on latest versions of Kubernetes (1.15.3, 1.14.6, 1,13.10, 1.11.10)
[ https://issues.apache.org/jira/browse/SPARK-28921?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andy Grove updated SPARK-28921: --- Summary: Spark jobs failing on latest versions of Kubernetes (1.15.3, 1.14.6, 1,13.10, 1.11.10) (was: Spark jobs failing on latest versions of Kubernetes (1.15.3, 1.14.6, 1,13.10)) > Spark jobs failing on latest versions of Kubernetes (1.15.3, 1.14.6, 1,13.10, > 1.11.10) > -- > > Key: SPARK-28921 > URL: https://issues.apache.org/jira/browse/SPARK-28921 > Project: Spark > Issue Type: Bug > Components: Kubernetes >Affects Versions: 2.3.3, 2.4.3 >Reporter: Paul Schweigert >Priority: Major > > Spark jobs are failing on latest versions of Kubernetes when jobs attempt to > provision executor pods (jobs like Spark-Pi that do not launch executors run > without a problem): > > Here's an example error message: > > {code:java} > 19/08/30 01:29:09 INFO ExecutorPodsAllocator: Going to request 2 executors > from Kubernetes. > 19/08/30 01:29:09 INFO ExecutorPodsAllocator: Going to request 2 executors > from Kubernetes.19/08/30 01:29:09 WARN WatchConnectionManager: Exec Failure: > HTTP 403, Status: 403 - > java.net.ProtocolException: Expected HTTP 101 response but was '403 > Forbidden' > at > okhttp3.internal.ws.RealWebSocket.checkResponse(RealWebSocket.java:216) > at okhttp3.internal.ws.RealWebSocket$2.onResponse(RealWebSocket.java:183) > at okhttp3.RealCall$AsyncCall.execute(RealCall.java:141) > at okhttp3.internal.NamedRunnable.run(NamedRunnable.java:32) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) > > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) > > at java.lang.Thread.run(Thread.java:748) > {code} > > Looks like the issue is caused by fixes for a recent CVE : > CVE: [https://cve.mitre.org/cgi-bin/cvename.cgi?name=CVE-2019-14809] > Fix: [https://github.com/fabric8io/kubernetes-client/pull/1669] > > Looks like upgrading kubernetes-client to 4.4.2 would solve this issue. -- This message was sent by Atlassian Jira (v8.3.2#803003) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-28921) Spark jobs failing on latest versions of Kubernetes (1.15.3, 1.14.6, 1,13.10)
[ https://issues.apache.org/jira/browse/SPARK-28921?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16920133#comment-16920133 ] Andy Grove commented on SPARK-28921: Here's a PR to fix against master branch since it didn't automatically link to this JIRA: https://github.com/apache/spark/pull/25640 > Spark jobs failing on latest versions of Kubernetes (1.15.3, 1.14.6, 1,13.10) > - > > Key: SPARK-28921 > URL: https://issues.apache.org/jira/browse/SPARK-28921 > Project: Spark > Issue Type: Bug > Components: Kubernetes >Affects Versions: 2.3.3, 2.4.3 >Reporter: Paul Schweigert >Priority: Critical > > Spark jobs are failing on latest versions of Kubernetes when jobs attempt to > provision executor pods (jobs like Spark-Pi that do not launch executors run > without a problem): > > Here's an example error message: > > {code:java} > 19/08/30 01:29:09 INFO ExecutorPodsAllocator: Going to request 2 executors > from Kubernetes. > 19/08/30 01:29:09 INFO ExecutorPodsAllocator: Going to request 2 executors > from Kubernetes.19/08/30 01:29:09 WARN WatchConnectionManager: Exec Failure: > HTTP 403, Status: 403 - > java.net.ProtocolException: Expected HTTP 101 response but was '403 > Forbidden' > at > okhttp3.internal.ws.RealWebSocket.checkResponse(RealWebSocket.java:216) > at okhttp3.internal.ws.RealWebSocket$2.onResponse(RealWebSocket.java:183) > at okhttp3.RealCall$AsyncCall.execute(RealCall.java:141) > at okhttp3.internal.NamedRunnable.run(NamedRunnable.java:32) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) > > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) > > at java.lang.Thread.run(Thread.java:748) > {code} > > Looks like the issue is caused by fixes for a recent CVE : > CVE: [https://cve.mitre.org/cgi-bin/cvename.cgi?name=CVE-2019-14809] > Fix: [https://github.com/fabric8io/kubernetes-client/pull/1669] > > Looks like upgrading kubernetes-client to 4.4.2 would solve this issue. -- This message was sent by Atlassian Jira (v8.3.2#803003) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-28921) Spark jobs failing on latest versions of Kubernetes (1.15.3, 1.14.6, 1,13.10)
[ https://issues.apache.org/jira/browse/SPARK-28921?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andy Grove updated SPARK-28921: --- Affects Version/s: 2.3.3 > Spark jobs failing on latest versions of Kubernetes (1.15.3, 1.14.6, 1,13.10) > - > > Key: SPARK-28921 > URL: https://issues.apache.org/jira/browse/SPARK-28921 > Project: Spark > Issue Type: Bug > Components: Kubernetes >Affects Versions: 2.3.3, 2.4.3 >Reporter: Paul Schweigert >Priority: Critical > > Spark jobs are failing on latest versions of Kubernetes when jobs attempt to > provision executor pods (jobs like Spark-Pi that do not launch executors run > without a problem): > > Here's an example error message: > > {code:java} > 19/08/30 01:29:09 INFO ExecutorPodsAllocator: Going to request 2 executors > from Kubernetes. > 19/08/30 01:29:09 INFO ExecutorPodsAllocator: Going to request 2 executors > from Kubernetes.19/08/30 01:29:09 WARN WatchConnectionManager: Exec Failure: > HTTP 403, Status: 403 - > java.net.ProtocolException: Expected HTTP 101 response but was '403 > Forbidden' > at > okhttp3.internal.ws.RealWebSocket.checkResponse(RealWebSocket.java:216) > at okhttp3.internal.ws.RealWebSocket$2.onResponse(RealWebSocket.java:183) > at okhttp3.RealCall$AsyncCall.execute(RealCall.java:141) > at okhttp3.internal.NamedRunnable.run(NamedRunnable.java:32) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) > > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) > > at java.lang.Thread.run(Thread.java:748) > {code} > > Looks like the issue is caused by fixes for a recent CVE : > CVE: [https://cve.mitre.org/cgi-bin/cvename.cgi?name=CVE-2019-14809] > Fix: [https://github.com/fabric8io/kubernetes-client/pull/1669] > > Looks like upgrading kubernetes-client to 4.4.2 would solve this issue. -- This message was sent by Atlassian Jira (v8.3.2#803003) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Comment Edited] (SPARK-28925) Update Kubernetes-client to 4.4.2 to be compatible with Kubernetes 1.13 and 1.14
[ https://issues.apache.org/jira/browse/SPARK-28925?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16919907#comment-16919907 ] Andy Grove edited comment on SPARK-28925 at 8/30/19 9:54 PM: - This also impacts Spark 2.3.3 on EKS 1.11 due to security patches that were rolled out in the past week. {code:java} Server Version: version.Info{Major:"1", Minor:"11+", GitVersion:"v1.11.10-eks-7f15cc", GitCommit:"7f15ccb4e58f112866f7ddcfebf563f199558488", GitTreeState:"clean", BuildDate:"2019-08-19T17:46:02Z", GoVersion:"go1.12.9", Compiler:"gc", Platform:"linux/amd64"} {code} was (Author: andygrove): This also impacts Spark 2.3.3 > Update Kubernetes-client to 4.4.2 to be compatible with Kubernetes 1.13 and > 1.14 > > > Key: SPARK-28925 > URL: https://issues.apache.org/jira/browse/SPARK-28925 > Project: Spark > Issue Type: Bug > Components: Kubernetes >Affects Versions: 2.3.3, 2.4.3 >Reporter: Eric >Priority: Minor > > Hello, > If you use Spark with Kubernetes 1.13 or 1.14 you will see this error: > {code:java} > {"time": "2019-08-28T09:56:11.866Z", "lvl":"INFO", "logger": > "org.apache.spark.internal.Logging", > "thread":"kubernetes-executor-snapshots-subscribers-0","msg":"Going to > request 1 executors from Kubernetes."} > {"time": "2019-08-28T09:56:12.028Z", "lvl":"WARN", "logger": > "io.fabric8.kubernetes.client.dsl.internal.WatchConnectionManager$2", > "thread":"OkHttp https://kubernetes.default.svc/...","msg":"Exec Failure: > HTTP 403, Status: 403 - "} > java.net.ProtocolException: Expected HTTP 101 response but was '403 Forbidden' > {code} > Apparently the bug is fixed here: > [https://github.com/fabric8io/kubernetes-client/pull/1669] > We have currently compiled Spark source code with Kubernetes-client 4.4.2 and > it's working great on our cluster. We are using Kubernetes 1.13.10. > > Could it be possible to update that dependency version? > > Thanks! -- This message was sent by Atlassian Jira (v8.3.2#803003) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Comment Edited] (SPARK-28925) Update Kubernetes-client to 4.4.2 to be compatible with Kubernetes 1.13 and 1.14
[ https://issues.apache.org/jira/browse/SPARK-28925?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16919907#comment-16919907 ] Andy Grove edited comment on SPARK-28925 at 8/30/19 9:55 PM: - This also impacts Spark 2.3.3 on EKS 1.11 due to security patches that were rolled out in the past week. {code:java} Server Version: version.Info{Major:"1", Minor:"11+", GitVersion:"v1.11.10-eks-7f15cc", GitCommit:"7f15ccb4e58f112866f7ddcfebf563f199558488", GitTreeState:"clean", BuildDate:"2019-08-19T17:46:02Z", GoVersion:"go1.12.9", Compiler:"gc", Platform:"linux/amd64"} {code} I experimented with replacing {{kubernetes-client.jar}} with version 4.4.2 and it did resolve this issue, but caused other issues, so isn't a real option for a workaround for my use case. was (Author: andygrove): This also impacts Spark 2.3.3 on EKS 1.11 due to security patches that were rolled out in the past week. {code:java} Server Version: version.Info{Major:"1", Minor:"11+", GitVersion:"v1.11.10-eks-7f15cc", GitCommit:"7f15ccb4e58f112866f7ddcfebf563f199558488", GitTreeState:"clean", BuildDate:"2019-08-19T17:46:02Z", GoVersion:"go1.12.9", Compiler:"gc", Platform:"linux/amd64"} {code} > Update Kubernetes-client to 4.4.2 to be compatible with Kubernetes 1.13 and > 1.14 > > > Key: SPARK-28925 > URL: https://issues.apache.org/jira/browse/SPARK-28925 > Project: Spark > Issue Type: Bug > Components: Kubernetes >Affects Versions: 2.3.3, 2.4.3 >Reporter: Eric >Priority: Minor > > Hello, > If you use Spark with Kubernetes 1.13 or 1.14 you will see this error: > {code:java} > {"time": "2019-08-28T09:56:11.866Z", "lvl":"INFO", "logger": > "org.apache.spark.internal.Logging", > "thread":"kubernetes-executor-snapshots-subscribers-0","msg":"Going to > request 1 executors from Kubernetes."} > {"time": "2019-08-28T09:56:12.028Z", "lvl":"WARN", "logger": > "io.fabric8.kubernetes.client.dsl.internal.WatchConnectionManager$2", > "thread":"OkHttp https://kubernetes.default.svc/...","msg":"Exec Failure: > HTTP 403, Status: 403 - "} > java.net.ProtocolException: Expected HTTP 101 response but was '403 Forbidden' > {code} > Apparently the bug is fixed here: > [https://github.com/fabric8io/kubernetes-client/pull/1669] > We have currently compiled Spark source code with Kubernetes-client 4.4.2 and > it's working great on our cluster. We are using Kubernetes 1.13.10. > > Could it be possible to update that dependency version? > > Thanks! -- This message was sent by Atlassian Jira (v8.3.2#803003) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-28925) Update Kubernetes-client to 4.4.2 to be compatible with Kubernetes 1.13 and 1.14
[ https://issues.apache.org/jira/browse/SPARK-28925?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andy Grove updated SPARK-28925: --- Affects Version/s: 2.3.3 > Update Kubernetes-client to 4.4.2 to be compatible with Kubernetes 1.13 and > 1.14 > > > Key: SPARK-28925 > URL: https://issues.apache.org/jira/browse/SPARK-28925 > Project: Spark > Issue Type: Bug > Components: Kubernetes >Affects Versions: 2.3.3, 2.4.3 >Reporter: Eric >Priority: Minor > > Hello, > If you use Spark with Kubernetes 1.13 or 1.14 you will see this error: > {code:java} > {"time": "2019-08-28T09:56:11.866Z", "lvl":"INFO", "logger": > "org.apache.spark.internal.Logging", > "thread":"kubernetes-executor-snapshots-subscribers-0","msg":"Going to > request 1 executors from Kubernetes."} > {"time": "2019-08-28T09:56:12.028Z", "lvl":"WARN", "logger": > "io.fabric8.kubernetes.client.dsl.internal.WatchConnectionManager$2", > "thread":"OkHttp https://kubernetes.default.svc/...","msg":"Exec Failure: > HTTP 403, Status: 403 - "} > java.net.ProtocolException: Expected HTTP 101 response but was '403 Forbidden' > {code} > Apparently the bug is fixed here: > [https://github.com/fabric8io/kubernetes-client/pull/1669] > We have currently compiled Spark source code with Kubernetes-client 4.4.2 and > it's working great on our cluster. We are using Kubernetes 1.13.10. > > Could it be possible to update that dependency version? > > Thanks! -- This message was sent by Atlassian Jira (v8.3.2#803003) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-28925) Update Kubernetes-client to 4.4.2 to be compatible with Kubernetes 1.13 and 1.14
[ https://issues.apache.org/jira/browse/SPARK-28925?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16919907#comment-16919907 ] Andy Grove commented on SPARK-28925: This also impacts Spark 2.3.3 > Update Kubernetes-client to 4.4.2 to be compatible with Kubernetes 1.13 and > 1.14 > > > Key: SPARK-28925 > URL: https://issues.apache.org/jira/browse/SPARK-28925 > Project: Spark > Issue Type: Bug > Components: Kubernetes >Affects Versions: 2.4.3 >Reporter: Eric >Priority: Minor > > Hello, > If you use Spark with Kubernetes 1.13 or 1.14 you will see this error: > {code:java} > {"time": "2019-08-28T09:56:11.866Z", "lvl":"INFO", "logger": > "org.apache.spark.internal.Logging", > "thread":"kubernetes-executor-snapshots-subscribers-0","msg":"Going to > request 1 executors from Kubernetes."} > {"time": "2019-08-28T09:56:12.028Z", "lvl":"WARN", "logger": > "io.fabric8.kubernetes.client.dsl.internal.WatchConnectionManager$2", > "thread":"OkHttp https://kubernetes.default.svc/...","msg":"Exec Failure: > HTTP 403, Status: 403 - "} > java.net.ProtocolException: Expected HTTP 101 response but was '403 Forbidden' > {code} > Apparently the bug is fixed here: > [https://github.com/fabric8io/kubernetes-client/pull/1669] > We have currently compiled Spark source code with Kubernetes-client 4.4.2 and > it's working great on our cluster. We are using Kubernetes 1.13.10. > > Could it be possible to update that dependency version? > > Thanks! -- This message was sent by Atlassian Jira (v8.3.2#803003) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-12932) Bad error message with trying to create Dataset from RDD of Java objects that are not bean-compliant
[ https://issues.apache.org/jira/browse/SPARK-12932?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15110785#comment-15110785 ] Andy Grove commented on SPARK-12932: Here is a pull request to change the error message: https://github.com/apache/spark/pull/10865 > Bad error message with trying to create Dataset from RDD of Java objects that > are not bean-compliant > > > Key: SPARK-12932 > URL: https://issues.apache.org/jira/browse/SPARK-12932 > Project: Spark > Issue Type: Bug > Components: Java API >Affects Versions: 1.6.0 > Environment: Ubuntu 15.10 / Java 8 >Reporter: Andy Grove > > When trying to create a Dataset from an RDD of Person (all using the Java > API), I got the error "java.lang.UnsupportedOperationException: no encoder > found for example_java.dataset.Person". This is not a very helpful error and > no other logging information was apparent to help troubleshoot this. > It turned out that the problem was that my Person class did not have a > default constructor and also did not have setter methods and that was the > root cause. > This JIRA is for implementing a more usful error message to help Java > developers who are trying out the Dataset API for the first time. > The full stack trace is: > {code} > Exception in thread "main" java.lang.UnsupportedOperationException: no > encoder found for example_java.common.Person > at > org.apache.spark.sql.catalyst.JavaTypeInference$.org$apache$spark$sql$catalyst$JavaTypeInference$$extractorFor(JavaTypeInference.scala:403) > at > org.apache.spark.sql.catalyst.JavaTypeInference$.extractorsFor(JavaTypeInference.scala:314) > at > org.apache.spark.sql.catalyst.encoders.ExpressionEncoder$.javaBean(ExpressionEncoder.scala:75) > at org.apache.spark.sql.Encoders$.bean(Encoder.scala:176) > at org.apache.spark.sql.Encoders.bean(Encoder.scala) > {code} > NOTE that if I do provide EITHER the default constructor OR the setters, but > not both, then I get a stack trace with much more useful information, but > omitting BOTH causes this issue. > The original source is below. > {code:title=Example.java} > public class JavaDatasetExample { > public static void main(String[] args) throws Exception { > SparkConf sparkConf = new SparkConf() > .setAppName("Example") > .setMaster("local[*]"); > JavaSparkContext sc = new JavaSparkContext(sparkConf); > SQLContext sqlContext = new SQLContext(sc); > List people = ImmutableList.of( > new Person("Joe", "Bloggs", 21, "NY") > ); > Dataset dataset = sqlContext.createDataset(people, > Encoders.bean(Person.class)); > {code} > {code:title=Person.java} > class Person implements Serializable { > String first; > String last; > int age; > String state; > public Person() { > } > public Person(String first, String last, int age, String state) { > this.first = first; > this.last = last; > this.age = age; > this.state = state; > } > public String getFirst() { > return first; > } > public String getLast() { > return last; > } > public int getAge() { > return age; > } > public String getState() { > return state; > } > } > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-12932) Bad error message with trying to create Dataset from RDD of Java objects that are not bean-compliant
[ https://issues.apache.org/jira/browse/SPARK-12932?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15110772#comment-15110772 ] Andy Grove commented on SPARK-12932: After reviewing the code for this, I think it is just a case of changing the error message text in `org.apache.spark.sql.catalyst.JavaTypeInference#extractorFor`. The error should be changed to: {code} s"Cannot infer type for Java class ${other.getName} because it is not bean-compliant" {code} > Bad error message with trying to create Dataset from RDD of Java objects that > are not bean-compliant > > > Key: SPARK-12932 > URL: https://issues.apache.org/jira/browse/SPARK-12932 > Project: Spark > Issue Type: Bug > Components: Java API >Affects Versions: 1.6.0 > Environment: Ubuntu 15.10 / Java 8 >Reporter: Andy Grove > > When trying to create a Dataset from an RDD of Person (all using the Java > API), I got the error "java.lang.UnsupportedOperationException: no encoder > found for example_java.dataset.Person". This is not a very helpful error and > no other logging information was apparent to help troubleshoot this. > It turned out that the problem was that my Person class did not have a > default constructor and also did not have setter methods and that was the > root cause. > This JIRA is for implementing a more usful error message to help Java > developers who are trying out the Dataset API for the first time. > The full stack trace is: > {code} > Exception in thread "main" java.lang.UnsupportedOperationException: no > encoder found for example_java.common.Person > at > org.apache.spark.sql.catalyst.JavaTypeInference$.org$apache$spark$sql$catalyst$JavaTypeInference$$extractorFor(JavaTypeInference.scala:403) > at > org.apache.spark.sql.catalyst.JavaTypeInference$.extractorsFor(JavaTypeInference.scala:314) > at > org.apache.spark.sql.catalyst.encoders.ExpressionEncoder$.javaBean(ExpressionEncoder.scala:75) > at org.apache.spark.sql.Encoders$.bean(Encoder.scala:176) > at org.apache.spark.sql.Encoders.bean(Encoder.scala) > {code} > NOTE that if I do provide EITHER the default constructor OR the setters, but > not both, then I get a stack trace with much more useful information, but > omitting BOTH causes this issue. > The original source is below. > {code:title=Example.java} > public class JavaDatasetExample { > public static void main(String[] args) throws Exception { > SparkConf sparkConf = new SparkConf() > .setAppName("Example") > .setMaster("local[*]"); > JavaSparkContext sc = new JavaSparkContext(sparkConf); > SQLContext sqlContext = new SQLContext(sc); > List people = ImmutableList.of( > new Person("Joe", "Bloggs", 21, "NY") > ); > Dataset dataset = sqlContext.createDataset(people, > Encoders.bean(Person.class)); > {code} > {code:title=Person.java} > class Person implements Serializable { > String first; > String last; > int age; > String state; > public Person() { > } > public Person(String first, String last, int age, String state) { > this.first = first; > this.last = last; > this.age = age; > this.state = state; > } > public String getFirst() { > return first; > } > public String getLast() { > return last; > } > public int getAge() { > return age; > } > public String getState() { > return state; > } > } > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-12932) Bad error message with trying to create Dataset from RDD of Java objects that are not bean-compliant
[ https://issues.apache.org/jira/browse/SPARK-12932?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15109389#comment-15109389 ] Andy Grove commented on SPARK-12932: OK, I updated the JIRA with more info. Yes, I actually would like to have a go at providing a patch for this. It will be a good way for me to get more familiar with the internals of the Dataset API. > Bad error message with trying to create Dataset from RDD of Java objects that > are not bean-compliant > > > Key: SPARK-12932 > URL: https://issues.apache.org/jira/browse/SPARK-12932 > Project: Spark > Issue Type: Bug > Components: Java API >Affects Versions: 1.6.0 > Environment: Ubuntu 15.10 / Java 8 >Reporter: Andy Grove > > When trying to create a Dataset from an RDD of Person (all using the Java > API), I got the error "java.lang.UnsupportedOperationException: no encoder > found for example_java.dataset.Person". This is not a very helpful error and > no other logging information was apparent to help troubleshoot this. > It turned out that the problem was that my Person class did not have a > default constructor and also did not have setter methods and that was the > root cause. > This JIRA is for implementing a more usful error message to help Java > developers who are trying out the Dataset API for the first time. > The full stack trace is: > {code} > Exception in thread "main" java.lang.UnsupportedOperationException: no > encoder found for example_java.common.Person > at > org.apache.spark.sql.catalyst.JavaTypeInference$.org$apache$spark$sql$catalyst$JavaTypeInference$$extractorFor(JavaTypeInference.scala:403) > at > org.apache.spark.sql.catalyst.JavaTypeInference$.extractorsFor(JavaTypeInference.scala:314) > at > org.apache.spark.sql.catalyst.encoders.ExpressionEncoder$.javaBean(ExpressionEncoder.scala:75) > at org.apache.spark.sql.Encoders$.bean(Encoder.scala:176) > at org.apache.spark.sql.Encoders.bean(Encoder.scala) > {code} > NOTE that if I do provide EITHER the default constructor OR the setters, but > not both, then I get a stack trace with much more useful information, but > omitting BOTH causes this issue. > The original source is below. > {code:title=Example.java} > public class JavaDatasetExample { > public static void main(String[] args) throws Exception { > SparkConf sparkConf = new SparkConf() > .setAppName("Example") > .setMaster("local[*]"); > JavaSparkContext sc = new JavaSparkContext(sparkConf); > SQLContext sqlContext = new SQLContext(sc); > List people = ImmutableList.of( > new Person("Joe", "Bloggs", 21, "NY") > ); > Dataset dataset = sqlContext.createDataset(people, > Encoders.bean(Person.class)); > {code} > {code:title=Person.java} > class Person implements Serializable { > String first; > String last; > int age; > String state; > public Person() { > } > public Person(String first, String last, int age, String state) { > this.first = first; > this.last = last; > this.age = age; > this.state = state; > } > public String getFirst() { > return first; > } > public String getLast() { > return last; > } > public int getAge() { > return age; > } > public String getState() { > return state; > } > } > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-12932) Bad error message with trying to create Dataset from RDD of Java objects that are not bean-compliant
[ https://issues.apache.org/jira/browse/SPARK-12932?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andy Grove updated SPARK-12932: --- Description: When trying to create a Dataset from an RDD of Person (all using the Java API), I got the error "java.lang.UnsupportedOperationException: no encoder found for example_java.dataset.Person". This is not a very helpful error and no other logging information was apparent to help troubleshoot this. It turned out that the problem was that my Person class did not have a default constructor and also did not have setter methods and that was the root cause. This JIRA is for implementing a more usful error message to help Java developers who are trying out the Dataset API for the first time. The full stack trace is: {code} Exception in thread "main" java.lang.UnsupportedOperationException: no encoder found for example_java.common.Person at org.apache.spark.sql.catalyst.JavaTypeInference$.org$apache$spark$sql$catalyst$JavaTypeInference$$extractorFor(JavaTypeInference.scala:403) at org.apache.spark.sql.catalyst.JavaTypeInference$.extractorsFor(JavaTypeInference.scala:314) at org.apache.spark.sql.catalyst.encoders.ExpressionEncoder$.javaBean(ExpressionEncoder.scala:75) at org.apache.spark.sql.Encoders$.bean(Encoder.scala:176) at org.apache.spark.sql.Encoders.bean(Encoder.scala) {code} NOTE that if I do provide EITHER the default constructor OR the setters, but not both, then I get a stack trace with much more useful information, but omitting BOTH causes this issue. The original source is below. {code:title=Example.java} public class JavaDatasetExample { public static void main(String[] args) throws Exception { SparkConf sparkConf = new SparkConf() .setAppName("Example") .setMaster("local[*]"); JavaSparkContext sc = new JavaSparkContext(sparkConf); SQLContext sqlContext = new SQLContext(sc); List people = ImmutableList.of( new Person("Joe", "Bloggs", 21, "NY") ); Dataset dataset = sqlContext.createDataset(people, Encoders.bean(Person.class)); {code} {code:title=Person.java} class Person implements Serializable { String first; String last; int age; String state; public Person() { } public Person(String first, String last, int age, String state) { this.first = first; this.last = last; this.age = age; this.state = state; } public String getFirst() { return first; } public String getLast() { return last; } public int getAge() { return age; } public String getState() { return state; } } {code} was: When trying to create a Dataset from an RDD of Person (all using the Java API), I got the error "java.lang.UnsupportedOperationException: no encoder found for example_java.dataset.Person". This is not a very helpful error and no other logging information was apparent to help troubleshoot this. It turned out that the problem was that my Person class did not have a default constructor and also did not have setter methods and that was the root cause. This JIRA is for implementing a more usful error message to help Java developers who are trying out the Dataset API for the first time. The full stack trace is: {code} Exception in thread "main" java.lang.UnsupportedOperationException: no encoder found for example_java.common.Person at org.apache.spark.sql.catalyst.JavaTypeInference$.org$apache$spark$sql$catalyst$JavaTypeInference$$extractorFor(JavaTypeInference.scala:403) at org.apache.spark.sql.catalyst.JavaTypeInference$.extractorsFor(JavaTypeInference.scala:314) at org.apache.spark.sql.catalyst.encoders.ExpressionEncoder$.javaBean(ExpressionEncoder.scala:75) at org.apache.spark.sql.Encoders$.bean(Encoder.scala:176) at org.apache.spark.sql.Encoders.bean(Encoder.scala) {code} NOTE that if I do provide EITHER the default constructor OR the setters, but not both, then I get a stack trace with much more useful information, but omitting BOTH causes this issue. The original source is below. {code:title=Example.java} public class JavaDatasetExample { public static void main(String[] args) throws Exception { SparkConf sparkConf = new SparkConf() .setAppName("Example") .setMaster("local[*]"); JavaSparkContext sc = new JavaSparkContext(sparkConf); SQLContext sqlContext = new SQLContext(sc); List people = ImmutableList.of( new Person("Joe", "Bloggs", 21, "NY") ); Dataset dataset = sqlContext.createDataset(people, Encoders.bean(Person.class)); {code} {code:title=Person.java} class Person implements Serializable { String first; String last; int age; String stat
[jira] [Updated] (SPARK-12932) Bad error message with trying to create Dataset from RDD of Java objects that are not bean-compliant
[ https://issues.apache.org/jira/browse/SPARK-12932?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andy Grove updated SPARK-12932: --- Description: When trying to create a Dataset from an RDD of Person (all using the Java API), I got the error "java.lang.UnsupportedOperationException: no encoder found for example_java.dataset.Person". This is not a very helpful error and no other logging information was apparent to help troubleshoot this. It turned out that the problem was that my Person class did not have a default constructor and also did not have setter methods and that was the root cause. This JIRA is for implementing a more usful error message to help Java developers who are trying out the Dataset API for the first time. The full stack trace is: {{Exception in thread "main" java.lang.UnsupportedOperationException: no encoder found for example_java.common.Person at org.apache.spark.sql.catalyst.JavaTypeInference$.org$apache$spark$sql$catalyst$JavaTypeInference$$extractorFor(JavaTypeInference.scala:403) at org.apache.spark.sql.catalyst.JavaTypeInference$.extractorsFor(JavaTypeInference.scala:314) at org.apache.spark.sql.catalyst.encoders.ExpressionEncoder$.javaBean(ExpressionEncoder.scala:75) at org.apache.spark.sql.Encoders$.bean(Encoder.scala:176) at org.apache.spark.sql.Encoders.bean(Encoder.scala) }} NOTE that if I do provide EITHER the default constructor OR the setters, but not both, then I get a stack trace with much more useful information, but omitting BOTH causes this issue. The original source is below. {code:title=Example.java} public class JavaDatasetExample { public static void main(String[] args) throws Exception { SparkConf sparkConf = new SparkConf() .setAppName("Example") .setMaster("local[*]"); JavaSparkContext sc = new JavaSparkContext(sparkConf); SQLContext sqlContext = new SQLContext(sc); List people = ImmutableList.of( new Person("Joe", "Bloggs", 21, "NY") ); Dataset dataset = sqlContext.createDataset(people, Encoders.bean(Person.class)); {code} {code:title=Person.java} class Person implements Serializable { String first; String last; int age; String state; public Person() { } public Person(String first, String last, int age, String state) { this.first = first; this.last = last; this.age = age; this.state = state; } public String getFirst() { return first; } public String getLast() { return last; } public int getAge() { return age; } public String getState() { return state; } } {code} was: When trying to create a Dataset from an RDD of Person (all using the Java API), I got the error "java.lang.UnsupportedOperationException: no encoder found for example_java.dataset.Person". This is not a very helpful error and no other logging information was apparent to help troubleshoot this. It turned out that the problem was that my Person class did not have a default constructor and also did not have setter methods and that was the root cause. This JIRA is for implementing a more usful error message to help Java developers who are trying out the Dataset API for the first time. The full stack trace is: {unformatted} Exception in thread "main" java.lang.UnsupportedOperationException: no encoder found for example_java.common.Person at org.apache.spark.sql.catalyst.JavaTypeInference$.org$apache$spark$sql$catalyst$JavaTypeInference$$extractorFor(JavaTypeInference.scala:403) at org.apache.spark.sql.catalyst.JavaTypeInference$.extractorsFor(JavaTypeInference.scala:314) at org.apache.spark.sql.catalyst.encoders.ExpressionEncoder$.javaBean(ExpressionEncoder.scala:75) at org.apache.spark.sql.Encoders$.bean(Encoder.scala:176) at org.apache.spark.sql.Encoders.bean(Encoder.scala) {unformatted} NOTE that if I do provide EITHER the default constructor OR the setters, but not both, then I get a stack trace with much more useful information, but omitting BOTH causes this issue. The original source is below. {code:title=Example.java} public class JavaDatasetExample { public static void main(String[] args) throws Exception { SparkConf sparkConf = new SparkConf() .setAppName("Example") .setMaster("local[*]"); JavaSparkContext sc = new JavaSparkContext(sparkConf); SQLContext sqlContext = new SQLContext(sc); List people = ImmutableList.of( new Person("Joe", "Bloggs", 21, "NY") ); Dataset dataset = sqlContext.createDataset(people, Encoders.bean(Person.class)); {code} {code:title=Person.java} class Person implements Serializable { String first; String last; int age; Strin
[jira] [Updated] (SPARK-12932) Bad error message with trying to create Dataset from RDD of Java objects that are not bean-compliant
[ https://issues.apache.org/jira/browse/SPARK-12932?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andy Grove updated SPARK-12932: --- Description: When trying to create a Dataset from an RDD of Person (all using the Java API), I got the error "java.lang.UnsupportedOperationException: no encoder found for example_java.dataset.Person". This is not a very helpful error and no other logging information was apparent to help troubleshoot this. It turned out that the problem was that my Person class did not have a default constructor and also did not have setter methods and that was the root cause. This JIRA is for implementing a more usful error message to help Java developers who are trying out the Dataset API for the first time. The full stack trace is: {unformatted} Exception in thread "main" java.lang.UnsupportedOperationException: no encoder found for example_java.common.Person at org.apache.spark.sql.catalyst.JavaTypeInference$.org$apache$spark$sql$catalyst$JavaTypeInference$$extractorFor(JavaTypeInference.scala:403) at org.apache.spark.sql.catalyst.JavaTypeInference$.extractorsFor(JavaTypeInference.scala:314) at org.apache.spark.sql.catalyst.encoders.ExpressionEncoder$.javaBean(ExpressionEncoder.scala:75) at org.apache.spark.sql.Encoders$.bean(Encoder.scala:176) at org.apache.spark.sql.Encoders.bean(Encoder.scala) {unformatted} NOTE that if I do provide EITHER the default constructor OR the setters, but not both, then I get a stack trace with much more useful information, but omitting BOTH causes this issue. The original source is below. {code:title=Example.java} public class JavaDatasetExample { public static void main(String[] args) throws Exception { SparkConf sparkConf = new SparkConf() .setAppName("Example") .setMaster("local[*]"); JavaSparkContext sc = new JavaSparkContext(sparkConf); SQLContext sqlContext = new SQLContext(sc); List people = ImmutableList.of( new Person("Joe", "Bloggs", 21, "NY") ); Dataset dataset = sqlContext.createDataset(people, Encoders.bean(Person.class)); {code} {code:title=Person.java} class Person implements Serializable { String first; String last; int age; String state; public Person() { } public Person(String first, String last, int age, String state) { this.first = first; this.last = last; this.age = age; this.state = state; } public String getFirst() { return first; } public String getLast() { return last; } public int getAge() { return age; } public String getState() { return state; } } {code} was: When trying to create a Dataset from an RDD of Person (all using the Java API), I got the error "java.lang.UnsupportedOperationException: no encoder found for example_java.dataset.Person". This is not a very helpful error and no other logging information was apparent to help troubleshoot this. It turned out that the problem was that my Person class did not have a default constructor and also did not have setter methods and that was the root cause. This JIRA is for implementing a more usful error message to help Java developers who are trying out the Dataset API for the first time. {code:title=Example.java} public class JavaDatasetExample { public static void main(String[] args) throws Exception { SparkConf sparkConf = new SparkConf() .setAppName("Example") .setMaster("local[*]"); JavaSparkContext sc = new JavaSparkContext(sparkConf); SQLContext sqlContext = new SQLContext(sc); List people = ImmutableList.of( new Person("Joe", "Bloggs", 21, "NY") ); Dataset dataset = sqlContext.createDataset(people, Encoders.bean(Person.class)); {code} {code:title=Person.java} class Person implements Serializable { String first; String last; int age; String state; public Person() { } public Person(String first, String last, int age, String state) { this.first = first; this.last = last; this.age = age; this.state = state; } public String getFirst() { return first; } public String getLast() { return last; } public int getAge() { return age; } public String getState() { return state; } } {code} > Bad error message with trying to create Dataset from RDD of Java objects that > are not bean-compliant > > > Key: SPARK-12932 > URL: https://issues.apache.org/jira/browse/SPARK-12932 > Project: Spark > Issue Type: Bug > Compo
[jira] [Updated] (SPARK-12932) Bad error message with trying to create Dataset from RDD of Java objects that are not bean-compliant
[ https://issues.apache.org/jira/browse/SPARK-12932?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andy Grove updated SPARK-12932: --- Description: When trying to create a Dataset from an RDD of Person (all using the Java API), I got the error "java.lang.UnsupportedOperationException: no encoder found for example_java.dataset.Person". This is not a very helpful error and no other logging information was apparent to help troubleshoot this. It turned out that the problem was that my Person class did not have a default constructor and also did not have setter methods and that was the root cause. This JIRA is for implementing a more usful error message to help Java developers who are trying out the Dataset API for the first time. {code:title=Example.java} public class JavaDatasetExample { public static void main(String[] args) throws Exception { SparkConf sparkConf = new SparkConf() .setAppName("Example") .setMaster("local[*]"); JavaSparkContext sc = new JavaSparkContext(sparkConf); SQLContext sqlContext = new SQLContext(sc); List people = ImmutableList.of( new Person("Joe", "Bloggs", 21, "NY") ); Dataset dataset = sqlContext.createDataset(people, Encoders.bean(Person.class)); {code} {code:title=Person.java} class Person implements Serializable { String first; String last; int age; String state; public Person() { } public Person(String first, String last, int age, String state) { this.first = first; this.last = last; this.age = age; this.state = state; } public String getFirst() { return first; } public String getLast() { return last; } public int getAge() { return age; } public String getState() { return state; } } {code} was: When trying to create a Dataset from an RDD of Person (all using the Java API), I get the error "java.lang.UnsupportedOperationException: no encoder found for example_java.dataset.Person". {code:title=Example.java} public class JavaDatasetExample { public static void main(String[] args) throws Exception { SparkConf sparkConf = new SparkConf() .setAppName("Example") .setMaster("local[*]"); JavaSparkContext sc = new JavaSparkContext(sparkConf); SQLContext sqlContext = new SQLContext(sc); List people = ImmutableList.of( new Person("Joe", "Bloggs", 21, "NY") ); Dataset dataset = sqlContext.createDataset(people, Encoders.bean(Person.class)); {code} {code:title=Person.java} class Person implements Serializable { String first; String last; int age; String state; public Person() { } public Person(String first, String last, int age, String state) { this.first = first; this.last = last; this.age = age; this.state = state; } public String getFirst() { return first; } public String getLast() { return last; } public int getAge() { return age; } public String getState() { return state; } } {code} Summary: Bad error message with trying to create Dataset from RDD of Java objects that are not bean-compliant (was: Cannot convert list of simple Java objects to Dataset (no encoder found)) > Bad error message with trying to create Dataset from RDD of Java objects that > are not bean-compliant > > > Key: SPARK-12932 > URL: https://issues.apache.org/jira/browse/SPARK-12932 > Project: Spark > Issue Type: Bug > Components: Java API >Affects Versions: 1.6.0 > Environment: Ubuntu 15.10 / Java 8 >Reporter: Andy Grove > > When trying to create a Dataset from an RDD of Person (all using the Java > API), I got the error "java.lang.UnsupportedOperationException: no encoder > found for example_java.dataset.Person". This is not a very helpful error and > no other logging information was apparent to help troubleshoot this. > It turned out that the problem was that my Person class did not have a > default constructor and also did not have setter methods and that was the > root cause. > This JIRA is for implementing a more usful error message to help Java > developers who are trying out the Dataset API for the first time. > {code:title=Example.java} > public class JavaDatasetExample { > public static void main(String[] args) throws Exception { > SparkConf sparkConf = new SparkConf() > .setAppName("Example") > .setMaster("local[*]"); > JavaSparkContext sc = new JavaSparkContext(sparkConf); > SQLContext
[jira] [Created] (SPARK-12932) Cannot convert list of simple Java objects to Dataset (no encoder found)
Andy Grove created SPARK-12932: -- Summary: Cannot convert list of simple Java objects to Dataset (no encoder found) Key: SPARK-12932 URL: https://issues.apache.org/jira/browse/SPARK-12932 Project: Spark Issue Type: Bug Components: Java API Affects Versions: 1.6.0 Environment: Ubuntu 15.10 / Java 8 Reporter: Andy Grove When trying to create a Dataset from an RDD of Person (all using the Java API), I get the error "java.lang.UnsupportedOperationException: no encoder found for example_java.dataset.Person". {code:title=Example.java} public class JavaDatasetExample { public static void main(String[] args) throws Exception { SparkConf sparkConf = new SparkConf() .setAppName("Example") .setMaster("local[*]"); JavaSparkContext sc = new JavaSparkContext(sparkConf); SQLContext sqlContext = new SQLContext(sc); List people = ImmutableList.of( new Person("Joe", "Bloggs", 21, "NY") ); Dataset dataset = sqlContext.createDataset(people, Encoders.bean(Person.class)); {code} {code:Person.java} class Person implements Serializable { String first; String last; int age; String state; public Person() { } public Person(String first, String last, int age, String state) { this.first = first; this.last = last; this.age = age; this.state = state; } public String getFirst() { return first; } public String getLast() { return last; } public int getAge() { return age; } public String getState() { return state; } } {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-12932) Cannot convert list of simple Java objects to Dataset (no encoder found)
[ https://issues.apache.org/jira/browse/SPARK-12932?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andy Grove updated SPARK-12932: --- Description: When trying to create a Dataset from an RDD of Person (all using the Java API), I get the error "java.lang.UnsupportedOperationException: no encoder found for example_java.dataset.Person". {code:title=Example.java} public class JavaDatasetExample { public static void main(String[] args) throws Exception { SparkConf sparkConf = new SparkConf() .setAppName("Example") .setMaster("local[*]"); JavaSparkContext sc = new JavaSparkContext(sparkConf); SQLContext sqlContext = new SQLContext(sc); List people = ImmutableList.of( new Person("Joe", "Bloggs", 21, "NY") ); Dataset dataset = sqlContext.createDataset(people, Encoders.bean(Person.class)); {code} {code:title=Person.java} class Person implements Serializable { String first; String last; int age; String state; public Person() { } public Person(String first, String last, int age, String state) { this.first = first; this.last = last; this.age = age; this.state = state; } public String getFirst() { return first; } public String getLast() { return last; } public int getAge() { return age; } public String getState() { return state; } } {code} was: When trying to create a Dataset from an RDD of Person (all using the Java API), I get the error "java.lang.UnsupportedOperationException: no encoder found for example_java.dataset.Person". {code:title=Example.java} public class JavaDatasetExample { public static void main(String[] args) throws Exception { SparkConf sparkConf = new SparkConf() .setAppName("Example") .setMaster("local[*]"); JavaSparkContext sc = new JavaSparkContext(sparkConf); SQLContext sqlContext = new SQLContext(sc); List people = ImmutableList.of( new Person("Joe", "Bloggs", 21, "NY") ); Dataset dataset = sqlContext.createDataset(people, Encoders.bean(Person.class)); {code} {code:Person.java} class Person implements Serializable { String first; String last; int age; String state; public Person() { } public Person(String first, String last, int age, String state) { this.first = first; this.last = last; this.age = age; this.state = state; } public String getFirst() { return first; } public String getLast() { return last; } public int getAge() { return age; } public String getState() { return state; } } {code} > Cannot convert list of simple Java objects to Dataset (no encoder found) > > > Key: SPARK-12932 > URL: https://issues.apache.org/jira/browse/SPARK-12932 > Project: Spark > Issue Type: Bug > Components: Java API >Affects Versions: 1.6.0 > Environment: Ubuntu 15.10 / Java 8 >Reporter: Andy Grove > > When trying to create a Dataset from an RDD of Person (all using the Java > API), I get the error "java.lang.UnsupportedOperationException: no encoder > found for example_java.dataset.Person". > {code:title=Example.java} > public class JavaDatasetExample { > public static void main(String[] args) throws Exception { > SparkConf sparkConf = new SparkConf() > .setAppName("Example") > .setMaster("local[*]"); > JavaSparkContext sc = new JavaSparkContext(sparkConf); > SQLContext sqlContext = new SQLContext(sc); > List people = ImmutableList.of( > new Person("Joe", "Bloggs", 21, "NY") > ); > Dataset dataset = sqlContext.createDataset(people, > Encoders.bean(Person.class)); > {code} > {code:title=Person.java} > class Person implements Serializable { > String first; > String last; > int age; > String state; > public Person() { > } > public Person(String first, String last, int age, String state) { > this.first = first; > this.last = last; > this.age = age; > this.state = state; > } > public String getFirst() { > return first; > } > public String getLast() { > return last; > } > public int getAge() { > return age; > } > public String getState() { > return state; > } > } > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional c