[GitHub] spark issue #19171: [SPARK-21902][CORE] Print root cause for BlockManager#do...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/19171 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/81807/ Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #19171: [SPARK-21902][CORE] Print root cause for BlockManager#do...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/19171 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #19171: [SPARK-21902][CORE] Print root cause for BlockManager#do...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/19171 **[Test build #81807 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/81807/testReport)** for PR 19171 at commit [`86525f7`](https://github.com/apache/spark/commit/86525f7f7cb5c2656008ba92c44c724db186158e). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #19130: [SPARK-21917][CORE][YARN] Supporting adding http(s) reso...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/19130 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #19130: [SPARK-21917][CORE][YARN] Supporting adding http(s) reso...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/19130 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/81804/ Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #19130: [SPARK-21917][CORE][YARN] Supporting adding http(s) reso...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/19130 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/81805/ Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #19130: [SPARK-21917][CORE][YARN] Supporting adding http(s) reso...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/19130 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #19130: [SPARK-21917][CORE][YARN] Supporting adding http(s) reso...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/19130 **[Test build #81805 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/81805/testReport)** for PR 19130 at commit [`fc2eb2b`](https://github.com/apache/spark/commit/fc2eb2b4cd0472441c4052adc83728440838c92d). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #19130: [SPARK-21917][CORE][YARN] Supporting adding http(s) reso...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/19130 **[Test build #81804 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/81804/testReport)** for PR 19130 at commit [`1c5487c`](https://github.com/apache/spark/commit/1c5487c2c1af7db45dee327d0ded8ab43e04d5d2). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #19240: [SPARK-22018][SQL]Preserve top-level alias metada...
Github user asfgit closed the pull request at: https://github.com/apache/spark/pull/19240 --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #12646: [SPARK-14878][SQL] Trim characters string functio...
Github user kevinyu98 commented on a diff in the pull request: https://github.com/apache/spark/pull/12646#discussion_r139064657 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/stringExpressions.scala --- @@ -503,69 +504,307 @@ case class FindInSet(left: Expression, right: Expression) extends BinaryExpressi override def prettyName: String = "find_in_set" } +trait String2TrimExpression extends Expression with ImplicitCastInputTypes { + + override def dataType: DataType = StringType + override def inputTypes: Seq[AbstractDataType] = Seq.fill(children.size)(StringType) + + override def nullable: Boolean = children.exists(_.nullable) + override def foldable: Boolean = children.forall(_.foldable) +} + +object StringTrim { + def apply(str: Expression, trimStr: Expression) : StringTrim = StringTrim(str, Some(trimStr)) + def apply(str: Expression) : StringTrim = StringTrim(str, None) +} + /** - * A function that trim the spaces from both ends for the specified string. + * A function that takes a character string, removes the leading and trailing characters matching with the characters --- End diff -- done. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #19235: [SPARK-14387][SPARK-19459][SQL] Enable Hive-1.x ORC comp...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/19235 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #19235: [SPARK-14387][SPARK-19459][SQL] Enable Hive-1.x ORC comp...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/19235 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/81806/ Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #19235: [SPARK-14387][SPARK-19459][SQL] Enable Hive-1.x ORC comp...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/19235 **[Test build #81806 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/81806/testReport)** for PR 19235 at commit [`c6d2c35`](https://github.com/apache/spark/commit/c6d2c35cf200413ea0fab21c542de578f316039c). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #12646: [SPARK-14878][SQL] Trim characters string functio...
Github user kevinyu98 commented on a diff in the pull request: https://github.com/apache/spark/pull/12646#discussion_r139063658 --- Diff: common/unsafe/src/main/java/org/apache/spark/unsafe/types/UTF8String.java --- @@ -535,6 +585,51 @@ public UTF8String trimRight() { } } + /** + * Based on the given trim string, trim this string starting from right end + * This method searches each character in the source string starting from the right end, removes the character if it + * is in the trim string, stops at the first character which is not in the trim string, returns the new string. + * @param trimString the trim character string + */ + public UTF8String trimRight(UTF8String trimString) { +if (trimString == null) + return null; +int charIdx = 0; +// number of characters from the source string +int numChars = 0; +// array of character length for the source string +int[] stringCharLen = new int[numBytes]; +// array of the first byte position for each character in the source string +int[] stringCharPos = new int[numBytes]; +// build the position and length array +while (charIdx < numBytes) { + stringCharPos[numChars] = charIdx; + stringCharLen[numChars]= numBytesForFirstByte(getByte(charIdx)); --- End diff -- done. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #19239: [SPARK-22017] Take minimum of all watermark execs in Str...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/19239 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/81803/ Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #19239: [SPARK-22017] Take minimum of all watermark execs in Str...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/19239 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #19239: [SPARK-22017] Take minimum of all watermark execs in Str...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/19239 **[Test build #81803 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/81803/testReport)** for PR 19239 at commit [`8b60538`](https://github.com/apache/spark/commit/8b605384d77fdeb63b28feabee74284a5ab1409a). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #12646: [SPARK-14878][SQL] Trim characters string functio...
Github user kevinyu98 commented on a diff in the pull request: https://github.com/apache/spark/pull/12646#discussion_r139063428 --- Diff: common/unsafe/src/main/java/org/apache/spark/unsafe/types/UTF8String.java --- @@ -535,6 +585,51 @@ public UTF8String trimRight() { } } + /** + * Based on the given trim string, trim this string starting from right end + * This method searches each character in the source string starting from the right end, removes the character if it + * is in the trim string, stops at the first character which is not in the trim string, returns the new string. + * @param trimString the trim character string + */ + public UTF8String trimRight(UTF8String trimString) { +if (trimString == null) + return null; --- End diff -- changed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #12646: [SPARK-14878][SQL] Trim characters string functio...
Github user kevinyu98 commented on a diff in the pull request: https://github.com/apache/spark/pull/12646#discussion_r139063323 --- Diff: common/unsafe/src/main/java/org/apache/spark/unsafe/types/UTF8String.java --- @@ -522,6 +537,41 @@ public UTF8String trimLeft() { } } + /** + * Based on the given trim string, trim this string starting from left end + * This method searches each character in the source string starting from the left end, removes the character if it + * is in the trim string, stops at the first character which is not in the trim string, returns the new string. + * @param trimString the trim character string + */ + public UTF8String trimLeft(UTF8String trimString) { +if (trimString == null) + return null; --- End diff -- changed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #19226: [SPARK-21985][PySpark] PairDeserializer is broken for do...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/19226 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/81811/ Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #19226: [SPARK-21985][PySpark] PairDeserializer is broken for do...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/19226 **[Test build #81811 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/81811/testReport)** for PR 19226 at commit [`f6d42f4`](https://github.com/apache/spark/commit/f6d42f48e7b22e0758ff92e438e620f52fd95322). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #19226: [SPARK-21985][PySpark] PairDeserializer is broken for do...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/19226 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #12646: [SPARK-14878][SQL] Trim characters string functio...
Github user kevinyu98 commented on a diff in the pull request: https://github.com/apache/spark/pull/12646#discussion_r139062541 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/stringExpressions.scala --- @@ -503,69 +504,307 @@ case class FindInSet(left: Expression, right: Expression) extends BinaryExpressi override def prettyName: String = "find_in_set" } +trait String2TrimExpression extends Expression with ImplicitCastInputTypes { + + override def dataType: DataType = StringType + override def inputTypes: Seq[AbstractDataType] = Seq.fill(children.size)(StringType) + + override def nullable: Boolean = children.exists(_.nullable) + override def foldable: Boolean = children.forall(_.foldable) +} + +object StringTrim { + def apply(str: Expression, trimStr: Expression) : StringTrim = StringTrim(str, Some(trimStr)) + def apply(str: Expression) : StringTrim = StringTrim(str, None) +} + /** - * A function that trim the spaces from both ends for the specified string. + * A function that takes a character string, removes the leading and trailing characters matching with the characters + * in the trim string, returns the new string. + * If BOTH and trimStr keywords are not specified, it defaults to remove space character from both ends. The trim + * function will have one argument, which contains the source string. + * If BOTH and trimStr keywords are specified, it trims the characters from both ends, and the trim function will have + * two arguments, the first argument contains trimStr, the second argument contains the source string. + * trimStr: A character string to be trimmed from the source string, if it has multiple characters, the function + * searches for each character in the source string, removes the characters from the source string until it + * encounters the first non-match character. + * BOTH: removes any character from both ends of the source string that matches characters in the trim string. */ @ExpressionDescription( - usage = "_FUNC_(str) - Removes the leading and trailing space characters from `str`.", + usage = """ +_FUNC_(str) - Removes the leading and trailing space characters from `str`. +_FUNC_(BOTH trimStr FROM str) - Remove the leading and trailing trimString from `str` + """, + arguments = """ +Arguments: + * str - a string expression + * trimString - the trim string + * BOTH, FROM - these are keyword to specify for trim string from both ends of the string + """, examples = """ Examples: > SELECT _FUNC_('SparkSQL '); SparkSQL + > SELECT _FUNC_(BOTH 'SL' FROM 'SSparkSQLS'); + parkSQ """) -case class StringTrim(child: Expression) - extends UnaryExpression with String2StringExpression { +case class StringTrim( +srcStr: Expression, +trimStr: Option[Expression] = None) + extends String2TrimExpression { - def convert(v: UTF8String): UTF8String = v.trim() + def this (trimStr: Expression, srcStr: Expression) = this(srcStr, Option(trimStr)) + + def this(srcStr: Expression) = this(srcStr, None) override def prettyName: String = "trim" + override def children: Seq[Expression] = if (trimStr.isDefined) { +srcStr :: trimStr.get :: Nil + } else { +srcStr :: Nil + } + override def eval(input: InternalRow): Any = { +val srcString = srcStr.eval(input).asInstanceOf[UTF8String] +if (srcString == null) { + null +} else { + if (trimStr.isDefined) { +return srcString.trim(trimStr.get.eval(input).asInstanceOf[UTF8String]) --- End diff -- sure. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #18853: [SPARK-21646][SQL] CommonType for binary comparison
Github user wangyum commented on the issue: https://github.com/apache/spark/pull/18853 I provide 2 SQL scripts to validate the different result between Spark and Hive: | Engine | [SPARK_21646_1.txt](https://github.com/apache/spark/files/1305185/SPARK_21646_1.txt) | [SPARK_21646_2.txt](https://github.com/apache/spark/files/1305186/SPARK_21646_2.txt) | | - |:-:| -:| | [Hive-2.2.0](http://mirrors.hust.edu.cn/apache/hive/hive-2.2.0/apache-hive-2.2.0-bin.tar.gz) | 0.10.61001 |2017-09-14 | | Spark| 100 | - | --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #12646: [SPARK-14878][SQL] Trim characters string functio...
Github user kevinyu98 commented on a diff in the pull request: https://github.com/apache/spark/pull/12646#discussion_r139062566 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/stringExpressions.scala --- @@ -503,69 +504,307 @@ case class FindInSet(left: Expression, right: Expression) extends BinaryExpressi override def prettyName: String = "find_in_set" } +trait String2TrimExpression extends Expression with ImplicitCastInputTypes { + + override def dataType: DataType = StringType + override def inputTypes: Seq[AbstractDataType] = Seq.fill(children.size)(StringType) + + override def nullable: Boolean = children.exists(_.nullable) + override def foldable: Boolean = children.forall(_.foldable) +} + +object StringTrim { + def apply(str: Expression, trimStr: Expression) : StringTrim = StringTrim(str, Some(trimStr)) + def apply(str: Expression) : StringTrim = StringTrim(str, None) +} + /** - * A function that trim the spaces from both ends for the specified string. + * A function that takes a character string, removes the leading and trailing characters matching with the characters + * in the trim string, returns the new string. + * If BOTH and trimStr keywords are not specified, it defaults to remove space character from both ends. The trim + * function will have one argument, which contains the source string. + * If BOTH and trimStr keywords are specified, it trims the characters from both ends, and the trim function will have + * two arguments, the first argument contains trimStr, the second argument contains the source string. + * trimStr: A character string to be trimmed from the source string, if it has multiple characters, the function + * searches for each character in the source string, removes the characters from the source string until it + * encounters the first non-match character. + * BOTH: removes any character from both ends of the source string that matches characters in the trim string. */ @ExpressionDescription( - usage = "_FUNC_(str) - Removes the leading and trailing space characters from `str`.", + usage = """ +_FUNC_(str) - Removes the leading and trailing space characters from `str`. +_FUNC_(BOTH trimStr FROM str) - Remove the leading and trailing trimString from `str` + """, + arguments = """ +Arguments: + * str - a string expression + * trimString - the trim string + * BOTH, FROM - these are keyword to specify for trim string from both ends of the string + """, examples = """ Examples: > SELECT _FUNC_('SparkSQL '); SparkSQL + > SELECT _FUNC_(BOTH 'SL' FROM 'SSparkSQLS'); + parkSQ """) -case class StringTrim(child: Expression) - extends UnaryExpression with String2StringExpression { +case class StringTrim( +srcStr: Expression, +trimStr: Option[Expression] = None) + extends String2TrimExpression { - def convert(v: UTF8String): UTF8String = v.trim() + def this (trimStr: Expression, srcStr: Expression) = this(srcStr, Option(trimStr)) + + def this(srcStr: Expression) = this(srcStr, None) override def prettyName: String = "trim" + override def children: Seq[Expression] = if (trimStr.isDefined) { +srcStr :: trimStr.get :: Nil + } else { +srcStr :: Nil + } + override def eval(input: InternalRow): Any = { +val srcString = srcStr.eval(input).asInstanceOf[UTF8String] +if (srcString == null) { + null +} else { + if (trimStr.isDefined) { +return srcString.trim(trimStr.get.eval(input).asInstanceOf[UTF8String]) + } else { +return srcString.trim() --- End diff -- changed --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #19210: Fix Graphite re-connects for Graphite instances behind E...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/19210 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #19210: Fix Graphite re-connects for Graphite instances behind E...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/19210 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/81802/ Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #19210: Fix Graphite re-connects for Graphite instances behind E...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/19210 **[Test build #81802 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/81802/testReport)** for PR 19210 at commit [`0458123`](https://github.com/apache/spark/commit/0458123c1b3827f8b4b55eeb8bd5f7dbc749a4aa). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #19226: [SPARK-21985][PySpark] PairDeserializer is broken for do...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/19226 **[Test build #81811 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/81811/testReport)** for PR 19226 at commit [`f6d42f4`](https://github.com/apache/spark/commit/f6d42f48e7b22e0758ff92e438e620f52fd95322). --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #19226: [SPARK-21985][PySpark] PairDeserializer is broken for do...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/19226 Merged build finished. Test FAILed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #19226: [SPARK-21985][PySpark] PairDeserializer is broken for do...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/19226 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/81810/ Test FAILed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #19226: [SPARK-21985][PySpark] PairDeserializer is broken for do...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/19226 **[Test build #81810 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/81810/testReport)** for PR 19226 at commit [`54b7fd0`](https://github.com/apache/spark/commit/54b7fd0de15b1674fd8a285708ef669c29fb1ed9). * This patch **fails Python style tests**. * This patch merges cleanly. * This patch adds no public classes. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #19226: [SPARK-21985][PySpark] PairDeserializer is broken for do...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/19226 **[Test build #81810 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/81810/testReport)** for PR 19226 at commit [`54b7fd0`](https://github.com/apache/spark/commit/54b7fd0de15b1674fd8a285708ef669c29fb1ed9). --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #19171: [SPARK-21902][CORE] Print root cause for BlockManager#do...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/19171 **[Test build #81808 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/81808/testReport)** for PR 19171 at commit [`a3ed8b3`](https://github.com/apache/spark/commit/a3ed8b38879bd017b9c9b2081cec81987e7e33ef). --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #19136: [SPARK-15689][SQL] data source v2 read path
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/19136 **[Test build #81809 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/81809/testReport)** for PR 19136 at commit [`d2c86f4`](https://github.com/apache/spark/commit/d2c86f4339d59f227bef61b1c97f9770ce1233b9). --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #19171: [SPARK-21902][CORE] Print root cause for BlockMan...
Github user caneGuy commented on a diff in the pull request: https://github.com/apache/spark/pull/19171#discussion_r139058928 --- Diff: core/src/main/scala/org/apache/spark/storage/BlockManager.scala --- @@ -988,6 +988,12 @@ private[spark] class BlockManager( logWarning(s"Putting block $blockId failed") } res +} catch { + // Since removeBlockInternal may throw exception, + // we should print exception first to show root cause. + case e: Throwable => +logWarning(s"Putting block $blockId failed due to exception $e.") --- End diff -- Done @jerryshao --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #16832: [SPARK-19490][SQL] ignore case sensitivity when filterin...
Github user LantaoJin commented on the issue: https://github.com/apache/spark/pull/16832 Duplicates to [SPARK-18572](https://github.com/apache/spark/pull/15998) --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #19240: [SPARK-22018][SQL]Preserve top-level alias metadata when...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/19240 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #19240: [SPARK-22018][SQL]Preserve top-level alias metadata when...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/19240 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/81801/ Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #19240: [SPARK-22018][SQL]Preserve top-level alias metadata when...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/19240 **[Test build #81801 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/81801/testReport)** for PR 19240 at commit [`b3e41a7`](https://github.com/apache/spark/commit/b3e41a7603e8bf917e9b596bdeb6afa51a32a695). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #19241: Spark on Kubernetes - basic scheduler backend [WI...
Github user foxish closed the pull request at: https://github.com/apache/spark/pull/19241 --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #19241: Spark on Kubernetes - basic scheduler backend [WIP]
Github user foxish commented on the issue: https://github.com/apache/spark/pull/19241 The unnecessary constants and config still needs to be stripped out. Getting this out there to 1) serve as the framework for our first PR, 2) get some insight on the unit test failures --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #19241: Spark on Kubernetes - basic scheduler backend [WI...
GitHub user foxish opened a pull request: https://github.com/apache/spark/pull/19241 Spark on Kubernetes - basic scheduler backend [WIP] Stripped out a lot of extraneous things, to create this. Our first PR upstream will likely be this. (note that it is created against the master branch which is up-to-date.) A couple of unit test failures the unit tests for scheduler backend - could be something that changed upstream after 2.2, needs a closer look. cc @mccheah @ash211 @erikerlandson @felixcheung You can merge this pull request into a Git repository by running: $ git pull https://github.com/apache-spark-on-k8s/spark spark-kubernetes-1 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/spark/pull/19241.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #19241 commit c8312b318296aea890a6831d8a7acaadeea00bcf Author: foxishDate: 2017-09-15T03:10:24Z Spark on Kubernetes - basic scheduler backend --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #19223: [SPARK-21513][SQL][FOLLOWUP] Allow UDF to_json support c...
Github user goldmedal commented on the issue: https://github.com/apache/spark/pull/19223 Thanks @HyukjinKwon @felixcheung @viirya --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #19233: [Spark-22008][Streaming]Spark Streaming Dynamic Allocati...
Github user jerryshao commented on the issue: https://github.com/apache/spark/pull/19233 IIUC streaming DRA seems an obsolete code. Long ago when I played with it, there existed some bugs, but seems not so many users used this feature. I'm not sure if we really need to put efforts on this code, since we now move to Structured Streaming instead. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #19223: [SPARK-21513][SQL][FOLLOWUP] Allow UDF to_json su...
Github user asfgit closed the pull request at: https://github.com/apache/spark/pull/19223 --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #19231: [SPARK-22002][SQL] Read JDBC table use custom schema sup...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/19231 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/81800/ Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #19223: [SPARK-21513][SQL][FOLLOWUP] Allow UDF to_json support c...
Github user HyukjinKwon commented on the issue: https://github.com/apache/spark/pull/19223 Merged to master. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #19231: [SPARK-22002][SQL] Read JDBC table use custom schema sup...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/19231 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #19171: [SPARK-21902][CORE] Print root cause for BlockManager#do...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/19171 **[Test build #81807 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/81807/testReport)** for PR 19171 at commit [`86525f7`](https://github.com/apache/spark/commit/86525f7f7cb5c2656008ba92c44c724db186158e). --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #19223: [SPARK-21513][SQL][FOLLOWUP] Allow UDF to_json support c...
Github user viirya commented on the issue: https://github.com/apache/spark/pull/19223 Thanks @felixcheung @HyukjinKwon --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #19235: [SPARK-14387][SPARK-19459][SQL] Enable Hive-1.x O...
Github user dongjoon-hyun commented on a diff in the pull request: https://github.com/apache/spark/pull/19235#discussion_r139054047 --- Diff: sql/hive/src/test/scala/org/apache/spark/sql/hive/orc/OrcSourceSuite.scala --- @@ -187,8 +188,12 @@ abstract class OrcSuite extends QueryTest with TestHiveSingleton with BeforeAndA |STORED AS orc |LOCATION '$uri'""".stripMargin) val result = Row("a", "b ", "c", Seq("d ")) - checkAnswer(spark.table("hive_orc"), result) - checkAnswer(spark.table("spark_orc"), result) + Seq("false", "true").foreach { value => +withSQLConf(HiveUtils.CONVERT_METASTORE_ORC.key -> value) { --- End diff -- This is a test case for ORC file format. Based on #14471, I'm enabling this. For Parquet, I think we can proceed separately if ORC is finished. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #19231: [SPARK-22002][SQL] Read JDBC table use custom schema sup...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/19231 **[Test build #81800 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/81800/testReport)** for PR 19231 at commit [`06095f5`](https://github.com/apache/spark/commit/06095f52454a000d15d3df5845a383cb1e1dbddc). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #19130: [SPARK-21917][CORE][YARN] Supporting adding http(...
Github user jerryshao commented on a diff in the pull request: https://github.com/apache/spark/pull/19130#discussion_r139053961 --- Diff: core/src/main/scala/org/apache/spark/deploy/SparkSubmit.scala --- @@ -367,6 +368,54 @@ object SparkSubmit extends CommandLineUtils with Logging { }.orNull } +// When running in YARN, for some remote resources with scheme: +// 1. Hadoop FileSystem doesn't support them. +// 2. We explicitly bypass Hadoop FileSystem with "spark.yarn.dist.forceDownloadSchemes". +// We will download them to local disk prior to add to YARN's distributed cache. +// For yarn client mode, since we already download them with above code, so we only need to +// figure out the local path and replace the remote one. +if (clusterManager == YARN) { + sparkConf.setIfMissing(SecurityManager.SPARK_AUTH_SECRET_CONF, "unused") + val secMgr = new SecurityManager(sparkConf) + val forceDownloadSchemes = sparkConf.get(FORCE_DOWNLOAD_SCHEMES) + + def shouldDownload(scheme: String): Boolean = { +val isFsAvailable = () => { + Try { FileSystem.getFileSystemClass(scheme, hadoopConf) }.isSuccess +} +forceDownloadSchemes.contains(scheme) || !isFsAvailable() + } + + def downloadResource(resource: String): String = { +val uri = Utils.resolveURI(resource) +uri.getScheme match { + case "local" | "file" => resource + case e if shouldDownload(e) => +val file = new File(targetDir, new Path(uri).getName) +if (file.exists()) { + file.toURI.toString +} else { + downloadFile(resource, targetDir, sparkConf, hadoopConf, secMgr) +} + case _ => uri.toString +} + } + + args.primaryResource = Option(args.primaryResource).map { downloadResource }.orNull + args.files = Option(args.files).map { files => +files.split(",").map(_.trim).filter(_.nonEmpty).map { downloadResource }.mkString(",") --- End diff -- I added a help method in `Utils` and changed in `SparkSubmit` related codes. There still have some other places which requires to change, but I will not touch them in this PR. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #19223: [SPARK-21513][SQL][FOLLOWUP] Allow UDF to_json support c...
Github user HyukjinKwon commented on the issue: https://github.com/apache/spark/pull/19223 Looks passed fine. Let me merge this one. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #19206: [SPARK-19206][yarn]Client and ApplicationMaster r...
Github user Chaos-Ju closed the pull request at: https://github.com/apache/spark/pull/19206 --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #19171: [SPARK-21902][CORE] Print root cause for BlockManager#do...
Github user jerryshao commented on the issue: https://github.com/apache/spark/pull/19171 ok to test. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #19235: [SPARK-21997][SQL][WIP] Turn off spark.sql.hive.convertM...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/19235 **[Test build #81806 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/81806/testReport)** for PR 19235 at commit [`c6d2c35`](https://github.com/apache/spark/commit/c6d2c35cf200413ea0fab21c542de578f316039c). --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #19171: [SPARK-21902][CORE] Print root cause for BlockMan...
Github user jerryshao commented on a diff in the pull request: https://github.com/apache/spark/pull/19171#discussion_r139053438 --- Diff: core/src/main/scala/org/apache/spark/storage/BlockManager.scala --- @@ -988,6 +988,12 @@ private[spark] class BlockManager( logWarning(s"Putting block $blockId failed") } res +} catch { + // Since removeBlockInternal may throw exception, + // we should print exception first to show root cause. + case e: Throwable => +logWarning(s"Putting block $blockId failed due to exception $e.") --- End diff -- Would you please change to `case NonFatal(e) =>` --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #19237: [SPARK-21987][SQL] fix a compatibility issue of sql even...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/19237 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #19237: [SPARK-21987][SQL] fix a compatibility issue of sql even...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/19237 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/81799/ Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #19237: [SPARK-21987][SQL] fix a compatibility issue of sql even...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/19237 **[Test build #81799 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/81799/testReport)** for PR 19237 at commit [`93cacba`](https://github.com/apache/spark/commit/93cacbabfe72a1c7dba20d00612757ae6e11f854). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #19235: [SPARK-21997][SQL][WIP] Turn off spark.sql.hive.convertM...
Github user dongjoon-hyun commented on the issue: https://github.com/apache/spark/pull/19235 Since this PR is invalid, I'll reuse this PR instead of creating new one. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #14471: [SPARK-14387][SQL] Enable Hive-1.x ORC compatibility wit...
Github user dongjoon-hyun commented on the issue: https://github.com/apache/spark/pull/14471 Hi, @rajeshbalamohan . I'll refer your commit for SPARK-19459 . You'll be the main author in case of merge. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #19130: [SPARK-21917][CORE][YARN] Supporting adding http(s) reso...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/19130 **[Test build #81805 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/81805/testReport)** for PR 19130 at commit [`fc2eb2b`](https://github.com/apache/spark/commit/fc2eb2b4cd0472441c4052adc83728440838c92d). --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #19235: [SPARK-21997][SQL][WIP] Turn off spark.sql.hive.convertM...
Github user dongjoon-hyun commented on the issue: https://github.com/apache/spark/pull/19235 @gatorsmile and @vanzin . I'm comparing with ORC now. Previously, ORC fails with another reason. I'll make another PR for that. I found that #14471 is enough for ORC. In case of ORC, ORC itself handles truncations on write. The padding is handled by Hive side `HiveCharWritable` via [HiveBaseChar.java](https://github.com/apache/hive/blob/master/common/src/java/org/apache/hadoop/hive/common/type/HiveBaseChar.java#L57) on read. In case of Parquet, I guess Parquet is the same, but there is no such a padding logic like HiveCharWritable in Spark. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #19130: [SPARK-21917][CORE][YARN] Supporting adding http(s) reso...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/19130 **[Test build #81804 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/81804/testReport)** for PR 19130 at commit [`1c5487c`](https://github.com/apache/spark/commit/1c5487c2c1af7db45dee327d0ded8ab43e04d5d2). --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #19133: [SPARK-21902][CORE] Uniform calling for DiskBlockManager...
Github user caneGuy commented on the issue: https://github.com/apache/spark/pull/19133 Actually , initially i put this together with [PR-19171](https://github.com/apache/spark/pull/19171) since i found the api is not unify when fix that problem.All right i will close this one.Cloud you help review 19171? @jerryshao Thanks. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #19133: [SPARK-21902][CORE] Uniform calling for DiskBlock...
Github user caneGuy closed the pull request at: https://github.com/apache/spark/pull/19133 --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #19231: [SPARK-22002][SQL] Read JDBC table use custom schema sup...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/19231 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #19231: [SPARK-22002][SQL] Read JDBC table use custom schema sup...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/19231 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/81798/ Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #19231: [SPARK-22002][SQL] Read JDBC table use custom schema sup...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/19231 **[Test build #81798 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/81798/testReport)** for PR 19231 at commit [`1ee4ea0`](https://github.com/apache/spark/commit/1ee4ea0a23b257caa6c3fc7c2b2b73e154314f02). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #19239: [SPARK-22017] Take minimum of all watermark execs in Str...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/19239 **[Test build #81803 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/81803/testReport)** for PR 19239 at commit [`8b60538`](https://github.com/apache/spark/commit/8b605384d77fdeb63b28feabee74284a5ab1409a). --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #19239: [SPARK-22017] Take minimum of all watermark execs in Str...
Github user joseph-torres commented on the issue: https://github.com/apache/spark/pull/19239 addressed comments --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #14158: [SPARK-13547] [SQL] [WEBUI] Add SQL query in web UI's SQ...
Github user nblintao commented on the issue: https://github.com/apache/spark/pull/14158 @HyukjinKwon Sorry for the delay. I'm busy looking for jobs these days. I'll try my best to fix it in October. Thank you for reminding me! --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #19135: [SPARK-21923][CORE]Avoid calling reserveUnrollMem...
Github user ConeyLiu commented on a diff in the pull request: https://github.com/apache/spark/pull/19135#discussion_r139049317 --- Diff: core/src/main/scala/org/apache/spark/storage/memory/MemoryStore.scala --- @@ -325,6 +325,10 @@ private[spark] class MemoryStore( // Whether there is still enough memory for us to continue unrolling this block var keepUnrolling = true +// Number of elements unrolled so far +var elementsUnrolled = 0L +// How often to check whether we need to request more memory +val memoryCheckPeriod = 16 --- End diff -- I have just made it configurable. I'm not sure if this writting is reasonable. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #19210: Fix Graphite re-connects for Graphite instances behind E...
Github user jerryshao commented on the issue: https://github.com/apache/spark/pull/19210 BTW, can you please create a JIRA, and fix the PR title like other PRs. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #19133: [SPARK-21902][CORE] Uniform calling for DiskBlockManager...
Github user jerryshao commented on the issue: https://github.com/apache/spark/pull/19133 This is not a necessary fix. We usually don't do such changes without really fix anything. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #19135: [SPARK-21923][CORE]Avoid calling reserveUnrollMem...
Github user cloud-fan commented on a diff in the pull request: https://github.com/apache/spark/pull/19135#discussion_r139047702 --- Diff: core/src/main/scala/org/apache/spark/storage/memory/MemoryStore.scala --- @@ -325,6 +325,10 @@ private[spark] class MemoryStore( // Whether there is still enough memory for us to continue unrolling this block var keepUnrolling = true +// Number of elements unrolled so far +var elementsUnrolled = 0L +// How often to check whether we need to request more memory +val memoryCheckPeriod = 16 --- End diff -- it's hard coded in `putIteratorAsValues` too, we can improve it later. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #19191: [SPARK-21958][ML] Word2VecModel save: transform data in ...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/19191 Can one of the admins verify this patch? --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #19210: Fix Graphite re-connects for Graphite instances behind E...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/19210 **[Test build #81802 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/81802/testReport)** for PR 19210 at commit [`0458123`](https://github.com/apache/spark/commit/0458123c1b3827f8b4b55eeb8bd5f7dbc749a4aa). --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #19227: [SPARK-20060][CORE] Support accessing secure Hado...
Github user jerryshao closed the pull request at: https://github.com/apache/spark/pull/19227 --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #19227: [SPARK-20060][CORE] Support accessing secure Hadoop clus...
Github user jerryshao commented on the issue: https://github.com/apache/spark/pull/19227 I see, so based on your comments: 1. Mesos should not honor principal/keytab configuration. Instead of rename them, we should remove the `MESOS` here: ``` if (clusterManager == YARN || clusterManager == LOCAL || clusterManager == MESOS) { ``` 2. Spark on standalone is not well suitable for accessing security cluster with tokens or keytabs currently. Let me close this PR and rethink about some minor issues. BTW. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #19210: Fix Graphite re-connects for Graphite instances behind E...
Github user alexmnyc commented on the issue: https://github.com/apache/spark/pull/19210 @jerryshao done --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #19218: [SPARK-21786][SQL] The 'spark.sql.parquet.compression.co...
Github user maropu commented on the issue: https://github.com/apache/spark/pull/19218 @gatorsmile Is it worth fixing this? If so, could you trigger tests? --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #19133: [SPARK-21902][CORE] Uniform calling for DiskBlockManager...
Github user caneGuy commented on the issue: https://github.com/apache/spark/pull/19133 Ping @kiszk Cloud you help take a look at this? Thanks too much. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #19210: Fix Graphite re-connects for Graphite instances b...
Github user alexmnyc commented on a diff in the pull request: https://github.com/apache/spark/pull/19210#discussion_r139046210 --- Diff: core/src/main/scala/org/apache/spark/metrics/sink/GraphiteSink.scala --- @@ -69,7 +69,7 @@ private[spark] class GraphiteSink(val property: Properties, val registry: Metric val graphite = propertyToOption(GRAPHITE_KEY_PROTOCOL).map(_.toLowerCase(Locale.ROOT)) match { case Some("udp") => new GraphiteUDP(new InetSocketAddress(host, port)) -case Some("tcp") | None => new Graphite(new InetSocketAddress(host, port)) +case Some("tcp") | None => new Graphite(host, port) --- End diff -- Looks like there is a constructor accepting hostname directly for UDP. I'll add that as well. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #19171: [SPARK-21902][CORE] Print root cause for BlockManager#do...
Github user caneGuy commented on the issue: https://github.com/apache/spark/pull/19171 Ping @kiszk Cloud you help take a look at this? Thanks too much. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #19227: [SPARK-20060][CORE] Support accessing secure Hadoop clus...
Github user vanzin commented on the issue: https://github.com/apache/spark/pull/19227 > Current Spark on Mesos code actually honors it: Then it really shouldn't. Those options are for long lived applications and Mesos don't yet support those. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #19227: [SPARK-20060][CORE] Support accessing secure Hadoop clus...
Github user jerryshao commented on the issue: https://github.com/apache/spark/pull/19227 > I don't think Mesos honors it (and it shouldn't be, since IIRC it hasn't implemented long-lived app support yet). Current Spark on Mesos code actually honors it: ``` // assure a keytab is available from any place in a JVM if (clusterManager == YARN || clusterManager == LOCAL || clusterManager == MESOS) { if (args.principal != null) { if (args.keytab != null) { require(new File(args.keytab).exists(), s"Keytab file: ${args.keytab} does not exist") // Add keytab and principal configurations in sysProps to make them available // for later use; e.g. in spark sql, the isolated class loader used to talk // to HiveMetastore will use these settings. They will be set as Java system // properties and then loaded by SparkConf sysProps.put("spark.yarn.keytab", args.keytab) sysProps.put("spark.yarn.principal", args.principal) UserGroupInformation.loginUserFromKeytab(args.principal, args.keytab) } } } ``` --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #19240: [SPARK-22018][SQL]Preserve top-level alias metadata when...
Github user marmbrus commented on the issue: https://github.com/apache/spark/pull/19240 LGTM --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #19227: [SPARK-20060][CORE] Support accessing secure Hadoop clus...
Github user vanzin commented on the issue: https://github.com/apache/spark/pull/19227 > so that's why I rename them. What do you think? I think that should be a separate change. I don't think Mesos honors it (and it shouldn't be, since IIRC it hasn't implemented long-lived app support yet). principal / keytab is for long lived apps that are expected to outlive the token's TTL. Everybody else should be logging in to kerberos themselves, since most of those users won't even have a keytab. > Spark on Mesos also uses the same approach, does it also has same issue? Mesos, like YARN, can provide the proper user isolation that standalone cannot currently. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #18685: [SPARK-21439] Support for ABCMeta in PySpark
Github user HyukjinKwon commented on a diff in the pull request: https://github.com/apache/spark/pull/18685#discussion_r139045469 --- Diff: python/pyspark/cloudpickle.py --- @@ -667,6 +667,13 @@ def save_file(self, obj): else: dispatch[file] = save_file +# WeakSet was added in 2.7. --- End diff -- BTW, I'd add a simple e2e test. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #19210: Fix Graphite re-connects for Graphite instances b...
Github user jerryshao commented on a diff in the pull request: https://github.com/apache/spark/pull/19210#discussion_r139045423 --- Diff: core/src/main/scala/org/apache/spark/metrics/sink/GraphiteSink.scala --- @@ -69,7 +69,7 @@ private[spark] class GraphiteSink(val property: Properties, val registry: Metric val graphite = propertyToOption(GRAPHITE_KEY_PROTOCOL).map(_.toLowerCase(Locale.ROOT)) match { case Some("udp") => new GraphiteUDP(new InetSocketAddress(host, port)) -case Some("tcp") | None => new Graphite(new InetSocketAddress(host, port)) +case Some("tcp") | None => new Graphite(host, port) --- End diff -- Yes, that's what I mean, I'm not sure if it is required since I'm not familiar with this Graphite sink. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #18685: [SPARK-21439] Support for ABCMeta in PySpark
Github user HyukjinKwon commented on the issue: https://github.com/apache/spark/pull/18685 gentle ping @maver1ck. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #19227: [SPARK-20060][CORE] Support accessing secure Hadoop clus...
Github user jerryshao commented on the issue: https://github.com/apache/spark/pull/19227 The purpose of changing configuration name is that these configurations are not only used by yarn mode in `SparkSubmit`, Mesos, local will also honor this, so that's why I rename them. What do you think? Regarding "impersonate the user you stole the delegation tokens from", yes, it is possible, but I do see Spark on Mesos also uses the same approach, does it also has same issue? --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #19227: [SPARK-20060][CORE] Support accessing secure Hadoop clus...
Github user vanzin commented on the issue: https://github.com/apache/spark/pull/19227 If you're not touching keytabs why are you changing the property names, which would be an unrelated change? You don't need principal / keytab to get delegation tokens. You just login with `kinit`. But shipping delegation tokens in standalone mode has similar issues; it's a little harder, but not much, to get them, and they have a TTL, but during that period you can pretty much impersonate the user you stole the delegation tokens from. I could hold my nose and let delegation token support be added to standalone if it's stuck behind a scary looking config like `spark.standalone.enableInsecureKerberosSupport`; but I'm really not comfortable with adding any kind of support for shipping keytabs. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #19223: [SPARK-21513][SQL][FOLLOWUP] Allow UDF to_json support c...
Github user goldmedal commented on the issue: https://github.com/apache/spark/pull/19223 ok. I got it. Thanks :) --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #19223: [SPARK-21513][SQL][FOLLOWUP] Allow UDF to_json support c...
Github user HyukjinKwon commented on the issue: https://github.com/apache/spark/pull/19223 Yes, when there are some changes in: https://github.com/apache/spark/blob/828fab03567ecc245a65c4d295a677ce0ba26c19/appveyor.yml#L29-L35 It should run the R tests on Windows via AppVeyor. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org