[GitHub] spark pull request: [SPARK-8199][SQL] follow up; revert change in ...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/7505#issuecomment-122637767 [Test build #1113 has finished](https://amplab.cs.berkeley.edu/jenkins/job/NewSparkPullRequestBuilder/1113/console) for PR 7505 at commit [`d09321c`](https://github.com/apache/spark/commit/d09321c7f3a4d5127c357fe15e7d6ab9531719d9). * This patch **passes all tests**. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-9175] [MLlib] BLAS.gemm fails to update...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/7503#issuecomment-122638017 Merged build triggered. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-9109] [GraphX] Keep the cached edge in ...
Github user srowen commented on the pull request: https://github.com/apache/spark/pull/7469#issuecomment-122638035 Thanks @ankurdave -- you can follow this by resolving the issue (done already now) --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-9175] [MLlib] BLAS.gemm fails to update...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/7503#issuecomment-122638019 Merged build started. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-8199][SQL] follow up; revert change in ...
Github user asfgit closed the pull request at: https://github.com/apache/spark/pull/7505 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SQL] Make date/time functions more consistent...
Github user asfgit closed the pull request at: https://github.com/apache/spark/pull/7506 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-9094] [PARENT] Increased io.dropwizard....
Github user asfgit closed the pull request at: https://github.com/apache/spark/pull/7493 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SQL] Make date/time functions more consistent...
Github user rxin commented on the pull request: https://github.com/apache/spark/pull/7506#issuecomment-122638167 I've merged this. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-6761][SQL] Approximate quantile for Dat...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/6042#issuecomment-122638558 Build triggered. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-6761][SQL] Approximate quantile for Dat...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/6042#issuecomment-122638561 Build started. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-8241][SQL] string function: concat_ws.
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/7504#issuecomment-122638622 [Test build #37761 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/37761/console) for PR 7504 at commit [`dda1021`](https://github.com/apache/spark/commit/dda1021891cbfea0c6859542f3270a5ae8c20486). * This patch **fails Spark unit tests**. * This patch merges cleanly. * This patch adds the following public classes _(experimental)_: * `case class ConcatWs(children: Seq[Expression])` --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-6761][SQL] Approximate quantile for Dat...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/6042#issuecomment-122638906 [Test build #37769 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/37769/consoleFull) for PR 6042 at commit [`1086537`](https://github.com/apache/spark/commit/10865378c3aba5e639c352bded61a616933a5f1c). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-9178][SQL] Add an empty string constant...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/7509#issuecomment-122639078 Can one of the admins verify this patch? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-9178][SQL] Add an empty string constant...
GitHub user tarekauel opened a pull request: https://github.com/apache/spark/pull/7509 [SPARK-9178][SQL] Add an empty string constant to UTF8String Jira: https://issues.apache.org/jira/browse/SPARK-9178 In order to avoid calls of `UTF8String.fromString()` this pr adds an `EMPTY_STRING` constant to `UTF8String`. An `UTF8String` is immutable, so we can use a constant, isn't it? I searched for current usage of `UTF8String.fromString()` with `grep -R UTF8String.fromString(\\) .` You can merge this pull request into a Git repository by running: $ git pull https://github.com/tarekauel/spark SPARK-9178 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/spark/pull/7509.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #7509 commit 748b87a38575664fcfc877ccc575678ba54a9df6 Author: Tarek Auel tarek.a...@googlemail.com Date: 2015-07-19T08:22:43Z [SPARK-9178] Add empty string constant to UTF8String --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-8241][SQL] string function: concat_ws.
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/7504#issuecomment-122638627 Merged build finished. Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-9128][Core] Get outerclasses and object...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/7459#issuecomment-122640067 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-9128][Core] Get outerclasses and object...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/7459#issuecomment-122640048 [Test build #37762 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/37762/console) for PR 7459 at commit [`7c9858d`](https://github.com/apache/spark/commit/7c9858db0f8374c8f124b4a964190ad2ff5ad898). * This patch **passes all tests**. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [HOTFIX] [SQL] Fixes compilation error introdu...
GitHub user liancheng opened a pull request: https://github.com/apache/spark/pull/7510 [HOTFIX] [SQL] Fixes compilation error introduced by PR #7506 PR #7506 breaks master build because of compilation error. Note that #7506 itself looks good, but it seems that `git merge` did something stupid. You can merge this pull request into a Git repository by running: $ git pull https://github.com/liancheng/spark hotfix-for-pr-7506 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/spark/pull/7510.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #7510 commit 7ea7e89818e529e43afdb9c18e4a68ba33acdd13 Author: Cheng Lian l...@databricks.com Date: 2015-07-19T09:06:07Z Fixes compilation error --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [HOTFIX] [SQL] Fixes compilation error introdu...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/7510#issuecomment-122641202 Merged build triggered. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [HOTFIX] [SQL] Fixes compilation error introdu...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/7510#issuecomment-122641210 Merged build started. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-9172][SQL] Make DecimalPrecision suppor...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/7511#issuecomment-122642015 Merged build started. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-9172][SQL] Make DecimalPrecision suppor...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/7511#issuecomment-122643246 [Test build #37771 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/37771/consoleFull) for PR 7511 at commit [`9fb0d49`](https://github.com/apache/spark/commit/9fb0d490f4244963138e0fcaddba82ad066b0a3f). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-9172][SQL] Make DecimalPrecision suppor...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/7511#issuecomment-122643791 [Test build #37771 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/37771/console) for PR 7511 at commit [`9fb0d49`](https://github.com/apache/spark/commit/9fb0d490f4244963138e0fcaddba82ad066b0a3f). * This patch **fails Spark unit tests**. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-9172][SQL] Make DecimalPrecision suppor...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/7511#issuecomment-122643795 Merged build finished. Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-8951][SparkR] support Unicode character...
Github user CHOIJAEHONG1 commented on the pull request: https://github.com/apache/spark/pull/7494#issuecomment-122643950 I am not sure about `readString`, but the teatcase, which verifies the intactness of unicode characters in a native dataframe making a round trip to Spark's DataFrame, failed. There is something underneath. ``` 1. Failure(@test_sparkSQL.R#438): collect() support Unicode characters - collect(where(df2, df2$name == \346\202\250\345\245\275))[[2]] not equal to \346\202\250\345\245\275 1 string mismatches: x[1]: \346\202\250\345\245\275 y[1]: e682a8e5a5bd ``` --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: Spark 8695
Github user srowen commented on the pull request: https://github.com/apache/spark/pull/7397#issuecomment-122644170 (Ah: this continues a discussion in https://github.com/apache/spark/pull/7168 -- should have been mentioned in this PR.) @piganesh do you mind closing this if you're not going to follow up? Otherwise, please make the `.toDouble` change and correctly title this PR. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [MESOS][SPARK-8798] Allow additional uris to b...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/7195#issuecomment-122635356 [Test build #37757 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/37757/console) for PR 7195 at commit [`42e2ee2`](https://github.com/apache/spark/commit/42e2ee29ecd034b93c6a705fc9c8d4297de6362b). * This patch **fails Spark unit tests**. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-8255][SPARK-8256][SQL]Add regex_extract...
Github user tarekauel commented on a diff in the pull request: https://github.com/apache/spark/pull/7468#discussion_r34955908 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/stringOperations.scala --- @@ -673,6 +673,110 @@ case class Encode(value: Expression, charset: Expression) } /** + * Replace all substrings of str that match regexp with rep + */ +case class RegExpReplace(subject: Expression, regexp: Expression, rep: Expression) + extends Expression with ImplicitCastInputTypes { + + // last regex in string, we will update the pattern iff regexp value changed. + @transient private var lastRegex: UTF8String = _ + // last regex pattern, we cache it for performance concern + @transient private var pattern: Pattern = _ + // last replacement string, we don't want to convert a UTF8String = java.langString every time. + @transient private var lastReplacement: String = _ + @transient private var lastReplacementInUTF8: UTF8String = _ + // result buffer write by Matcher + @transient private val result: StringBuffer = new StringBuffer + + override def nullable: Boolean = children.foldLeft(false)(_ || _.nullable) + override def foldable: Boolean = children.foldLeft(true)(_ _.foldable) + + override def eval(input: InternalRow): Any = { +val s = subject.eval(input) +if (null != s) { + val p = regexp.eval(input) + if (null != p) { +val r = rep.eval(input) +if (null != r) { + if (!p.equals(lastRegex)) { +// regex value changed +lastRegex = p.asInstanceOf[UTF8String] +pattern = Pattern.compile(lastRegex.toString) + } + if (!r.equals(lastReplacementInUTF8)) { +// replacement string changed +lastReplacementInUTF8 = r.asInstanceOf[UTF8String] +lastReplacement = lastReplacementInUTF8.toString + } + val m = pattern.matcher(s.toString()) + result.delete(0, result.length()) + + while (m.find) { +m.appendReplacement(result, lastReplacement) + } + m.appendTail(result) + + return UTF8String.fromString(result.toString) +} + } +} + +null + } + + override def dataType: DataType = StringType + override def inputTypes: Seq[AbstractDataType] = Seq(StringType, StringType, StringType) + override def children: Seq[Expression] = subject :: regexp :: rep :: Nil + override def prettyName: String = regexp_replace +} + +/** + * UDF to extract a specific(idx) group identified by a java regex. + */ +case class RegExpExtract(subject: Expression, regexp: Expression, idx: Expression) + extends Expression with ImplicitCastInputTypes { + def this(s: Expression, r: Expression) = this(s, r, Literal(1)) + + // last regex in string, we will update the pattern iff regexp value changed. + @transient private var lastRegex: UTF8String = _ + // last regex pattern, we cache it for performance concern + @transient private var pattern: Pattern = _ + + override def nullable: Boolean = children.foldLeft(false)(_ || _.nullable) + override def foldable: Boolean = children.foldLeft(true)(_ _.foldable) + + override def eval(input: InternalRow): Any = { +val s = subject.eval(input) +if (null != s) { + val p = regexp.eval(input) + if (null != p) { +val r = idx.eval(input) +if (null != r) { + if (!p.equals(lastRegex)) { +// regex value changed +lastRegex = p.asInstanceOf[UTF8String] +pattern = Pattern.compile(lastRegex.toString) + } + val m = pattern.matcher(s.toString()) + if (m.find) { +val mr: MatchResult = m.toMatchResult +return UTF8String.fromString(mr.group(r.asInstanceOf[Int])) + } + return UTF8String.fromString() --- End diff -- Okay. I am going to create a Jira and check the coding for existing empty strings --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [MESOS][SPARK-8798] Allow additional uris to b...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/7195#issuecomment-122635407 Merged build finished. Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SQL] Make date/time functions more consistent...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/7506#issuecomment-122637538 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SQL] Make date/time functions more consistent...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/7506#issuecomment-122637521 [Test build #37759 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/37759/console) for PR 7506 at commit [`e44a4a0`](https://github.com/apache/spark/commit/e44a4a0579ea65093fdb7ca39749855be3a50fcd). * This patch **passes all tests**. * This patch merges cleanly. * This patch adds the following public classes _(experimental)_: * `case class Hour(child: Expression) extends UnaryExpression with ImplicitCastInputTypes ` * `case class Minute(child: Expression) extends UnaryExpression with ImplicitCastInputTypes ` * `case class Second(child: Expression) extends UnaryExpression with ImplicitCastInputTypes ` * `case class DayOfYear(child: Expression) extends UnaryExpression with ImplicitCastInputTypes ` * `case class Year(child: Expression) extends UnaryExpression with ImplicitCastInputTypes ` * `case class Quarter(child: Expression) extends UnaryExpression with ImplicitCastInputTypes ` * `case class Month(child: Expression) extends UnaryExpression with ImplicitCastInputTypes ` * `case class DayOfMonth(child: Expression) extends UnaryExpression with ImplicitCastInputTypes ` * `case class WeekOfYear(child: Expression) extends UnaryExpression with ImplicitCastInputTypes ` * `case class DateFormatClass(left: Expression, right: Expression) extends BinaryExpression` --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-9179] [BUILD] Allows committers to spec...
GitHub user liancheng opened a pull request: https://github.com/apache/spark/pull/7508 [SPARK-9179] [BUILD] Allows committers to specify primary author of the PR to be merged It's a common case that some contributor contributes an initial version of a feature/bugfix, and later on some other people (mostly committers) fork and add more improvements. When merging these PRs, we probably want to specify the original author as the primary author. Currently we can only do this by running ``` $ git commit --amend --author=name email ``` manually right before the merge script pushes to Apache Git repo. It would be nice if the script accepts user specified primary author information. You can merge this pull request into a Git repository by running: $ git pull https://github.com/liancheng/spark spark-9179 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/spark/pull/7508.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #7508 commit 218d88e7c74d4cb1dae085ae4a6d1a6221acb90f Author: Cheng Lian l...@databricks.com Date: 2015-07-19T08:05:01Z Allows committers to specify primary author of the PR to be merged --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-9175] [MLlib] BLAS.gemm fails to update...
Github user srowen commented on the pull request: https://github.com/apache/spark/pull/7503#issuecomment-122637915 LGTM --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-9175] [MLlib] BLAS.gemm fails to update...
Github user srowen commented on the pull request: https://github.com/apache/spark/pull/7503#issuecomment-122637916 ok to test --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-9179] [BUILD] Allows committers to spec...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/7508#issuecomment-122637898 Merged build triggered. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-9179] [BUILD] Allows committers to spec...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/7508#issuecomment-122637901 Merged build started. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: changes with lambda (closure)
Github user srowen commented on the pull request: https://github.com/apache/spark/pull/7502#issuecomment-122637954 I think you opened this by mistake? do you mind closing this PR? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-9179] [BUILD] Allows committers to spec...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/7508#issuecomment-122637941 [Test build #37767 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/37767/consoleFull) for PR 7508 at commit [`218d88e`](https://github.com/apache/spark/commit/218d88e7c74d4cb1dae085ae4a6d1a6221acb90f). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-9179] [BUILD] Allows committers to spec...
Github user srowen commented on the pull request: https://github.com/apache/spark/pull/7508#issuecomment-122637944 LGTM. Several superfluous whitespace changes, but hey. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-9175] [MLlib] BLAS.gemm fails to update...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/7503#issuecomment-122638160 [Test build #37768 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/37768/consoleFull) for PR 7503 at commit [`fce199c`](https://github.com/apache/spark/commit/fce199c3419b36a8c6d69d7b9eb293c7d4185b59). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-8199][SQL] follow up; revert change in ...
Github user rxin commented on the pull request: https://github.com/apache/spark/pull/7505#issuecomment-122638158 Thanks - I've merged this. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-9094] [PARENT] Increased io.dropwizard....
Github user srowen commented on the pull request: https://github.com/apache/spark/pull/7493#issuecomment-122638146 Merged into master/1.4 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-9178][SQL] Add an empty string constant...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/7509#issuecomment-122639980 Can one of the admins verify this patch? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [HOTFIX] [SQL] Fixes compilation error introdu...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/7510#issuecomment-122641318 [Test build #37770 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/37770/consoleFull) for PR 7510 at commit [`7ea7e89`](https://github.com/apache/spark/commit/7ea7e89818e529e43afdb9c18e4a68ba33acdd13). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-9172][SQL] Make DecimalPrecision suppor...
GitHub user viirya opened a pull request: https://github.com/apache/spark/pull/7511 [SPARK-9172][SQL] Make DecimalPrecision support for Intersect and Except JIRA: https://issues.apache.org/jira/browse/SPARK-9172 Simply make `DecimalPrecision` support for `Intersect` and `Except` in addition to `Union`. Besides, add unit test for `DecimalPrecision` as well. You can merge this pull request into a Git repository by running: $ git pull https://github.com/viirya/spark-1 more_decimalprecieion Alternatively you can review and apply these changes as the patch at: https://github.com/apache/spark/pull/7511.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #7511 commit 9fb0d490f4244963138e0fcaddba82ad066b0a3f Author: Liang-Chi Hsieh vii...@appier.com Date: 2015-07-19T09:22:53Z Make DecimalPrecision support for Intersect and Except. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-9172][SQL] Make DecimalPrecision suppor...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/7511#issuecomment-122641980 Merged build triggered. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-9179] [BUILD] Allows committers to spec...
Github user liancheng commented on the pull request: https://github.com/apache/spark/pull/7508#issuecomment-122641804 @srowen Thanks for the review :) Just couldn't help to remove those trailing spaces... I'm merging this to master then. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-9179] [BUILD] Allows committers to spec...
Github user asfgit closed the pull request at: https://github.com/apache/spark/pull/7508 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-9172][SQL] Make DecimalPrecision suppor...
Github user viirya commented on the pull request: https://github.com/apache/spark/pull/7511#issuecomment-122644123 Wait for #7510 to solve the compilation error. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-8241][SQL] string function: concat_ws.
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/7504#issuecomment-122632691 Merged build finished. Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-8199][SQL] follow up; revert change in ...
GitHub user tarekauel opened a pull request: https://github.com/apache/spark/pull/7505 [SPARK-8199][SQL] follow up; revert change in test @rxin / @davies Sorry for that unnecessary change. And thanks again for all you support! You can merge this pull request into a Git repository by running: $ git pull https://github.com/tarekauel/spark SPARK-8199-FollowUp Alternatively you can review and apply these changes as the patch at: https://github.com/apache/spark/pull/7505.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #7505 commit 67acfe6ff366e2050a72069842b088935d81e2ef Author: Tarek Auel tarek.a...@googlemail.com Date: 2015-07-19T06:01:02Z [SPARK-8199] follow up; revert change in test --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-8199][SQL] follow up; revert change in ...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/7505#issuecomment-122632766 Can one of the admins verify this patch? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-8199][SQL] follow up; revert change in ...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/7505#issuecomment-122632788 Can one of the admins verify this patch? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-8199][SQL] follow up; revert change in ...
Github user davies commented on the pull request: https://github.com/apache/spark/pull/7505#issuecomment-122632899 @tarekauel If Calendar is created inside for-loop (i), we should use `i`, otherwise use 1, is it correct? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SQL] Make date/time functions more consistent...
GitHub user rxin opened a pull request: https://github.com/apache/spark/pull/7506 [SQL] Make date/time functions more consistent with other database systems. This renames some of the functions that are just merged in order to be more consistent with other databases. Also did some small cleanups. You can merge this pull request into a Git repository by running: $ git pull https://github.com/rxin/spark datetime Alternatively you can review and apply these changes as the patch at: https://github.com/apache/spark/pull/7506.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #7506 commit 9c08fdc73d601bb856a16fca4c8c700dc29f3717 Author: Reynold Xin r...@databricks.com Date: 2015-07-19T06:12:08Z [SQL] Make date/time functions more consistent with other database systems. This renames some of the functions that are just merged in order to be more consistent with other databases. Also did some small cleanups. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-8199][SQL] follow up; revert change in ...
Github user tarekauel commented on the pull request: https://github.com/apache/spark/pull/7505#issuecomment-122632995 Now it's right, isn't it? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-8199][SQL] follow up; revert change in ...
Github user davies commented on the pull request: https://github.com/apache/spark/pull/7505#issuecomment-122632950 there are four more places (below that) need to be fixed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SQL] Make date/time functions more consistent...
Github user rxin commented on a diff in the pull request: https://github.com/apache/spark/pull/7506#discussion_r34955144 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/functions.scala --- @@ -1748,182 +1748,6 @@ object functions { */ def length(columnName: String): Column = length(Column(columnName)) - // --- End diff -- note that this previously cut right into the middle of string functions so I moved them --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-8199][SQL] follow up; revert change in ...
Github user davies commented on the pull request: https://github.com/apache/spark/pull/7505#issuecomment-122632959 After this rush, you should have more rest, also me. :) --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-8935][SQL] Implement code generation fo...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/7365#issuecomment-122633040 [Test build #37752 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/37752/console) for PR 7365 at commit [`5de0a95`](https://github.com/apache/spark/commit/5de0a951371a23cd198a6cf69b9fcb238f792f0e). * This patch **fails Spark unit tests**. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SQL] Make date/time functions more consistent...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/7506#issuecomment-122633011 Merged build triggered. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-8935][SQL] Implement code generation fo...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/7365#issuecomment-122633043 Merged build finished. Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SQL] Make date/time functions more consistent...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/7506#issuecomment-122633012 Merged build started. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SQL] Make date/time functions more consistent...
Github user rxin commented on the pull request: https://github.com/apache/spark/pull/7506#issuecomment-122633048 cc @tarekauel and @davies --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-8241][SQL] string function: concat_ws.
Github user rxin commented on the pull request: https://github.com/apache/spark/pull/7504#issuecomment-122633202 Jenkins, retest this please. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SQL] Make date/time functions more consistent...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/7506#issuecomment-122633104 Merged build started. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SQL] Make date/time functions more consistent...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/7506#issuecomment-122633099 Merged build triggered. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SQL] Make date/time functions more consistent...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/7506#issuecomment-122633264 Merged build started. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-9128][Core] Get outerclasses and object...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/7459#issuecomment-122633267 Merged build started. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-8241][SQL] string function: concat_ws.
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/7504#issuecomment-122633261 Merged build triggered. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SQL] Make date/time functions more consistent...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/7506#issuecomment-122633237 [Test build #37759 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/37759/consoleFull) for PR 7506 at commit [`e44a4a0`](https://github.com/apache/spark/commit/e44a4a0579ea65093fdb7ca39749855be3a50fcd). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SQL] Make date/time functions more consistent...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/7506#issuecomment-122633291 Merged build finished. Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-8241][SQL] string function: concat_ws.
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/7504#issuecomment-122633265 Merged build started. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-9128][Core] Get outerclasses and object...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/7459#issuecomment-122633262 Merged build triggered. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SQL] Make date/time functions more consistent...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/7506#issuecomment-122633260 Merged build triggered. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-9128][Core] Get outerclasses and object...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/7459#issuecomment-122633283 [Test build #37762 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/37762/consoleFull) for PR 7459 at commit [`7c9858d`](https://github.com/apache/spark/commit/7c9858db0f8374c8f124b4a964190ad2ff5ad898). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-8241][SQL] string function: concat_ws.
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/7504#issuecomment-12262 [Test build #37761 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/37761/consoleFull) for PR 7504 at commit [`dda1021`](https://github.com/apache/spark/commit/dda1021891cbfea0c6859542f3270a5ae8c20486). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-8199][SQL] follow up; revert change in ...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/7505#issuecomment-12267 [Test build #1113 has started](https://amplab.cs.berkeley.edu/jenkins/job/NewSparkPullRequestBuilder/1113/consoleFull) for PR 7505 at commit [`d09321c`](https://github.com/apache/spark/commit/d09321c7f3a4d5127c357fe15e7d6ab9531719d9). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-8199][SQL] follow up; revert change in ...
Github user davies commented on the pull request: https://github.com/apache/spark/pull/7505#issuecomment-122633326 LGTM. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SQL] Make date/time functions more consistent...
Github user davies commented on the pull request: https://github.com/apache/spark/pull/7506#issuecomment-122633426 LGTM --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SQL] Make date/time functions more consistent...
Github user tarekauel commented on the pull request: https://github.com/apache/spark/pull/7506#issuecomment-122633436 @rxin Could you do this little fix as well? https://github.com/apache/spark/pull/7505/files Why do we switch from day_of_month to dayofmonth? Most SQL implementations use underscores: [MySQL](https://dev.mysql.com/doc/refman/5.0/en/func-op-summary-ref.html) [SAP HANA](http://help.sap.com/saphelp_hanaplatform/helpdata/en/20/9f228975191014baed94f1b69693ae/content.htm?frameset=/en/20/9ddefe75191014ac249bf78ba2a1e9/frameset.htmcurrent_toc=/en/2e/1ef8b4f4554739959886e55d4c127b/plain.htmnode_id=91show_children=false) [Oracle](http://docs.oracle.com/cd/B19306_01/server.102/b14200/functions001.htm#i88891) I would prefer underscores, because they improve the readability, if you write all SQL stuff in caps, like: `SELECT name, age, DAY_OF_MONTH(birthday) AS birthday FROM people WHERE age 15` compared to `SELECT name, age, DAYOFMONTH(birthday) AS birthday FROM people WHERE age 15` I'm not a Python pro, but I thought that underscores are 'pythonic', aren't they? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SQL] Make date/time functions more consistent...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/7506#issuecomment-122633378 Merged build finished. Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-8241][SQL] string function: concat_ws.
Github user rxin commented on the pull request: https://github.com/apache/spark/pull/7504#issuecomment-122633444 cc @davies --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SQL] Make date/time functions more consistent...
Github user rxin commented on the pull request: https://github.com/apache/spark/pull/7506#issuecomment-122633558 Both MySQL and HANA use dayofmonth, without the underscore? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-9019][YARN] Add RM delegation token to ...
Github user bolkedebruin commented on the pull request: https://github.com/apache/spark/pull/7489#issuecomment-122633600 Before adding the code I also grepped for getRMDelegationToken which is the API call. It is not in Spark. Additionally, you are required to add it to the launch context so where else can it be? This was tested on a HDP 2.2.0 setup with FreeIPA as KDC/LDAP. It was also tested on a HDP 2.2.6 clean install with Kerberos activated from Ambari. We also tested with Spark 1.3.1. All are showing this behavior. I might be able to test it with a CDH 5.3 cluster, but would you be able to share a debug log yourself from the container to confirm that a RM token is generated in your case and/or that behavior is different? I would like to get to the bottom of this and as I said I was surprised it wasn't in there before, but until now my data is saying so and I start to run out of options to provide more evidence. On 19 jul. 2015, at 02:24, Hari Shreedharan notificati...@github.com wrote: I have not seen this issue. I am not saying RM is not running. I am saying there might be a config issue somewhere. I don't have access to the code right now, but I am fairly sure there is code that adds the RM tokens already. On Saturday, July 18, 2015, bolkedebruin notificati...@github.com wrote: @harishreedharan https://github.com/harishreedharan We tested it and the issue is with both keytab and kinit. Why do you think the RM is not running? As it is actually running (cluster is/was not doing much). The connection refused message happens *after* the SASL negotiation fails and is a bit misleading. See below for the same job but then with my patch included (will add in on minute). â Reply to this email directly or view it on GitHub https://github.com/apache/spark/pull/7489#issuecomment-122580670. -- Thanks, Hari â Reply to this email directly or view it on GitHub. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-8638] [SQL] Window Function Performance...
Github user yhuai commented on the pull request: https://github.com/apache/spark/pull/7057#issuecomment-122633626 @hvanhovell Overall looks good. I am merging it to master. I will leave a few comments for minor changes. Can you submit a follow-up PR to address them? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-8638] [SQL] Window Function Performance...
Github user asfgit closed the pull request at: https://github.com/apache/spark/pull/7057 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-8240] [SPARK-8241] [SQL] string functio...
Github user rxin commented on the pull request: https://github.com/apache/spark/pull/6775#issuecomment-122633649 @adrian-wang I had some time this weekend and added concat/concat_ws with code gen. I'm going to close this one. Thanks a lot. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-8638] [SQL] Window Function Performance...
Github user yhuai commented on a diff in the pull request: https://github.com/apache/spark/pull/7057#discussion_r3497 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/Window.scala --- @@ -38,443 +84,667 @@ case class Window( child: SparkPlan) extends UnaryNode { - override def output: Seq[Attribute] = -(projectList ++ windowExpression).map(_.toAttribute) + override def output: Seq[Attribute] = projectList ++ windowExpression.map(_.toAttribute) - override def requiredChildDistribution: Seq[Distribution] = + override def requiredChildDistribution: Seq[Distribution] = { if (windowSpec.partitionSpec.isEmpty) { - // This operator will be very expensive. + // Only show warning when the number of bytes is larger than 100 MB? + logWarning(No Partition Defined for Window operation! Moving all data to a single ++ partition, this can cause serious performance degradation.) AllTuples :: Nil -} else { - ClusteredDistribution(windowSpec.partitionSpec) :: Nil -} - - // Since window functions are adding columns to the input rows, the child's outputPartitioning - // is preserved. - override def outputPartitioning: Partitioning = child.outputPartitioning - - override def requiredChildOrdering: Seq[Seq[SortOrder]] = { -// The required child ordering has two parts. -// The first part is the expressions in the partition specification. -// We add these expressions to the required ordering to make sure input rows are grouped -// based on the partition specification. So, we only need to process a single partition -// at a time. -// The second part is the expressions specified in the ORDER BY cluase. -// Basically, we first use sort to group rows based on partition specifications and then sort -// Rows in a group based on the order specification. -(windowSpec.partitionSpec.map(SortOrder(_, Ascending)) ++ windowSpec.orderSpec) :: Nil +} else ClusteredDistribution(windowSpec.partitionSpec) :: Nil } - // Since window functions basically add columns to input rows, this operator - // will not change the ordering of input rows. + override def requiredChildOrdering: Seq[Seq[SortOrder]] = +Seq(windowSpec.partitionSpec.map(SortOrder(_, Ascending)) ++ windowSpec.orderSpec) + override def outputOrdering: Seq[SortOrder] = child.outputOrdering - case class ComputedWindow( -unbound: WindowExpression, -windowFunction: WindowFunction, -resultAttribute: AttributeReference) - - // A list of window functions that need to be computed for each group. - private[this] val computedWindowExpressions = windowExpression.flatMap { window = -window.collect { - case w: WindowExpression = -ComputedWindow( - w, - BindReferences.bindReference(w.windowFunction, child.output), - AttributeReference(swindowResult:$w, w.dataType, w.nullable)()) + /** + * Create a bound ordering object for a given frame type and offset. A bound ordering object is + * used to determine which input row lies within the frame boundaries of an output row. + * + * This method uses Code Generation. It can only be used on the executor side. + * + * @param frameType to evaluate. This can either be Row or Range based. + * @param offset with respect to the row. + * @return a bound ordering object. + */ + private[this] def createBoundOrdering(frameType: FrameType, offset: Int): BoundOrdering = { +frameType match { + case RangeFrame = +val (exprs, current, bound) = if (offset == 0) { + // Use the entire order expression when the offset is 0. + val exprs = windowSpec.orderSpec.map(_.child) + val projection = newMutableProjection(exprs, child.output) + (windowSpec.orderSpec, projection(), projection()) +} +else if (windowSpec.orderSpec.size == 1) { --- End diff -- `} else if` --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-8240] [SPARK-8241] [SQL] string functio...
Github user asfgit closed the pull request at: https://github.com/apache/spark/pull/6775 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-8638] [SQL] Window Function Performance...
Github user yhuai commented on a diff in the pull request: https://github.com/apache/spark/pull/7057#discussion_r3496 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/Window.scala --- @@ -38,443 +84,667 @@ case class Window( child: SparkPlan) extends UnaryNode { - override def output: Seq[Attribute] = -(projectList ++ windowExpression).map(_.toAttribute) + override def output: Seq[Attribute] = projectList ++ windowExpression.map(_.toAttribute) - override def requiredChildDistribution: Seq[Distribution] = + override def requiredChildDistribution: Seq[Distribution] = { if (windowSpec.partitionSpec.isEmpty) { - // This operator will be very expensive. + // Only show warning when the number of bytes is larger than 100 MB? + logWarning(No Partition Defined for Window operation! Moving all data to a single ++ partition, this can cause serious performance degradation.) AllTuples :: Nil -} else { - ClusteredDistribution(windowSpec.partitionSpec) :: Nil -} - - // Since window functions are adding columns to the input rows, the child's outputPartitioning - // is preserved. - override def outputPartitioning: Partitioning = child.outputPartitioning - - override def requiredChildOrdering: Seq[Seq[SortOrder]] = { -// The required child ordering has two parts. -// The first part is the expressions in the partition specification. -// We add these expressions to the required ordering to make sure input rows are grouped -// based on the partition specification. So, we only need to process a single partition -// at a time. -// The second part is the expressions specified in the ORDER BY cluase. -// Basically, we first use sort to group rows based on partition specifications and then sort -// Rows in a group based on the order specification. -(windowSpec.partitionSpec.map(SortOrder(_, Ascending)) ++ windowSpec.orderSpec) :: Nil +} else ClusteredDistribution(windowSpec.partitionSpec) :: Nil } - // Since window functions basically add columns to input rows, this operator - // will not change the ordering of input rows. + override def requiredChildOrdering: Seq[Seq[SortOrder]] = +Seq(windowSpec.partitionSpec.map(SortOrder(_, Ascending)) ++ windowSpec.orderSpec) + override def outputOrdering: Seq[SortOrder] = child.outputOrdering - case class ComputedWindow( -unbound: WindowExpression, -windowFunction: WindowFunction, -resultAttribute: AttributeReference) - - // A list of window functions that need to be computed for each group. - private[this] val computedWindowExpressions = windowExpression.flatMap { window = -window.collect { - case w: WindowExpression = -ComputedWindow( - w, - BindReferences.bindReference(w.windowFunction, child.output), - AttributeReference(swindowResult:$w, w.dataType, w.nullable)()) + /** + * Create a bound ordering object for a given frame type and offset. A bound ordering object is + * used to determine which input row lies within the frame boundaries of an output row. + * + * This method uses Code Generation. It can only be used on the executor side. + * + * @param frameType to evaluate. This can either be Row or Range based. + * @param offset with respect to the row. + * @return a bound ordering object. + */ + private[this] def createBoundOrdering(frameType: FrameType, offset: Int): BoundOrdering = { +frameType match { + case RangeFrame = +val (exprs, current, bound) = if (offset == 0) { + // Use the entire order expression when the offset is 0. + val exprs = windowSpec.orderSpec.map(_.child) + val projection = newMutableProjection(exprs, child.output) + (windowSpec.orderSpec, projection(), projection()) +} +else if (windowSpec.orderSpec.size == 1) { + // Use only the first order expression when the offset is non-null. + val sortExpr = windowSpec.orderSpec.head + val expr = sortExpr.child + // Create the projection which returns the current 'value'. + val current = newMutableProjection(expr :: Nil, child.output)() + // Flip the sign of the offset when processing the order is descending + val boundOffset = if (sortExpr.direction == Descending) -offset + else offset --- End diff -- ``` val boundOffset = if (sortExpr.direction == Descending) -offset else offset ``` --- If your project is set up for it, you can reply to this email
[GitHub] spark pull request: [SPARK-8638] [SQL] Window Function Performance...
Github user yhuai commented on a diff in the pull request: https://github.com/apache/spark/pull/7057#discussion_r34955549 --- Diff: sql/hive/src/test/scala/org/apache/spark/sql/hive/execution/WindowSuite.scala --- @@ -0,0 +1,79 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the License); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + *http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an AS IS BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.spark.sql.hive.execution + +import org.apache.spark.sql.{Row, QueryTest} +import org.apache.spark.sql.expressions.Window +import org.apache.spark.sql.functions._ +import org.apache.spark.sql.hive.test.TestHive.implicits._ + +/** + * Window expressions are tested extensively by the following test suites: + * [[org.apache.spark.sql.hive.HiveDataFrameWindowSuite]] + * [[org.apache.spark.sql.hive.execution.HiveWindowFunctionQueryWithoutCodeGenSuite]] + * [[org.apache.spark.sql.hive.execution.HiveWindowFunctionQueryFileWithoutCodeGenSuite]] + * However these suites do not cover all possible (i.e. more exotic) settings. This suite fill + * this gap. + * + * TODO Move this class to the sql/core project when we move to Native Spark UDAFs. + */ +class WindowSuite extends QueryTest { --- End diff -- Seems we do not need to create a new suite, right? We can just use `HiveDataFrameWindowSuite`. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-6910] [SQL] Support for pushing predica...
Github user liancheng commented on the pull request: https://github.com/apache/spark/pull/7492#issuecomment-122633732 @marmbrus I think it's probably OK to merge this one first. But I still haven't got any clue about the root cause mentioned in https://github.com/apache/spark/pull/7421#issuecomment-122527391 yet. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-8638] [SQL] Window Function Performance...
Github user yhuai commented on a diff in the pull request: https://github.com/apache/spark/pull/7057#discussion_r3498 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/Window.scala --- @@ -38,443 +84,667 @@ case class Window( child: SparkPlan) extends UnaryNode { - override def output: Seq[Attribute] = -(projectList ++ windowExpression).map(_.toAttribute) + override def output: Seq[Attribute] = projectList ++ windowExpression.map(_.toAttribute) - override def requiredChildDistribution: Seq[Distribution] = + override def requiredChildDistribution: Seq[Distribution] = { if (windowSpec.partitionSpec.isEmpty) { - // This operator will be very expensive. + // Only show warning when the number of bytes is larger than 100 MB? + logWarning(No Partition Defined for Window operation! Moving all data to a single ++ partition, this can cause serious performance degradation.) AllTuples :: Nil -} else { - ClusteredDistribution(windowSpec.partitionSpec) :: Nil -} - - // Since window functions are adding columns to the input rows, the child's outputPartitioning - // is preserved. - override def outputPartitioning: Partitioning = child.outputPartitioning - - override def requiredChildOrdering: Seq[Seq[SortOrder]] = { -// The required child ordering has two parts. -// The first part is the expressions in the partition specification. -// We add these expressions to the required ordering to make sure input rows are grouped -// based on the partition specification. So, we only need to process a single partition -// at a time. -// The second part is the expressions specified in the ORDER BY cluase. -// Basically, we first use sort to group rows based on partition specifications and then sort -// Rows in a group based on the order specification. -(windowSpec.partitionSpec.map(SortOrder(_, Ascending)) ++ windowSpec.orderSpec) :: Nil +} else ClusteredDistribution(windowSpec.partitionSpec) :: Nil } - // Since window functions basically add columns to input rows, this operator - // will not change the ordering of input rows. + override def requiredChildOrdering: Seq[Seq[SortOrder]] = +Seq(windowSpec.partitionSpec.map(SortOrder(_, Ascending)) ++ windowSpec.orderSpec) + override def outputOrdering: Seq[SortOrder] = child.outputOrdering - case class ComputedWindow( -unbound: WindowExpression, -windowFunction: WindowFunction, -resultAttribute: AttributeReference) - - // A list of window functions that need to be computed for each group. - private[this] val computedWindowExpressions = windowExpression.flatMap { window = -window.collect { - case w: WindowExpression = -ComputedWindow( - w, - BindReferences.bindReference(w.windowFunction, child.output), - AttributeReference(swindowResult:$w, w.dataType, w.nullable)()) + /** + * Create a bound ordering object for a given frame type and offset. A bound ordering object is + * used to determine which input row lies within the frame boundaries of an output row. + * + * This method uses Code Generation. It can only be used on the executor side. + * + * @param frameType to evaluate. This can either be Row or Range based. + * @param offset with respect to the row. + * @return a bound ordering object. + */ + private[this] def createBoundOrdering(frameType: FrameType, offset: Int): BoundOrdering = { +frameType match { + case RangeFrame = +val (exprs, current, bound) = if (offset == 0) { + // Use the entire order expression when the offset is 0. + val exprs = windowSpec.orderSpec.map(_.child) + val projection = newMutableProjection(exprs, child.output) + (windowSpec.orderSpec, projection(), projection()) +} +else if (windowSpec.orderSpec.size == 1) { + // Use only the first order expression when the offset is non-null. + val sortExpr = windowSpec.orderSpec.head + val expr = sortExpr.child + // Create the projection which returns the current 'value'. + val current = newMutableProjection(expr :: Nil, child.output)() + // Flip the sign of the offset when processing the order is descending + val boundOffset = if (sortExpr.direction == Descending) -offset + else offset + // Create the projection which returns the current 'value' modified by adding the offset. + val boundExpr = Add(expr, Cast(Literal.create(boundOffset, IntegerType), expr.dataType)) +
[GitHub] spark pull request: [SPARK-8241][SQL] string function: concat_ws.
Github user davies commented on a diff in the pull request: https://github.com/apache/spark/pull/7504#discussion_r34955580 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/stringOperations.scala --- @@ -56,15 +51,76 @@ case class Concat(children: Seq[Expression]) extends Expression with ImplicitCas override protected def genCode(ctx: CodeGenContext, ev: GeneratedExpressionCode): String = { val evals = children.map(_.gen(ctx)) -val inputs = evals.map { eval = s${eval.isNull} ? null : ${eval.primitive} }.mkString(, ) +val inputs = evals.map { eval = + s${eval.isNull} ? (UTF8String)null : ${eval.primitive} +}.mkString(, ) evals.map(_.code).mkString(\n) + s boolean ${ev.isNull} = false; UTF8String ${ev.primitive} = UTF8String.concat($inputs); + if (${ev.primitive} == null) { +${ev.isNull} = true; + } } } +/** + * An expression that concatenates multiple input strings or array of strings into a single string, + * using a given separator (the first child). + * + * Returns null if the separator is null. Otherwise, concat_ws skips all null values. + */ +case class ConcatWs(children: Seq[Expression]) + extends Expression with ImplicitCastInputTypes with CodegenFallback { + + require(children.nonEmpty, s$prettyName requires at least one argument.) + + override def prettyName: String = concat_ws + + /** The 1st child (separator) is str, and rest are either str or array of str. */ + override def inputTypes: Seq[AbstractDataType] = { +val arrayOrStr = TypeCollection(ArrayType(StringType), StringType) +StringType +: Seq.fill(children.size - 1)(arrayOrStr) + } + + override def dataType: DataType = StringType + + override def nullable: Boolean = children.head.nullable + override def foldable: Boolean = children.forall(_.foldable) + + override def eval(input: InternalRow): Any = { +val flatInputs = children.flatMap { child = + child.eval(input) match { +case s: UTF8String = Iterator(s) +case arr: Seq[_] = arr.asInstanceOf[Seq[UTF8String]] +case null = Iterator(null.asInstanceOf[UTF8String]) --- End diff -- minor: Can we just ignore the `null`? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-8756][SQL] Keep cached information and ...
Github user rxin commented on the pull request: https://github.com/apache/spark/pull/7154#issuecomment-122633884 cc @liancheng --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-8241][SQL] string function: concat_ws.
Github user davies commented on a diff in the pull request: https://github.com/apache/spark/pull/7504#discussion_r34955583 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/stringOperations.scala --- @@ -56,15 +51,76 @@ case class Concat(children: Seq[Expression]) extends Expression with ImplicitCas override protected def genCode(ctx: CodeGenContext, ev: GeneratedExpressionCode): String = { val evals = children.map(_.gen(ctx)) -val inputs = evals.map { eval = s${eval.isNull} ? null : ${eval.primitive} }.mkString(, ) +val inputs = evals.map { eval = + s${eval.isNull} ? (UTF8String)null : ${eval.primitive} +}.mkString(, ) evals.map(_.code).mkString(\n) + s boolean ${ev.isNull} = false; UTF8String ${ev.primitive} = UTF8String.concat($inputs); + if (${ev.primitive} == null) { +${ev.isNull} = true; + } } } +/** + * An expression that concatenates multiple input strings or array of strings into a single string, + * using a given separator (the first child). + * + * Returns null if the separator is null. Otherwise, concat_ws skips all null values. + */ +case class ConcatWs(children: Seq[Expression]) + extends Expression with ImplicitCastInputTypes with CodegenFallback { + + require(children.nonEmpty, s$prettyName requires at least one argument.) + + override def prettyName: String = concat_ws + + /** The 1st child (separator) is str, and rest are either str or array of str. */ + override def inputTypes: Seq[AbstractDataType] = { +val arrayOrStr = TypeCollection(ArrayType(StringType), StringType) +StringType +: Seq.fill(children.size - 1)(arrayOrStr) + } + + override def dataType: DataType = StringType + + override def nullable: Boolean = children.head.nullable + override def foldable: Boolean = children.forall(_.foldable) --- End diff -- existing: we could use this as the default one for Expression. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-8241][SQL] string function: concat_ws.
Github user rxin commented on a diff in the pull request: https://github.com/apache/spark/pull/7504#discussion_r34955606 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/stringOperations.scala --- @@ -56,15 +51,76 @@ case class Concat(children: Seq[Expression]) extends Expression with ImplicitCas override protected def genCode(ctx: CodeGenContext, ev: GeneratedExpressionCode): String = { val evals = children.map(_.gen(ctx)) -val inputs = evals.map { eval = s${eval.isNull} ? null : ${eval.primitive} }.mkString(, ) +val inputs = evals.map { eval = + s${eval.isNull} ? (UTF8String)null : ${eval.primitive} +}.mkString(, ) evals.map(_.code).mkString(\n) + s boolean ${ev.isNull} = false; UTF8String ${ev.primitive} = UTF8String.concat($inputs); + if (${ev.primitive} == null) { +${ev.isNull} = true; + } } } +/** + * An expression that concatenates multiple input strings or array of strings into a single string, + * using a given separator (the first child). + * + * Returns null if the separator is null. Otherwise, concat_ws skips all null values. + */ +case class ConcatWs(children: Seq[Expression]) + extends Expression with ImplicitCastInputTypes with CodegenFallback { + + require(children.nonEmpty, s$prettyName requires at least one argument.) + + override def prettyName: String = concat_ws + + /** The 1st child (separator) is str, and rest are either str or array of str. */ + override def inputTypes: Seq[AbstractDataType] = { +val arrayOrStr = TypeCollection(ArrayType(StringType), StringType) +StringType +: Seq.fill(children.size - 1)(arrayOrStr) + } + + override def dataType: DataType = StringType + + override def nullable: Boolean = children.head.nullable + override def foldable: Boolean = children.forall(_.foldable) + + override def eval(input: InternalRow): Any = { +val flatInputs = children.flatMap { child = + child.eval(input) match { +case s: UTF8String = Iterator(s) +case arr: Seq[_] = arr.asInstanceOf[Seq[UTF8String]] +case null = Iterator(null.asInstanceOf[UTF8String]) --- End diff -- How? It won't match s. Also this thing doesn't compile if I do a wildcard match on s, e.g. ```scala case s: _ = ... ``` --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-8241][SQL] string function: concat_ws.
Github user davies commented on a diff in the pull request: https://github.com/apache/spark/pull/7504#discussion_r34955617 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/stringOperations.scala --- @@ -56,15 +51,76 @@ case class Concat(children: Seq[Expression]) extends Expression with ImplicitCas override protected def genCode(ctx: CodeGenContext, ev: GeneratedExpressionCode): String = { val evals = children.map(_.gen(ctx)) -val inputs = evals.map { eval = s${eval.isNull} ? null : ${eval.primitive} }.mkString(, ) +val inputs = evals.map { eval = + s${eval.isNull} ? (UTF8String)null : ${eval.primitive} +}.mkString(, ) evals.map(_.code).mkString(\n) + s boolean ${ev.isNull} = false; UTF8String ${ev.primitive} = UTF8String.concat($inputs); + if (${ev.primitive} == null) { +${ev.isNull} = true; + } } } +/** + * An expression that concatenates multiple input strings or array of strings into a single string, + * using a given separator (the first child). + * + * Returns null if the separator is null. Otherwise, concat_ws skips all null values. + */ +case class ConcatWs(children: Seq[Expression]) + extends Expression with ImplicitCastInputTypes with CodegenFallback { + + require(children.nonEmpty, s$prettyName requires at least one argument.) + + override def prettyName: String = concat_ws + + /** The 1st child (separator) is str, and rest are either str or array of str. */ + override def inputTypes: Seq[AbstractDataType] = { +val arrayOrStr = TypeCollection(ArrayType(StringType), StringType) +StringType +: Seq.fill(children.size - 1)(arrayOrStr) + } + + override def dataType: DataType = StringType + + override def nullable: Boolean = children.head.nullable + override def foldable: Boolean = children.forall(_.foldable) + + override def eval(input: InternalRow): Any = { +val flatInputs = children.flatMap { child = + child.eval(input) match { +case s: UTF8String = Iterator(s) +case arr: Seq[_] = arr.asInstanceOf[Seq[UTF8String]] +case null = Iterator(null.asInstanceOf[UTF8String]) --- End diff -- I mean we can return an empty Iterator --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-9166][SQL][PYSPARK] Capture and hide Il...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/7497#issuecomment-122634105 [Test build #37763 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/37763/consoleFull) for PR 7497 at commit [`9ace67d`](https://github.com/apache/spark/commit/9ace67dede05115cfed7f4794867cd9dabe370d8). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org