[GitHub] spark pull request: [SPARK-8199][SQL] follow up; revert change in ...

2015-07-19 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/7505#issuecomment-122637767
  
  [Test build #1113 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/NewSparkPullRequestBuilder/1113/console)
 for   PR 7505 at commit 
[`d09321c`](https://github.com/apache/spark/commit/d09321c7f3a4d5127c357fe15e7d6ab9531719d9).
 * This patch **passes all tests**.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-9175] [MLlib] BLAS.gemm fails to update...

2015-07-19 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/7503#issuecomment-122638017
  
 Merged build triggered.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-9109] [GraphX] Keep the cached edge in ...

2015-07-19 Thread srowen
Github user srowen commented on the pull request:

https://github.com/apache/spark/pull/7469#issuecomment-122638035
  
Thanks @ankurdave -- you can follow this by resolving the issue (done 
already now)


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-9175] [MLlib] BLAS.gemm fails to update...

2015-07-19 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/7503#issuecomment-122638019
  
Merged build started.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-8199][SQL] follow up; revert change in ...

2015-07-19 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/spark/pull/7505


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SQL] Make date/time functions more consistent...

2015-07-19 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/spark/pull/7506


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-9094] [PARENT] Increased io.dropwizard....

2015-07-19 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/spark/pull/7493


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SQL] Make date/time functions more consistent...

2015-07-19 Thread rxin
Github user rxin commented on the pull request:

https://github.com/apache/spark/pull/7506#issuecomment-122638167
  
I've merged this.



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-6761][SQL] Approximate quantile for Dat...

2015-07-19 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/6042#issuecomment-122638558
  
 Build triggered.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-6761][SQL] Approximate quantile for Dat...

2015-07-19 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/6042#issuecomment-122638561
  
Build started.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-8241][SQL] string function: concat_ws.

2015-07-19 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/7504#issuecomment-122638622
  
  [Test build #37761 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/37761/console)
 for   PR 7504 at commit 
[`dda1021`](https://github.com/apache/spark/commit/dda1021891cbfea0c6859542f3270a5ae8c20486).
 * This patch **fails Spark unit tests**.
 * This patch merges cleanly.
 * This patch adds the following public classes _(experimental)_:
  * `case class ConcatWs(children: Seq[Expression])`



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-6761][SQL] Approximate quantile for Dat...

2015-07-19 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/6042#issuecomment-122638906
  
  [Test build #37769 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/37769/consoleFull)
 for   PR 6042 at commit 
[`1086537`](https://github.com/apache/spark/commit/10865378c3aba5e639c352bded61a616933a5f1c).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-9178][SQL] Add an empty string constant...

2015-07-19 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/7509#issuecomment-122639078
  
Can one of the admins verify this patch?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-9178][SQL] Add an empty string constant...

2015-07-19 Thread tarekauel
GitHub user tarekauel opened a pull request:

https://github.com/apache/spark/pull/7509

[SPARK-9178][SQL] Add an empty string constant to UTF8String

Jira: https://issues.apache.org/jira/browse/SPARK-9178

In order to avoid calls of `UTF8String.fromString()` this pr adds an 
`EMPTY_STRING` constant to `UTF8String`. An `UTF8String` is immutable, so we 
can use a constant, isn't it?

I searched for current usage of `UTF8String.fromString()` with 
`grep -R  UTF8String.fromString(\\) .` 

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/tarekauel/spark SPARK-9178

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/spark/pull/7509.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #7509


commit 748b87a38575664fcfc877ccc575678ba54a9df6
Author: Tarek Auel tarek.a...@googlemail.com
Date:   2015-07-19T08:22:43Z

[SPARK-9178] Add empty string constant to UTF8String




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-8241][SQL] string function: concat_ws.

2015-07-19 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/7504#issuecomment-122638627
  
Merged build finished. Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-9128][Core] Get outerclasses and object...

2015-07-19 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/7459#issuecomment-122640067
  
Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-9128][Core] Get outerclasses and object...

2015-07-19 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/7459#issuecomment-122640048
  
  [Test build #37762 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/37762/console)
 for   PR 7459 at commit 
[`7c9858d`](https://github.com/apache/spark/commit/7c9858db0f8374c8f124b4a964190ad2ff5ad898).
 * This patch **passes all tests**.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [HOTFIX] [SQL] Fixes compilation error introdu...

2015-07-19 Thread liancheng
GitHub user liancheng opened a pull request:

https://github.com/apache/spark/pull/7510

[HOTFIX] [SQL] Fixes compilation error introduced by PR #7506

PR #7506 breaks master build because of compilation error. Note that #7506 
itself looks good, but it seems that `git merge` did something stupid.

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/liancheng/spark hotfix-for-pr-7506

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/spark/pull/7510.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #7510


commit 7ea7e89818e529e43afdb9c18e4a68ba33acdd13
Author: Cheng Lian l...@databricks.com
Date:   2015-07-19T09:06:07Z

Fixes compilation error




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [HOTFIX] [SQL] Fixes compilation error introdu...

2015-07-19 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/7510#issuecomment-122641202
  
 Merged build triggered.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [HOTFIX] [SQL] Fixes compilation error introdu...

2015-07-19 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/7510#issuecomment-122641210
  
Merged build started.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-9172][SQL] Make DecimalPrecision suppor...

2015-07-19 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/7511#issuecomment-122642015
  
Merged build started.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-9172][SQL] Make DecimalPrecision suppor...

2015-07-19 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/7511#issuecomment-122643246
  
  [Test build #37771 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/37771/consoleFull)
 for   PR 7511 at commit 
[`9fb0d49`](https://github.com/apache/spark/commit/9fb0d490f4244963138e0fcaddba82ad066b0a3f).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-9172][SQL] Make DecimalPrecision suppor...

2015-07-19 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/7511#issuecomment-122643791
  
  [Test build #37771 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/37771/console)
 for   PR 7511 at commit 
[`9fb0d49`](https://github.com/apache/spark/commit/9fb0d490f4244963138e0fcaddba82ad066b0a3f).
 * This patch **fails Spark unit tests**.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-9172][SQL] Make DecimalPrecision suppor...

2015-07-19 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/7511#issuecomment-122643795
  
Merged build finished. Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-8951][SparkR] support Unicode character...

2015-07-19 Thread CHOIJAEHONG1
Github user CHOIJAEHONG1 commented on the pull request:

https://github.com/apache/spark/pull/7494#issuecomment-122643950
  
I am not sure about `readString`, but the teatcase, which verifies the 
intactness of unicode characters in a native dataframe making a round trip to 
Spark's DataFrame, failed. There is something underneath.

```
1. Failure(@test_sparkSQL.R#438): collect() support Unicode characters 
-
collect(where(df2, df2$name == \346\202\250\345\245\275))[[2]] not equal 
to \346\202\250\345\245\275
1 string mismatches:
x[1]: \346\202\250\345\245\275
y[1]: e682a8e5a5bd
```


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: Spark 8695

2015-07-19 Thread srowen
Github user srowen commented on the pull request:

https://github.com/apache/spark/pull/7397#issuecomment-122644170
  
(Ah: this continues a discussion in 
https://github.com/apache/spark/pull/7168 -- should have been mentioned in this 
PR.)

@piganesh do you mind closing this if you're not going to follow up?
Otherwise, please make the `.toDouble` change and correctly title this PR.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [MESOS][SPARK-8798] Allow additional uris to b...

2015-07-19 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/7195#issuecomment-122635356
  
  [Test build #37757 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/37757/console)
 for   PR 7195 at commit 
[`42e2ee2`](https://github.com/apache/spark/commit/42e2ee29ecd034b93c6a705fc9c8d4297de6362b).
 * This patch **fails Spark unit tests**.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-8255][SPARK-8256][SQL]Add regex_extract...

2015-07-19 Thread tarekauel
Github user tarekauel commented on a diff in the pull request:

https://github.com/apache/spark/pull/7468#discussion_r34955908
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/stringOperations.scala
 ---
@@ -673,6 +673,110 @@ case class Encode(value: Expression, charset: 
Expression)
 }
 
 /**
+ * Replace all substrings of str that match regexp with rep
+ */
+case class RegExpReplace(subject: Expression, regexp: Expression, rep: 
Expression)
+  extends Expression with ImplicitCastInputTypes {
+
+  // last regex in string, we will update the pattern iff regexp value 
changed.
+  @transient private var lastRegex: UTF8String = _
+  // last regex pattern, we cache it for performance concern
+  @transient private var pattern: Pattern = _
+  // last replacement string, we don't want to convert a UTF8String = 
java.langString every time.
+  @transient private var lastReplacement: String = _
+  @transient private var lastReplacementInUTF8: UTF8String = _
+  // result buffer write by Matcher
+  @transient private val result: StringBuffer = new StringBuffer
+
+  override def nullable: Boolean = children.foldLeft(false)(_ || 
_.nullable)
+  override def foldable: Boolean = children.foldLeft(true)(_  _.foldable)
+
+  override def eval(input: InternalRow): Any = {
+val s = subject.eval(input)
+if (null != s) {
+  val p = regexp.eval(input)
+  if (null != p) {
+val r = rep.eval(input)
+if (null != r) {
+  if (!p.equals(lastRegex)) {
+// regex value changed
+lastRegex = p.asInstanceOf[UTF8String]
+pattern = Pattern.compile(lastRegex.toString)
+  }
+  if (!r.equals(lastReplacementInUTF8)) {
+// replacement string changed
+lastReplacementInUTF8 = r.asInstanceOf[UTF8String]
+lastReplacement = lastReplacementInUTF8.toString
+  }
+  val m = pattern.matcher(s.toString())
+  result.delete(0, result.length())
+
+  while (m.find) {
+m.appendReplacement(result, lastReplacement)
+  }
+  m.appendTail(result)
+
+  return UTF8String.fromString(result.toString)
+}
+  }
+}
+
+null
+  }
+
+  override def dataType: DataType = StringType
+  override def inputTypes: Seq[AbstractDataType] = Seq(StringType, 
StringType, StringType)
+  override def children: Seq[Expression] = subject :: regexp :: rep :: Nil
+  override def prettyName: String = regexp_replace
+}
+
+/**
+ * UDF to extract a specific(idx) group identified by a java regex.
+ */
+case class RegExpExtract(subject: Expression, regexp: Expression, idx: 
Expression)
+  extends Expression with ImplicitCastInputTypes {
+  def this(s: Expression, r: Expression) = this(s, r, Literal(1))
+
+  // last regex in string, we will update the pattern iff regexp value 
changed.
+  @transient private var lastRegex: UTF8String = _
+  // last regex pattern, we cache it for performance concern
+  @transient private var pattern: Pattern = _
+
+  override def nullable: Boolean = children.foldLeft(false)(_ || 
_.nullable)
+  override def foldable: Boolean = children.foldLeft(true)(_  _.foldable)
+
+  override def eval(input: InternalRow): Any = {
+val s = subject.eval(input)
+if (null != s) {
+  val p = regexp.eval(input)
+  if (null != p) {
+val r = idx.eval(input)
+if (null != r) {
+  if (!p.equals(lastRegex)) {
+// regex value changed
+lastRegex = p.asInstanceOf[UTF8String]
+pattern = Pattern.compile(lastRegex.toString)
+  }
+  val m = pattern.matcher(s.toString())
+  if (m.find) {
+val mr: MatchResult = m.toMatchResult
+return UTF8String.fromString(mr.group(r.asInstanceOf[Int]))
+  }
+  return UTF8String.fromString()
--- End diff --

Okay. I am going to create a Jira and check the coding for existing empty 
strings


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [MESOS][SPARK-8798] Allow additional uris to b...

2015-07-19 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/7195#issuecomment-122635407
  
Merged build finished. Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SQL] Make date/time functions more consistent...

2015-07-19 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/7506#issuecomment-122637538
  
Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SQL] Make date/time functions more consistent...

2015-07-19 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/7506#issuecomment-122637521
  
  [Test build #37759 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/37759/console)
 for   PR 7506 at commit 
[`e44a4a0`](https://github.com/apache/spark/commit/e44a4a0579ea65093fdb7ca39749855be3a50fcd).
 * This patch **passes all tests**.
 * This patch merges cleanly.
 * This patch adds the following public classes _(experimental)_:
  * `case class Hour(child: Expression) extends UnaryExpression with 
ImplicitCastInputTypes `
  * `case class Minute(child: Expression) extends UnaryExpression with 
ImplicitCastInputTypes `
  * `case class Second(child: Expression) extends UnaryExpression with 
ImplicitCastInputTypes `
  * `case class DayOfYear(child: Expression) extends UnaryExpression with 
ImplicitCastInputTypes `
  * `case class Year(child: Expression) extends UnaryExpression with 
ImplicitCastInputTypes `
  * `case class Quarter(child: Expression) extends UnaryExpression with 
ImplicitCastInputTypes `
  * `case class Month(child: Expression) extends UnaryExpression with 
ImplicitCastInputTypes `
  * `case class DayOfMonth(child: Expression) extends UnaryExpression with 
ImplicitCastInputTypes `
  * `case class WeekOfYear(child: Expression) extends UnaryExpression with 
ImplicitCastInputTypes `
  * `case class DateFormatClass(left: Expression, right: Expression) 
extends BinaryExpression`



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-9179] [BUILD] Allows committers to spec...

2015-07-19 Thread liancheng
GitHub user liancheng opened a pull request:

https://github.com/apache/spark/pull/7508

[SPARK-9179] [BUILD] Allows committers to specify primary author of the PR 
to be merged

It's a common case that some contributor contributes an initial version of 
a feature/bugfix, and later on some other people (mostly committers) fork and 
add more improvements. When merging these PRs, we probably want to specify the 
original author as the primary author. Currently we can only do this by running

```
$ git commit --amend --author=name email
```

manually right before the merge script pushes to Apache Git repo. It would 
be nice if the script accepts user specified primary author information.

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/liancheng/spark spark-9179

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/spark/pull/7508.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #7508


commit 218d88e7c74d4cb1dae085ae4a6d1a6221acb90f
Author: Cheng Lian l...@databricks.com
Date:   2015-07-19T08:05:01Z

Allows committers to specify primary author of the PR to be merged




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-9175] [MLlib] BLAS.gemm fails to update...

2015-07-19 Thread srowen
Github user srowen commented on the pull request:

https://github.com/apache/spark/pull/7503#issuecomment-122637915
  
LGTM


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-9175] [MLlib] BLAS.gemm fails to update...

2015-07-19 Thread srowen
Github user srowen commented on the pull request:

https://github.com/apache/spark/pull/7503#issuecomment-122637916
  
ok to test


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-9179] [BUILD] Allows committers to spec...

2015-07-19 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/7508#issuecomment-122637898
  
 Merged build triggered.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-9179] [BUILD] Allows committers to spec...

2015-07-19 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/7508#issuecomment-122637901
  
Merged build started.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: changes with lambda (closure)

2015-07-19 Thread srowen
Github user srowen commented on the pull request:

https://github.com/apache/spark/pull/7502#issuecomment-122637954
  
I think you opened this by mistake? do you mind closing this PR?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-9179] [BUILD] Allows committers to spec...

2015-07-19 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/7508#issuecomment-122637941
  
  [Test build #37767 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/37767/consoleFull)
 for   PR 7508 at commit 
[`218d88e`](https://github.com/apache/spark/commit/218d88e7c74d4cb1dae085ae4a6d1a6221acb90f).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-9179] [BUILD] Allows committers to spec...

2015-07-19 Thread srowen
Github user srowen commented on the pull request:

https://github.com/apache/spark/pull/7508#issuecomment-122637944
  
LGTM. Several superfluous whitespace changes, but hey.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-9175] [MLlib] BLAS.gemm fails to update...

2015-07-19 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/7503#issuecomment-122638160
  
  [Test build #37768 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/37768/consoleFull)
 for   PR 7503 at commit 
[`fce199c`](https://github.com/apache/spark/commit/fce199c3419b36a8c6d69d7b9eb293c7d4185b59).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-8199][SQL] follow up; revert change in ...

2015-07-19 Thread rxin
Github user rxin commented on the pull request:

https://github.com/apache/spark/pull/7505#issuecomment-122638158
  
Thanks - I've merged this.



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-9094] [PARENT] Increased io.dropwizard....

2015-07-19 Thread srowen
Github user srowen commented on the pull request:

https://github.com/apache/spark/pull/7493#issuecomment-122638146
  
Merged into master/1.4


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-9178][SQL] Add an empty string constant...

2015-07-19 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/7509#issuecomment-122639980
  
Can one of the admins verify this patch?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [HOTFIX] [SQL] Fixes compilation error introdu...

2015-07-19 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/7510#issuecomment-122641318
  
  [Test build #37770 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/37770/consoleFull)
 for   PR 7510 at commit 
[`7ea7e89`](https://github.com/apache/spark/commit/7ea7e89818e529e43afdb9c18e4a68ba33acdd13).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-9172][SQL] Make DecimalPrecision suppor...

2015-07-19 Thread viirya
GitHub user viirya opened a pull request:

https://github.com/apache/spark/pull/7511

[SPARK-9172][SQL] Make DecimalPrecision support for Intersect and Except

JIRA: https://issues.apache.org/jira/browse/SPARK-9172

Simply make `DecimalPrecision` support for `Intersect` and `Except` in 
addition to `Union`.

Besides, add unit test for `DecimalPrecision` as well.

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/viirya/spark-1 more_decimalprecieion

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/spark/pull/7511.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #7511


commit 9fb0d490f4244963138e0fcaddba82ad066b0a3f
Author: Liang-Chi Hsieh vii...@appier.com
Date:   2015-07-19T09:22:53Z

Make DecimalPrecision support for Intersect and Except.




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-9172][SQL] Make DecimalPrecision suppor...

2015-07-19 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/7511#issuecomment-122641980
  
 Merged build triggered.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-9179] [BUILD] Allows committers to spec...

2015-07-19 Thread liancheng
Github user liancheng commented on the pull request:

https://github.com/apache/spark/pull/7508#issuecomment-122641804
  
@srowen Thanks for the review :) Just couldn't help to remove those 
trailing spaces... I'm merging this to master then.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-9179] [BUILD] Allows committers to spec...

2015-07-19 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/spark/pull/7508


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-9172][SQL] Make DecimalPrecision suppor...

2015-07-19 Thread viirya
Github user viirya commented on the pull request:

https://github.com/apache/spark/pull/7511#issuecomment-122644123
  
Wait for #7510 to solve the compilation error.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-8241][SQL] string function: concat_ws.

2015-07-19 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/7504#issuecomment-122632691
  
Merged build finished. Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-8199][SQL] follow up; revert change in ...

2015-07-19 Thread tarekauel
GitHub user tarekauel opened a pull request:

https://github.com/apache/spark/pull/7505

[SPARK-8199][SQL] follow up; revert change in test

@rxin / @davies 

Sorry for that unnecessary change. And thanks again for all you support!

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/tarekauel/spark SPARK-8199-FollowUp

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/spark/pull/7505.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #7505


commit 67acfe6ff366e2050a72069842b088935d81e2ef
Author: Tarek Auel tarek.a...@googlemail.com
Date:   2015-07-19T06:01:02Z

[SPARK-8199] follow up; revert change in test




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-8199][SQL] follow up; revert change in ...

2015-07-19 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/7505#issuecomment-122632766
  
Can one of the admins verify this patch?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-8199][SQL] follow up; revert change in ...

2015-07-19 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/7505#issuecomment-122632788
  
Can one of the admins verify this patch?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-8199][SQL] follow up; revert change in ...

2015-07-19 Thread davies
Github user davies commented on the pull request:

https://github.com/apache/spark/pull/7505#issuecomment-122632899
  
@tarekauel If Calendar is created inside for-loop (i), we should use `i`, 
otherwise use 1, is it correct?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SQL] Make date/time functions more consistent...

2015-07-19 Thread rxin
GitHub user rxin opened a pull request:

https://github.com/apache/spark/pull/7506

[SQL] Make date/time functions more consistent with other database systems.

This renames some of the functions that are just merged in order to be more 
consistent with other databases. Also did some small cleanups.

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/rxin/spark datetime

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/spark/pull/7506.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #7506


commit 9c08fdc73d601bb856a16fca4c8c700dc29f3717
Author: Reynold Xin r...@databricks.com
Date:   2015-07-19T06:12:08Z

[SQL] Make date/time functions more consistent with other database systems.

This renames some of the functions that are just merged in order to be more 
consistent with other databases.

Also did some small cleanups.




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-8199][SQL] follow up; revert change in ...

2015-07-19 Thread tarekauel
Github user tarekauel commented on the pull request:

https://github.com/apache/spark/pull/7505#issuecomment-122632995
  
Now it's right, isn't it?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-8199][SQL] follow up; revert change in ...

2015-07-19 Thread davies
Github user davies commented on the pull request:

https://github.com/apache/spark/pull/7505#issuecomment-122632950
  
there are four more places (below that) need to be fixed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SQL] Make date/time functions more consistent...

2015-07-19 Thread rxin
Github user rxin commented on a diff in the pull request:

https://github.com/apache/spark/pull/7506#discussion_r34955144
  
--- Diff: sql/core/src/main/scala/org/apache/spark/sql/functions.scala ---
@@ -1748,182 +1748,6 @@ object functions {
*/
   def length(columnName: String): Column = length(Column(columnName))
 
-  
//
--- End diff --

note that this previously cut right into the middle of string functions so 
I moved them


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-8199][SQL] follow up; revert change in ...

2015-07-19 Thread davies
Github user davies commented on the pull request:

https://github.com/apache/spark/pull/7505#issuecomment-122632959
  
After this rush, you should have more rest, also me. :)


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-8935][SQL] Implement code generation fo...

2015-07-19 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/7365#issuecomment-122633040
  
  [Test build #37752 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/37752/console)
 for   PR 7365 at commit 
[`5de0a95`](https://github.com/apache/spark/commit/5de0a951371a23cd198a6cf69b9fcb238f792f0e).
 * This patch **fails Spark unit tests**.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SQL] Make date/time functions more consistent...

2015-07-19 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/7506#issuecomment-122633011
  
 Merged build triggered.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-8935][SQL] Implement code generation fo...

2015-07-19 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/7365#issuecomment-122633043
  
Merged build finished. Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SQL] Make date/time functions more consistent...

2015-07-19 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/7506#issuecomment-122633012
  
Merged build started.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SQL] Make date/time functions more consistent...

2015-07-19 Thread rxin
Github user rxin commented on the pull request:

https://github.com/apache/spark/pull/7506#issuecomment-122633048
  
cc @tarekauel and @davies 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-8241][SQL] string function: concat_ws.

2015-07-19 Thread rxin
Github user rxin commented on the pull request:

https://github.com/apache/spark/pull/7504#issuecomment-122633202
  
Jenkins, retest this please.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SQL] Make date/time functions more consistent...

2015-07-19 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/7506#issuecomment-122633104
  
Merged build started.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SQL] Make date/time functions more consistent...

2015-07-19 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/7506#issuecomment-122633099
  
 Merged build triggered.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SQL] Make date/time functions more consistent...

2015-07-19 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/7506#issuecomment-122633264
  
Merged build started.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-9128][Core] Get outerclasses and object...

2015-07-19 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/7459#issuecomment-122633267
  
Merged build started.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-8241][SQL] string function: concat_ws.

2015-07-19 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/7504#issuecomment-122633261
  
 Merged build triggered.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SQL] Make date/time functions more consistent...

2015-07-19 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/7506#issuecomment-122633237
  
  [Test build #37759 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/37759/consoleFull)
 for   PR 7506 at commit 
[`e44a4a0`](https://github.com/apache/spark/commit/e44a4a0579ea65093fdb7ca39749855be3a50fcd).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SQL] Make date/time functions more consistent...

2015-07-19 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/7506#issuecomment-122633291
  
Merged build finished. Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-8241][SQL] string function: concat_ws.

2015-07-19 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/7504#issuecomment-122633265
  
Merged build started.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-9128][Core] Get outerclasses and object...

2015-07-19 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/7459#issuecomment-122633262
  
 Merged build triggered.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SQL] Make date/time functions more consistent...

2015-07-19 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/7506#issuecomment-122633260
  
 Merged build triggered.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-9128][Core] Get outerclasses and object...

2015-07-19 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/7459#issuecomment-122633283
  
  [Test build #37762 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/37762/consoleFull)
 for   PR 7459 at commit 
[`7c9858d`](https://github.com/apache/spark/commit/7c9858db0f8374c8f124b4a964190ad2ff5ad898).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-8241][SQL] string function: concat_ws.

2015-07-19 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/7504#issuecomment-12262
  
  [Test build #37761 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/37761/consoleFull)
 for   PR 7504 at commit 
[`dda1021`](https://github.com/apache/spark/commit/dda1021891cbfea0c6859542f3270a5ae8c20486).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-8199][SQL] follow up; revert change in ...

2015-07-19 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/7505#issuecomment-12267
  
  [Test build #1113 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/NewSparkPullRequestBuilder/1113/consoleFull)
 for   PR 7505 at commit 
[`d09321c`](https://github.com/apache/spark/commit/d09321c7f3a4d5127c357fe15e7d6ab9531719d9).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-8199][SQL] follow up; revert change in ...

2015-07-19 Thread davies
Github user davies commented on the pull request:

https://github.com/apache/spark/pull/7505#issuecomment-122633326
  
LGTM.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SQL] Make date/time functions more consistent...

2015-07-19 Thread davies
Github user davies commented on the pull request:

https://github.com/apache/spark/pull/7506#issuecomment-122633426
  
LGTM


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SQL] Make date/time functions more consistent...

2015-07-19 Thread tarekauel
Github user tarekauel commented on the pull request:

https://github.com/apache/spark/pull/7506#issuecomment-122633436
  
@rxin Could you do this little fix as well?
https://github.com/apache/spark/pull/7505/files

Why do we switch from day_of_month to dayofmonth? Most SQL implementations 
use underscores:
[MySQL](https://dev.mysql.com/doc/refman/5.0/en/func-op-summary-ref.html) 
[SAP 
HANA](http://help.sap.com/saphelp_hanaplatform/helpdata/en/20/9f228975191014baed94f1b69693ae/content.htm?frameset=/en/20/9ddefe75191014ac249bf78ba2a1e9/frameset.htmcurrent_toc=/en/2e/1ef8b4f4554739959886e55d4c127b/plain.htmnode_id=91show_children=false)
 
[Oracle](http://docs.oracle.com/cd/B19306_01/server.102/b14200/functions001.htm#i88891)
I would prefer underscores, because they improve the readability, if you 
write all SQL stuff in caps, like:
`SELECT name, age, DAY_OF_MONTH(birthday) AS birthday FROM people WHERE age 
 15` compared to `SELECT name, age, DAYOFMONTH(birthday) AS birthday FROM 
people WHERE age  15`
I'm not a Python pro, but I thought that underscores are 'pythonic', aren't 
they?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SQL] Make date/time functions more consistent...

2015-07-19 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/7506#issuecomment-122633378
  
Merged build finished. Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-8241][SQL] string function: concat_ws.

2015-07-19 Thread rxin
Github user rxin commented on the pull request:

https://github.com/apache/spark/pull/7504#issuecomment-122633444
  
cc @davies 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SQL] Make date/time functions more consistent...

2015-07-19 Thread rxin
Github user rxin commented on the pull request:

https://github.com/apache/spark/pull/7506#issuecomment-122633558
  
Both MySQL and HANA use dayofmonth, without the underscore?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-9019][YARN] Add RM delegation token to ...

2015-07-19 Thread bolkedebruin
Github user bolkedebruin commented on the pull request:

https://github.com/apache/spark/pull/7489#issuecomment-122633600
  
Before adding the code I also grepped for getRMDelegationToken which is the 
API call. It is not in Spark. Additionally, you are required to add it to the 
launch context so where else can it be?

This was tested on a HDP 2.2.0 setup with FreeIPA as KDC/LDAP. It was also 
tested on a HDP 2.2.6 clean install with Kerberos activated from
Ambari. We also tested with Spark 1.3.1. All are showing this behavior. 

I might be able to test it with a CDH 5.3 cluster, but would you be able to 
share a debug log yourself from the container to confirm that a RM token is 
generated in your case and/or that behavior is different? 

I would like to get to the bottom of this and as I said I was surprised it 
wasn't in there before, but until now my data is saying so and I start to run 
out of options to provide more evidence. 


 On 19 jul. 2015, at 02:24, Hari Shreedharan notificati...@github.com 
wrote:
 
 I have not seen this issue. I am not saying RM is not running. I am saying
 there might be a config issue somewhere. I don't have access to the code
 right now, but I am fairly sure there is code that adds the RM tokens
 already.
 
 On Saturday, July 18, 2015, bolkedebruin notificati...@github.com wrote:
 
  @harishreedharan https://github.com/harishreedharan We tested it and
  the issue is with both keytab and kinit. Why do you think the RM is not
  running? As it is actually running (cluster is/was not doing much). The
  connection refused message happens *after* the SASL negotiation fails
  and is a bit misleading.
 
  See below for the same job but then with my patch included (will add in 
on
  minute).
 
  —
  Reply to this email directly or view it on GitHub
  https://github.com/apache/spark/pull/7489#issuecomment-122580670.
 
 
 
 -- 
 
 Thanks,
 Hari
 —
 Reply to this email directly or view it on GitHub.
 



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-8638] [SQL] Window Function Performance...

2015-07-19 Thread yhuai
Github user yhuai commented on the pull request:

https://github.com/apache/spark/pull/7057#issuecomment-122633626
  
@hvanhovell Overall looks good. I am merging it to master. I will leave a 
few comments for minor changes. Can you submit a follow-up PR to address them?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-8638] [SQL] Window Function Performance...

2015-07-19 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/spark/pull/7057


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-8240] [SPARK-8241] [SQL] string functio...

2015-07-19 Thread rxin
Github user rxin commented on the pull request:

https://github.com/apache/spark/pull/6775#issuecomment-122633649
  
@adrian-wang I had some time this weekend and added concat/concat_ws with 
code gen. I'm going to close this one. Thanks a lot.



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-8638] [SQL] Window Function Performance...

2015-07-19 Thread yhuai
Github user yhuai commented on a diff in the pull request:

https://github.com/apache/spark/pull/7057#discussion_r3497
  
--- Diff: 
sql/core/src/main/scala/org/apache/spark/sql/execution/Window.scala ---
@@ -38,443 +84,667 @@ case class Window(
 child: SparkPlan)
   extends UnaryNode {
 
-  override def output: Seq[Attribute] =
-(projectList ++ windowExpression).map(_.toAttribute)
+  override def output: Seq[Attribute] = projectList ++ 
windowExpression.map(_.toAttribute)
 
-  override def requiredChildDistribution: Seq[Distribution] =
+  override def requiredChildDistribution: Seq[Distribution] = {
 if (windowSpec.partitionSpec.isEmpty) {
-  // This operator will be very expensive.
+  // Only show warning when the number of bytes is larger than 100 MB?
+  logWarning(No Partition Defined for Window operation! Moving all 
data to a single 
++ partition, this can cause serious performance degradation.)
   AllTuples :: Nil
-} else {
-  ClusteredDistribution(windowSpec.partitionSpec) :: Nil
-}
-
-  // Since window functions are adding columns to the input rows, the 
child's outputPartitioning
-  // is preserved.
-  override def outputPartitioning: Partitioning = child.outputPartitioning
-
-  override def requiredChildOrdering: Seq[Seq[SortOrder]] = {
-// The required child ordering has two parts.
-// The first part is the expressions in the partition specification.
-// We add these expressions to the required ordering to make sure 
input rows are grouped
-// based on the partition specification. So, we only need to process a 
single partition
-// at a time.
-// The second part is the expressions specified in the ORDER BY cluase.
-// Basically, we first use sort to group rows based on partition 
specifications and then sort
-// Rows in a group based on the order specification.
-(windowSpec.partitionSpec.map(SortOrder(_, Ascending)) ++ 
windowSpec.orderSpec) :: Nil
+} else ClusteredDistribution(windowSpec.partitionSpec) :: Nil
   }
 
-  // Since window functions basically add columns to input rows, this 
operator
-  // will not change the ordering of input rows.
+  override def requiredChildOrdering: Seq[Seq[SortOrder]] =
+Seq(windowSpec.partitionSpec.map(SortOrder(_, Ascending)) ++ 
windowSpec.orderSpec)
+
   override def outputOrdering: Seq[SortOrder] = child.outputOrdering
 
-  case class ComputedWindow(
-unbound: WindowExpression,
-windowFunction: WindowFunction,
-resultAttribute: AttributeReference)
-
-  // A list of window functions that need to be computed for each group.
-  private[this] val computedWindowExpressions = windowExpression.flatMap { 
window =
-window.collect {
-  case w: WindowExpression =
-ComputedWindow(
-  w,
-  BindReferences.bindReference(w.windowFunction, child.output),
-  AttributeReference(swindowResult:$w, w.dataType, w.nullable)())
+  /**
+   * Create a bound ordering object for a given frame type and offset. A 
bound ordering object is
+   * used to determine which input row lies within the frame boundaries of 
an output row.
+   *
+   * This method uses Code Generation. It can only be used on the executor 
side.
+   *
+   * @param frameType to evaluate. This can either be Row or Range based.
+   * @param offset with respect to the row.
+   * @return a bound ordering object.
+   */
+  private[this] def createBoundOrdering(frameType: FrameType, offset: 
Int): BoundOrdering = {
+frameType match {
+  case RangeFrame =
+val (exprs, current, bound) = if (offset == 0) {
+  // Use the entire order expression when the offset is 0.
+  val exprs = windowSpec.orderSpec.map(_.child)
+  val projection = newMutableProjection(exprs, child.output)
+  (windowSpec.orderSpec, projection(), projection())
+}
+else if (windowSpec.orderSpec.size == 1) {
--- End diff --

`} else if`


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-8240] [SPARK-8241] [SQL] string functio...

2015-07-19 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/spark/pull/6775


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-8638] [SQL] Window Function Performance...

2015-07-19 Thread yhuai
Github user yhuai commented on a diff in the pull request:

https://github.com/apache/spark/pull/7057#discussion_r3496
  
--- Diff: 
sql/core/src/main/scala/org/apache/spark/sql/execution/Window.scala ---
@@ -38,443 +84,667 @@ case class Window(
 child: SparkPlan)
   extends UnaryNode {
 
-  override def output: Seq[Attribute] =
-(projectList ++ windowExpression).map(_.toAttribute)
+  override def output: Seq[Attribute] = projectList ++ 
windowExpression.map(_.toAttribute)
 
-  override def requiredChildDistribution: Seq[Distribution] =
+  override def requiredChildDistribution: Seq[Distribution] = {
 if (windowSpec.partitionSpec.isEmpty) {
-  // This operator will be very expensive.
+  // Only show warning when the number of bytes is larger than 100 MB?
+  logWarning(No Partition Defined for Window operation! Moving all 
data to a single 
++ partition, this can cause serious performance degradation.)
   AllTuples :: Nil
-} else {
-  ClusteredDistribution(windowSpec.partitionSpec) :: Nil
-}
-
-  // Since window functions are adding columns to the input rows, the 
child's outputPartitioning
-  // is preserved.
-  override def outputPartitioning: Partitioning = child.outputPartitioning
-
-  override def requiredChildOrdering: Seq[Seq[SortOrder]] = {
-// The required child ordering has two parts.
-// The first part is the expressions in the partition specification.
-// We add these expressions to the required ordering to make sure 
input rows are grouped
-// based on the partition specification. So, we only need to process a 
single partition
-// at a time.
-// The second part is the expressions specified in the ORDER BY cluase.
-// Basically, we first use sort to group rows based on partition 
specifications and then sort
-// Rows in a group based on the order specification.
-(windowSpec.partitionSpec.map(SortOrder(_, Ascending)) ++ 
windowSpec.orderSpec) :: Nil
+} else ClusteredDistribution(windowSpec.partitionSpec) :: Nil
   }
 
-  // Since window functions basically add columns to input rows, this 
operator
-  // will not change the ordering of input rows.
+  override def requiredChildOrdering: Seq[Seq[SortOrder]] =
+Seq(windowSpec.partitionSpec.map(SortOrder(_, Ascending)) ++ 
windowSpec.orderSpec)
+
   override def outputOrdering: Seq[SortOrder] = child.outputOrdering
 
-  case class ComputedWindow(
-unbound: WindowExpression,
-windowFunction: WindowFunction,
-resultAttribute: AttributeReference)
-
-  // A list of window functions that need to be computed for each group.
-  private[this] val computedWindowExpressions = windowExpression.flatMap { 
window =
-window.collect {
-  case w: WindowExpression =
-ComputedWindow(
-  w,
-  BindReferences.bindReference(w.windowFunction, child.output),
-  AttributeReference(swindowResult:$w, w.dataType, w.nullable)())
+  /**
+   * Create a bound ordering object for a given frame type and offset. A 
bound ordering object is
+   * used to determine which input row lies within the frame boundaries of 
an output row.
+   *
+   * This method uses Code Generation. It can only be used on the executor 
side.
+   *
+   * @param frameType to evaluate. This can either be Row or Range based.
+   * @param offset with respect to the row.
+   * @return a bound ordering object.
+   */
+  private[this] def createBoundOrdering(frameType: FrameType, offset: 
Int): BoundOrdering = {
+frameType match {
+  case RangeFrame =
+val (exprs, current, bound) = if (offset == 0) {
+  // Use the entire order expression when the offset is 0.
+  val exprs = windowSpec.orderSpec.map(_.child)
+  val projection = newMutableProjection(exprs, child.output)
+  (windowSpec.orderSpec, projection(), projection())
+}
+else if (windowSpec.orderSpec.size == 1) {
+  // Use only the first order expression when the offset is 
non-null.
+  val sortExpr = windowSpec.orderSpec.head
+  val expr = sortExpr.child
+  // Create the projection which returns the current 'value'.
+  val current = newMutableProjection(expr :: Nil, child.output)()
+  // Flip the sign of the offset when processing the order is 
descending
+  val boundOffset = if (sortExpr.direction == Descending) -offset
+  else offset
--- End diff --

```
val boundOffset =
  if (sortExpr.direction == Descending)
-offset
  else
offset
```


---
If your project is set up for it, you can reply to this email 

[GitHub] spark pull request: [SPARK-8638] [SQL] Window Function Performance...

2015-07-19 Thread yhuai
Github user yhuai commented on a diff in the pull request:

https://github.com/apache/spark/pull/7057#discussion_r34955549
  
--- Diff: 
sql/hive/src/test/scala/org/apache/spark/sql/hive/execution/WindowSuite.scala 
---
@@ -0,0 +1,79 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the License); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an AS IS BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.spark.sql.hive.execution
+
+import org.apache.spark.sql.{Row, QueryTest}
+import org.apache.spark.sql.expressions.Window
+import org.apache.spark.sql.functions._
+import org.apache.spark.sql.hive.test.TestHive.implicits._
+
+/**
+ * Window expressions are tested extensively by the following test suites:
+ * [[org.apache.spark.sql.hive.HiveDataFrameWindowSuite]]
+ * 
[[org.apache.spark.sql.hive.execution.HiveWindowFunctionQueryWithoutCodeGenSuite]]
+ * 
[[org.apache.spark.sql.hive.execution.HiveWindowFunctionQueryFileWithoutCodeGenSuite]]
+ * However these suites do not cover all possible (i.e. more exotic) 
settings. This suite fill
+ * this gap.
+ *
+ * TODO Move this class to the sql/core project when we move to Native 
Spark UDAFs.
+ */
+class WindowSuite extends QueryTest {
--- End diff --

Seems we do not need to create a new suite, right? We can just use 
`HiveDataFrameWindowSuite`.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-6910] [SQL] Support for pushing predica...

2015-07-19 Thread liancheng
Github user liancheng commented on the pull request:

https://github.com/apache/spark/pull/7492#issuecomment-122633732
  
@marmbrus I think it's probably OK to merge this one first. But I still 
haven't got any clue about the root cause mentioned in 
https://github.com/apache/spark/pull/7421#issuecomment-122527391 yet.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-8638] [SQL] Window Function Performance...

2015-07-19 Thread yhuai
Github user yhuai commented on a diff in the pull request:

https://github.com/apache/spark/pull/7057#discussion_r3498
  
--- Diff: 
sql/core/src/main/scala/org/apache/spark/sql/execution/Window.scala ---
@@ -38,443 +84,667 @@ case class Window(
 child: SparkPlan)
   extends UnaryNode {
 
-  override def output: Seq[Attribute] =
-(projectList ++ windowExpression).map(_.toAttribute)
+  override def output: Seq[Attribute] = projectList ++ 
windowExpression.map(_.toAttribute)
 
-  override def requiredChildDistribution: Seq[Distribution] =
+  override def requiredChildDistribution: Seq[Distribution] = {
 if (windowSpec.partitionSpec.isEmpty) {
-  // This operator will be very expensive.
+  // Only show warning when the number of bytes is larger than 100 MB?
+  logWarning(No Partition Defined for Window operation! Moving all 
data to a single 
++ partition, this can cause serious performance degradation.)
   AllTuples :: Nil
-} else {
-  ClusteredDistribution(windowSpec.partitionSpec) :: Nil
-}
-
-  // Since window functions are adding columns to the input rows, the 
child's outputPartitioning
-  // is preserved.
-  override def outputPartitioning: Partitioning = child.outputPartitioning
-
-  override def requiredChildOrdering: Seq[Seq[SortOrder]] = {
-// The required child ordering has two parts.
-// The first part is the expressions in the partition specification.
-// We add these expressions to the required ordering to make sure 
input rows are grouped
-// based on the partition specification. So, we only need to process a 
single partition
-// at a time.
-// The second part is the expressions specified in the ORDER BY cluase.
-// Basically, we first use sort to group rows based on partition 
specifications and then sort
-// Rows in a group based on the order specification.
-(windowSpec.partitionSpec.map(SortOrder(_, Ascending)) ++ 
windowSpec.orderSpec) :: Nil
+} else ClusteredDistribution(windowSpec.partitionSpec) :: Nil
   }
 
-  // Since window functions basically add columns to input rows, this 
operator
-  // will not change the ordering of input rows.
+  override def requiredChildOrdering: Seq[Seq[SortOrder]] =
+Seq(windowSpec.partitionSpec.map(SortOrder(_, Ascending)) ++ 
windowSpec.orderSpec)
+
   override def outputOrdering: Seq[SortOrder] = child.outputOrdering
 
-  case class ComputedWindow(
-unbound: WindowExpression,
-windowFunction: WindowFunction,
-resultAttribute: AttributeReference)
-
-  // A list of window functions that need to be computed for each group.
-  private[this] val computedWindowExpressions = windowExpression.flatMap { 
window =
-window.collect {
-  case w: WindowExpression =
-ComputedWindow(
-  w,
-  BindReferences.bindReference(w.windowFunction, child.output),
-  AttributeReference(swindowResult:$w, w.dataType, w.nullable)())
+  /**
+   * Create a bound ordering object for a given frame type and offset. A 
bound ordering object is
+   * used to determine which input row lies within the frame boundaries of 
an output row.
+   *
+   * This method uses Code Generation. It can only be used on the executor 
side.
+   *
+   * @param frameType to evaluate. This can either be Row or Range based.
+   * @param offset with respect to the row.
+   * @return a bound ordering object.
+   */
+  private[this] def createBoundOrdering(frameType: FrameType, offset: 
Int): BoundOrdering = {
+frameType match {
+  case RangeFrame =
+val (exprs, current, bound) = if (offset == 0) {
+  // Use the entire order expression when the offset is 0.
+  val exprs = windowSpec.orderSpec.map(_.child)
+  val projection = newMutableProjection(exprs, child.output)
+  (windowSpec.orderSpec, projection(), projection())
+}
+else if (windowSpec.orderSpec.size == 1) {
+  // Use only the first order expression when the offset is 
non-null.
+  val sortExpr = windowSpec.orderSpec.head
+  val expr = sortExpr.child
+  // Create the projection which returns the current 'value'.
+  val current = newMutableProjection(expr :: Nil, child.output)()
+  // Flip the sign of the offset when processing the order is 
descending
+  val boundOffset = if (sortExpr.direction == Descending) -offset
+  else offset
+  // Create the projection which returns the current 'value' 
modified by adding the offset.
+  val boundExpr = Add(expr, Cast(Literal.create(boundOffset, 
IntegerType), expr.dataType))
+ 

[GitHub] spark pull request: [SPARK-8241][SQL] string function: concat_ws.

2015-07-19 Thread davies
Github user davies commented on a diff in the pull request:

https://github.com/apache/spark/pull/7504#discussion_r34955580
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/stringOperations.scala
 ---
@@ -56,15 +51,76 @@ case class Concat(children: Seq[Expression]) extends 
Expression with ImplicitCas
 
   override protected def genCode(ctx: CodeGenContext, ev: 
GeneratedExpressionCode): String = {
 val evals = children.map(_.gen(ctx))
-val inputs = evals.map { eval = s${eval.isNull} ? null : 
${eval.primitive} }.mkString(, )
+val inputs = evals.map { eval =
+  s${eval.isNull} ? (UTF8String)null : ${eval.primitive}
+}.mkString(, )
 evals.map(_.code).mkString(\n) + s
   boolean ${ev.isNull} = false;
   UTF8String ${ev.primitive} = UTF8String.concat($inputs);
+  if (${ev.primitive} == null) {
+${ev.isNull} = true;
+  }
 
   }
 }
 
 
+/**
+ * An expression that concatenates multiple input strings or array of 
strings into a single string,
+ * using a given separator (the first child).
+ *
+ * Returns null if the separator is null. Otherwise, concat_ws skips all 
null values.
+ */
+case class ConcatWs(children: Seq[Expression])
+  extends Expression with ImplicitCastInputTypes with CodegenFallback {
+
+  require(children.nonEmpty, s$prettyName requires at least one 
argument.)
+
+  override def prettyName: String = concat_ws
+
+  /** The 1st child (separator) is str, and rest are either str or array 
of str. */
+  override def inputTypes: Seq[AbstractDataType] = {
+val arrayOrStr = TypeCollection(ArrayType(StringType), StringType)
+StringType +: Seq.fill(children.size - 1)(arrayOrStr)
+  }
+
+  override def dataType: DataType = StringType
+
+  override def nullable: Boolean = children.head.nullable
+  override def foldable: Boolean = children.forall(_.foldable)
+
+  override def eval(input: InternalRow): Any = {
+val flatInputs = children.flatMap { child =
+  child.eval(input) match {
+case s: UTF8String = Iterator(s)
+case arr: Seq[_] = arr.asInstanceOf[Seq[UTF8String]]
+case null = Iterator(null.asInstanceOf[UTF8String])
--- End diff --

minor: Can we just ignore the `null`?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-8756][SQL] Keep cached information and ...

2015-07-19 Thread rxin
Github user rxin commented on the pull request:

https://github.com/apache/spark/pull/7154#issuecomment-122633884
  
cc @liancheng 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-8241][SQL] string function: concat_ws.

2015-07-19 Thread davies
Github user davies commented on a diff in the pull request:

https://github.com/apache/spark/pull/7504#discussion_r34955583
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/stringOperations.scala
 ---
@@ -56,15 +51,76 @@ case class Concat(children: Seq[Expression]) extends 
Expression with ImplicitCas
 
   override protected def genCode(ctx: CodeGenContext, ev: 
GeneratedExpressionCode): String = {
 val evals = children.map(_.gen(ctx))
-val inputs = evals.map { eval = s${eval.isNull} ? null : 
${eval.primitive} }.mkString(, )
+val inputs = evals.map { eval =
+  s${eval.isNull} ? (UTF8String)null : ${eval.primitive}
+}.mkString(, )
 evals.map(_.code).mkString(\n) + s
   boolean ${ev.isNull} = false;
   UTF8String ${ev.primitive} = UTF8String.concat($inputs);
+  if (${ev.primitive} == null) {
+${ev.isNull} = true;
+  }
 
   }
 }
 
 
+/**
+ * An expression that concatenates multiple input strings or array of 
strings into a single string,
+ * using a given separator (the first child).
+ *
+ * Returns null if the separator is null. Otherwise, concat_ws skips all 
null values.
+ */
+case class ConcatWs(children: Seq[Expression])
+  extends Expression with ImplicitCastInputTypes with CodegenFallback {
+
+  require(children.nonEmpty, s$prettyName requires at least one 
argument.)
+
+  override def prettyName: String = concat_ws
+
+  /** The 1st child (separator) is str, and rest are either str or array 
of str. */
+  override def inputTypes: Seq[AbstractDataType] = {
+val arrayOrStr = TypeCollection(ArrayType(StringType), StringType)
+StringType +: Seq.fill(children.size - 1)(arrayOrStr)
+  }
+
+  override def dataType: DataType = StringType
+
+  override def nullable: Boolean = children.head.nullable
+  override def foldable: Boolean = children.forall(_.foldable)
--- End diff --

existing: we could use this as the default one for Expression.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-8241][SQL] string function: concat_ws.

2015-07-19 Thread rxin
Github user rxin commented on a diff in the pull request:

https://github.com/apache/spark/pull/7504#discussion_r34955606
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/stringOperations.scala
 ---
@@ -56,15 +51,76 @@ case class Concat(children: Seq[Expression]) extends 
Expression with ImplicitCas
 
   override protected def genCode(ctx: CodeGenContext, ev: 
GeneratedExpressionCode): String = {
 val evals = children.map(_.gen(ctx))
-val inputs = evals.map { eval = s${eval.isNull} ? null : 
${eval.primitive} }.mkString(, )
+val inputs = evals.map { eval =
+  s${eval.isNull} ? (UTF8String)null : ${eval.primitive}
+}.mkString(, )
 evals.map(_.code).mkString(\n) + s
   boolean ${ev.isNull} = false;
   UTF8String ${ev.primitive} = UTF8String.concat($inputs);
+  if (${ev.primitive} == null) {
+${ev.isNull} = true;
+  }
 
   }
 }
 
 
+/**
+ * An expression that concatenates multiple input strings or array of 
strings into a single string,
+ * using a given separator (the first child).
+ *
+ * Returns null if the separator is null. Otherwise, concat_ws skips all 
null values.
+ */
+case class ConcatWs(children: Seq[Expression])
+  extends Expression with ImplicitCastInputTypes with CodegenFallback {
+
+  require(children.nonEmpty, s$prettyName requires at least one 
argument.)
+
+  override def prettyName: String = concat_ws
+
+  /** The 1st child (separator) is str, and rest are either str or array 
of str. */
+  override def inputTypes: Seq[AbstractDataType] = {
+val arrayOrStr = TypeCollection(ArrayType(StringType), StringType)
+StringType +: Seq.fill(children.size - 1)(arrayOrStr)
+  }
+
+  override def dataType: DataType = StringType
+
+  override def nullable: Boolean = children.head.nullable
+  override def foldable: Boolean = children.forall(_.foldable)
+
+  override def eval(input: InternalRow): Any = {
+val flatInputs = children.flatMap { child =
+  child.eval(input) match {
+case s: UTF8String = Iterator(s)
+case arr: Seq[_] = arr.asInstanceOf[Seq[UTF8String]]
+case null = Iterator(null.asInstanceOf[UTF8String])
--- End diff --

How? It won't match s. Also this thing doesn't compile if I do a wildcard 
match on s, e.g.
```scala
case s: _ = ...
```


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-8241][SQL] string function: concat_ws.

2015-07-19 Thread davies
Github user davies commented on a diff in the pull request:

https://github.com/apache/spark/pull/7504#discussion_r34955617
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/stringOperations.scala
 ---
@@ -56,15 +51,76 @@ case class Concat(children: Seq[Expression]) extends 
Expression with ImplicitCas
 
   override protected def genCode(ctx: CodeGenContext, ev: 
GeneratedExpressionCode): String = {
 val evals = children.map(_.gen(ctx))
-val inputs = evals.map { eval = s${eval.isNull} ? null : 
${eval.primitive} }.mkString(, )
+val inputs = evals.map { eval =
+  s${eval.isNull} ? (UTF8String)null : ${eval.primitive}
+}.mkString(, )
 evals.map(_.code).mkString(\n) + s
   boolean ${ev.isNull} = false;
   UTF8String ${ev.primitive} = UTF8String.concat($inputs);
+  if (${ev.primitive} == null) {
+${ev.isNull} = true;
+  }
 
   }
 }
 
 
+/**
+ * An expression that concatenates multiple input strings or array of 
strings into a single string,
+ * using a given separator (the first child).
+ *
+ * Returns null if the separator is null. Otherwise, concat_ws skips all 
null values.
+ */
+case class ConcatWs(children: Seq[Expression])
+  extends Expression with ImplicitCastInputTypes with CodegenFallback {
+
+  require(children.nonEmpty, s$prettyName requires at least one 
argument.)
+
+  override def prettyName: String = concat_ws
+
+  /** The 1st child (separator) is str, and rest are either str or array 
of str. */
+  override def inputTypes: Seq[AbstractDataType] = {
+val arrayOrStr = TypeCollection(ArrayType(StringType), StringType)
+StringType +: Seq.fill(children.size - 1)(arrayOrStr)
+  }
+
+  override def dataType: DataType = StringType
+
+  override def nullable: Boolean = children.head.nullable
+  override def foldable: Boolean = children.forall(_.foldable)
+
+  override def eval(input: InternalRow): Any = {
+val flatInputs = children.flatMap { child =
+  child.eval(input) match {
+case s: UTF8String = Iterator(s)
+case arr: Seq[_] = arr.asInstanceOf[Seq[UTF8String]]
+case null = Iterator(null.asInstanceOf[UTF8String])
--- End diff --

I mean we can return an empty Iterator


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-9166][SQL][PYSPARK] Capture and hide Il...

2015-07-19 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/7497#issuecomment-122634105
  
  [Test build #37763 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/37763/consoleFull)
 for   PR 7497 at commit 
[`9ace67d`](https://github.com/apache/spark/commit/9ace67dede05115cfed7f4794867cd9dabe370d8).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



  1   2   3   4   >