[GitHub] spark pull request: [Spark-8530] [ML] add python API for MinMaxSca...

2015-06-30 Thread hhbyyh
GitHub user hhbyyh opened a pull request:

https://github.com/apache/spark/pull/7150

[Spark-8530] [ML] add python API for MinMaxScaler

jira: https://issues.apache.org/jira/browse/SPARK-8530

as titled. 
jira for MinMaxScaler: https://issues.apache.org/jira/browse/SPARK-7514

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/hhbyyh/spark pythonMinMax

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/spark/pull/7150.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #7150


commit 77f57ef41d12c2b5fc061305b71932f39207b93f
Author: Yuhao Yang 
Date:   2015-07-01T06:52:02Z

add python API for MinMaxScaler




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-8747][SQL] fix EqualNullSafe for binary...

2015-06-30 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/7143#issuecomment-117489137
  
Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-8747][SQL] fix EqualNullSafe for binary...

2015-06-30 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/7143#issuecomment-117488735
  
  [Test build #36227 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/36227/console)
 for   PR 7143 at commit 
[`d19e9c0`](https://github.com/apache/spark/commit/d19e9c0599bb222b405c2382f7fa0ba0dbd24099).
 * This patch **passes all tests**.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-8591][CORE]Block failed to unroll to me...

2015-06-30 Thread dibbhatt
Github user dibbhatt commented on a diff in the pull request:

https://github.com/apache/spark/pull/6990#discussion_r33652840
  
--- Diff: core/src/main/scala/org/apache/spark/storage/BlockManager.scala 
---
@@ -833,8 +833,10 @@ private[spark] class BlockManager(
 logDebug("Put block %s locally took %s".format(blockId, 
Utils.getUsedTimeMs(startTimeMs)))
 
 // Either we're storing bytes and we asynchronously started 
replication, or we're storing
-// values and need to serialize and replicate them now:
-if (putLevel.replication > 1) {
+// values and need to serialize and replicate them now.
+// Should not replicate the block if its StorageLevel is 
StorageLevel.NONE or
+// putting it to local is failed.
+if (!putBlockInfo.isFailed && putLevel.replication > 1) {
--- End diff --

Also as I see the replicate process, if one remote peer is failed to 
replicate , it tries with other remote peer to get desired Replication factor 
2, so the retry mechanism to replicate block already in place.  Thus if store 
to local is failed and if we stop replication to happen, this gives higher 
chance for client to retry same block and get desired replication factor .


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-8227][SQL][WIP]Add function unhex

2015-06-30 Thread davies
Github user davies commented on a diff in the pull request:

https://github.com/apache/spark/pull/7113#discussion_r33652508
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/math.scala
 ---
@@ -354,6 +356,35 @@ case class Pow(left: Expression, right: Expression)
   }
 }
 
+/**
+ * Performs the inverse operation of HEX.
+ * Resulting characters are returned as a byte array.
+ */
+case class UnHex(child: Expression)
+  extends UnaryExpression with AutoCastInputTypes with Serializable  {
+
+  override def expectedChildTypes: Seq[DataType] = Seq(StringType)
+
+  override def dataType: DataType = BinaryType
+
+  override def eval(input: InternalRow): Any = {
+val num = child.eval(input)
+if (num == null) {
+  null
+} else {
+  unhex(num.asInstanceOf[UTF8String])
+}
+  }
+
+  private def unhex(utf8Str: UTF8String): Array[Byte] = {
+try {
+  new 
org.apache.commons.codec.binary.Hex(StandardCharsets.UTF_8).decode(utf8Str.getBytes)
--- End diff --

`toDigit()` could be implemented by lookup table 
```
val arr = Array[Byte](0, 1, 2, 3, 4, 5, 6, 7, 8, 9, -1, -1, -1, -1, -1, -1, 
-1, 10, 11, 12, 13, 14, 15, -1, ., 0, 1, 2, 3, 4, 5)
def toDigit(b: Byte): Byte = {
   if (b >= 48 && b <= (102 - 48)) arr(b) else -1
}
```


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-8621] [SQL] support empty string as col...

2015-06-30 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/7149#issuecomment-117474555
  
Merged build finished. Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-8621] [SQL] support empty string as col...

2015-06-30 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/7149#issuecomment-117474448
  
  [Test build #36233 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/36233/console)
 for   PR 7149 at commit 
[`a9c6c1d`](https://github.com/apache/spark/commit/a9c6c1d06ec4bc386b08a08cf0d8947e85021a2b).
 * This patch **fails Spark unit tests**.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-8591][CORE]Block failed to unroll to me...

2015-06-30 Thread dibbhatt
Github user dibbhatt commented on a diff in the pull request:

https://github.com/apache/spark/pull/6990#discussion_r33652324
  
--- Diff: core/src/main/scala/org/apache/spark/storage/BlockManager.scala 
---
@@ -833,8 +833,10 @@ private[spark] class BlockManager(
 logDebug("Put block %s locally took %s".format(blockId, 
Utils.getUsedTimeMs(startTimeMs)))
 
 // Either we're storing bytes and we asynchronously started 
replication, or we're storing
-// values and need to serialize and replicate them now:
-if (putLevel.replication > 1) {
+// values and need to serialize and replicate them now.
+// Should not replicate the block if its StorageLevel is 
StorageLevel.NONE or
+// putting it to local is failed.
+if (!putBlockInfo.isFailed && putLevel.replication > 1) {
--- End diff --

I agree with you @squito . What I have observed while working with Spark 
Streaming is , there is a higher probability of getting the 
BlockNotFoundException if I use MEMORY_ONLY settings and blocks got evicted 
from memory. The MEMORY_ONLY_2 can help that to some extent as even if block 
get dropped form one node, it still be available to other. This fix will help 
to reduce the BlockNotFoundException while storing the Blocks to local ( and 
failed ) , thus the Receiver gets the exception and can re-try again to store 
at some point in future . If we allow block to replicate even if store to local 
is failed, which is going to reduce the desired replication factor , Receiver 
will not have any option to re-try . In case of Local store to success and 
remote failed , or both local and remote store is success but eventually at 
some point block got evicted is not at receiver's control. So out of three 
possible cases ( Block failed locally but stored remotely , block stored 
locally but failed
  remotely , block stored both in remote and local but evicted sometime in 
future ) , this fix can solve the BlockNotFoundException for 1/3 rd of the 
cases by giving the Receiver to re-try to store the block once again to get 
desired replication factor 2. 

Do let me know if this argument is valid ?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-8227][SQL][WIP]Add function unhex

2015-06-30 Thread davies
Github user davies commented on a diff in the pull request:

https://github.com/apache/spark/pull/7113#discussion_r33652200
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/math.scala
 ---
@@ -354,6 +356,35 @@ case class Pow(left: Expression, right: Expression)
   }
 }
 
+/**
+ * Performs the inverse operation of HEX.
+ * Resulting characters are returned as a byte array.
+ */
+case class UnHex(child: Expression)
+  extends UnaryExpression with AutoCastInputTypes with Serializable  {
+
+  override def expectedChildTypes: Seq[DataType] = Seq(StringType)
+
+  override def dataType: DataType = BinaryType
+
+  override def eval(input: InternalRow): Any = {
+val num = child.eval(input)
+if (num == null) {
+  null
+} else {
+  unhex(num.asInstanceOf[UTF8String])
+}
+  }
+
+  private def unhex(utf8Str: UTF8String): Array[Byte] = {
+try {
+  new 
org.apache.commons.codec.binary.Hex(StandardCharsets.UTF_8).decode(utf8Str.getBytes)
--- End diff --

The best approach is something like `decode(char[])`, but work with 
`byte[]`, we need to implement `toDigit(Byte): Byte`, which return 10 for `97` 
('a').


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-8750][SQL] Remove the closure in functi...

2015-06-30 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/7148#issuecomment-117468147
  
  [Test build #36234 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/36234/consoleFull)
 for   PR 7148 at commit 
[`00df372`](https://github.com/apache/spark/commit/00df372326a60ba1263e33132d197570d2e062e9).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-8750][SQL] Remove the closure in functi...

2015-06-30 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request:

https://github.com/apache/spark/pull/7148#discussion_r33652131
  
--- Diff: sql/core/src/main/scala/org/apache/spark/sql/functions.scala ---
@@ -1829,7 +1829,15 @@ object functions {
*/
   @deprecated("Use callUDF", "1.5.0")
   def callUdf(udfName: String, cols: Column*): Column = {
- UnresolvedFunction(udfName, cols.map(_.expr))
+// Note: we avoid using closures here because on file systems that are 
case-insensitive, the
+// compiled class file for the closure here will conflict with the one 
in callUDF (upper case).
+val exprs = new 
scala.collection.mutable.ArrayBuffer[Expression](cols.size)
--- End diff --

we can just use Array here, as we have known the size ahead of time.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-8750][SQL] Remove the closure in functi...

2015-06-30 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/7148#issuecomment-117463889
  
 Merged build triggered.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-8750][SQL] Remove the closure in functi...

2015-06-30 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/7148#issuecomment-117464005
  
Merged build started.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-3444] [core] Restore INFO level after l...

2015-06-30 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/7140#issuecomment-117462143
  
Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-8638] [SQL] Window Function Performance...

2015-06-30 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/7057#issuecomment-117461905
  
Merged build finished. Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-3444] [core] Restore INFO level after l...

2015-06-30 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/7140#issuecomment-117461182
  
  [Test build #36226 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/36226/console)
 for   PR 7140 at commit 
[`de14836`](https://github.com/apache/spark/commit/de1483629bc6f88cf2ff64b0eb1f4c51ffbb6e65).
 * This patch **passes all tests**.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-8638] [SQL] Window Function Performance...

2015-06-30 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/7057#issuecomment-117461635
  
**[Test build #36225 timed 
out](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/36225/console)**
 for PR 7057 at commit 
[`27b0329`](https://github.com/apache/spark/commit/27b03293c51ab20b572958fa207071360ed2418c)
 after a configured wait of `175m`.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-8621] [SQL] support empty string as col...

2015-06-30 Thread cloud-fan
Github user cloud-fan commented on the pull request:

https://github.com/apache/spark/pull/7149#issuecomment-117454730
  
cc @rxin 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-8621] [SQL] support empty string as col...

2015-06-30 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/7149#issuecomment-117455412
  
  [Test build #36233 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/36233/consoleFull)
 for   PR 7149 at commit 
[`a9c6c1d`](https://github.com/apache/spark/commit/a9c6c1d06ec4bc386b08a08cf0d8947e85021a2b).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-8621] [SQL] crosstab exception when one...

2015-06-30 Thread cloud-fan
Github user cloud-fan commented on the pull request:

https://github.com/apache/spark/pull/7117#issuecomment-117454503
  
Hi @animeshbaranawal , thanks for fixing this!
Spark SQL should allow any kind of string as column name and that's my bad 
that I mistakenly deny empty string when wrote `parseAttributeName`. I have 
opened https://github.com/apache/spark/pull/7149 to fix this with less changing 
and added test, do you mind have a look?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-8621] [SQL] support empty string as col...

2015-06-30 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/7149#issuecomment-117451392
  
Merged build started.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-8621] [SQL] support empty string as col...

2015-06-30 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/7149#issuecomment-117451243
  
 Merged build triggered.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-8621] [SQL] support empty string as col...

2015-06-30 Thread cloud-fan
GitHub user cloud-fan opened a pull request:

https://github.com/apache/spark/pull/7149

[SPARK-8621] [SQL] support empty string as column name

improve the empty check in `parseAttributeName` so that we can allow empty 
string as column name.

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/cloud-fan/spark 8621

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/spark/pull/7149.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #7149


commit a9c6c1d06ec4bc386b08a08cf0d8947e85021a2b
Author: Wenchen Fan 
Date:   2015-07-01T06:08:19Z

support empty string




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-8227][SQL][WIP]Add function unhex

2015-06-30 Thread zhichao-li
Github user zhichao-li commented on a diff in the pull request:

https://github.com/apache/spark/pull/7113#discussion_r33651510
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/math.scala
 ---
@@ -354,6 +356,35 @@ case class Pow(left: Expression, right: Expression)
   }
 }
 
+/**
+ * Performs the inverse operation of HEX.
+ * Resulting characters are returned as a byte array.
+ */
+case class UnHex(child: Expression)
+  extends UnaryExpression with AutoCastInputTypes with Serializable  {
+
+  override def expectedChildTypes: Seq[DataType] = Seq(StringType)
+
+  override def dataType: DataType = BinaryType
+
+  override def eval(input: InternalRow): Any = {
+val num = child.eval(input)
+if (num == null) {
+  null
+} else {
+  unhex(num.asInstanceOf[UTF8String])
+}
+  }
+
+  private def unhex(utf8Str: UTF8String): Array[Byte] = {
+try {
+  new 
org.apache.commons.codec.binary.Hex(StandardCharsets.UTF_8).decode(utf8Str.getBytes)
--- End diff --

How about I reverse it back to the previous version by using 
`utf8String.toString()` to convert byte array to char sequence?  We need to do 
this conversion even in the apache common lib 
``` scala
public byte[] decode(final byte[] array) throws DecoderException {
return decodeHex(new String(array, getCharset()).toCharArray());
}
```
and there's limitation in that lib when the size is odd. It would throw 
exception instead of padding 0 at the front end which adopted in hive .

``` scala
public static byte[] decodeHex(final char[] data) throws 
DecoderException {

final int len = data.length;

if ((len & 0x01) != 0) {
throw new DecoderException("Odd number of characters.");
}
```




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-8750][SQL] Remove the closure in functi...

2015-06-30 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/7148#issuecomment-117446964
  
Merged build finished. Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-8750][SQL] Remove the closure in functi...

2015-06-30 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/7148#issuecomment-117446950
  
  [Test build #36232 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/36232/console)
 for   PR 7148 at commit 
[`4beba76`](https://github.com/apache/spark/commit/4beba76f77b83631860a13eaec1aad4b11359fc2).
 * This patch **fails Spark unit tests**.
 * This patch merges cleanly.
 * This patch adds the following public classes _(experimental)_:
  * `// compiled class file for the closure here will conflict with the 
one in callUDF (upper case).`



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [DOCS] The default value of sql config is wron...

2015-06-30 Thread davies
Github user davies commented on the pull request:

https://github.com/apache/spark/pull/7142#issuecomment-117446839
  
I'd like to delay this change until the release (at least after QA). We 
also should update description, which may be not true now. 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [DOCS] The default value of sql config is wron...

2015-06-30 Thread sarutak
Github user sarutak commented on the pull request:

https://github.com/apache/spark/pull/7142#issuecomment-117446374
  
I noticed, we should also modify the doc for 
`spark.sql.planner.externalSort` right?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-8748][SQL] Move castability test out fr...

2015-06-30 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/spark/pull/7145


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-8750][SQL] Remove the closure in functi...

2015-06-30 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/7148#issuecomment-117444366
  
  [Test build #36232 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/36232/consoleFull)
 for   PR 7148 at commit 
[`4beba76`](https://github.com/apache/spark/commit/4beba76f77b83631860a13eaec1aad4b11359fc2).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-8750][SQL] Remove the closure in functi...

2015-06-30 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/7148#issuecomment-117444273
  
Merged build started.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-8750][SQL] Remove the closure in functi...

2015-06-30 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/7148#issuecomment-117444265
  
 Merged build triggered.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-8750][SQL] Remove the closure in functi...

2015-06-30 Thread liancheng
Github user liancheng commented on the pull request:

https://github.com/apache/spark/pull/7148#issuecomment-117444245
  
LGTM


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-8750][SQL] Remove the closure in functi...

2015-06-30 Thread rxin
GitHub user rxin opened a pull request:

https://github.com/apache/spark/pull/7148

[SPARK-8750][SQL] Remove the closure in functions.callUdf.



You can merge this pull request into a Git repository by running:

$ git pull https://github.com/rxin/spark calludf-closure

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/spark/pull/7148.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #7148


commit 4beba76f77b83631860a13eaec1aad4b11359fc2
Author: Reynold Xin 
Date:   2015-07-01T06:00:00Z

[SPARK-8750][SQL] Remove the closure in functions.callUdf.




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-8749][SQL] Remove HiveTypeCoercion trai...

2015-06-30 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/7147#issuecomment-117443232
  
  [Test build #36231 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/36231/consoleFull)
 for   PR 7147 at commit 
[`c1c6dc0`](https://github.com/apache/spark/commit/c1c6dc02a07d4c56d3140a0431a70596e54ea088).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-8749][SQL] Remove HiveTypeCoercion trai...

2015-06-30 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/7147#issuecomment-117442810
  
Merged build started.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-8749][SQL] Remove HiveTypeCoercion trai...

2015-06-30 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/7147#issuecomment-117442802
  
 Merged build triggered.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-8749][SQL] Remove HiveTypeCoercion trai...

2015-06-30 Thread rxin
GitHub user rxin opened a pull request:

https://github.com/apache/spark/pull/7147

[SPARK-8749][SQL] Remove HiveTypeCoercion trait.

Moved all the rules into the companion object.

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/rxin/spark SPARK-8749

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/spark/pull/7147.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #7147


commit c1c6dc02a07d4c56d3140a0431a70596e54ea088
Author: Reynold Xin 
Date:   2015-07-01T05:46:56Z

[SPARK-8749][SQL] Remove HiveTypeCoercion trait.

Moved all the rules into the companion object.




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [DOCS] The default value of sql config is wron...

2015-06-30 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/7142#issuecomment-117441760
  
Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [DOCS] The default value of sql config is wron...

2015-06-30 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/7142#issuecomment-117441734
  
  [Test build #36228 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/36228/console)
 for   PR 7142 at commit 
[`d624cf8`](https://github.com/apache/spark/commit/d624cf8f60531936d598917403318eb6f04d71c3).
 * This patch **passes all tests**.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SQL][minor] remove internalRowRDD in DataFram...

2015-06-30 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/7116#issuecomment-117441706
  
  [Test build #36230 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/36230/consoleFull)
 for   PR 7116 at commit 
[`24756ca`](https://github.com/apache/spark/commit/24756caaf72e1ee455bda79e5b90656adc78484c).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-8227][SQL][WIP]Add function unhex

2015-06-30 Thread davies
Github user davies commented on a diff in the pull request:

https://github.com/apache/spark/pull/7113#discussion_r33649592
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/math.scala
 ---
@@ -354,6 +356,35 @@ case class Pow(left: Expression, right: Expression)
   }
 }
 
+/**
+ * Performs the inverse operation of HEX.
+ * Resulting characters are returned as a byte array.
+ */
+case class UnHex(child: Expression)
+  extends UnaryExpression with AutoCastInputTypes with Serializable  {
+
+  override def expectedChildTypes: Seq[DataType] = Seq(StringType)
--- End diff --

hex() return StringType, so unhex() should take StringType, not BinaryType. 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-8647][MLlib] Potential issue with const...

2015-06-30 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/7146#issuecomment-117441591
  
Can one of the admins verify this patch?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SQL][minor] remove internalRowRDD in DataFram...

2015-06-30 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/7116#issuecomment-117441605
  
Merged build started.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SQL][minor] remove internalRowRDD in DataFram...

2015-06-30 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/7116#issuecomment-117441595
  
 Merged build triggered.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-8748][SQL] Move castability test out fr...

2015-06-30 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/7145#issuecomment-117441280
  
  [Test build #36229 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/36229/consoleFull)
 for   PR 7145 at commit 
[`cd086a9`](https://github.com/apache/spark/commit/cd086a961ce26af6653889f325a78b467227ffa3).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-8647][MLlib] Potential issue with const...

2015-06-30 Thread aloknsingh
GitHub user aloknsingh opened a pull request:

https://github.com/apache/spark/pull/7146

[SPARK-8647][MLlib] Potential issue with constant hashCode

I added the code, 
  // see [SPARK-8647], this achieves the needed constant hash code without 
constant no.
  override def hashCode(): Int = this.getClass.getName.hashCode()

does getting the constant hash code as per jira

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/aloknsingh/spark aloknsingh_SPARK-8647

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/spark/pull/7146.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #7146


commit 43cdb89786d3d3e47c2df2e01322fe544fc08aaf
Author: Alok  Singh 
Date:   2015-07-01T05:18:12Z

[SPARK-8647][MLlib] Potential issue with constant hashCode




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-8748][SQL] Move castability test out fr...

2015-06-30 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/7145#issuecomment-117439140
  
Merged build started.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-8748][SQL] Move castability test out fr...

2015-06-30 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/7145#issuecomment-117439133
  
 Merged build triggered.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-8748][SQL] Move castability test out fr...

2015-06-30 Thread rxin
GitHub user rxin opened a pull request:

https://github.com/apache/spark/pull/7145

[SPARK-8748][SQL] Move castability test out from Cast case class into Cast 
object.

This patch moved resolve function in Cast case class into the companion 
object, and renamed it canCast. We can then use this in the analyzer without a 
Cast expr.

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/rxin/spark cast

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/spark/pull/7145.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #7145


commit 4d2d98913e30c6f24ba6296f200b3a9089da45f3
Author: Reynold Xin 
Date:   2015-07-01T05:21:53Z

[SPARK-8748][SQL] Move castability test out from Cast case class into Cast 
object.

This patch moved resolve function in Cast case class into the companion 
object,
and renamed it canCast. We can then use this in the analyzer without a Cast 
expr.




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-7944][SPARK-8013] Remove most of the Sp...

2015-06-30 Thread ScrapCodes
Github user ScrapCodes commented on the pull request:

https://github.com/apache/spark/pull/6903#issuecomment-117438365
  
Yes, I am sort of clear that they are unrelated failures. About Mima checks 
they should pick the right version artifact according to scala version. I did 
not get a chance to look at it yet. But even if it is not doing the right 
thing, sounds like a bug to me.

About this patch, I am okay with it. But somehow I can not get SBT to build 
it missing genjavadoc plugin for 2.11.7. How are you building it ?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [DOCS] The default value of sql config is wron...

2015-06-30 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/7142#issuecomment-117438330
  
  [Test build #36228 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/36228/consoleFull)
 for   PR 7142 at commit 
[`d624cf8`](https://github.com/apache/spark/commit/d624cf8f60531936d598917403318eb6f04d71c3).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [DOCS] The default value of sql config is wron...

2015-06-30 Thread sarutak
Github user sarutak commented on the pull request:

https://github.com/apache/spark/pull/7142#issuecomment-117438135
  
Good catch, @KaiXinXiaoLei . Could you also address @tarekauel 's comment?
@liancheng does this change make sense?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [DOCS] The default value of sql config is wron...

2015-06-30 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/7142#issuecomment-117437913
  
Merged build started.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-8160][SQL]Support using external sortin...

2015-06-30 Thread davies
Github user davies commented on the pull request:

https://github.com/apache/spark/pull/6875#issuecomment-117438111
  
@lianhuiwang Aggregation is different than join, because aggregation could 
aggregation could reduce the data size, but join cannot. Optimizer could figure 
out whether use broadcast join or sort merge join based on the size of table, 
but it's very hard to guess what's the memory assumption will for aggregation 
(which is determined by the number of unique groups and aggregation algorithm).

All the aggregations happens within a partition, so no shuffling is needed. 
Usually, there are two aggregations  happen before and after shuffling.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [DOCS] The default value of sql config is wron...

2015-06-30 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/7142#issuecomment-117437845
  
 Merged build triggered.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [DOCS] The default value of sql config is wron...

2015-06-30 Thread sarutak
Github user sarutak commented on the pull request:

https://github.com/apache/spark/pull/7142#issuecomment-117437504
  
ok to test.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-5562][MLlib] LDA should handle empty do...

2015-06-30 Thread aloknsingh
Github user aloknsingh commented on the pull request:

https://github.com/apache/spark/pull/7064#issuecomment-117437417
  
Jenkins, retest this please


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-8746][SQL] update download link for Hiv...

2015-06-30 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/7144#issuecomment-117436988
  
Can one of the admins verify this patch?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-8747][SQL] fix EqualNullSafe for binary...

2015-06-30 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/7143#issuecomment-117436862
  
  [Test build #36227 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/36227/consoleFull)
 for   PR 7143 at commit 
[`d19e9c0`](https://github.com/apache/spark/commit/d19e9c0599bb222b405c2382f7fa0ba0dbd24099).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-8746][SQL] update download link for Hiv...

2015-06-30 Thread ckadner
GitHub user ckadner opened a pull request:

https://github.com/apache/spark/pull/7144

[SPARK-8746][SQL] update download link for Hive 0.13.1

updated the [Hive 0.13.1](https://archive.apache.org/dist/hive/hive-0.13.1) 
download link in `sql/README.md`

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/ckadner/spark SPARK-8746

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/spark/pull/7144.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #7144


commit 65d80f78ac1de3511f5e00f27e5f76e8639b6433
Author: Christian Kadner 
Date:   2015-07-01T05:02:05Z

[SPARK-8746][SQL] update download link for Hive 0.13.1




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-8747][SQL] fix EqualNullSafe for binary...

2015-06-30 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/7143#issuecomment-117436555
  
Merged build started.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-8747][SQL] fix EqualNullSafe for binary...

2015-06-30 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/7143#issuecomment-117436518
  
 Merged build triggered.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-8160][SQL]Support using external sortin...

2015-06-30 Thread lianhuiwang
Github user lianhuiwang commented on the pull request:

https://github.com/apache/spark/pull/6875#issuecomment-117436199
  
@davies if we can not hold all of them in memory and then switch to sort 
based, it should re-shuffle data to do sort. so its computation cost is very 
expensive. i think it is determined by statistics before physical plan 
execution. this problem is similar as hash join or sort-merge join. now sort 
merge join is determined by spark.sql.planner.sortMergeJoin(default is false). 
like sort merge join, sort based aggregation of this PR is also determined by 
spark.sql.planner.sortMergeAggregate(default is false).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-8747][SQL] fix EqualNullSafe for binary...

2015-06-30 Thread cloud-fan
GitHub user cloud-fan opened a pull request:

https://github.com/apache/spark/pull/7143

[SPARK-8747][SQL] fix EqualNullSafe for binary type

also improve tests for binary comparison.

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/cloud-fan/spark binary

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/spark/pull/7143.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #7143


commit d19e9c0599bb222b405c2382f7fa0ba0dbd24099
Author: Wenchen Fan 
Date:   2015-07-01T05:03:21Z

fix equalNullSafe




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-8235][SPARK-8236][SQL] misc functions: ...

2015-06-30 Thread davies
Github user davies commented on the pull request:

https://github.com/apache/spark/pull/6970#issuecomment-117435531
  
Forget to follow up this one, sorry. Next time, it will be better if you 
guys could comment on the JIRA to avoid crush each other, thanks you all the 
hard work, anyway!


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-8378][Streaming]Add the Python API for ...

2015-06-30 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/6830#issuecomment-117434623
  
Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-8688][YARN]Bug fix: disable the cache f...

2015-06-30 Thread harishreedharan
Github user harishreedharan commented on the pull request:

https://github.com/apache/spark/pull/7069#issuecomment-117434721
  
I am on vacation till Sunday. I can take a look after. If you want to merge
it before that, please do. I can look at it and file follow up jiras if
required.

On Tuesday, June 30, 2015, UCB AMPLab  wrote:

> Merged build finished. Test PASSed.
>
> —
> Reply to this email directly or view it on GitHub
> .
>


-- 

Thanks,
Hari



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-8378][Streaming]Add the Python API for ...

2015-06-30 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/6830#issuecomment-117434415
  
  [Test build #36224 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/36224/console)
 for   PR 6830 at commit 
[`78dfdac`](https://github.com/apache/spark/commit/78dfdac7216c3e68a42c6534f497289212a31a8a).
 * This patch **passes all tests**.
 * This patch merges cleanly.
 * This patch adds the following public classes _(experimental)_:
  * `class FlumeUtils(object):`



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-8533][Streaming] Upgrade Flume to 1.6.0

2015-06-30 Thread harishreedharan
Github user harishreedharan commented on the pull request:

https://github.com/apache/spark/pull/6939#issuecomment-117434018
  
I am on vacation till Sunday. Please feel free to merge this - if I have
any concerns at that time I will make some noise.

On Tuesday, June 30, 2015, Sean Owen  wrote:

> Ping @harishreedharan  I am sure you
> wouldn't propose this otherwise, but, just wanted to understand what if 
any
> potential incompatibilities this entails, particularly regarding
> dependencies. Do you know of any risks?
>
> —
> Reply to this email directly or view it on GitHub
> .
>


-- 

Thanks,
Hari



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-6602][Core]Remove unnecessary synchroni...

2015-06-30 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/spark/pull/7141


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-6602][Core]Remove unnecessary synchroni...

2015-06-30 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/7141#issuecomment-117433758
  
Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-6602][Core]Remove unnecessary synchroni...

2015-06-30 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/7141#issuecomment-117433699
  
  [Test build #36222 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/36222/console)
 for   PR 7141 at commit 
[`fcf7b50`](https://github.com/apache/spark/commit/fcf7b5077d4afef07ae1f715e4cc51128200bf18).
 * This patch **passes all tests**.
 * This patch merges cleanly.
 * This patch adds the following public classes _(experimental)_:
  * `  case class Heartbeat(workerId: String, worker: RpcEndpointRef) 
extends DeployMessage`
  * `  case class RegisteredWorker(master: RpcEndpointRef, masterWebUiUrl: 
String) extends DeployMessage`
  * `  case class RegisterApplication(appDescription: 
ApplicationDescription, driver: RpcEndpointRef)`
  * `  case class RegisteredApplication(appId: String, master: 
RpcEndpointRef) extends DeployMessage`
  * `  case class SubmitDriverResponse(`
  * `  case class KillDriverResponse(`
  * `  case class MasterChanged(master: RpcEndpointRef, masterWebUiUrl: 
String)`



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-8227][SQL][WIP]Add function unhex

2015-06-30 Thread davies
Github user davies commented on a diff in the pull request:

https://github.com/apache/spark/pull/7113#discussion_r33648302
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/math.scala
 ---
@@ -354,6 +356,35 @@ case class Pow(left: Expression, right: Expression)
   }
 }
 
+/**
+ * Performs the inverse operation of HEX.
+ * Resulting characters are returned as a byte array.
+ */
+case class UnHex(child: Expression)
+  extends UnaryExpression with AutoCastInputTypes with Serializable  {
--- End diff --

`unhex` only works on StringType, so we should not use `AutoCastInputTypes` 
here,  we should use `checkInputDataTypes` instead.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-8227][SQL][WIP]Add function unhex

2015-06-30 Thread davies
Github user davies commented on a diff in the pull request:

https://github.com/apache/spark/pull/7113#discussion_r33648015
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/math.scala
 ---
@@ -354,6 +356,35 @@ case class Pow(left: Expression, right: Expression)
   }
 }
 
+/**
+ * Performs the inverse operation of HEX.
+ * Resulting characters are returned as a byte array.
+ */
+case class UnHex(child: Expression)
+  extends UnaryExpression with AutoCastInputTypes with Serializable  {
+
+  override def expectedChildTypes: Seq[DataType] = Seq(StringType)
+
+  override def dataType: DataType = BinaryType
+
+  override def eval(input: InternalRow): Any = {
+val num = child.eval(input)
+if (num == null) {
+  null
+} else {
+  unhex(num.asInstanceOf[UTF8String])
+}
+  }
+
+  private def unhex(utf8Str: UTF8String): Array[Byte] = {
+try {
+  new 
org.apache.commons.codec.binary.Hex(StandardCharsets.UTF_8).decode(utf8Str.getBytes)
--- End diff --

Just realized that the implementation of  `decode(byte[])` is not 
optimized, it will have multiple allocations. Would you like it to optimized it 
now, or leave it later. Either works for me.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-8688][YARN]Bug fix: disable the cache f...

2015-06-30 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/7069#issuecomment-117432101
  
Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-8688][YARN]Bug fix: disable the cache f...

2015-06-30 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/7069#issuecomment-117432075
  
  [Test build #36221 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/36221/console)
 for   PR 7069 at commit 
[`f94cd0b`](https://github.com/apache/spark/commit/f94cd0bbc46440355c9e534c717ba4bc9011b608).
 * This patch **passes all tests**.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-6602][Core]Remove unnecessary synchroni...

2015-06-30 Thread rxin
Github user rxin commented on the pull request:

https://github.com/apache/spark/pull/7141#issuecomment-117431805
  
lgtm


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-8018][MLlib]KMeans should accept initia...

2015-06-30 Thread FlytxtRnD
Github user FlytxtRnD commented on the pull request:

https://github.com/apache/spark/pull/6737#issuecomment-117431033
  
@jkbradley Thank you for the review. I will make changes soon. I would like 
to know why pyspark unit tests are failing in this PR. It seems tests in 
PowerIterationClusteringModel are failing. Could you please tell me why this 
happens?



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-3444] [core] Restore INFO level after l...

2015-06-30 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/7140#issuecomment-117430686
  
  [Test build #36226 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/36226/consoleFull)
 for   PR 7140 at commit 
[`de14836`](https://github.com/apache/spark/commit/de1483629bc6f88cf2ff64b0eb1f4c51ffbb6e65).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-3444] [core] Restore INFO level after l...

2015-06-30 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/7140#issuecomment-117430470
  
Merged build started.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-3444] [core] Restore INFO level after l...

2015-06-30 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/7140#issuecomment-117430460
  
 Merged build triggered.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-3444] [core] Restore INFO level after l...

2015-06-30 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/7140#issuecomment-117430321
  
Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-3444] [core] Restore INFO level after l...

2015-06-30 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/7140#issuecomment-117430268
  
  [Test build #36220 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/36220/console)
 for   PR 7140 at commit 
[`6cff13a`](https://github.com/apache/spark/commit/6cff13a53eed560da434b625efc6b077f7265c12).
 * This patch **passes all tests**.
 * This patch merges cleanly.
 * This patch adds the following public classes _(experimental)_:
  * `trait ExtractValue `
  * `abstract class ExtractValueWithStruct extends UnaryExpression with 
ExtractValue `
  * `abstract class ExtractValueWithOrdinal extends BinaryExpression with 
ExtractValue `



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-8235][SPARK-8236][SQL] misc functions: ...

2015-06-30 Thread qiansl127
Github user qiansl127 closed the pull request at:

https://github.com/apache/spark/pull/6970


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-8235][SPARK-8236][SQL] misc functions: ...

2015-06-30 Thread qiansl127
Github user qiansl127 commented on the pull request:

https://github.com/apache/spark/pull/6970#issuecomment-117429617
  
Yes, please close this PR.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-8235][SPARK-8236][SQL] misc functions: ...

2015-06-30 Thread tarekauel
Github user tarekauel commented on the pull request:

https://github.com/apache/spark/pull/6970#issuecomment-117429124
  
I guess this one can be closed: #6963 (sha1) #7108 (crc32)


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-8374] [YARN] Job frequently hangs after...

2015-06-30 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/7128#issuecomment-117428246
  
Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-8374] [YARN] Job frequently hangs after...

2015-06-30 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/7128#issuecomment-117428173
  
  [Test build #36219 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/36219/console)
 for   PR 7128 at commit 
[`c1fa754`](https://github.com/apache/spark/commit/c1fa7541c5b74eabcde9ab1bafa542ee1bb4c51f).
 * This patch **passes all tests**.
 * This patch merges cleanly.
 * This patch adds the following public classes _(experimental)_:
  * `  case class Heartbeat(workerId: String, worker: RpcEndpointRef) 
extends DeployMessage`
  * `  case class RegisteredWorker(master: RpcEndpointRef, masterWebUiUrl: 
String) extends DeployMessage`
  * `  case class RegisterApplication(appDescription: 
ApplicationDescription, driver: RpcEndpointRef)`
  * `  case class RegisteredApplication(appId: String, master: 
RpcEndpointRef) extends DeployMessage`
  * `  case class SubmitDriverResponse(`
  * `  case class KillDriverResponse(`
  * `  case class MasterChanged(master: RpcEndpointRef, masterWebUiUrl: 
String)`



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-8227][SQL][WIP]Add function unhex

2015-06-30 Thread tarekauel
Github user tarekauel commented on a diff in the pull request:

https://github.com/apache/spark/pull/7113#discussion_r33646498
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/math.scala
 ---
@@ -354,6 +356,35 @@ case class Pow(left: Expression, right: Expression)
   }
 }
 
+/**
+ * Performs the inverse operation of HEX.
+ * Resulting characters are returned as a byte array.
+ */
+case class UnHex(child: Expression)
+  extends UnaryExpression with AutoCastInputTypes with Serializable  {
+
+  override def expectedChildTypes: Seq[DataType] = Seq(StringType)
--- End diff --

@davies this could be changed to `ByteType`, couldn't it? This avoids a 
cast, if someone calls this with a byte array.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [DOCS] The default value of sql config is wron...

2015-06-30 Thread tarekauel
Github user tarekauel commented on a diff in the pull request:

https://github.com/apache/spark/pull/7142#discussion_r33646314
  
--- Diff: docs/sql-programming-guide.md ---
@@ -1890,7 +1890,7 @@ that these options will be deprecated in future 
release as more optimizations ar
   
   
 spark.sql.codegen
-false
+true
 
   When true, code will be dynamically generated at runtime for 
expression evaluation in a specific
   query.  For some queries with complicated expression this option can 
lead to significant speed-ups.
--- End diff --

If you change this docu you could remove the duplicated space as well 
(before For)


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-8638] [SQL] Window Function Performance...

2015-06-30 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/7057#issuecomment-117426446
  
  [Test build #36225 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/36225/consoleFull)
 for   PR 7057 at commit 
[`27b0329`](https://github.com/apache/spark/commit/27b03293c51ab20b572958fa207071360ed2418c).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-8535][PySpark]PySpark : Can't create Da...

2015-06-30 Thread davies
Github user davies commented on the pull request:

https://github.com/apache/spark/pull/7124#issuecomment-117426318
  
Merged into master, 1.3 and 1.4 branch, thanks!


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-8378][Streaming]Add the Python API for ...

2015-06-30 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/6830#issuecomment-117426309
  
Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-8378][Streaming]Add the Python API for ...

2015-06-30 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/6830#issuecomment-117426260
  
  [Test build #36217 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/36217/console)
 for   PR 6830 at commit 
[`78dfdac`](https://github.com/apache/spark/commit/78dfdac7216c3e68a42c6534f497289212a31a8a).
 * This patch **passes all tests**.
 * This patch merges cleanly.
 * This patch adds the following public classes _(experimental)_:
  * `class FlumeUtils(object):`



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-8638] [SQL] Window Function Performance...

2015-06-30 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/7057#issuecomment-117426229
  
Merged build started.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-8638] [SQL] Window Function Performance...

2015-06-30 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/7057#issuecomment-117426218
  
 Merged build triggered.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-8535][PySpark]PySpark : Can't create Da...

2015-06-30 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/spark/pull/7124


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [WIP][SPARK-8313] R Spark packages support

2015-06-30 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/7139#issuecomment-117425683
  
  [Test build #36218 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/36218/console)
 for   PR 7139 at commit 
[`8810beb`](https://github.com/apache/spark/commit/8810beb215e8b3b646e0ae32794a16dd5da1eb65).
 * This patch **fails SparkR unit tests**.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [WIP][SPARK-8313] R Spark packages support

2015-06-30 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/7139#issuecomment-117425768
  
Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



  1   2   3   4   5   6   7   8   9   10   >