date:20170227

[GitHub] spark issue #15505: [SPARK-18890][CORE] Move task serialization from the Tas...

2017-02-27 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/15505
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/73520/
Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #17080: [SPARK-19739][CORE] propagate S3 session token to cluser

2017-02-27 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/17080
  
Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #17083: [SPARK-19750][UI][branch-2.1] Fix redirect issue from ht...

2017-02-27 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/17083
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/73498/
Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #17083: [SPARK-19750][UI][branch-2.1] Fix redirect issue from ht...

2017-02-27 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/17083
  
Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #15505: [SPARK-18890][CORE] Move task serialization from the Tas...

2017-02-27 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/15505
  
**[Test build #73520 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/73520/testReport)**
 for PR 15505 at commit 
[`1917b61`](https://github.com/apache/spark/commit/1917b616d8e33241ec763ac583b9e938873a1c7f).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #17080: [SPARK-19739][CORE] propagate S3 session token to cluser

2017-02-27 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/17080
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/73501/
Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #17083: [SPARK-19750][UI][branch-2.1] Fix redirect issue from ht...

2017-02-27 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/17083
  
**[Test build #73498 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/73498/testReport)**
 for PR 17083 at commit 
[`5408005`](https://github.com/apache/spark/commit/5408005912c1e369cbf3d77ea490b88f621ee047).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #17080: [SPARK-19739][CORE] propagate S3 session token to cluser

2017-02-27 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/17080
  
**[Test build #73501 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/73501/testReport)**
 for PR 17080 at commit 
[`0ae5aa7`](https://github.com/apache/spark/commit/0ae5aa73c70ae2f46a2d16087b5c55652d1e0282).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #17031: [SPARK-19702][MESOS] Add suppress/revive support to the ...

2017-02-27 Thread skonto

Github user skonto commented on the issue:

https://github.com/apache/spark/pull/17031
  
Ok like the Cassandra case you mean right?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #17086: [SPARK-18693][ML][MLLIB] ML Evaluators should use weight...

2017-02-27 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/17086
  
Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #16478: [SPARK-7768][SQL] Revise user defined types (UDT)

2017-02-27 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/16478
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/73519/
Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #17086: [SPARK-18693][ML][MLLIB] ML Evaluators should use weight...

2017-02-27 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/17086
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/73529/
Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #16478: [SPARK-7768][SQL] Revise user defined types (UDT)

2017-02-27 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/16478
  
Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #17086: [SPARK-18693][ML][MLLIB] ML Evaluators should use weight...

2017-02-27 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/17086
  
**[Test build #73529 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/73529/testReport)**
 for PR 17086 at commit 
[`cf6a5ab`](https://github.com/apache/spark/commit/cf6a5aba61716dcb11ef3ca7b1f3b803bf99ef33).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds the following public classes _(experimental)_:
  * `class MulticlassMetrics @Since(\"1.1.0\") (predAndLabelsWithOptWeight: 
RDD[_]) `


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #16478: [SPARK-7768][SQL] Revise user defined types (UDT)

2017-02-27 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/16478
  
**[Test build #73519 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/73519/testReport)**
 for PR 16478 at commit 
[`e6b01f0`](https://github.com/apache/spark/commit/e6b01f07947da06be2fc3114793d7793a0f7406a).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #17031: [SPARK-19702][MESOS] Add suppress/revive support ...

2017-02-27 Thread skonto

Github user skonto commented on a diff in the pull request:

https://github.com/apache/spark/pull/17031#discussion_r103287098
  
--- Diff: 
resource-managers/mesos/src/main/scala/org/apache/spark/scheduler/cluster/mesos/MesosFineGrainedSchedulerBackend.scala
 ---
@@ -24,6 +24,7 @@ import scala.collection.JavaConverters._
 import scala.collection.mutable.{HashMap, HashSet}
--- End diff --

ok cool!


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #16990: [SPARK-19660][CORE][SQL] Replace the configuration prope...

2017-02-27 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/16990
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/73511/
Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #16990: [SPARK-19660][CORE][SQL] Replace the configuration prope...

2017-02-27 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/16990
  
Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #16990: [SPARK-19660][CORE][SQL] Replace the configuration prope...

2017-02-27 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/16990
  
**[Test build #73511 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/73511/testReport)**
 for PR 16990 at commit 
[`21956db`](https://github.com/apache/spark/commit/21956db9ac1908807bbe7761c815980309c35ac8).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #17031: [SPARK-19702][MESOS] Add suppress/revive support to the ...

2017-02-27 Thread mgummelt

Github user mgummelt commented on the issue:

https://github.com/apache/spark/pull/17031
  
Given the concerns about the dispatcher being stuck in a suppressed state, 
I'm going to solve this a different way.  I'm going to increase the default 
offer decline timeout to 120s and make it configurable, just like it is in the 
driver.  This will make it so that the offer will be offered to 120 other 
frameworks before circling back to the dispatcher, rather than the default 5.  
I'll also keep the explicit revive calls when a new driver is submitted or an 
existing one fails, which immediately causes offers to be re-offered to the 
dispatcher.

This removes the risk that the driver gets stuck in a suppressed state, 
because the dispatcher never suppresses itself.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #14299: Ensure broadcasted variables are destroyed even in case ...

2017-02-27 Thread vanzin

Github user vanzin commented on the issue:

https://github.com/apache/spark/pull/14299
  
You also should file a bug and reference it from the PR title.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #17084: [SPARK-18693][ML][MLLIB] ML Evaluators should use weight...

2017-02-27 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/17084
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/73526/
Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #17084: [SPARK-18693][ML][MLLIB] ML Evaluators should use weight...

2017-02-27 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/17084
  
**[Test build #73526 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/73526/testReport)**
 for PR 17084 at commit 
[`98652cf`](https://github.com/apache/spark/commit/98652cfb6c92eed90deff61bc83ef66b9096df20).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds the following public classes _(experimental)_:
  * `class BinaryClassificationMetrics @Since(\"2.2.0\") (`


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #17084: [SPARK-18693][ML][MLLIB] ML Evaluators should use weight...

2017-02-27 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/17084
  
Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #17039: [SPARK-19710][SQL][TESTS] Fix ordering of rows in query ...

2017-02-27 Thread hvanhovell

Github user hvanhovell commented on the issue:

https://github.com/apache/spark/pull/17039
  
How about the more pragmatic approach. I think relation algebra only 
guarantees ordering when an order by is the top level operation. Why not just 
check that, and if we find one, add all output columns to the order by?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #17001: [SPARK-19667][SQL]create table with hiveenabled in defau...

2017-02-27 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/17001
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/73510/
Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #17001: [SPARK-19667][SQL]create table with hiveenabled in defau...

2017-02-27 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/17001
  
Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #17001: [SPARK-19667][SQL]create table with hiveenabled in defau...

2017-02-27 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/17001
  
**[Test build #73510 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/73510/testReport)**
 for PR 17001 at commit 
[`13245e4`](https://github.com/apache/spark/commit/13245e4474115b41880224d43cd7b4b8613bd6ac).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #17071: [SPARK-15615][SQL][BUILD][FOLLOW-UP] Replace deprecated ...

2017-02-27 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/17071
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/73506/
Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #17071: [SPARK-15615][SQL][BUILD][FOLLOW-UP] Replace deprecated ...

2017-02-27 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/17071
  
Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #17031: [SPARK-19702][MESOS] Add suppress/revive support ...

2017-02-27 Thread skonto

Github user skonto commented on a diff in the pull request:

https://github.com/apache/spark/pull/17031#discussion_r103281854
  
--- Diff: 
resource-managers/mesos/src/main/scala/org/apache/spark/scheduler/cluster/mesos/MesosClusterScheduler.scala
 ---
@@ -582,141 +688,33 @@ private[spark] class MesosClusterScheduler(
 }
   }
 
-  override def resourceOffers(driver: SchedulerDriver, offers: 
JList[Offer]): Unit = {
-logTrace(s"Received offers from Mesos: 
\n${offers.asScala.mkString("\n")}")
-val tasks = new mutable.HashMap[OfferID, ArrayBuffer[TaskInfo]]()
-val currentTime = new Date()
-
-val currentOffers = offers.asScala.map {
-  o => new ResourceOffer(o.getId, o.getSlaveId, o.getResourcesList)
-}.toList
-
-stateLock.synchronized {
-  // We first schedule all the supervised drivers that are ready to 
retry.
-  // This list will be empty if none of the drivers are marked as 
supervise.
-  val driversToRetry = pendingRetryDrivers.filter { d =>
-d.retryState.get.nextRetry.before(currentTime)
-  }
-
-  scheduleTasks(
-copyBuffer(driversToRetry),
-removeFromPendingRetryDrivers,
-currentOffers,
-tasks)
-
-  // Then we walk through the queued drivers and try to schedule them.
-  scheduleTasks(
-copyBuffer(queuedDrivers),
-removeFromQueuedDrivers,
-currentOffers,
-tasks)
-}
-tasks.foreach { case (offerId, taskInfos) =>
-  driver.launchTasks(Collections.singleton(offerId), taskInfos.asJava)
-}
-
-for (o <- currentOffers if !tasks.contains(o.offerId)) {
-  driver.declineOffer(o.offerId)
-}
-  }
-
-  private def copyBuffer(
-  buffer: ArrayBuffer[MesosDriverDescription]): 
ArrayBuffer[MesosDriverDescription] = {
-val newBuffer = new ArrayBuffer[MesosDriverDescription](buffer.size)
-buffer.copyToBuffer(newBuffer)
-newBuffer
-  }
-
-  def getSchedulerState(): MesosClusterSchedulerState = {
-stateLock.synchronized {
-  new MesosClusterSchedulerState(
-frameworkId,
-masterInfo.map(m => s"http://${m.getIp}:${m.getPort};),
-copyBuffer(queuedDrivers),
-launchedDrivers.values.map(_.copy()).toList,
-finishedDrivers.map(_.copy()).toList,
-copyBuffer(pendingRetryDrivers))
-}
-  }
-
-  override def offerRescinded(driver: SchedulerDriver, offerId: OfferID): 
Unit = {}
-  override def disconnected(driver: SchedulerDriver): Unit = {}
-  override def reregistered(driver: SchedulerDriver, masterInfo: 
MasterInfo): Unit = {
-logInfo(s"Framework re-registered with master ${masterInfo.getId}")
-  }
-  override def slaveLost(driver: SchedulerDriver, slaveId: SlaveID): Unit 
= {}
-  override def error(driver: SchedulerDriver, error: String): Unit = {
-logError("Error received: " + error)
-markErr()
-  }
+  private def createTaskInfo(desc: MesosDriverDescription, offer: 
ResourceOffer): TaskInfo = {
+val taskId = TaskID.newBuilder().setValue(desc.submissionId).build()
 
-  /**
-   * Check if the task state is a recoverable state that we can relaunch 
the task.
-   * Task state like TASK_ERROR are not relaunchable state since it wasn't 
able
-   * to be validated by Mesos.
-   */
-  private def shouldRelaunch(state: MesosTaskState): Boolean = {
-state == MesosTaskState.TASK_FAILED ||
-  state == MesosTaskState.TASK_LOST
-  }
+val (remainingResources, cpuResourcesToUse) =
+  partitionResources(offer.resources, "cpus", desc.cores)
+val (finalResources, memResourcesToUse) =
+  partitionResources(remainingResources.asJava, "mem", desc.mem)
+offer.resources = finalResources.asJava
 
-  override def statusUpdate(driver: SchedulerDriver, status: TaskStatus): 
Unit = {
-val taskId = status.getTaskId.getValue
-stateLock.synchronized {
-  if (launchedDrivers.contains(taskId)) {
-if (status.getReason == Reason.REASON_RECONCILIATION &&
-  !pendingRecover.contains(taskId)) {
-  // Task has already received update and no longer requires 
reconciliation.
-  return
-}
-val state = launchedDrivers(taskId)
-// Check if the driver is supervise enabled and can be relaunched.
-if (state.driverDescription.supervise && 
shouldRelaunch(status.getState)) {
-  removeFromLaunchedDrivers(taskId)
-  state.finishDate = Some(new Date())
-  val retryState: Option[MesosClusterRetryState] = 
state.driverDescription.retryState
-  val (retries, waitTimeSec) = retryState

[GitHub] spark issue #17071: [SPARK-15615][SQL][BUILD][FOLLOW-UP] Replace deprecated ...

2017-02-27 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/17071
  
**[Test build #73506 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/73506/testReport)**
 for PR 17071 at commit 
[`6f35ee3`](https://github.com/apache/spark/commit/6f35ee3d07892743b318ea8dd23276e881873d2b).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #16809: [SPARK-19463][SQL]refresh cache after the InsertIntoHado...

2017-02-27 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/16809
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/73516/
Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #16809: [SPARK-19463][SQL]refresh cache after the InsertIntoHado...

2017-02-27 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/16809
  
Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #16809: [SPARK-19463][SQL]refresh cache after the InsertIntoHado...

2017-02-27 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/16809
  
**[Test build #73516 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/73516/testReport)**
 for PR 16809 at commit 
[`f8ccc2f`](https://github.com/apache/spark/commit/f8ccc2fe54c29f69adc730a7078590540b1b4b5e).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #17039: [SPARK-19710][SQL][TESTS] Fix ordering of rows in query ...

2017-02-27 Thread gatorsmile

Github user gatorsmile commented on the issue:

https://github.com/apache/spark/pull/17039
  
The major issue is we do not know the original intention of users' query. 
The query might purposely check whether the result set is sorted or not. Thus, 
the existing test suite design is conservative to avoid adding any sort as long 
as users specify the ORDER BY clause. For example, 
```SQL
SELECT c1, c2, sum(c1) FROM tab1 GROUP BY c1, c2 ORDER BY c1, c2
```
In the above example, although the order by clause does not contain all the 
columns, the result set is always sorted. Thus, our test suite should not sort 
it.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #17031: [SPARK-19702][MESOS] Add suppress/revive support ...

2017-02-27 Thread skonto

Github user skonto commented on a diff in the pull request:

https://github.com/apache/spark/pull/17031#discussion_r103281366
  
--- Diff: 
resource-managers/mesos/src/main/scala/org/apache/spark/scheduler/cluster/mesos/MesosClusterScheduler.scala
 ---
@@ -737,13 +735,75 @@ private[spark] class MesosClusterScheduler(
 if (index != -1) {
   pendingRetryDrivers.remove(index)
   pendingRetryDriversState.expunge(id)
+  suppressOrRevive()
   true
 } else {
   false
 }
   }
 
-  def getQueuedDriversSize: Int = queuedDrivers.size
-  def getLaunchedDriversSize: Int = launchedDrivers.size
-  def getPendingRetryDriversSize: Int = pendingRetryDrivers.size
+  private def copyBuffer(buffer: ArrayBuffer[MesosDriverDescription]):
+  ArrayBuffer[MesosDriverDescription] = {
+val newBuffer = new ArrayBuffer[MesosDriverDescription](buffer.size)
+buffer.copyToBuffer(newBuffer)
+newBuffer
+  }
+
+  /**
+   * Check if the task state is a recoverable state that we can relaunch 
the task.
+   * Task state like TASK_ERROR are not relaunchable state since it wasn't 
able
+   * to be validated by Mesos.
+   */
+  private def isFailure(state: MesosTaskState): Boolean = {
+state == MesosTaskState.TASK_FAILED ||
+  state == MesosTaskState.TASK_LOST
+  }
+
+  private def shouldSuppress: Boolean = {
+return queuedDrivers.isEmpty && pendingRetryDrivers.isEmpty
--- End diff --

return is redundant.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #13326: [SPARK-15560] [Mesos] Queued/Supervise drivers waiting f...

2017-02-27 Thread devaraj-kavali

Github user devaraj-kavali commented on the issue:

https://github.com/apache/spark/pull/13326
  
@mgummelt /@tnachen, can you have a look into this?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #13143: [SPARK-15359] [Mesos] Mesos dispatcher should handle DRI...

2017-02-27 Thread devaraj-kavali

Github user devaraj-kavali commented on the issue:

https://github.com/apache/spark/pull/13143
  
@mgummelt /@tnachen, can you have a look into this?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #17031: [SPARK-19702][MESOS] Add suppress/revive support ...

2017-02-27 Thread skonto

Github user skonto commented on a diff in the pull request:

https://github.com/apache/spark/pull/17031#discussion_r103280797
  
--- Diff: 
resource-managers/mesos/src/main/scala/org/apache/spark/scheduler/cluster/mesos/MesosClusterScheduler.scala
 ---
@@ -737,13 +735,75 @@ private[spark] class MesosClusterScheduler(
 if (index != -1) {
   pendingRetryDrivers.remove(index)
   pendingRetryDriversState.expunge(id)
+  suppressOrRevive()
   true
 } else {
   false
 }
   }
 
-  def getQueuedDriversSize: Int = queuedDrivers.size
-  def getLaunchedDriversSize: Int = launchedDrivers.size
-  def getPendingRetryDriversSize: Int = pendingRetryDrivers.size
+  private def copyBuffer(buffer: ArrayBuffer[MesosDriverDescription]):
+  ArrayBuffer[MesosDriverDescription] = {
+val newBuffer = new ArrayBuffer[MesosDriverDescription](buffer.size)
+buffer.copyToBuffer(newBuffer)
+newBuffer
+  }
+
+  /**
+   * Check if the task state is a recoverable state that we can relaunch 
the task.
+   * Task state like TASK_ERROR are not relaunchable state since it wasn't 
able
+   * to be validated by Mesos.
+   */
+  private def isFailure(state: MesosTaskState): Boolean = {
+state == MesosTaskState.TASK_FAILED ||
+  state == MesosTaskState.TASK_LOST
+  }
+
+  private def shouldSuppress: Boolean = {
+return queuedDrivers.isEmpty && pendingRetryDrivers.isEmpty
+  }
+
+  private def suppressOrRevive(): Unit = {
+if (shouldSuppress && !isSuppressed) {
+  logInfo("Suppressing Offers.")
+  driver.suppressOffers()
+  isSuppressed = true
+} else if (!shouldSuppress && isSuppressed) {
+  logInfo("Reviving Offers.")
+  driver.reviveOffers()
+  isSuppressed = false
+}
+  }
+
+  /**
+   * Escape args for Unix-like shells, unless already quoted by the user.
+   * Based on: 
http://www.gnu.org/software/bash/manual/html_node/Double-Quotes.html
+   * and http://www.grymoire.com/Unix/Quote.html
+   *
+   * @param value argument
+   * @return escaped argument
+   */
+  private[scheduler] def shellEscape(value: String): String = {
+val WrappedInQuotes = """^(".+"|'.+')$""".r
+val ShellSpecialChars = (""".*([ '<>&|\?\*;!#\\(\)"$`]).*""").r
--- End diff --

Parentheses are redundant. 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #17031: [SPARK-19702][MESOS] Add suppress/revive support ...

2017-02-27 Thread skonto

Github user skonto commented on a diff in the pull request:

https://github.com/apache/spark/pull/17031#discussion_r103280133
  
--- Diff: 
resource-managers/mesos/src/main/scala/org/apache/spark/deploy/mesos/ui/MesosClusterPage.scala
 ---
@@ -32,7 +32,7 @@ private[mesos] class MesosClusterPage(parent: 
MesosClusterUI) extends WebUIPage(
   private val historyServerURL = parent.conf.get(HISTORY_SERVER_URL)
 
   def render(request: HttpServletRequest): Seq[Node] = {
-val state = parent.scheduler.getSchedulerState()
+val state = parent.scheduler.getSchedulerState
 
 val driverHeader = Seq("Driver ID")
 val historyHeader = historyServerURL.map(url => 
Seq("History")).getOrElse(Nil)
--- End diff --

Since you are refactoring the code s/url/_.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #16782: [SPARK-19348][PYTHON][WIP] PySpark keyword_only decorato...

2017-02-27 Thread BryanCutler

Github user BryanCutler commented on the issue:

https://github.com/apache/spark/pull/16782
  
also, using the `inspection` module it would be possible to check if the 
wrapped function is a method.  Then we wouldn't need to just make that 
assumption.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #17039: [SPARK-19710][SQL][TESTS] Fix ordering of rows in query ...

2017-02-27 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/17039
  
Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #17039: [SPARK-19710][SQL][TESTS] Fix ordering of rows in query ...

2017-02-27 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/17039
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/73508/
Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #17079: [SPARK-19748][SQL]refresh function has a wrong order to ...

2017-02-27 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/17079
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/73502/
Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #17079: [SPARK-19748][SQL]refresh function has a wrong order to ...

2017-02-27 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/17079
  
Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #17039: [SPARK-19710][SQL][TESTS] Fix ordering of rows in query ...

2017-02-27 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/17039
  
**[Test build #73508 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/73508/testReport)**
 for PR 17039 at commit 
[`4a4d7ad`](https://github.com/apache/spark/commit/4a4d7ad4b349e49dc4cb81235f796e360dd183f8).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #16782: [SPARK-19348][PYTHON][WIP] PySpark keyword_only decorato...

2017-02-27 Thread BryanCutler

Github user BryanCutler commented on the issue:

https://github.com/apache/spark/pull/16782
  
Thanks @jkbradley and @davies for reviewing.  This fix still seems a little 
hacky to me and you could still possibly run into trouble if you call a nested 
wrapped function and don't consume the `_input_kwargs` right away.  But it is 
the best solution I could think of without being overly complicated and it is a 
little better than it was before.  If you guys give the go ahead, I can update 
the other uses in pyspark.ml and try to add a test also.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #17079: [SPARK-19748][SQL]refresh function has a wrong order to ...

2017-02-27 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/17079
  
**[Test build #73502 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/73502/testReport)**
 for PR 17079 at commit 
[`fd3bb21`](https://github.com/apache/spark/commit/fd3bb21597809409e7f33796589c9178744063c5).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #16476: [SPARK-19084][SQL] Implement expression field

2017-02-27 Thread tejasapatil

Github user tejasapatil commented on the issue:

https://github.com/apache/spark/pull/16476
  
@gczsjdy : I had one comment in past about `genIfElseStructure` but after 
giving more thought, I was not able to think about a better way to do that. I 
am fine withthe current version of code you have.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #17083: [SPARK-19750][UI][branch-2.1] Fix redirect issue ...

2017-02-27 Thread vanzin

Github user vanzin commented on a diff in the pull request:

https://github.com/apache/spark/pull/17083#discussion_r103278713
  
--- Diff: core/src/main/scala/org/apache/spark/ui/JettyUtils.scala ---
@@ -378,7 +378,8 @@ private[spark] object JettyUtils extends Logging {
   server.getHandler().asInstanceOf[ContextHandlerCollection])
   }
 
-  private def createRedirectHttpsHandler(securePort: Int, scheme: String): 
ContextHandler = {
+  private def createRedirectHttpsHandler(
+  httpsConnector: ServerConnector, scheme: String): ContextHandler = {
--- End diff --

nit: one argument per line when using multiple lines.

But instead of changing this, why not pass the correct port from the caller 
in the first place?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #17082: [SPARK-19749][SS] Name socket source with a meaningful n...

2017-02-27 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/17082
  
Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #17083: [SPARK-19750][UI][branch-2.1] Fix redirect issue ...

2017-02-27 Thread vanzin

Github user vanzin commented on a diff in the pull request:

https://github.com/apache/spark/pull/17083#discussion_r103279006
  
--- Diff: core/src/test/scala/org/apache/spark/ui/UISuite.scala ---
@@ -267,8 +267,11 @@ class UISuite extends SparkFunSuite {
   s"$scheme://localhost:$port/test1/root",
   s"$scheme://localhost:$port/test2/root")
 urls.foreach { url =>
-  val rc = TestUtils.httpResponseCode(new URL(url))
-  assert(rc === expected, s"Unexpected status $rc for $url")
+  val rc = TestUtils.httpResponseCodeAndURL(new URL(url))
--- End diff --

`val (rc, redirectUrl) = ...`


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #17082: [SPARK-19749][SS] Name socket source with a meaningful n...

2017-02-27 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/17082
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/73499/
Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #16784: [SPARK-19382][ML]:Test sparse vectors in LinearSV...

2017-02-27 Thread jkbradley

Github user jkbradley commented on a diff in the pull request:

https://github.com/apache/spark/pull/16784#discussion_r103278706
  
--- Diff: 
mllib/src/test/scala/org/apache/spark/ml/classification/LinearSVCSuite.scala ---
@@ -220,12 +246,13 @@ object LinearSVCSuite {
 "aggregationDepth" -> 3
   )
 
-// Generate noisy input of the form Y = signum(x.dot(weights) + 
intercept + noise)
+  // Generate noisy input of the form Y = signum(x.dot(weights) + 
intercept + noise)
   def generateSVMInput(
--- End diff --

This API is strange, where the caller expects numFeatures = weights.size, 
but really numFeatures = 10 * weights.size if isDense=false.  Please update it 
to construct a random dense or sparse vector first (both of length 
weights.size) and then compute y to make the API more consistent.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #16784: [SPARK-19382][ML]:Test sparse vectors in LinearSV...

2017-02-27 Thread jkbradley

Github user jkbradley commented on a diff in the pull request:

https://github.com/apache/spark/pull/16784#discussion_r103278715
  
--- Diff: 
mllib/src/test/scala/org/apache/spark/ml/classification/LinearSVCSuite.scala ---
@@ -234,7 +261,12 @@ object LinearSVCSuite {
   val yD = new BDV(xi).dot(weightsMat) + intercept + 0.01 * 
rnd.nextGaussian()
   if (yD > 0) 1.0 else 0.0
 }
-y.zip(x).map(p => LabeledPoint(p._1, Vectors.dense(p._2)))
+val index = (0 to weights.length - 1).toArray
--- End diff --

Move inside if-then to branch where it is used.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #17082: [SPARK-19749][SS] Name socket source with a meaningful n...

2017-02-27 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/17082
  
**[Test build #73499 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/73499/testReport)**
 for PR 17082 at commit 
[`68349fa`](https://github.com/apache/spark/commit/68349facee3b33fd5975e90c74c882f3d922).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #16784: [SPARK-19382][ML]:Test sparse vectors in LinearSV...

2017-02-27 Thread jkbradley

Github user jkbradley commented on a diff in the pull request:

https://github.com/apache/spark/pull/16784#discussion_r103278729
  
--- Diff: 
mllib/src/test/scala/org/apache/spark/ml/classification/LinearSVCSuite.scala ---
@@ -203,6 +227,8 @@ class LinearSVCSuite extends SparkFunSuite with 
MLlibTestSparkContext with Defau
 val svm = new LinearSVC()
 testEstimatorAndModelReadWrite(svm, smallBinaryDataset, 
LinearSVCSuite.allParamSettings,
   checkModelData)
+testEstimatorAndModelReadWrite(svm, smallSparseBinaryDataset, 
LinearSVCSuite.allParamSettings,
--- End diff --

No need for this.  Once the model has been fit, its training data is 
irrelevant.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #16782: [SPARK-19348][PYTHON][WIP] PySpark keyword_only d...

2017-02-27 Thread BryanCutler

Github user BryanCutler commented on a diff in the pull request:

https://github.com/apache/spark/pull/16782#discussion_r103277884
  
--- Diff: python/pyspark/__init__.py ---
@@ -96,9 +96,11 @@ def keyword_only(func):
 """
 @wraps(func)
 def wrapper(*args, **kwargs):
+# NOTE - this assumes we are wrapping a method and args[0] will be 
'self'
 if len(args) > 1:
 raise TypeError("Method %s forces keyword arguments." % 
func.__name__)
 wrapper._input_kwargs = kwargs
--- End diff --

Yeah, that is what I was suggesting only that removing that would require 
changing everywhere it is used in ml.  So I just wanted to check with you guys 
first.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #16782: [SPARK-19348][PYTHON][WIP] PySpark keyword_only d...

2017-02-27 Thread davies

Github user davies commented on a diff in the pull request:

https://github.com/apache/spark/pull/16782#discussion_r103276349
  
--- Diff: python/pyspark/__init__.py ---
@@ -96,9 +96,11 @@ def keyword_only(func):
 """
 @wraps(func)
 def wrapper(*args, **kwargs):
+# NOTE - this assumes we are wrapping a method and args[0] will be 
'self'
 if len(args) > 1:
 raise TypeError("Method %s forces keyword arguments." % 
func.__name__)
 wrapper._input_kwargs = kwargs
--- End diff --

If the assumption is correct, should we always use 'self' to hold the 
kwargs? (remove this line and update all the fuctions that use `keyword_only`)?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #17064: [SPARK-19736][SQL] refreshByPath should clear all...

2017-02-27 Thread cloud-fan

Github user cloud-fan commented on a diff in the pull request:

https://github.com/apache/spark/pull/17064#discussion_r103277557
  
--- Diff: 
sql/core/src/main/scala/org/apache/spark/sql/execution/CacheManager.scala ---
@@ -168,15 +168,16 @@ class CacheManager extends Logging {
   (fs, path.makeQualified(fs.getUri, fs.getWorkingDirectory))
 }
 
-cachedData.foreach {
-  case data if data.plan.find(lookupAndRefresh(_, fs, 
qualifiedPath)).isDefined =>
-val dataIndex = cachedData.indexWhere(cd => 
data.plan.sameResult(cd.plan))
-if (dataIndex >= 0) {
-  data.cachedRepresentation.cachedColumnBuffers.unpersist(blocking 
= true)
-  cachedData.remove(dataIndex)
-}
-
sparkSession.sharedState.cacheManager.cacheQuery(Dataset.ofRows(sparkSession, 
data.plan))
-  case _ => // Do Nothing
+cachedData.filter {
--- End diff --

why the previous one doesn't work?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #17077: [SPARK-16931][PYTHON][SQL] Add Python wrapper for bucket...

2017-02-27 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/17077
  
Merged build finished. Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #17077: [SPARK-16931][PYTHON][SQL] Add Python wrapper for bucket...

2017-02-27 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/17077
  
Test FAILed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/73527/
Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #17077: [SPARK-16931][PYTHON][SQL] Add Python wrapper for bucket...

2017-02-27 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/17077
  
**[Test build #73527 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/73527/testReport)**
 for PR 17077 at commit 
[`18c709c`](https://github.com/apache/spark/commit/18c709c4bf77fc6db5530e00a9e5bba0e1ab0250).
 * This patch **fails MiMa tests**.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #17086: [SPARK-18693][ML][MLLIB] ML Evaluators should use weight...

2017-02-27 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/17086
  
**[Test build #73529 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/73529/testReport)**
 for PR 17086 at commit 
[`cf6a5ab`](https://github.com/apache/spark/commit/cf6a5aba61716dcb11ef3ca7b1f3b803bf99ef33).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #16557: [SPARK-18693][ML][MLLIB] ML Evaluators should use weight...

2017-02-27 Thread imatiach-msft

Github user imatiach-msft commented on the issue:

https://github.com/apache/spark/pull/16557
  
I've created 3 PRs, located here:
https://github.com/apache/spark/pull/17084
https://github.com/apache/spark/pull/17085
https://github.com/apache/spark/pull/17086


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #17086: [SPARK-18693][ML][MLLIB] ML Evaluators should use...

2017-02-27 Thread imatiach-msft

GitHub user imatiach-msft opened a pull request:

https://github.com/apache/spark/pull/17086

[SPARK-18693][ML][MLLIB] ML Evaluators should use weight column - added 
weight column for multiclass classification evaluator

## What changes were proposed in this pull request?

The evaluators BinaryClassificationEvaluator, RegressionEvaluator, and 
MulticlassClassificationEvaluator and the corresponding metrics classes 
BinaryClassificationMetrics, RegressionMetrics and MulticlassMetrics should use 
sample weight data.

I've closed the PR: https://github.com/apache/spark/pull/16557
 as recommended in favor of creating three pull requests, one for each of 
the evaluators (binary/regression/multiclass) to make it easier to 
review/update.

## How was this patch tested?

I added tests to the metrics class.


You can merge this pull request into a Git repository by running:

$ git pull https://github.com/imatiach-msft/spark ilmat/multiclass-evaluate

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/spark/pull/17086.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #17086


commit cf6a5aba61716dcb11ef3ca7b1f3b803bf99ef33
Author: Ilya Matiach 
Date:   2017-02-27T18:28:08Z

Added weight column for multiclass classification evaluator




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #17085: [SPARK-18693][ML][MLLIB] ML Evaluators should use...

2017-02-27 Thread imatiach-msft

GitHub user imatiach-msft opened a pull request:

https://github.com/apache/spark/pull/17085

[SPARK-18693][ML][MLLIB] ML Evaluators should use weight column - added 
weight column for regression evaluator

## What changes were proposed in this pull request?

The evaluators BinaryClassificationEvaluator, RegressionEvaluator, and 
MulticlassClassificationEvaluator and the corresponding metrics classes 
BinaryClassificationMetrics, RegressionMetrics and MulticlassMetrics should use 
sample weight data.

I've closed the PR: https://github.com/apache/spark/pull/16557
 as recommended in favor of creating three pull requests, one for each of 
the evaluators (binary/regression/multiclass) to make it easier to 
review/update.

The updates to the regression metrics were based on (and updated with new 
changes based on comments):
https://issues.apache.org/jira/browse/SPARK-11520
 ("RegressionMetrics should support instance weights")
 but the pull request was closed as the changes were never checked in.

## How was this patch tested?

I added tests to the metrics class.


You can merge this pull request into a Git repository by running:

$ git pull https://github.com/imatiach-msft/spark ilmat/regression-evaluate

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/spark/pull/17085.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #17085


commit 48800eb91b27a713232ab27f05bb8cef92129852
Author: Ilya Matiach 
Date:   2017-02-27T18:20:44Z

Added weight column for regression evaluator




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #17076: [SPARK-19745][ML] SVCAggregator captures coeffici...

2017-02-27 Thread sethah

Github user sethah commented on a diff in the pull request:

https://github.com/apache/spark/pull/17076#discussion_r103276904
  
--- Diff: 
mllib/src/main/scala/org/apache/spark/ml/classification/LinearSVC.scala ---
@@ -440,19 +440,9 @@ private class LinearSVCAggregator(
 
   private val numFeatures: Int = bcFeaturesStd.value.length
   private val numFeaturesPlusIntercept: Int = if (fitIntercept) 
numFeatures + 1 else numFeatures
-  private val coefficients: Vector = bcCoefficients.value
   private var weightSum: Double = 0.0
   private var lossSum: Double = 0.0
-  require(numFeaturesPlusIntercept == coefficients.size, s"Dimension 
mismatch. Coefficients " +
-s"length ${coefficients.size}, FeaturesStd length ${numFeatures}, 
fitIntercept: $fitIntercept")
-
-  private val coefficientsArray = coefficients match {
-case dv: DenseVector => dv.values
-case _ =>
-  throw new IllegalArgumentException(
-s"coefficients only supports dense vector but got type 
${coefficients.getClass}.")
-  }
-  private val gradientSumArray = 
Array.fill[Double](coefficientsArray.length)(0)
+  private lazy val gradientSumArray = new 
Array[Double](numFeaturesPlusIntercept)
--- End diff --

Actually this question is slightly different than what I was referring to 
above. We don't use `@transient` here because we do need to serialize this when 
we send the gradient updates back to the driver. The reason for making it lazy 
is because we don't need to serialize the array of zeros. We can just 
initialize it on the workers and avoid the communication cost. 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #17085: [SPARK-18693][ML][MLLIB] ML Evaluators should use weight...

2017-02-27 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/17085
  
**[Test build #73528 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/73528/testReport)**
 for PR 17085 at commit 
[`48800eb`](https://github.com/apache/spark/commit/48800eb91b27a713232ab27f05bb8cef92129852).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #17075: [SPARK-19727][SQL] Fix for round function that modifies ...

2017-02-27 Thread cloud-fan

Github user cloud-fan commented on the issue:

https://github.com/apache/spark/pull/17075
  
I think we should fix `changePrecison` to return a new instance instead of 
updating itself.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #16290: [SPARK-18817] [SPARKR] [SQL] Set default warehous...

2017-02-27 Thread shivaram

Github user shivaram commented on a diff in the pull request:

https://github.com/apache/spark/pull/16290#discussion_r103276097
  
--- Diff: R/pkg/R/sparkR.R ---
@@ -376,6 +377,12 @@ sparkR.session <- function(
 overrideEnvs(sparkConfigMap, paramMap)
   }
 
+  # NOTE(shivaram): Set default warehouse dir to tmpdir to meet CRAN 
requirements
+  # See SPARK-18817 for more details
+  if (!exists("spark.sql.default.warehouse.dir", envir = sparkConfigMap)) {
--- End diff --

Ah I see - I will make try to use `SessionState` and see if it can avoid 
having to create a new option


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #16330: [SPARK-18817][SPARKR][SQL] change derby log output and m...

2017-02-27 Thread shivaram

Github user shivaram commented on the issue:

https://github.com/apache/spark/pull/16330
  
Its a bit tricky to ask users permission during installation (actually I'm 
not sure how we can create such an option ?) -- I think a viable option could 
be to do add `logWarning` that shows where SparkSQL data is going to be stored 
and a pointer to how the location can be changed.

@felixcheung I think its worth a shot to ask the CRAN submission process 
with such a warning and then revisit this if we still have a problem ?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #17077: [SPARK-16931][PYTHON][SQL] Add Python wrapper for bucket...

2017-02-27 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/17077
  
**[Test build #73527 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/73527/testReport)**
 for PR 17077 at commit 
[`18c709c`](https://github.com/apache/spark/commit/18c709c4bf77fc6db5530e00a9e5bba0e1ab0250).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #17084: [SPARK-18693][ML][MLLIB] ML Evaluators should use weight...

2017-02-27 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/17084
  
**[Test build #73526 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/73526/testReport)**
 for PR 17084 at commit 
[`98652cf`](https://github.com/apache/spark/commit/98652cfb6c92eed90deff61bc83ef66b9096df20).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #16842: [SPARK-19304] [Streaming] [Kinesis] fix kinesis s...

2017-02-27 Thread brkyvz

Github user brkyvz commented on a diff in the pull request:

https://github.com/apache/spark/pull/16842#discussion_r103274162
  
--- Diff: 
external/kinesis-asl/src/main/scala/org/apache/spark/streaming/kinesis/KinesisBackedBlockRDD.scala
 ---
@@ -212,7 +214,7 @@ class KinesisSequenceRangeIterator(
 val getRecordsRequest = new GetRecordsRequest
 getRecordsRequest.setRequestCredentials(credentials)
 getRecordsRequest.setShardIterator(shardIterator)
-getRecordsRequest.setLimit(recordCount)
+getRecordsRequest.setLimit(Math.max(recordCount, 
this.maxGetRecordsLimit))
--- End diff --

this should be a `min` not a `max`


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #16910: [SPARK-19575][SQL]Reading from or writing to a hive serd...

2017-02-27 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/16910
  
Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #16910: [SPARK-19575][SQL]Reading from or writing to a hive serd...

2017-02-27 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/16910
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/73513/
Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #17084: [SPARK-18693][ML][MLLIB] ML Evaluators should use...

2017-02-27 Thread imatiach-msft

GitHub user imatiach-msft opened a pull request:

https://github.com/apache/spark/pull/17084

[SPARK-18693][ML][MLLIB] ML Evaluators should use weight column - added 
weight column for binary classification evaluator

## What changes were proposed in this pull request?

The evaluators BinaryClassificationEvaluator, RegressionEvaluator, and 
MulticlassClassificationEvaluator and the corresponding metrics classes 
BinaryClassificationMetrics, RegressionMetrics and MulticlassMetrics should use 
sample weight data.

I've closed the PR: https://github.com/apache/spark/pull/16557
as recommended in favor of creating three pull requests, one for each of 
the evaluators (binary/regression/multiclass) to make it easier to 
review/update.

## How was this patch tested?
I added tests to the metrics and evaluators classes.


You can merge this pull request into a Git repository by running:

$ git pull https://github.com/imatiach-msft/spark ilmat/binary-evalute

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/spark/pull/17084.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #17084






---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #16910: [SPARK-19575][SQL]Reading from or writing to a hive serd...

2017-02-27 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/16910
  
**[Test build #73513 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/73513/testReport)**
 for PR 16910 at commit 
[`92d1067`](https://github.com/apache/spark/commit/92d10679b5a07b34f6d5cfdb8cd27279165c95e3).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #16557: [SPARK-18693][ML][MLLIB] ML Evaluators should use...

2017-02-27 Thread imatiach-msft

Github user imatiach-msft closed the pull request at:

https://github.com/apache/spark/pull/16557


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #16557: [SPARK-18693][ML][MLLIB] ML Evaluators should use weight...

2017-02-27 Thread imatiach-msft

Github user imatiach-msft commented on the issue:

https://github.com/apache/spark/pull/16557
  
ok, I will close this and create three new PRs, one for each of the 
evaluators


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #17052: [SPARK-19690][SS] Join a streaming DataFrame with a batc...

2017-02-27 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/17052
  
Merged build finished. Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #17052: [SPARK-19690][SS] Join a streaming DataFrame with a batc...

2017-02-27 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/17052
  
Test FAILed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/73507/
Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #17052: [SPARK-19690][SS] Join a streaming DataFrame with a batc...

2017-02-27 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/17052
  
**[Test build #73507 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/73507/testReport)**
 for PR 17052 at commit 
[`38e3a14`](https://github.com/apache/spark/commit/38e3a14b609373f2fae21fcd70a14669cfc96aa1).
 * This patch **fails Spark unit tests**.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #16557: [SPARK-18693][ML][MLLIB] ML Evaluators should use weight...

2017-02-27 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/16557
  
**[Test build #73525 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/73525/testReport)**
 for PR 16557 at commit 
[`a0fc4c3`](https://github.com/apache/spark/commit/a0fc4c3ddb9e9e62e78b4dff59e65d7ae4387054).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #16971: [SPARK-19573][SQL] Make NaN/null handling consist...

2017-02-27 Thread gatorsmile

Github user gatorsmile commented on a diff in the pull request:

https://github.com/apache/spark/pull/16971#discussion_r103270963
  
--- Diff: 
sql/core/src/main/scala/org/apache/spark/sql/execution/stat/StatFunctions.scala 
---
@@ -91,7 +100,13 @@ object StatFunctions extends Logging {
 }
 val summaries = df.select(columns: 
_*).rdd.aggregate(emptySummaries)(apply, merge)
 
-summaries.map { summary => probabilities.map(summary.query) }
+summaries.map { summary =>
+  try {
+probabilities.map(summary.query)
+  } catch {
+case e: SparkException => Seq.empty[Double]
--- End diff --

Please do not use the Exception handling for this purpose. Instead, you can 
return None. 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #16793: [SPARK-19454][PYTHON][SQL] DataFrame.replace improvement...

2017-02-27 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/16793
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/73524/
Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #16793: [SPARK-19454][PYTHON][SQL] DataFrame.replace improvement...

2017-02-27 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/16793
  
**[Test build #73524 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/73524/testReport)**
 for PR 16793 at commit 
[`17e6820`](https://github.com/apache/spark/commit/17e68205ef639893902c65c0394c8aa4406191be).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #16793: [SPARK-19454][PYTHON][SQL] DataFrame.replace improvement...

2017-02-27 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/16793
  
Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #16929: [SPARK-19595][SQL] Support json array in from_json

2017-02-27 Thread brkyvz

Github user brkyvz commented on the issue:

https://github.com/apache/spark/pull/16929
  
Thanks @HyukjinKwon took a pass. My comments are mainly:

 1. We don't need to support APIs for both `StructType` and `ArrayType`. I 
would rather just add an API for `DataType` and `require` that the `DataType` 
is either `StructType` or `ArrayType`. 

2. If a user specifies the schema as an `Array` but one of the rows has a 
JSON object, we should still consider it an Array of records. No need to 
separate `Array support` and `Object support`


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #17077: [SPARK-16931][PYTHON][SQL] Add Python wrapper for bucket...

2017-02-27 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/17077
  
Test FAILed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/73523/
Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #16842: [SPARK-19304] [Streaming] [Kinesis] fix kinesis slow che...

2017-02-27 Thread brkyvz

Github user brkyvz commented on the issue:

https://github.com/apache/spark/pull/16842
  
@Gauravshah Can you please comment on how much faster this PR improved your 
recovery time?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #17077: [SPARK-16931][PYTHON][SQL] Add Python wrapper for bucket...

2017-02-27 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/17077
  
Merged build finished. Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #17077: [SPARK-16931][PYTHON][SQL] Add Python wrapper for bucket...

2017-02-27 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/17077
  
**[Test build #73523 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/73523/testReport)**
 for PR 17077 at commit 
[`9fde39f`](https://github.com/apache/spark/commit/9fde39fa2174e9e67d6045b890f8cc0fc76cd61b).
 * This patch **fails PySpark unit tests**.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #16929: [SPARK-19595][SQL] Support json array in from_jso...

2017-02-27 Thread brkyvz

Github user brkyvz commented on a diff in the pull request:

https://github.com/apache/spark/pull/16929#discussion_r103268734
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/json/JacksonParser.scala
 ---
@@ -39,7 +39,12 @@ private[sql] class SparkSQLJsonProcessingException(msg: 
String) extends RuntimeE
  */
 class JacksonParser(
 schema: StructType,
-options: JSONOptions) extends Logging {
+options: JSONOptions,
+arraySupport: Boolean = true,
--- End diff --

as I commented above, I don't think we need this


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #16929: [SPARK-19595][SQL] Support json array in from_jso...

2017-02-27 Thread brkyvz

Github user brkyvz commented on a diff in the pull request:

https://github.com/apache/spark/pull/16929#discussion_r103268655
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/jsonExpressions.scala
 ---
@@ -480,36 +480,79 @@ case class JsonTuple(children: Seq[Expression])
 }
 
 /**
- * Converts an json input string to a [[StructType]] with the specified 
schema.
+ * Converts an json input string to a [[StructType]] or [[ArrayType]] with 
the specified schema.
  */
 case class JsonToStruct(
-schema: StructType,
+schema: DataType,
 options: Map[String, String],
 child: Expression,
 timeZoneId: Option[String] = None)
   extends UnaryExpression with TimeZoneAwareExpression with 
CodegenFallback with ExpectsInputTypes {
   override def nullable: Boolean = true
 
-  def this(schema: StructType, options: Map[String, String], child: 
Expression) =
+  def this(schema: DataType, options: Map[String, String], child: 
Expression) =
 this(schema, options, child, None)
 
+  override def checkInputDataTypes(): TypeCheckResult = schema match {
+case _: StructType | ArrayType(_: StructType, _) =>
+  super.checkInputDataTypes()
+case _ => TypeCheckResult.TypeCheckFailure(
+  s"Input schema ${schema.simpleString} must be a struct or an array 
of structs.")
+  }
+
+  @transient
+  lazy val rowSchema = schema match {
+case st: StructType => st
+case ArrayType(st: StructType, _) => st
+  }
+
+  // This converts parsed rows to the desired output by the given schema.
+  @transient
+  lazy val converter = schema match {
+case _: StructType =>
+  // These are always produced from json objects by `objectSupport` in 
`JacksonParser`.
+  (rows: Seq[InternalRow]) => rows.head
+
+case ArrayType(_: StructType, _) =>
+  // These are always produced from json arrays by `arraySupport` in 
`JacksonParser`.
+  (rows: Seq[InternalRow]) => new GenericArrayData(rows)
+  }
+
   @transient
   lazy val parser =
 new JacksonParser(
-  schema,
-  new JSONOptions(options + ("mode" -> ParseModes.FAIL_FAST_MODE), 
timeZoneId.get))
+  rowSchema,
+  new JSONOptions(options + ("mode" -> ParseModes.FAIL_FAST_MODE), 
timeZoneId.get),
+  objectSupport = schema.isInstanceOf[StructType],
--- End diff --

Do you think we need the `objectSupport` and `arraySupport`?
I would rather not add it. If someone specifies an `ArrayType` but the row 
contains just an object, let's still just return it as an `ArrayType`. I think 
users would appreciate this.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #16929: [SPARK-19595][SQL] Support json array in from_jso...

2017-02-27 Thread brkyvz

Github user brkyvz commented on a diff in the pull request:

https://github.com/apache/spark/pull/16929#discussion_r103268156
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/jsonExpressions.scala
 ---
@@ -480,36 +480,79 @@ case class JsonTuple(children: Seq[Expression])
 }
 
 /**
- * Converts an json input string to a [[StructType]] with the specified 
schema.
+ * Converts an json input string to a [[StructType]] or [[ArrayType]] with 
the specified schema.
  */
 case class JsonToStruct(
-schema: StructType,
+schema: DataType,
 options: Map[String, String],
 child: Expression,
 timeZoneId: Option[String] = None)
   extends UnaryExpression with TimeZoneAwareExpression with 
CodegenFallback with ExpectsInputTypes {
   override def nullable: Boolean = true
 
-  def this(schema: StructType, options: Map[String, String], child: 
Expression) =
+  def this(schema: DataType, options: Map[String, String], child: 
Expression) =
 this(schema, options, child, None)
 
+  override def checkInputDataTypes(): TypeCheckResult = schema match {
--- End diff --

why not just override:

`override def inputTypes = new TypeCollection(ArrayType, StructType) :: Nil`


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #17076: [SPARK-19745][ML] SVCAggregator captures coefficients in...

2017-02-27 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/17076
  
Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #17076: [SPARK-19745][ML] SVCAggregator captures coefficients in...

2017-02-27 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/17076
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/73505/
Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

< 1 2 3 4 5 6 7 >

401 - 500 of 613 matches

Mail list logo