date:20171107

[GitHub] spark pull request #19525: [SPARK-22289] [ML] Add JSON support for Matrix pa...

2017-11-07 Thread yanboliang

Github user yanboliang commented on a diff in the pull request:

https://github.com/apache/spark/pull/19525#discussion_r149530436
  
--- Diff: 
mllib/src/main/scala/org/apache/spark/ml/linalg/JsonMatrixConverter.scala ---
@@ -0,0 +1,79 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.spark.ml.linalg
+
+import org.json4s.DefaultFormats
+import org.json4s.JsonDSL._
+import org.json4s.jackson.JsonMethods.{compact, parse => parseJson, render}
+
+private[ml] object JsonMatrixConverter {
+
+  /** Unique class name for identifying JSON object encoded by this class. 
*/
+  val className = "org.apache.spark.ml.linalg.Matrix"
--- End diff --

I'd suggest a more shorter string(or integer) to identify this is a matrix, 
it should be huge burden to store so long metadata string for a matrix with 
several elements. 


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #19525: [SPARK-22289] [ML] Add JSON support for Matrix pa...

2017-11-07 Thread yanboliang

Github user yanboliang commented on a diff in the pull request:

https://github.com/apache/spark/pull/19525#discussion_r149532602
  
--- Diff: 
mllib/src/main/scala/org/apache/spark/ml/linalg/JsonMatrixConverter.scala ---
@@ -0,0 +1,79 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.spark.ml.linalg
+
+import org.json4s.DefaultFormats
+import org.json4s.JsonDSL._
+import org.json4s.jackson.JsonMethods.{compact, parse => parseJson, render}
+
+private[ml] object JsonMatrixConverter {
+
+  /** Unique class name for identifying JSON object encoded by this class. 
*/
+  val className = "org.apache.spark.ml.linalg.Matrix"
--- End diff --

Or can we just use ```type``` to identify vector and matrix? For example, 
```type``` less than 10 is reserved for vector and more than 10 is for matrix. 
What do you think of it?


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #19525: [SPARK-22289] [ML] Add JSON support for Matrix pa...

2017-11-07 Thread yanboliang

Github user yanboliang commented on a diff in the pull request:

https://github.com/apache/spark/pull/19525#discussion_r149534129
  
--- Diff: 
mllib/src/test/scala/org/apache/spark/ml/classification/LogisticRegressionSuite.scala
 ---
@@ -2769,6 +2769,20 @@ class LogisticRegressionSuite
   LogisticRegressionSuite.allParamSettings, checkModelData)
   }
 
+  test("read/write with BoundsOnCoefficients") {
+def checkModelData(model: LogisticRegressionModel, model2: 
LogisticRegressionModel): Unit = {
+  assert(model.getLowerBoundsOnCoefficients === 
model2.getLowerBoundsOnCoefficients)
+  assert(model.getUpperBoundsOnCoefficients === 
model2.getUpperBoundsOnCoefficients)
--- End diff --

Or we can merge this test case with existing read/write test.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #19525: [SPARK-22289] [ML] Add JSON support for Matrix pa...

2017-11-07 Thread yanboliang

Github user yanboliang commented on a diff in the pull request:

https://github.com/apache/spark/pull/19525#discussion_r149522834
  
--- Diff: 
mllib-local/src/main/scala/org/apache/spark/ml/linalg/Matrices.scala ---
@@ -827,6 +831,11 @@ class SparseMatrix @Since("2.0.0") (
 @Since("2.0.0")
 object SparseMatrix {
 
+  @Since("2.3.0")
+  private[ml] def unapply(
--- End diff --

Ditto


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #19681: [SPARK-20652][sql] Store SQL UI data in the new a...

2017-11-07 Thread vanzin

Github user vanzin commented on a diff in the pull request:

https://github.com/apache/spark/pull/19681#discussion_r149537039
  
--- Diff: 
sql/core/src/main/scala/org/apache/spark/sql/execution/ui/SQLAppStatusListener.scala
 ---
@@ -0,0 +1,353 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.spark.sql.execution.ui
+
+import java.util.Date
+import java.util.concurrent.ConcurrentHashMap
+
+import scala.collection.JavaConverters._
+import scala.collection.mutable.HashMap
+
+import org.apache.spark.{JobExecutionStatus, SparkConf}
+import org.apache.spark.internal.Logging
+import org.apache.spark.scheduler._
+import org.apache.spark.sql.execution.SQLExecution
+import org.apache.spark.sql.execution.metric._
+import org.apache.spark.sql.internal.StaticSQLConf._
+import org.apache.spark.status.LiveEntity
+import org.apache.spark.status.config._
+import org.apache.spark.ui.SparkUI
+import org.apache.spark.util.kvstore.KVStore
+
+private[sql] class SQLAppStatusListener(
+conf: SparkConf,
+kvstore: KVStore,
+live: Boolean,
+ui: Option[SparkUI] = None)
+  extends SparkListener with Logging {
+
+  // How often to flush intermediate stage of a live execution to the 
store. When replaying logs,
+  // never flush (only do the very last write).
+  private val liveUpdatePeriodNs = if (live) 
conf.get(LIVE_ENTITY_UPDATE_PERIOD) else -1L
+
+  private val liveExecutions = new HashMap[Long, LiveExecutionData]()
+  private val stageMetrics = new HashMap[Int, LiveStageMetrics]()
+
+  private var uiInitialized = false
+
+  override def onJobStart(event: SparkListenerJobStart): Unit = {
+val executionIdString = 
event.properties.getProperty(SQLExecution.EXECUTION_ID_KEY)
+if (executionIdString == null) {
+  // This is not a job created by SQL
+  return
+}
+
+val executionId = executionIdString.toLong
+val jobId = event.jobId
+val exec = getOrCreateExecution(executionId)
+
+// Record the accumulator IDs for the stages of this job, so that the 
code that keeps
+// track of the metrics knows which accumulators to look at.
+val accumIds = exec.metrics.map(_.accumulatorId).sorted.toList
+event.stageIds.foreach { id =>
+  stageMetrics.put(id, new LiveStageMetrics(id, 0, accumIds.toArray, 
new ConcurrentHashMap()))
+}
+
+exec.jobs = exec.jobs + (jobId -> JobExecutionStatus.RUNNING)
+exec.stages = event.stageIds
+update(exec)
+  }
+
+  override def onStageSubmitted(event: SparkListenerStageSubmitted): Unit 
= {
+if (!isSQLStage(event.stageInfo.stageId)) {
+  return
+}
+
+// Reset the metrics tracking object for the new attempt.
+stageMetrics.get(event.stageInfo.stageId).foreach { metrics =>
+  metrics.taskMetrics.clear()
+  metrics.attemptId = event.stageInfo.attemptId
+}
+  }
+
+  override def onJobEnd(event: SparkListenerJobEnd): Unit = {
+liveExecutions.values.foreach { exec =>
+  if (exec.jobs.contains(event.jobId)) {
+val result = event.jobResult match {
+  case JobSucceeded => JobExecutionStatus.SUCCEEDED
+  case _ => JobExecutionStatus.FAILED
+}
+exec.jobs = exec.jobs + (event.jobId -> result)
+exec.endEvents += 1
+update(exec)
+  }
+}
+  }
+
+  override def onExecutorMetricsUpdate(event: 
SparkListenerExecutorMetricsUpdate): Unit = {
+event.accumUpdates.foreach { case (taskId, stageId, attemptId, 
accumUpdates) =>
+  updateStageMetrics(stageId, attemptId, taskId, accumUpdates, false)
+}
+  }
+
+  override def onTaskEnd(event: SparkListenerTaskEnd): Unit = {
+if (!isSQLStage(event.stageId)) {
+  return
+}
+
+val info = event.taskInfo
+// SPARK-20342. If processing events from a live ap

[GitHub] spark issue #19657: [SPARK-22344][SPARKR] clean up install dir if running te...

2017-11-07 Thread HyukjinKwon

Github user HyukjinKwon commented on the issue:

https://github.com/apache/spark/pull/19657
  
Will take a look within today.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #19681: [SPARK-20652][sql] Store SQL UI data in the new app stat...

2017-11-07 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/19681
  
**[Test build #83570 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/83570/testReport)**
 for PR 19681 at commit 
[`197dd8f`](https://github.com/apache/spark/commit/197dd8fe645d3672c6e0c0ac0f52144a84b91dc5).


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #19565: [SPARK-22111][MLLIB] OnlineLDAOptimizer should filter ou...

2017-11-07 Thread akopich

Github user akopich commented on the issue:

https://github.com/apache/spark/pull/19565
  
ping @WeichenXu123 , @srowen , @hhbyyh 
Further comments?


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #19682: [SPARK-22464] [SQL] No pushdown for Hive metastore parti...

2017-11-07 Thread mallman

Github user mallman commented on the issue:

https://github.com/apache/spark/pull/19682
  
Thanks for the fix!


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #19663: [SPARK-22463][YARN][SQL][Hive]add hadoop/hive/hba...

2017-11-07 Thread vanzin

Github user vanzin commented on a diff in the pull request:

https://github.com/apache/spark/pull/19663#discussion_r149544676
  
--- Diff: 
resource-managers/yarn/src/main/scala/org/apache/spark/deploy/yarn/Client.scala 
---
@@ -687,6 +687,20 @@ private[spark] class Client(
   private def createConfArchive(): File = {
 val hadoopConfFiles = new HashMap[String, File]()
 
+// SPARK_CONF_DIR shows up in the classpath before 
HADOOP_CONF_DIR/YARN_CONF_DIR
+val localConfDir = System.getProperty("SPARK_CONF_DIR",
--- End diff --

`SPARK_CONF_DIR` is set by Spark's launch scripts, so you should just be 
able to do:

```
sys.env.get("SPARK_CONF_DIR").foreach { ... }
```


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #19663: [SPARK-22463][YARN][SQL][Hive]add hadoop/hive/hba...

2017-11-07 Thread vanzin

Github user vanzin commented on a diff in the pull request:

https://github.com/apache/spark/pull/19663#discussion_r149544880
  
--- Diff: 
resource-managers/yarn/src/main/scala/org/apache/spark/deploy/yarn/Client.scala 
---
@@ -687,6 +687,20 @@ private[spark] class Client(
   private def createConfArchive(): File = {
 val hadoopConfFiles = new HashMap[String, File]()
 
+// SPARK_CONF_DIR shows up in the classpath before 
HADOOP_CONF_DIR/YARN_CONF_DIR
+val localConfDir = System.getProperty("SPARK_CONF_DIR",
+  System.getProperty("SPARK_HOME") + File.separator + "conf")
+val dir = new File(localConfDir)
+if (dir.isDirectory) {
+  val files = dir.listFiles(new FileFilter {
+override def accept(pathname: File): Boolean = {
+  pathname.isFile && pathname.getName.endsWith("xml")
+}
+  })
+  files.foreach { f => hadoopConfFiles(f.getName) = f }
+}
+
+// Ensure HADOOP_CONF_DIR/YARN_CONF_DIR not overriding existing files
--- End diff --

This comment doesn't make a lot of sense, at least not in this position. 
What are you trying to say?


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #19663: [SPARK-22463][YARN][SQL][Hive]add hadoop/hive/hba...

2017-11-07 Thread vanzin

Github user vanzin commented on a diff in the pull request:

https://github.com/apache/spark/pull/19663#discussion_r149544716
  
--- Diff: 
resource-managers/yarn/src/main/scala/org/apache/spark/deploy/yarn/Client.scala 
---
@@ -687,6 +687,20 @@ private[spark] class Client(
   private def createConfArchive(): File = {
 val hadoopConfFiles = new HashMap[String, File]()
 
+// SPARK_CONF_DIR shows up in the classpath before 
HADOOP_CONF_DIR/YARN_CONF_DIR
+val localConfDir = System.getProperty("SPARK_CONF_DIR",
+  System.getProperty("SPARK_HOME") + File.separator + "conf")
+val dir = new File(localConfDir)
+if (dir.isDirectory) {
+  val files = dir.listFiles(new FileFilter {
+override def accept(pathname: File): Boolean = {
+  pathname.isFile && pathname.getName.endsWith("xml")
--- End diff --

`".xml"`


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #19681: [SPARK-20652][sql] Store SQL UI data in the new app stat...

2017-11-07 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/19681
  
Merged build finished. Test FAILed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #19681: [SPARK-20652][sql] Store SQL UI data in the new app stat...

2017-11-07 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/19681
  
Test FAILed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/83570/
Test FAILed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #19459: [SPARK-20791][PYSPARK] Use Arrow to create Spark DataFra...

2017-11-07 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/19459
  
Test FAILed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/83569/
Test FAILed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #19678: [SPARK-20646][core] Port executors page to new UI backen...

2017-11-07 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/19678
  
Test FAILed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/83568/
Test FAILed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #19678: [SPARK-20646][core] Port executors page to new UI backen...

2017-11-07 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/19678
  
Merged build finished. Test FAILed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #19459: [SPARK-20791][PYSPARK] Use Arrow to create Spark DataFra...

2017-11-07 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/19459
  
Merged build finished. Test FAILed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #19678: [SPARK-20646][core] Port executors page to new UI backen...

2017-11-07 Thread vanzin

Github user vanzin commented on the issue:

https://github.com/apache/spark/pull/19678
  
retest this please


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #19681: [SPARK-20652][sql] Store SQL UI data in the new app stat...

2017-11-07 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/19681
  
Test FAILed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/83567/
Test FAILed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #19681: [SPARK-20652][sql] Store SQL UI data in the new app stat...

2017-11-07 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/19681
  
Merged build finished. Test FAILed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #19687: [SPARK-19644][SQL]Clean up Scala reflection garba...

2017-11-07 Thread zsxwing

GitHub user zsxwing opened a pull request:

https://github.com/apache/spark/pull/19687

[SPARK-19644][SQL]Clean up Scala reflection garbage after creating Encoder

## What changes were proposed in this pull request?

Because of the memory leak issue in `scala.reflect.api.Types.TypeApi.<:<` 
(https://github.com/scala/bug/issues/8302), creating an encoder may leak memory.

This PR adds `cleanUpReflectionObjects` to clean up these leaking objects 
for methods calling `scala.reflect.api.Types.TypeApi.<:<`.

## How was this patch tested?

The updated unit tests.

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/zsxwing/spark SPARK-19644

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/spark/pull/19687.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #19687


commit c03811ff006058987fa8d5fb9f7d097b9acc9ac5
Author: Shixiong Zhu 
Date:   2017-11-08T00:33:55Z

Clean up Scala reflection garbage after creating Encoder




---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #19687: [SPARK-19644][SQL]Clean up Scala reflection garbage afte...

2017-11-07 Thread zsxwing

Github user zsxwing commented on the issue:

https://github.com/apache/spark/pull/19687
  
cc @cloud-fan 


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #19433: [SPARK-3162] [MLlib] Add local tree training for decisio...

2017-11-07 Thread jkbradley

Github user jkbradley commented on the issue:

https://github.com/apache/spark/pull/19433
  
CC @dbtsai  in case you're interested b/c of Sequoia forests


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #19272: [Spark-21842][Mesos] Support Kerberos ticket rene...

2017-11-07 Thread ArtRand

Github user ArtRand commented on a diff in the pull request:

https://github.com/apache/spark/pull/19272#discussion_r149549953
  
--- Diff: 
resource-managers/mesos/src/main/scala/org/apache/spark/scheduler/cluster/mesos/MesosCoarseGrainedSchedulerBackend.scala
 ---
@@ -213,6 +216,14 @@ private[spark] class 
MesosCoarseGrainedSchedulerBackend(
   sc.conf.getOption("spark.mesos.driver.frameworkId").map(_ + suffix)
 )
 
+// check that the credentials are defined, even though it's likely 
that auth would have failed
+// already if you've made it this far, then start the token renewer
+if (hadoopDelegationTokens.isDefined) {
--- End diff --

I agree that I shouldn't need to use the conditional 
`hadoopDelegationTokens.isDefined`, however there will need to be some check 
(`UserGroupInformation.isSecurityEnabled` or similar) to pass the 
`driverEndpoint` to the renewer/manager here. When the initial tokens are 
generated `driverEndpoint` is still `None` because `start()` hasn't been called 
yet. 


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #19433: [SPARK-3162] [MLlib] Add local tree training for decisio...

2017-11-07 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/19433
  
**[Test build #3983 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/NewSparkPullRequestBuilder/3983/testReport)**
 for PR 19433 at commit 
[`b7e6e40`](https://github.com/apache/spark/commit/b7e6e40976612546b81d9775c194b274c146dc85).


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #19687: [SPARK-19644][SQL]Clean up Scala reflection garbage afte...

2017-11-07 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/19687
  
**[Test build #83571 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/83571/testReport)**
 for PR 19687 at commit 
[`c03811f`](https://github.com/apache/spark/commit/c03811ff006058987fa8d5fb9f7d097b9acc9ac5).


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #19678: [SPARK-20646][core] Port executors page to new UI backen...

2017-11-07 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/19678
  
**[Test build #83572 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/83572/testReport)**
 for PR 19678 at commit 
[`c7123d9`](https://github.com/apache/spark/commit/c7123d9c8d3934c482cd89ea820b2958f4dbbe0a).


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #19634: [SPARK-22412][SQL] Fix incorrect comment in DataSourceSc...

2017-11-07 Thread vgankidi

Github user vgankidi commented on the issue:

https://github.com/apache/spark/pull/19634
  
@gatorsmile I also wanted to discuss if we should consider other bin 
packing algorithms. According to this 
http://www.math.unl.edu/~s-sjessie1/203Handouts/Bin%20Packing.pdf, next fit 
decreasing is the least efficient of all but it is easiest to implement and has 
O(N) run time. 


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #17436: [SPARK-20101][SQL] Use OffHeapColumnVector when "spark.m...

2017-11-07 Thread kiszk

Github user kiszk commented on the issue:

https://github.com/apache/spark/pull/17436
  
Jenkins, retest this please


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #19433: [SPARK-3162] [MLlib] Add local tree training for decisio...

2017-11-07 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/19433
  
**[Test build #3983 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/NewSparkPullRequestBuilder/3983/testReport)**
 for PR 19433 at commit 
[`b7e6e40`](https://github.com/apache/spark/commit/b7e6e40976612546b81d9775c194b274c146dc85).
 * This patch **fails to generate documentation**.
 * This patch **does not merge cleanly**.
 * This patch adds no public classes.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #17436: [SPARK-20101][SQL] Use OffHeapColumnVector when "spark.m...

2017-11-07 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/17436
  
**[Test build #83573 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/83573/testReport)**
 for PR 17436 at commit 
[`9ce6fc0`](https://github.com/apache/spark/commit/9ce6fc0b0ad2c4c97236f0519db07b5a3600bb81).


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #19661: [SPARK-22450][Core][Mllib]safely register class f...

2017-11-07 Thread ConeyLiu

Github user ConeyLiu commented on a diff in the pull request:

https://github.com/apache/spark/pull/19661#discussion_r149553694
  
--- Diff: 
core/src/main/scala/org/apache/spark/serializer/KryoSerializer.scala ---
@@ -178,10 +178,40 @@ class KryoSerializer(conf: SparkConf)
 
kryo.register(Utils.classForName("scala.collection.immutable.Map$EmptyMap$"))
 kryo.register(classOf[ArrayBuffer[Any]])
 
+// We can't load those class directly in order to avoid unnecessary 
jar dependencies.
+// We load them safely, ignore it if the class not found.
+Seq("org.apache.spark.mllib.linalg.Vector",
+  "org.apache.spark.mllib.linalg.DenseVector",
+  "org.apache.spark.mllib.linalg.SparseVector",
+  "org.apache.spark.mllib.linalg.Matrix",
+  "org.apache.spark.mllib.linalg.DenseMatrix",
+  "org.apache.spark.mllib.linalg.SparseMatrix",
+  "org.apache.spark.ml.linalg.Vector",
+  "org.apache.spark.ml.linalg.DenseVector",
+  "org.apache.spark.ml.linalg.SparseVector",
+  "org.apache.spark.ml.linalg.Matrix",
+  "org.apache.spark.ml.linalg.DenseMatrix",
+  "org.apache.spark.ml.linalg.SparseMatrix",
+  "org.apache.spark.ml.feature.Instance",
+  "org.apache.spark.ml.feature.OffsetInstance"
+).flatMap(safeClassLoader(_)).foreach(kryo.register(_))
--- End diff --

Hi @cloud-fan , I tried the following codeï¼
```scala
flatMap(cn => 
Try{Utils.classForName(cn)}.toOption).foreach(kryo.register(_))
```
and 
```scala
flatMap{ cn =>
  try {
val clazz = Utils.classForName(cn)
Some(clazz)
  } catch {
case _: ClassNotFoundException => None
  }
}.foreach(kryo.register(_))
```

Both reported the same errors:
```
Error:(198, 18) type mismatch;
 found   : String => Iterable[Class[_$2]]( forSome { type _$2 })
 required: String => scala.collection.GenTraversableOnce[B]
).flatMap{cn => 
Option(Utils.classForName(cn))}.foreach(kryo.register(_))
```


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #19685: [SPARK-19759][ML] not using blas in ALSModel.pred...

2017-11-07 Thread WeichenXu123

Github user WeichenXu123 commented on a diff in the pull request:

https://github.com/apache/spark/pull/19685#discussion_r149554146
  
--- Diff: mllib/src/main/scala/org/apache/spark/ml/recommendation/ALS.scala 
---
@@ -289,9 +289,11 @@ class ALSModel private[ml] (
 
   private val predict = udf { (featuresA: Seq[Float], featuresB: 
Seq[Float]) =>
 if (featuresA != null && featuresB != null) {
-  // TODO(SPARK-19759): try dot-producting on Seqs or another 
non-converted type for
-  // potential optimization.
-  blas.sdot(rank, featuresA.toArray, 1, featuresB.toArray, 1)
+  var dotProduct = 0.0f
+  for(i <- 0 until rank) {
--- End diff --

You should `while` instead of `for`


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #19685: [SPARK-19759][ML] not using blas in ALSModel.predict for...

2017-11-07 Thread WeichenXu123

Github user WeichenXu123 commented on the issue:

https://github.com/apache/spark/pull/19685
  
Have you made some test to check the performance difference for this ?


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #19156: [SPARK-19634][SQL][ML][FOLLOW-UP] Improve interface of d...

2017-11-07 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/19156
  
**[Test build #83574 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/83574/testReport)**
 for PR 19156 at commit 
[`480e80d`](https://github.com/apache/spark/commit/480e80dbb0392bebe96dc1620195a39b54f75740).


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #19285: [SPARK-22068][CORE]Reduce the duplicate code between put...

2017-11-07 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/19285
  
**[Test build #83575 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/83575/testReport)**
 for PR 19285 at commit 
[`bc3ad4e`](https://github.com/apache/spark/commit/bc3ad4ea11e49b19ef4199642dbc4488f202d928).


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #19638: [SPARK-22422][ML] Add Adjusted R2 to RegressionMe...

2017-11-07 Thread tengpeng

Github user tengpeng commented on a diff in the pull request:

https://github.com/apache/spark/pull/19638#discussion_r149558607
  
--- Diff: 
mllib/src/test/scala/org/apache/spark/ml/regression/LinearRegressionSuite.scala 
---
@@ -764,13 +764,17 @@ class LinearRegressionSuite
   (Intercept) 6.3022157  0.00186003388   <2e-16 ***
   V2  4.6982442  0.00118053980   <2e-16 ***
   V3  7.1994344  0.00090447961   <2e-16 ***
+
+  # R code for r2adj
--- End diff --

@srowen it's fine in terms of functioning. 


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #19638: [SPARK-22422][ML] Add Adjusted R2 to RegressionMe...

2017-11-07 Thread sethah

Github user sethah commented on a diff in the pull request:

https://github.com/apache/spark/pull/19638#discussion_r149559666
  
--- Diff: 
mllib/src/test/scala/org/apache/spark/ml/regression/LinearRegressionSuite.scala 
---
@@ -764,13 +764,17 @@ class LinearRegressionSuite
   (Intercept) 6.3022157  0.00186003388   <2e-16 ***
   V2  4.6982442  0.00118053980   <2e-16 ***
   V3  7.1994344  0.00090447961   <2e-16 ***
+
+  # R code for r2adj
--- End diff --

There may be some confusion. If you type that code, "as-is", into an R 
shell, it will not work. It reference a variable called `X1`, which is never 
defined. When we provide R code in comments like this, we intend for it to be 
copy and pasted into a shell and just work. So, it does not function.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #19666: [SPARK-22451][ML] Reduce decision tree aggregate size fo...

2017-11-07 Thread WeichenXu123

Github user WeichenXu123 commented on the issue:

https://github.com/apache/spark/pull/19666
  
@facaiy Thanks for your review! I put more explanation on the design 
purpose of `traverseUnorderedSplits`. But, if you have better solution, no 
hesitate to tell me!


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #19638: [SPARK-22422][ML] Add Adjusted R2 to RegressionMe...

2017-11-07 Thread tengpeng

Github user tengpeng commented on a diff in the pull request:

https://github.com/apache/spark/pull/19638#discussion_r149560345
  
--- Diff: 
mllib/src/test/scala/org/apache/spark/ml/regression/LinearRegressionSuite.scala 
---
@@ -764,13 +764,17 @@ class LinearRegressionSuite
   (Intercept) 6.3022157  0.00186003388   <2e-16 ***
   V2  4.6982442  0.00118053980   <2e-16 ***
   V3  7.1994344  0.00090447961   <2e-16 ***
+
+  # R code for r2adj
--- End diff --

Thanks for the clarification. Do you think change `x1` to `V1` would help?


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #19666: [SPARK-22451][ML] Reduce decision tree aggregate size fo...

2017-11-07 Thread WeichenXu123

Github user WeichenXu123 commented on the issue:

https://github.com/apache/spark/pull/19666
  
Also cc @smurching Thanks!


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #19666: [SPARK-22451][ML] Reduce decision tree aggregate ...

2017-11-07 Thread WeichenXu123

Github user WeichenXu123 commented on a diff in the pull request:

https://github.com/apache/spark/pull/19666#discussion_r149561550
  
--- Diff: 
mllib/src/main/scala/org/apache/spark/ml/tree/impl/RandomForest.scala ---
@@ -741,17 +678,43 @@ private[spark] object RandomForest extends Logging {
   (splits(featureIndex)(bestFeatureSplitIndex), 
bestFeatureGainStats)
 } else if (binAggregates.metadata.isUnordered(featureIndex)) {
   // Unordered categorical feature
-  val leftChildOffset = 
binAggregates.getFeatureOffset(featureIndexIdx)
-  val (bestFeatureSplitIndex, bestFeatureGainStats) =
-Range(0, numSplits).map { splitIndex =>
-  val leftChildStats = 
binAggregates.getImpurityCalculator(leftChildOffset, splitIndex)
-  val rightChildStats = 
binAggregates.getParentImpurityCalculator()
-.subtract(leftChildStats)
+  val numBins = binAggregates.metadata.numBins(featureIndex)
+  val featureOffset = 
binAggregates.getFeatureOffset(featureIndexIdx)
+
+  val binStatsArray = Array.tabulate(numBins) { binIndex =>
+binAggregates.getImpurityCalculator(featureOffset, binIndex)
+  }
+  val parentStats = binAggregates.getParentImpurityCalculator()
+
+  var bestGain = Double.NegativeInfinity
+  var bestSet: BitSet = null
+  var bestLeftChildStats: ImpurityCalculator = null
+  var bestRightChildStats: ImpurityCalculator = null
+
+  traverseUnorderedSplits[ImpurityCalculator](numBins, null,
+(stats, binIndex) => {
+  val binStats = binStatsArray(binIndex)
+  if (stats == null) {
+binStats
+  } else {
+stats.copy.add(binStats)
+  }
+},
+(set, leftChildStats) => {
+  val rightChildStats = 
parentStats.copy.subtract(leftChildStats)
   gainAndImpurityStats = 
calculateImpurityStats(gainAndImpurityStats,
 leftChildStats, rightChildStats, binAggregates.metadata)
-  (splitIndex, gainAndImpurityStats)
-}.maxBy(_._2.gain)
-  (splits(featureIndex)(bestFeatureSplitIndex), 
bestFeatureGainStats)
+  if (gainAndImpurityStats.gain > bestGain) {
+bestGain = gainAndImpurityStats.gain
+bestSet = set | new BitSet(numBins) // copy set
--- End diff --

The class do not support `copy` 


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #19688: [SPARK-22466][Spark Submit]export SPARK_CONF_DIR ...

2017-11-07 Thread yaooqinn

GitHub user yaooqinn opened a pull request:

https://github.com/apache/spark/pull/19688

[SPARK-22466][Spark Submit]export SPARK_CONF_DIR while conf is default

## What changes were proposed in this pull request?
### Before

```
Kent@KentsMacBookPro î° 
~/Documents/spark-packages/spark-2.3.0-SNAPSHOT-bin-master î° bin/spark-shell 
--master local
Using Spark's default log4j profile: 
org/apache/spark/log4j-defaults.properties
Setting default log level to "WARN".
To adjust logging level use sc.setLogLevel(newLevel). For SparkR, use 
setLogLevel(newLevel).
17/11/08 10:28:44 WARN NativeCodeLoader: Unable to load native-hadoop 
library for your platform... using builtin-java classes where applicable
17/11/08 10:28:45 WARN Utils: Service 'SparkUI' could not bind on port 
4040. Attempting port 4041.
Spark context Web UI available at http://169.254.168.63:4041
Spark context available as 'sc' (master = local, app id = 
local-1510108125770).
Spark session available as 'spark'.
Welcome to
    __
 / __/__  ___ _/ /__
_\ \/ _ \/ _ `/ __/  '_/
   /___/ .__/\_,_/_/ /_/\_\   version 2.3.0-SNAPSHOT
  /_/

Using Scala version 2.11.8 (Java HotSpot(TM) 64-Bit Server VM, Java 
1.8.0_65)
Type in expressions to have them evaluated.
Type :help for more information.

scala> sys.env.get("SPARK_CONF_DIR")
res0: Option[String] = None
```

### After 

```
scala> sys.env.get("SPARK_CONF_DIR")
res0: Option[String] = Some(/Users/Kent/Documents/spark/conf)
```
## How was this patch tested?

@vanzin 

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/yaooqinn/spark SPARK-22466

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/spark/pull/19688.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #19688


commit 19ac61cd6d8b4cca295a1f0d2f2988ee3ac20d8c
Author: Kent Yao 
Date:   2017-11-08T02:30:01Z

export SPARK_CONF_DIR while conf is default




---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #19663: [SPARK-22463][YARN][SQL][Hive]add hadoop/hive/hba...

2017-11-07 Thread yaooqinn

Github user yaooqinn commented on a diff in the pull request:

https://github.com/apache/spark/pull/19663#discussion_r149561888
  
--- Diff: 
resource-managers/yarn/src/main/scala/org/apache/spark/deploy/yarn/Client.scala 
---
@@ -687,6 +687,20 @@ private[spark] class Client(
   private def createConfArchive(): File = {
 val hadoopConfFiles = new HashMap[String, File]()
 
+// SPARK_CONF_DIR shows up in the classpath before 
HADOOP_CONF_DIR/YARN_CONF_DIR
+val localConfDir = System.getProperty("SPARK_CONF_DIR",
+  System.getProperty("SPARK_HOME") + File.separator + "conf")
+val dir = new File(localConfDir)
+if (dir.isDirectory) {
+  val files = dir.listFiles(new FileFilter {
+override def accept(pathname: File): Boolean = {
+  pathname.isFile && pathname.getName.endsWith("xml")
--- End diff --

ok


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #19565: [SPARK-22111][MLLIB] OnlineLDAOptimizer should filter ou...

2017-11-07 Thread WeichenXu123

Github user WeichenXu123 commented on the issue:

https://github.com/apache/spark/pull/19565
  
ok I agree this change. @jkbradley Can you take a look ?


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #19663: [SPARK-22463][YARN][SQL][Hive]add hadoop/hive/hba...

2017-11-07 Thread yaooqinn

Github user yaooqinn commented on a diff in the pull request:

https://github.com/apache/spark/pull/19663#discussion_r149561877
  
--- Diff: 
resource-managers/yarn/src/main/scala/org/apache/spark/deploy/yarn/Client.scala 
---
@@ -687,6 +687,20 @@ private[spark] class Client(
   private def createConfArchive(): File = {
 val hadoopConfFiles = new HashMap[String, File]()
 
+// SPARK_CONF_DIR shows up in the classpath before 
HADOOP_CONF_DIR/YARN_CONF_DIR
+val localConfDir = System.getProperty("SPARK_CONF_DIR",
--- End diff --

not exactly till now  , plz check https://github.com/apache/spark/pull/19688


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #19663: [SPARK-22463][YARN][SQL][Hive]add hadoop/hive/hba...

2017-11-07 Thread yaooqinn

Github user yaooqinn commented on a diff in the pull request:

https://github.com/apache/spark/pull/19663#discussion_r149561925
  
--- Diff: 
resource-managers/yarn/src/main/scala/org/apache/spark/deploy/yarn/Client.scala 
---
@@ -687,6 +687,20 @@ private[spark] class Client(
   private def createConfArchive(): File = {
 val hadoopConfFiles = new HashMap[String, File]()
 
+// SPARK_CONF_DIR shows up in the classpath before 
HADOOP_CONF_DIR/YARN_CONF_DIR
+val localConfDir = System.getProperty("SPARK_CONF_DIR",
+  System.getProperty("SPARK_HOME") + File.separator + "conf")
+val dir = new File(localConfDir)
+if (dir.isDirectory) {
+  val files = dir.listFiles(new FileFilter {
+override def accept(pathname: File): Boolean = {
+  pathname.isFile && pathname.getName.endsWith("xml")
+}
+  })
+  files.foreach { f => hadoopConfFiles(f.getName) = f }
+}
+
+// Ensure HADOOP_CONF_DIR/YARN_CONF_DIR not overriding existing files
--- End diff --

ok, i'd remove it


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #19688: [SPARK-22466][Spark Submit]export SPARK_CONF_DIR while c...

2017-11-07 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/19688
  
Can one of the admins verify this patch?


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #19663: [SPARK-22463][YARN][SQL][Hive]add hadoop/hive/hbase/etc ...

2017-11-07 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/19663
  
**[Test build #83576 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/83576/testReport)**
 for PR 19663 at commit 
[`f8c1f63`](https://github.com/apache/spark/commit/f8c1f63944c602a00802356f94788464320ffa3f).


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #19156: [SPARK-19634][SQL][ML][FOLLOW-UP] Improve interface of d...

2017-11-07 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/19156
  
**[Test build #83574 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/83574/testReport)**
 for PR 19156 at commit 
[`480e80d`](https://github.com/apache/spark/commit/480e80dbb0392bebe96dc1620195a39b54f75740).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #19156: [SPARK-19634][SQL][ML][FOLLOW-UP] Improve interface of d...

2017-11-07 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/19156
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/83574/
Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #19156: [SPARK-19634][SQL][ML][FOLLOW-UP] Improve interface of d...

2017-11-07 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/19156
  
Merged build finished. Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #19607: [WIP][SPARK-22395][SQL][PYTHON] Fix the behavior of time...

2017-11-07 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/19607
  
**[Test build #83578 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/83578/testReport)**
 for PR 19607 at commit 
[`4adb073`](https://github.com/apache/spark/commit/4adb073f8d1454fbea0742a16b6d7662e063b37a).


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #19662: [SPARK-22446][SQL][ML] Declare StringIndexerModel indexe...

2017-11-07 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/19662
  
**[Test build #83577 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/83577/testReport)**
 for PR 19662 at commit 
[`dd672ac`](https://github.com/apache/spark/commit/dd672ac815038f8dfd89fecb1f5b3d4668158752).


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #19662: [SPARK-22446][SQL][ML] Declare StringIndexerModel indexe...

2017-11-07 Thread viirya

Github user viirya commented on the issue:

https://github.com/apache/spark/pull/19662
  
@WeichenXu123 I did a scan. Currently I only found `VectorAssembler`'s udf 
may have similar issue. Fixed and added test for it too.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #19272: [Spark-21842][Mesos] Support Kerberos ticket rene...

2017-11-07 Thread ArtRand

Github user ArtRand commented on a diff in the pull request:

https://github.com/apache/spark/pull/19272#discussion_r149564294
  
--- Diff: 
resource-managers/mesos/src/main/scala/org/apache/spark/scheduler/cluster/mesos/MesosCoarseGrainedSchedulerBackend.scala
 ---
@@ -213,6 +216,14 @@ private[spark] class 
MesosCoarseGrainedSchedulerBackend(
   sc.conf.getOption("spark.mesos.driver.frameworkId").map(_ + suffix)
 )
 
+// check that the credentials are defined, even though it's likely 
that auth would have failed
+// already if you've made it this far, then start the token renewer
+if (hadoopDelegationTokens.isDefined) {
--- End diff --

I may have spoke too soon, there might be a way..


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #19664: [SPARK-22442][SQL] ScalaReflection should produce...

2017-11-07 Thread viirya

Github user viirya commented on a diff in the pull request:

https://github.com/apache/spark/pull/19664#discussion_r149564330
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/objects/objects.scala
 ---
@@ -214,11 +215,13 @@ case class Invoke(
   override def eval(input: InternalRow): Any =
 throw new UnsupportedOperationException("Only code-generated 
evaluation is supported.")
 
+  private lazy val encodedFunctionName = 
TermName(functionName).encodedName.toString
--- End diff --

Maybe, although I didn't have concrete case causing the issue for now. 


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #19664: [SPARK-22442][SQL] ScalaReflection should produce...

2017-11-07 Thread viirya

Github user viirya commented on a diff in the pull request:

https://github.com/apache/spark/pull/19664#discussion_r149564523
  
--- Diff: 
sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/ScalaReflectionSuite.scala
 ---
@@ -335,4 +338,17 @@ class ScalaReflectionSuite extends SparkFunSuite {
 assert(linkedHashMapDeserializer.dataType == 
ObjectType(classOf[LHMap[_, _]]))
   }
 
+  test("SPARK-22442: Generate correct field names for special characters") 
{
+val serializer = serializerFor[SpecialCharAsFieldData](BoundReference(
+  0, ObjectType(classOf[SpecialCharAsFieldData]), nullable = false))
+val deserializer = deserializerFor[SpecialCharAsFieldData]
+assert(serializer.dataType(0).name == "field.1")
+assert(serializer.dataType(1).name == "field 2")
+
+val argumentsFields = 
deserializer.asInstanceOf[NewInstance].arguments.flatMap { _.collect {
+  case UpCast(u: UnresolvedAttribute, _, _) => u.name
+}}
+assert(argumentsFields(0) == "`field.1`")
--- End diff --

We need to deliberately wrap backticks around a field name such as 
`field.1` because of the dot character. Otherwise `UnresolvedAttribute` will 
parse it as two name parts `Seq("field", "1")` and then fail resolving later.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #19663: [SPARK-22463][YARN][SQL][Hive]add hadoop/hive/hbase/etc ...

2017-11-07 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/19663
  
**[Test build #83576 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/83576/testReport)**
 for PR 19663 at commit 
[`f8c1f63`](https://github.com/apache/spark/commit/f8c1f63944c602a00802356f94788464320ffa3f).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #19663: [SPARK-22463][YARN][SQL][Hive]add hadoop/hive/hbase/etc ...

2017-11-07 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/19663
  
Merged build finished. Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #19663: [SPARK-22463][YARN][SQL][Hive]add hadoop/hive/hbase/etc ...

2017-11-07 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/19663
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/83576/
Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #19664: [SPARK-22442][SQL] ScalaReflection should produce...

2017-11-07 Thread viirya

Github user viirya commented on a diff in the pull request:

https://github.com/apache/spark/pull/19664#discussion_r149565144
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/objects/objects.scala
 ---
@@ -214,11 +215,13 @@ case class Invoke(
   override def eval(input: InternalRow): Any =
 throw new UnsupportedOperationException("Only code-generated 
evaluation is supported.")
 
+  private lazy val encodedFunctionName = 
TermName(functionName).encodedName.toString
--- End diff --

Since we use `Invoke` to access field(s) in object, this can be an issue. I 
didn't found `StaticInvoke` used similarly. So it should be fine. 


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #19459: [SPARK-20791][PYSPARK] Use Arrow to create Spark DataFra...

2017-11-07 Thread ueshin

Github user ueshin commented on the issue:

https://github.com/apache/spark/pull/19459
  
Jenkins, retest this please.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #19459: [SPARK-20791][PYSPARK] Use Arrow to create Spark DataFra...

2017-11-07 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/19459
  
**[Test build #83579 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/83579/testReport)**
 for PR 19459 at commit 
[`99ce1e4`](https://github.com/apache/spark/commit/99ce1e44f57c411af95b1c9d9c95f35f2c1652e1).


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #19666: [SPARK-22451][ML] Reduce decision tree aggregate ...

2017-11-07 Thread WeichenXu123

Github user WeichenXu123 commented on a diff in the pull request:

https://github.com/apache/spark/pull/19666#discussion_r149567340
  
--- Diff: 
mllib/src/test/scala/org/apache/spark/ml/tree/impl/RandomForestSuite.scala ---
@@ -631,6 +614,42 @@ class RandomForestSuite extends SparkFunSuite with 
MLlibTestSparkContext {
 val expected = Map(0 -> 1.0 / 3.0, 2 -> 2.0 / 3.0)
 assert(mapToVec(map.toMap) ~== mapToVec(expected) relTol 0.01)
   }
+
+  test("traverseUnorderedSplits") {
+
--- End diff --

So how to test all possible splits to make sure the generated splits are 
all correct ? If tree generated, only best split is remained.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #19662: [SPARK-22446][SQL][ML] Declare StringIndexerModel...

2017-11-07 Thread WeichenXu123

Github user WeichenXu123 commented on a diff in the pull request:

https://github.com/apache/spark/pull/19662#discussion_r149567769
  
--- Diff: 
mllib/src/test/scala/org/apache/spark/ml/feature/VectorAssemblerSuite.scala ---
@@ -126,4 +126,25 @@ class VectorAssemblerSuite
   .setOutputCol("myOutputCol")
 testDefaultReadWrite(t)
   }
+
+  test("VectorAssembler's UDF should not apply on filtered data") {
--- End diff --

mark the [SPARK-22446] on the test name.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #19662: [SPARK-22446][SQL][ML] Declare StringIndexerModel...

2017-11-07 Thread viirya

Github user viirya commented on a diff in the pull request:

https://github.com/apache/spark/pull/19662#discussion_r149568133
  
--- Diff: 
mllib/src/test/scala/org/apache/spark/ml/feature/VectorAssemblerSuite.scala ---
@@ -126,4 +126,25 @@ class VectorAssemblerSuite
   .setOutputCol("myOutputCol")
 testDefaultReadWrite(t)
   }
+
+  test("VectorAssembler's UDF should not apply on filtered data") {
--- End diff --

Ok.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #19662: [SPARK-22446][SQL][ML] Declare StringIndexerModel indexe...

2017-11-07 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/19662
  
**[Test build #83580 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/83580/testReport)**
 for PR 19662 at commit 
[`d2ac83e`](https://github.com/apache/spark/commit/d2ac83e5b1c74abd422e436752f1cf91127e388a).


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #19687: [SPARK-19644][SQL]Clean up Scala reflection garbage afte...

2017-11-07 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/19687
  
**[Test build #83571 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/83571/testReport)**
 for PR 19687 at commit 
[`c03811f`](https://github.com/apache/spark/commit/c03811ff006058987fa8d5fb9f7d097b9acc9ac5).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #19687: [SPARK-19644][SQL]Clean up Scala reflection garbage afte...

2017-11-07 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/19687
  
Merged build finished. Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #19687: [SPARK-19644][SQL]Clean up Scala reflection garbage afte...

2017-11-07 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/19687
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/83571/
Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #19689: [SPARK-22462][SQL] Make rdd-based actions in Data...

2017-11-07 Thread viirya

GitHub user viirya opened a pull request:

https://github.com/apache/spark/pull/19689

[SPARK-22462][SQL] Make rdd-based actions in Dataset trackable in SQL UI

## What changes were proposed in this pull request?

For the few Dataset actions such as `foreach`, currently no SQL metrics are 
visible in the SQL tab of SparkUI. It is because it binds wrongly to Dataset's 
`QueryExecution`. As the actions directly evaluate on the RDD which has 
individual `QueryExecution`, to show correct SQL metrics on UI, we should bind 
to RDD's `QueryExecution`.

## How was this patch tested?

Manually test. Screenshot is attached in the PR.



You can merge this pull request into a Git repository by running:

$ git pull https://github.com/viirya/spark-1 SPARK-22462

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/spark/pull/19689.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #19689


commit ac539cd0e761193d9a665d8ccb19a8fba5dd504b
Author: Liang-Chi Hsieh 
Date:   2017-11-07T10:54:14Z

Make rdd-based actions trackable in UI.




---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #19689: [SPARK-22462][SQL] Make rdd-based actions in Dataset tra...

2017-11-07 Thread viirya

Github user viirya commented on the issue:

https://github.com/apache/spark/pull/19689
  

The screenshot for running `sql("select * from range(10)").foreach(a => 
Unit)` on spark-shell:

https://user-images.githubusercontent.com/68855/32531135-1e60d544-c47d-11e7-88d6-627ef77d0b80.png";>



---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #19689: [SPARK-22462][SQL] Make rdd-based actions in Dataset tra...

2017-11-07 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/19689
  
**[Test build #83581 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/83581/testReport)**
 for PR 19689 at commit 
[`ac539cd`](https://github.com/apache/spark/commit/ac539cd0e761193d9a665d8ccb19a8fba5dd504b).


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #19648: [SPARK-14516][ML][FOLLOW-UP] Move ClusteringEvaluatorSui...

2017-11-07 Thread yanboliang

Github user yanboliang commented on the issue:

https://github.com/apache/spark/pull/19648
  
Merged into master, thanks.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #19648: [SPARK-14516][ML][FOLLOW-UP] Move ClusteringEvalu...

2017-11-07 Thread asfgit

Github user asfgit closed the pull request at:

https://github.com/apache/spark/pull/19648


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #19678: [SPARK-20646][core] Port executors page to new UI backen...

2017-11-07 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/19678
  
**[Test build #83572 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/83572/testReport)**
 for PR 19678 at commit 
[`c7123d9`](https://github.com/apache/spark/commit/c7123d9c8d3934c482cd89ea820b2958f4dbbe0a).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #19678: [SPARK-20646][core] Port executors page to new UI backen...

2017-11-07 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/19678
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/83572/
Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #19678: [SPARK-20646][core] Port executors page to new UI backen...

2017-11-07 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/19678
  
Merged build finished. Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #17436: [SPARK-20101][SQL] Use OffHeapColumnVector when "spark.m...

2017-11-07 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/17436
  
**[Test build #83573 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/83573/testReport)**
 for PR 17436 at commit 
[`9ce6fc0`](https://github.com/apache/spark/commit/9ce6fc0b0ad2c4c97236f0519db07b5a3600bb81).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #17436: [SPARK-20101][SQL] Use OffHeapColumnVector when "spark.m...

2017-11-07 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/17436
  
Merged build finished. Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #17436: [SPARK-20101][SQL] Use OffHeapColumnVector when "spark.m...

2017-11-07 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/17436
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/83573/
Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #19689: [SPARK-22462][SQL] Make rdd-based actions in Dataset tra...

2017-11-07 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/19689
  
**[Test build #83581 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/83581/testReport)**
 for PR 19689 at commit 
[`ac539cd`](https://github.com/apache/spark/commit/ac539cd0e761193d9a665d8ccb19a8fba5dd504b).
 * This patch **fails Spark unit tests**.
 * This patch merges cleanly.
 * This patch adds no public classes.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #19689: [SPARK-22462][SQL] Make rdd-based actions in Dataset tra...

2017-11-07 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/19689
  
Merged build finished. Test FAILed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #19689: [SPARK-22462][SQL] Make rdd-based actions in Dataset tra...

2017-11-07 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/19689
  
Test FAILed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/83581/
Test FAILed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #19285: [SPARK-22068][CORE]Reduce the duplicate code between put...

2017-11-07 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/19285
  
**[Test build #83575 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/83575/testReport)**
 for PR 19285 at commit 
[`bc3ad4e`](https://github.com/apache/spark/commit/bc3ad4ea11e49b19ef4199642dbc4488f202d928).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #19657: [SPARK-22344][SPARKR] clean up install dir if running te...

2017-11-07 Thread HyukjinKwon

Github user HyukjinKwon commented on the issue:

https://github.com/apache/spark/pull/19657
  
Yup, I just checked it too and was writing a comment .. The current change 
should pass :).


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #19657: [SPARK-22344][SPARKR] clean up install dir if running te...

2017-11-07 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/19657
  
**[Test build #83582 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/83582/testReport)**
 for PR 19657 at commit 
[`18e238a`](https://github.com/apache/spark/commit/18e238a62d53de5a73283a741c1a9bb8230f4484).


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #19620: [SPARK-22327][SPARKR][TEST][BACKPORT-2.1] check f...

2017-11-07 Thread felixcheung

Github user felixcheung closed the pull request at:

https://github.com/apache/spark/pull/19620


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #19620: [SPARK-22327][SPARKR][TEST][BACKPORT-2.1] check for vers...

2017-11-07 Thread felixcheung

Github user felixcheung commented on the issue:

https://github.com/apache/spark/pull/19620
  
merged


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #19619: [SPARK-22327][SPARKR][TEST][BACKPORT-2.2] check for vers...

2017-11-07 Thread felixcheung

Github user felixcheung commented on the issue:

https://github.com/apache/spark/pull/19619
  
merged


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #19619: [SPARK-22327][SPARKR][TEST][BACKPORT-2.2] check f...

2017-11-07 Thread felixcheung

Github user felixcheung closed the pull request at:

https://github.com/apache/spark/pull/19619


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #19285: [SPARK-22068][CORE]Reduce the duplicate code between put...

2017-11-07 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/19285
  
Merged build finished. Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #19285: [SPARK-22068][CORE]Reduce the duplicate code between put...

2017-11-07 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/19285
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/83575/
Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #19557: [SPARK-22281][SPARKR] Handle R method breaking signature...

2017-11-07 Thread felixcheung

Github user felixcheung commented on the issue:

https://github.com/apache/spark/pull/19557
  
merged to master/2.2


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #19557: [SPARK-22281][SPARKR] Handle R method breaking si...

2017-11-07 Thread asfgit

Github user asfgit closed the pull request at:

https://github.com/apache/spark/pull/19557


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #19678: [SPARK-20646][core] Port executors page to new UI backen...

2017-11-07 Thread squito

Github user squito commented on the issue:

https://github.com/apache/spark/pull/19678
  
merged to master


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #19678: [SPARK-20646][core] Port executors page to new UI...

2017-11-07 Thread asfgit

Github user asfgit closed the pull request at:

https://github.com/apache/spark/pull/19678


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #13206: [SPARK-15420] [SQL] Add repartition and sort to prepare ...

2017-11-07 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/13206
  
**[Test build #83583 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/83583/consoleFull)**
 for PR 13206 at commit 
[`a64be8a`](https://github.com/apache/spark/commit/a64be8a91ddadcd7acbbd08956f214b3c40f0dca).


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

< 1 2 3 4 5 >

301 - 400 of 430 matches

Mail list logo