[GitHub] spark issue #16209: [WIP][SPARK-10849][SQL] Adds option to the JDBC data sou...

2017-03-16 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/16209
  
Test FAILed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/74721/
Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #16209: [WIP][SPARK-10849][SQL] Adds option to the JDBC data sou...

2017-03-16 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/16209
  
Merged build finished. Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #16209: [WIP][SPARK-10849][SQL] Adds option to the JDBC data sou...

2017-03-16 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/16209
  
**[Test build #74721 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/74721/testReport)**
 for PR 16209 at commit 
[`e76b7e0`](https://github.com/apache/spark/commit/e76b7e0b6fab0adf30f2be7ea7be50298196ac72).
 * This patch **fails Spark unit tests**.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #17295: [SPARK-19556][core] Do not encrypt block manager ...

2017-03-16 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request:

https://github.com/apache/spark/pull/17295#discussion_r106587687
  
--- Diff: core/src/main/scala/org/apache/spark/storage/BlockManager.scala 
---
@@ -1235,7 +1251,7 @@ private[spark] class BlockManager(
   peer.port,
   peer.executorId,
   blockId,
-  new NettyManagedBuffer(data.toNetty),
+  new BlockManagerManagedBuffer(blockInfoManager, blockId, 
data.toManagedBuffer()),
--- End diff --

why this change?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #17242: [SPARK-19902][SQL] Add optimization rule to simplify exp...

2017-03-16 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/17242
  
**[Test build #74725 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/74725/testReport)**
 for PR 17242 at commit 
[`93b83ef`](https://github.com/apache/spark/commit/93b83ef0b15c453adddc459f57cccb36269e4e08).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #17329: [SPARK-19991]FileSegmentManagedBuffer performance improv...

2017-03-16 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/17329
  
Test FAILed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/74715/
Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #17170: [SPARK-19825][R][ML] spark.ml R API for FPGrowth

2017-03-16 Thread wangmiao1981
Github user wangmiao1981 commented on a diff in the pull request:

https://github.com/apache/spark/pull/17170#discussion_r106587332
  
--- Diff: R/pkg/R/mllib_fpm.R ---
@@ -0,0 +1,152 @@
+#
+# Licensed to the Apache Software Foundation (ASF) under one or more
+# contributor license agreements.  See the NOTICE file distributed with
+# this work for additional information regarding copyright ownership.
+# The ASF licenses this file to You under the Apache License, Version 2.0
+# (the "License"); you may not use this file except in compliance with
+# the License.  You may obtain a copy of the License at
+#
+#http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+#
+
+# mllib_fpm.R: Provides methods for MLlib frequent pattern mining 
algorithms integration
+
+#' S4 class that represents a FPGrowthModel
+#'
+#' @param jobj a Java object reference to the backing Scala FPGrowthModel
+#' @export
+#' @note FPGrowthModel since 2.2.0
+setClass("FPGrowthModel", slots = list(jobj = "jobj"))
+
+#' FPGrowth
+#' 
+#' A parallel FP-growth algorithm to mine frequent itemsets. The algorithm 
is described in
+#' Li et al., PFP: Parallel FP-Growth for Query
+#' Recommendation <\url{http://dx.doi.org/10.1145/1454008.1454027}>. PFP 
distributes computation in such a way that each worker executes an
+#' independent group of mining tasks. The FP-Growth algorithm is described 
in
+#' Han et al., Mining frequent patterns without
+#' candidate generation <\url{http://dx.doi.org/10.1145/335191.335372}>.
+#'
+#' @param data A SparkDataFrame for training.
+#' @param minSupport Minimal support level.
+#' @param minConfidence Minimal confidence level.
+#' @param featuresCol Features column name.
+#' @param predictionCol Prediction column name.
+#' @param numPartitions Number of partitions used for fitting.
+#' @param ... additional argument(s) passed to the method.
+#' @return \code{spark.fpGrowth} returns a fitted FPGrowth model.
+#' 
+#' @rdname spark.fpGrowth
+#' @name spark.fpGrowth
+#' @aliases spark.fpGrowth,SparkDataFrame-method
+#' @export
+#' @examples
+#' \dontrun{
+#' raw_data <- read.df(
+#'   "data/mllib/sample_fpgrowth.txt",
+#'   source = "csv",
+#'   schema = structType(structField("raw_features", "string")))
+#'
+#' data <- selectExpr(raw_data, "split(raw_features, ' ') as features")
+#' model <- spark.fpGrowth(data)
+#'
+#' # Show frequent itemsets
+#' frequent_itemsets <- spark.freqItemsets(model)
+#' showDF(frequent_itemsets)
+#'
+#' # Show association rules
+#' association_rules <- spark.associationRules(model)
+#' showDF(association_rules)
+#'
+#' # Predict on new data
+#' new_itemsets <- data.frame(features = c("t", "t,s"))
+#' new_data <- selectExpr(createDataFrame(new_itemsets), "split(features, 
',') as features")
+#' predict(model, new_data)
+#'
+#' # Save and load model
+#' path <- "/path/to/model"
+#' write.ml(model, path)
+#' read.ml(path)
+#'
+#' # Optional arguments
+#' baskets_data <- selectExpr(createDataFrame(itemsets), "split(features, 
',') as baskets")
+#' another_model <- spark.fpGrowth(data, minSupport = 0.1, minConfidence = 
0.5
+#' featureCol = "baskets", predictionCol = 
"predicted",
+#' numPartitions = 10)
+#' }
+#' @references \url{http://en.wikipedia.org/wiki/Association_rule_learning}
+#' @note spark.fpGrowth since 2.2.0
+setMethod("spark.fpGrowth", signature(data = "SparkDataFrame"),
+  function(data, minSupport = 0.3, minConfidence = 0.8,
+   featuresCol = "features", predictionCol = "prediction",
+   numPartitions = -1) {
+if (!is.numeric(minSupport) || minSupport < 0 || minSupport > 
1) {
+  stop("minSupport should be a number [0, 1].")
+}
+if (!is.numeric(minConfidence) || minConfidence < 0 || 
minConfidence > 1) {
+  stop("minConfidence should be a number [0, 1].")
+}
+
+jobj <- callJStatic("org.apache.spark.ml.r.FPGrowthWrapper", 
"fit",
+data@sdf, as.numeric(minSupport), 
as.numeric(minConfidence),
+featuresCol, predictionCol, 
as.integer(numPartitions))
+new("FPGrowthModel", jobj = jobj)
+  })
+
  

[GitHub] spark issue #17329: [SPARK-19991]FileSegmentManagedBuffer performance improv...

2017-03-16 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/17329
  
Merged build finished. Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #17170: [SPARK-19825][R][ML] spark.ml R API for FPGrowth

2017-03-16 Thread wangmiao1981
Github user wangmiao1981 commented on a diff in the pull request:

https://github.com/apache/spark/pull/17170#discussion_r106587261
  
--- Diff: R/pkg/R/mllib_fpm.R ---
@@ -0,0 +1,152 @@
+#
+# Licensed to the Apache Software Foundation (ASF) under one or more
+# contributor license agreements.  See the NOTICE file distributed with
+# this work for additional information regarding copyright ownership.
+# The ASF licenses this file to You under the Apache License, Version 2.0
+# (the "License"); you may not use this file except in compliance with
+# the License.  You may obtain a copy of the License at
+#
+#http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+#
+
+# mllib_fpm.R: Provides methods for MLlib frequent pattern mining 
algorithms integration
+
+#' S4 class that represents a FPGrowthModel
+#'
+#' @param jobj a Java object reference to the backing Scala FPGrowthModel
+#' @export
+#' @note FPGrowthModel since 2.2.0
+setClass("FPGrowthModel", slots = list(jobj = "jobj"))
+
+#' FPGrowth
+#' 
+#' A parallel FP-growth algorithm to mine frequent itemsets. The algorithm 
is described in
+#' Li et al., PFP: Parallel FP-Growth for Query
+#' Recommendation <\url{http://dx.doi.org/10.1145/1454008.1454027}>. PFP 
distributes computation in such a way that each worker executes an
+#' independent group of mining tasks. The FP-Growth algorithm is described 
in
+#' Han et al., Mining frequent patterns without
+#' candidate generation <\url{http://dx.doi.org/10.1145/335191.335372}>.
+#'
+#' @param data A SparkDataFrame for training.
+#' @param minSupport Minimal support level.
+#' @param minConfidence Minimal confidence level.
+#' @param featuresCol Features column name.
+#' @param predictionCol Prediction column name.
+#' @param numPartitions Number of partitions used for fitting.
+#' @param ... additional argument(s) passed to the method.
+#' @return \code{spark.fpGrowth} returns a fitted FPGrowth model.
+#' 
+#' @rdname spark.fpGrowth
+#' @name spark.fpGrowth
+#' @aliases spark.fpGrowth,SparkDataFrame-method
+#' @export
+#' @examples
+#' \dontrun{
+#' raw_data <- read.df(
+#'   "data/mllib/sample_fpgrowth.txt",
+#'   source = "csv",
+#'   schema = structType(structField("raw_features", "string")))
+#'
+#' data <- selectExpr(raw_data, "split(raw_features, ' ') as features")
+#' model <- spark.fpGrowth(data)
+#'
+#' # Show frequent itemsets
+#' frequent_itemsets <- spark.freqItemsets(model)
+#' showDF(frequent_itemsets)
+#'
+#' # Show association rules
+#' association_rules <- spark.associationRules(model)
+#' showDF(association_rules)
+#'
+#' # Predict on new data
+#' new_itemsets <- data.frame(features = c("t", "t,s"))
+#' new_data <- selectExpr(createDataFrame(new_itemsets), "split(features, 
',') as features")
+#' predict(model, new_data)
+#'
+#' # Save and load model
+#' path <- "/path/to/model"
+#' write.ml(model, path)
+#' read.ml(path)
+#'
+#' # Optional arguments
+#' baskets_data <- selectExpr(createDataFrame(itemsets), "split(features, 
',') as baskets")
+#' another_model <- spark.fpGrowth(data, minSupport = 0.1, minConfidence = 
0.5
+#' featureCol = "baskets", predictionCol = 
"predicted",
+#' numPartitions = 10)
+#' }
+#' @references \url{http://en.wikipedia.org/wiki/Association_rule_learning}
+#' @note spark.fpGrowth since 2.2.0
+setMethod("spark.fpGrowth", signature(data = "SparkDataFrame"),
+  function(data, minSupport = 0.3, minConfidence = 0.8,
+   featuresCol = "features", predictionCol = "prediction",
+   numPartitions = -1) {
+if (!is.numeric(minSupport) || minSupport < 0 || minSupport > 
1) {
+  stop("minSupport should be a number [0, 1].")
+}
+if (!is.numeric(minConfidence) || minConfidence < 0 || 
minConfidence > 1) {
+  stop("minConfidence should be a number [0, 1].")
+}
+
+jobj <- callJStatic("org.apache.spark.ml.r.FPGrowthWrapper", 
"fit",
+data@sdf, as.numeric(minSupport), 
as.numeric(minConfidence),
+featuresCol, predictionCol, 
as.integer(numPartitions))
+new("FPGrowthModel", jobj = jobj)
+  })
+
  

[GitHub] spark pull request #17170: [SPARK-19825][R][ML] spark.ml R API for FPGrowth

2017-03-16 Thread wangmiao1981
Github user wangmiao1981 commented on a diff in the pull request:

https://github.com/apache/spark/pull/17170#discussion_r106587357
  
--- Diff: R/pkg/R/mllib_fpm.R ---
@@ -0,0 +1,152 @@
+#
+# Licensed to the Apache Software Foundation (ASF) under one or more
+# contributor license agreements.  See the NOTICE file distributed with
+# this work for additional information regarding copyright ownership.
+# The ASF licenses this file to You under the Apache License, Version 2.0
+# (the "License"); you may not use this file except in compliance with
+# the License.  You may obtain a copy of the License at
+#
+#http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+#
+
+# mllib_fpm.R: Provides methods for MLlib frequent pattern mining 
algorithms integration
+
+#' S4 class that represents a FPGrowthModel
+#'
+#' @param jobj a Java object reference to the backing Scala FPGrowthModel
+#' @export
+#' @note FPGrowthModel since 2.2.0
+setClass("FPGrowthModel", slots = list(jobj = "jobj"))
+
+#' FPGrowth
+#' 
+#' A parallel FP-growth algorithm to mine frequent itemsets. The algorithm 
is described in
+#' Li et al., PFP: Parallel FP-Growth for Query
+#' Recommendation <\url{http://dx.doi.org/10.1145/1454008.1454027}>. PFP 
distributes computation in such a way that each worker executes an
+#' independent group of mining tasks. The FP-Growth algorithm is described 
in
+#' Han et al., Mining frequent patterns without
+#' candidate generation <\url{http://dx.doi.org/10.1145/335191.335372}>.
+#'
+#' @param data A SparkDataFrame for training.
+#' @param minSupport Minimal support level.
+#' @param minConfidence Minimal confidence level.
+#' @param featuresCol Features column name.
+#' @param predictionCol Prediction column name.
+#' @param numPartitions Number of partitions used for fitting.
+#' @param ... additional argument(s) passed to the method.
+#' @return \code{spark.fpGrowth} returns a fitted FPGrowth model.
+#' 
+#' @rdname spark.fpGrowth
+#' @name spark.fpGrowth
+#' @aliases spark.fpGrowth,SparkDataFrame-method
+#' @export
+#' @examples
+#' \dontrun{
+#' raw_data <- read.df(
+#'   "data/mllib/sample_fpgrowth.txt",
+#'   source = "csv",
+#'   schema = structType(structField("raw_features", "string")))
+#'
+#' data <- selectExpr(raw_data, "split(raw_features, ' ') as features")
+#' model <- spark.fpGrowth(data)
+#'
+#' # Show frequent itemsets
+#' frequent_itemsets <- spark.freqItemsets(model)
+#' showDF(frequent_itemsets)
+#'
+#' # Show association rules
+#' association_rules <- spark.associationRules(model)
+#' showDF(association_rules)
+#'
+#' # Predict on new data
+#' new_itemsets <- data.frame(features = c("t", "t,s"))
+#' new_data <- selectExpr(createDataFrame(new_itemsets), "split(features, 
',') as features")
+#' predict(model, new_data)
+#'
+#' # Save and load model
+#' path <- "/path/to/model"
+#' write.ml(model, path)
+#' read.ml(path)
+#'
+#' # Optional arguments
+#' baskets_data <- selectExpr(createDataFrame(itemsets), "split(features, 
',') as baskets")
+#' another_model <- spark.fpGrowth(data, minSupport = 0.1, minConfidence = 
0.5
+#' featureCol = "baskets", predictionCol = 
"predicted",
+#' numPartitions = 10)
+#' }
+#' @references \url{http://en.wikipedia.org/wiki/Association_rule_learning}
+#' @note spark.fpGrowth since 2.2.0
+setMethod("spark.fpGrowth", signature(data = "SparkDataFrame"),
+  function(data, minSupport = 0.3, minConfidence = 0.8,
+   featuresCol = "features", predictionCol = "prediction",
+   numPartitions = -1) {
+if (!is.numeric(minSupport) || minSupport < 0 || minSupport > 
1) {
+  stop("minSupport should be a number [0, 1].")
+}
+if (!is.numeric(minConfidence) || minConfidence < 0 || 
minConfidence > 1) {
+  stop("minConfidence should be a number [0, 1].")
+}
+
+jobj <- callJStatic("org.apache.spark.ml.r.FPGrowthWrapper", 
"fit",
+data@sdf, as.numeric(minSupport), 
as.numeric(minConfidence),
+featuresCol, predictionCol, 
as.integer(numPartitions))
+new("FPGrowthModel", jobj = jobj)
+  })
+
  

[GitHub] spark pull request #17170: [SPARK-19825][R][ML] spark.ml R API for FPGrowth

2017-03-16 Thread wangmiao1981
Github user wangmiao1981 commented on a diff in the pull request:

https://github.com/apache/spark/pull/17170#discussion_r106587496
  
--- Diff: mllib/src/main/scala/org/apache/spark/ml/r/FPGrowthWrapper.scala 
---
@@ -0,0 +1,87 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.spark.ml.r
+
+import org.apache.hadoop.fs.Path
+import org.json4s.JsonDSL._
+import org.json4s.jackson.JsonMethods._
+
+import org.apache.spark.ml.fpm.{FPGrowth, FPGrowthModel}
+import org.apache.spark.ml.util._
+import org.apache.spark.sql.{DataFrame, Dataset}
+
+private[r] class FPGrowthWrapper private (val fpGrowthModel: 
FPGrowthModel) extends MLWritable {
+  def freqItemsets: DataFrame = fpGrowthModel.freqItemsets
+  def associationRules: DataFrame = fpGrowthModel.associationRules
+
+  def transform(dataset: Dataset[_]): DataFrame = {
+fpGrowthModel.transform(dataset)
+  }
+
+  override def write: MLWriter = new 
FPGrowthWrapper.FPGrowthWrapperWriter(this)
+}
+
+private[r] object FPGrowthWrapper extends MLReadable[FPGrowthWrapper] {
+
+  def fit(
+ data: DataFrame,
+ minSupport: Double,
+ minConfidence: Double,
+ featuresCol: String,
+ predictionCol: String,
+ numPartitions: Integer): FPGrowthWrapper = {
+val fpGrowth = new FPGrowth()
+  .setMinSupport(minSupport)
+  .setMinConfidence(minConfidence)
+  .setPredictionCol(predictionCol)
+
+if (numPartitions != null && numPartitions > 0) {
+  fpGrowth.setNumPartitions(numPartitions)
+}
+
+val fpGrowthModel = fpGrowth.fit(data)
+
+new FPGrowthWrapper(fpGrowthModel)
+  }
+
+  override def read: MLReader[FPGrowthWrapper] = new FPGrowthWrapperReader
+
+  class FPGrowthWrapperReader extends MLReader[FPGrowthWrapper] {
+override def load(path: String): FPGrowthWrapper = {
+  val modelPath = new Path(path, "model").toString
+  val fPGrowthModel = FPGrowthModel.load(modelPath)
+
+  new FPGrowthWrapper(fPGrowthModel)
+}
+  }
+
+class FPGrowthWrapperWriter(instance: FPGrowthWrapper) extends 
MLWriter {
--- End diff --

indentation seems incorrect here and above line. 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #17170: [SPARK-19825][R][ML] spark.ml R API for FPGrowth

2017-03-16 Thread wangmiao1981
Github user wangmiao1981 commented on a diff in the pull request:

https://github.com/apache/spark/pull/17170#discussion_r106587413
  
--- Diff: mllib/src/main/scala/org/apache/spark/ml/r/FPGrowthWrapper.scala 
---
@@ -0,0 +1,87 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.spark.ml.r
+
+import org.apache.hadoop.fs.Path
+import org.json4s.JsonDSL._
+import org.json4s.jackson.JsonMethods._
+
+import org.apache.spark.ml.fpm.{FPGrowth, FPGrowthModel}
+import org.apache.spark.ml.util._
+import org.apache.spark.sql.{DataFrame, Dataset}
+
+private[r] class FPGrowthWrapper private (val fpGrowthModel: 
FPGrowthModel) extends MLWritable {
+  def freqItemsets: DataFrame = fpGrowthModel.freqItemsets
+  def associationRules: DataFrame = fpGrowthModel.associationRules
+
+  def transform(dataset: Dataset[_]): DataFrame = {
+fpGrowthModel.transform(dataset)
+  }
+
+  override def write: MLWriter = new 
FPGrowthWrapper.FPGrowthWrapperWriter(this)
+}
+
+private[r] object FPGrowthWrapper extends MLReadable[FPGrowthWrapper] {
+
+  def fit(
+ data: DataFrame,
--- End diff --

alignment


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #17170: [SPARK-19825][R][ML] spark.ml R API for FPGrowth

2017-03-16 Thread wangmiao1981
Github user wangmiao1981 commented on a diff in the pull request:

https://github.com/apache/spark/pull/17170#discussion_r106587130
  
--- Diff: R/pkg/R/mllib_fpm.R ---
@@ -0,0 +1,152 @@
+#
+# Licensed to the Apache Software Foundation (ASF) under one or more
+# contributor license agreements.  See the NOTICE file distributed with
+# this work for additional information regarding copyright ownership.
+# The ASF licenses this file to You under the Apache License, Version 2.0
+# (the "License"); you may not use this file except in compliance with
+# the License.  You may obtain a copy of the License at
+#
+#http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+#
+
+# mllib_fpm.R: Provides methods for MLlib frequent pattern mining 
algorithms integration
+
+#' S4 class that represents a FPGrowthModel
+#'
+#' @param jobj a Java object reference to the backing Scala FPGrowthModel
+#' @export
+#' @note FPGrowthModel since 2.2.0
+setClass("FPGrowthModel", slots = list(jobj = "jobj"))
+
+#' FPGrowth
+#' 
+#' A parallel FP-growth algorithm to mine frequent itemsets. The algorithm 
is described in
+#' Li et al., PFP: Parallel FP-Growth for Query
+#' Recommendation <\url{http://dx.doi.org/10.1145/1454008.1454027}>. PFP 
distributes computation in such a way that each worker executes an
+#' independent group of mining tasks. The FP-Growth algorithm is described 
in
+#' Han et al., Mining frequent patterns without
+#' candidate generation <\url{http://dx.doi.org/10.1145/335191.335372}>.
+#'
+#' @param data A SparkDataFrame for training.
+#' @param minSupport Minimal support level.
+#' @param minConfidence Minimal confidence level.
+#' @param featuresCol Features column name.
+#' @param predictionCol Prediction column name.
+#' @param numPartitions Number of partitions used for fitting.
+#' @param ... additional argument(s) passed to the method.
+#' @return \code{spark.fpGrowth} returns a fitted FPGrowth model.
+#' 
--- End diff --

Other APIs do not have blank line here. I think we should be consistent. 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #17170: [SPARK-19825][R][ML] spark.ml R API for FPGrowth

2017-03-16 Thread wangmiao1981
Github user wangmiao1981 commented on a diff in the pull request:

https://github.com/apache/spark/pull/17170#discussion_r106587054
  
--- Diff: R/pkg/R/mllib_fpm.R ---
@@ -0,0 +1,152 @@
+#
+# Licensed to the Apache Software Foundation (ASF) under one or more
+# contributor license agreements.  See the NOTICE file distributed with
+# this work for additional information regarding copyright ownership.
+# The ASF licenses this file to You under the Apache License, Version 2.0
+# (the "License"); you may not use this file except in compliance with
+# the License.  You may obtain a copy of the License at
+#
+#http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+#
+
+# mllib_fpm.R: Provides methods for MLlib frequent pattern mining 
algorithms integration
+
+#' S4 class that represents a FPGrowthModel
+#'
+#' @param jobj a Java object reference to the backing Scala FPGrowthModel
+#' @export
+#' @note FPGrowthModel since 2.2.0
+setClass("FPGrowthModel", slots = list(jobj = "jobj"))
+
+#' FPGrowth
+#' 
+#' A parallel FP-growth algorithm to mine frequent itemsets. The algorithm 
is described in
+#' Li et al., PFP: Parallel FP-Growth for Query
+#' Recommendation <\url{http://dx.doi.org/10.1145/1454008.1454027}>. PFP 
distributes computation in such a way that each worker executes an
--- End diff --

This line seems exceeding the length limit. 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #17170: [SPARK-19825][R][ML] spark.ml R API for FPGrowth

2017-03-16 Thread wangmiao1981
Github user wangmiao1981 commented on a diff in the pull request:

https://github.com/apache/spark/pull/17170#discussion_r106587315
  
--- Diff: R/pkg/R/mllib_fpm.R ---
@@ -0,0 +1,152 @@
+#
+# Licensed to the Apache Software Foundation (ASF) under one or more
+# contributor license agreements.  See the NOTICE file distributed with
+# this work for additional information regarding copyright ownership.
+# The ASF licenses this file to You under the Apache License, Version 2.0
+# (the "License"); you may not use this file except in compliance with
+# the License.  You may obtain a copy of the License at
+#
+#http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+#
+
+# mllib_fpm.R: Provides methods for MLlib frequent pattern mining 
algorithms integration
+
+#' S4 class that represents a FPGrowthModel
+#'
+#' @param jobj a Java object reference to the backing Scala FPGrowthModel
+#' @export
+#' @note FPGrowthModel since 2.2.0
+setClass("FPGrowthModel", slots = list(jobj = "jobj"))
+
+#' FPGrowth
+#' 
+#' A parallel FP-growth algorithm to mine frequent itemsets. The algorithm 
is described in
+#' Li et al., PFP: Parallel FP-Growth for Query
+#' Recommendation <\url{http://dx.doi.org/10.1145/1454008.1454027}>. PFP 
distributes computation in such a way that each worker executes an
+#' independent group of mining tasks. The FP-Growth algorithm is described 
in
+#' Han et al., Mining frequent patterns without
+#' candidate generation <\url{http://dx.doi.org/10.1145/335191.335372}>.
+#'
+#' @param data A SparkDataFrame for training.
+#' @param minSupport Minimal support level.
+#' @param minConfidence Minimal confidence level.
+#' @param featuresCol Features column name.
+#' @param predictionCol Prediction column name.
+#' @param numPartitions Number of partitions used for fitting.
+#' @param ... additional argument(s) passed to the method.
+#' @return \code{spark.fpGrowth} returns a fitted FPGrowth model.
+#' 
+#' @rdname spark.fpGrowth
+#' @name spark.fpGrowth
+#' @aliases spark.fpGrowth,SparkDataFrame-method
+#' @export
+#' @examples
+#' \dontrun{
+#' raw_data <- read.df(
+#'   "data/mllib/sample_fpgrowth.txt",
+#'   source = "csv",
+#'   schema = structType(structField("raw_features", "string")))
+#'
+#' data <- selectExpr(raw_data, "split(raw_features, ' ') as features")
+#' model <- spark.fpGrowth(data)
+#'
+#' # Show frequent itemsets
+#' frequent_itemsets <- spark.freqItemsets(model)
+#' showDF(frequent_itemsets)
+#'
+#' # Show association rules
+#' association_rules <- spark.associationRules(model)
+#' showDF(association_rules)
+#'
+#' # Predict on new data
+#' new_itemsets <- data.frame(features = c("t", "t,s"))
+#' new_data <- selectExpr(createDataFrame(new_itemsets), "split(features, 
',') as features")
+#' predict(model, new_data)
+#'
+#' # Save and load model
+#' path <- "/path/to/model"
+#' write.ml(model, path)
+#' read.ml(path)
+#'
+#' # Optional arguments
+#' baskets_data <- selectExpr(createDataFrame(itemsets), "split(features, 
',') as baskets")
+#' another_model <- spark.fpGrowth(data, minSupport = 0.1, minConfidence = 
0.5
+#' featureCol = "baskets", predictionCol = 
"predicted",
+#' numPartitions = 10)
+#' }
+#' @references \url{http://en.wikipedia.org/wiki/Association_rule_learning}
+#' @note spark.fpGrowth since 2.2.0
+setMethod("spark.fpGrowth", signature(data = "SparkDataFrame"),
+  function(data, minSupport = 0.3, minConfidence = 0.8,
+   featuresCol = "features", predictionCol = "prediction",
+   numPartitions = -1) {
+if (!is.numeric(minSupport) || minSupport < 0 || minSupport > 
1) {
+  stop("minSupport should be a number [0, 1].")
+}
+if (!is.numeric(minConfidence) || minConfidence < 0 || 
minConfidence > 1) {
+  stop("minConfidence should be a number [0, 1].")
+}
+
+jobj <- callJStatic("org.apache.spark.ml.r.FPGrowthWrapper", 
"fit",
+data@sdf, as.numeric(minSupport), 
as.numeric(minConfidence),
+featuresCol, predictionCol, 
as.integer(numPartitions))
+new("FPGrowthModel", jobj = jobj)
+  })
+
  

[GitHub] spark pull request #17170: [SPARK-19825][R][ML] spark.ml R API for FPGrowth

2017-03-16 Thread wangmiao1981
Github user wangmiao1981 commented on a diff in the pull request:

https://github.com/apache/spark/pull/17170#discussion_r106587292
  
--- Diff: R/pkg/R/mllib_fpm.R ---
@@ -0,0 +1,152 @@
+#
+# Licensed to the Apache Software Foundation (ASF) under one or more
+# contributor license agreements.  See the NOTICE file distributed with
+# this work for additional information regarding copyright ownership.
+# The ASF licenses this file to You under the Apache License, Version 2.0
+# (the "License"); you may not use this file except in compliance with
+# the License.  You may obtain a copy of the License at
+#
+#http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+#
+
+# mllib_fpm.R: Provides methods for MLlib frequent pattern mining 
algorithms integration
+
+#' S4 class that represents a FPGrowthModel
+#'
+#' @param jobj a Java object reference to the backing Scala FPGrowthModel
+#' @export
+#' @note FPGrowthModel since 2.2.0
+setClass("FPGrowthModel", slots = list(jobj = "jobj"))
+
+#' FPGrowth
+#' 
+#' A parallel FP-growth algorithm to mine frequent itemsets. The algorithm 
is described in
+#' Li et al., PFP: Parallel FP-Growth for Query
+#' Recommendation <\url{http://dx.doi.org/10.1145/1454008.1454027}>. PFP 
distributes computation in such a way that each worker executes an
+#' independent group of mining tasks. The FP-Growth algorithm is described 
in
+#' Han et al., Mining frequent patterns without
+#' candidate generation <\url{http://dx.doi.org/10.1145/335191.335372}>.
+#'
+#' @param data A SparkDataFrame for training.
+#' @param minSupport Minimal support level.
+#' @param minConfidence Minimal confidence level.
+#' @param featuresCol Features column name.
+#' @param predictionCol Prediction column name.
+#' @param numPartitions Number of partitions used for fitting.
+#' @param ... additional argument(s) passed to the method.
+#' @return \code{spark.fpGrowth} returns a fitted FPGrowth model.
+#' 
+#' @rdname spark.fpGrowth
+#' @name spark.fpGrowth
+#' @aliases spark.fpGrowth,SparkDataFrame-method
+#' @export
+#' @examples
+#' \dontrun{
+#' raw_data <- read.df(
+#'   "data/mllib/sample_fpgrowth.txt",
+#'   source = "csv",
+#'   schema = structType(structField("raw_features", "string")))
+#'
+#' data <- selectExpr(raw_data, "split(raw_features, ' ') as features")
+#' model <- spark.fpGrowth(data)
+#'
+#' # Show frequent itemsets
+#' frequent_itemsets <- spark.freqItemsets(model)
+#' showDF(frequent_itemsets)
+#'
+#' # Show association rules
+#' association_rules <- spark.associationRules(model)
+#' showDF(association_rules)
+#'
+#' # Predict on new data
+#' new_itemsets <- data.frame(features = c("t", "t,s"))
+#' new_data <- selectExpr(createDataFrame(new_itemsets), "split(features, 
',') as features")
+#' predict(model, new_data)
+#'
+#' # Save and load model
+#' path <- "/path/to/model"
+#' write.ml(model, path)
+#' read.ml(path)
+#'
+#' # Optional arguments
+#' baskets_data <- selectExpr(createDataFrame(itemsets), "split(features, 
',') as baskets")
+#' another_model <- spark.fpGrowth(data, minSupport = 0.1, minConfidence = 
0.5
+#' featureCol = "baskets", predictionCol = 
"predicted",
+#' numPartitions = 10)
+#' }
+#' @references \url{http://en.wikipedia.org/wiki/Association_rule_learning}
+#' @note spark.fpGrowth since 2.2.0
+setMethod("spark.fpGrowth", signature(data = "SparkDataFrame"),
+  function(data, minSupport = 0.3, minConfidence = 0.8,
+   featuresCol = "features", predictionCol = "prediction",
+   numPartitions = -1) {
+if (!is.numeric(minSupport) || minSupport < 0 || minSupport > 
1) {
+  stop("minSupport should be a number [0, 1].")
+}
+if (!is.numeric(minConfidence) || minConfidence < 0 || 
minConfidence > 1) {
+  stop("minConfidence should be a number [0, 1].")
+}
+
+jobj <- callJStatic("org.apache.spark.ml.r.FPGrowthWrapper", 
"fit",
+data@sdf, as.numeric(minSupport), 
as.numeric(minConfidence),
+featuresCol, predictionCol, 
as.integer(numPartitions))
+new("FPGrowthModel", jobj = jobj)
+  })
+
  

[GitHub] spark issue #17329: [SPARK-19991]FileSegmentManagedBuffer performance improv...

2017-03-16 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/17329
  
**[Test build #74715 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/74715/testReport)**
 for PR 17329 at commit 
[`dc5fd8d`](https://github.com/apache/spark/commit/dc5fd8dda4717b485d4c4b2dfcdc5d115abf811c).
 * This patch **fails Spark unit tests**.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #17295: [SPARK-19556][core] Do not encrypt block manager ...

2017-03-16 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request:

https://github.com/apache/spark/pull/17295#discussion_r106587428
  
--- Diff: 
core/src/main/scala/org/apache/spark/security/CryptoStreamUtils.scala ---
@@ -102,4 +150,34 @@ private[spark] object CryptoStreamUtils extends 
Logging {
 }
 iv
   }
+
+  /**
+   * This class is a workaround for CRYPTO-125, that forces all bytes to 
be written to the
+   * underlying channel. Since the callers of this API are using blocking 
I/O, there are no
+   * concerns with regards to CPU usage here.
--- End diff --

is it a separated bug fix?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #17295: [SPARK-19556][core] Do not encrypt block manager ...

2017-03-16 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request:

https://github.com/apache/spark/pull/17295#discussion_r106587322
  
--- Diff: 
core/src/main/scala/org/apache/spark/security/CryptoStreamUtils.scala ---
@@ -63,12 +83,40 @@ private[spark] object CryptoStreamUtils extends Logging 
{
   is: InputStream,
   sparkConf: SparkConf,
   key: Array[Byte]): InputStream = {
-val properties = toCryptoConf(sparkConf)
 val iv = new Array[Byte](IV_LENGTH_IN_BYTES)
-is.read(iv, 0, iv.length)
-val transformationStr = sparkConf.get(IO_CRYPTO_CIPHER_TRANSFORMATION)
-new CryptoInputStream(transformationStr, properties, is,
-  new SecretKeySpec(key, "AES"), new IvParameterSpec(iv))
+var read = 0
+while (read < iv.length) {
--- End diff --

what does this while loop do?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #17242: [SPARK-19902][SQL] Add optimization rule to simplify exp...

2017-03-16 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/17242
  
**[Test build #74722 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/74722/testReport)**
 for PR 17242 at commit 
[`f4e771d`](https://github.com/apache/spark/commit/f4e771d85a33ff465d793e74ff4401453eaf0f3b).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #16971: [SPARK-19573][SQL] Make NaN/null handling consistent in ...

2017-03-16 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/16971
  
**[Test build #74724 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/74724/testReport)**
 for PR 16971 at commit 
[`ed6dacd`](https://github.com/apache/spark/commit/ed6dacdb3e3bdfd4e9ccb5c57bf8b4118636b0c6).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #17192: [SPARK-19849][SQL] Support ArrayType in to_json to produ...

2017-03-16 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/17192
  
**[Test build #74723 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/74723/testReport)**
 for PR 17192 at commit 
[`703a6cb`](https://github.com/apache/spark/commit/703a6cb36ea920e87a3536f16572020c11197345).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #17329: [SPARK-19991]FileSegmentManagedBuffer performance improv...

2017-03-16 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/17329
  
Test FAILed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/74714/
Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #17329: [SPARK-19991]FileSegmentManagedBuffer performance improv...

2017-03-16 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/17329
  
Merged build finished. Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #17329: [SPARK-19991]FileSegmentManagedBuffer performance improv...

2017-03-16 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/17329
  
**[Test build #74714 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/74714/testReport)**
 for PR 17329 at commit 
[`abcfc79`](https://github.com/apache/spark/commit/abcfc79991ecd1d5cef2cd1e275b872695ba19d9).
 * This patch **fails Spark unit tests**.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #17192: [SPARK-19849][SQL] Support ArrayType in to_json to produ...

2017-03-16 Thread HyukjinKwon
Github user HyukjinKwon commented on the issue:

https://github.com/apache/spark/pull/17192
  
retest this please


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #17192: [SPARK-19849][SQL] Support ArrayType in to_json to produ...

2017-03-16 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/17192
  
Merged build finished. Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #17192: [SPARK-19849][SQL] Support ArrayType in to_json to produ...

2017-03-16 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/17192
  
Test FAILed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/74718/
Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #17192: [SPARK-19849][SQL] Support ArrayType in to_json to produ...

2017-03-16 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/17192
  
**[Test build #74718 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/74718/testReport)**
 for PR 17192 at commit 
[`8c97406`](https://github.com/apache/spark/commit/8c97406b984ab68b74df2116547c1dbedb675785).
 * This patch **fails Spark unit tests**.
 * This patch merges cleanly.
 * This patch adds the following public classes _(experimental)_:
  * `case class JsonToStructs(`
  * `case class StructsToJson(`


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #17088: [SPARK-19753][CORE] Un-register all shuffle output on a ...

2017-03-16 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/17088
  
Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #17088: [SPARK-19753][CORE] Un-register all shuffle output on a ...

2017-03-16 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/17088
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/74710/
Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #17088: [SPARK-19753][CORE] Un-register all shuffle output on a ...

2017-03-16 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/17088
  
**[Test build #74710 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/74710/testReport)**
 for PR 17088 at commit 
[`8787db1`](https://github.com/apache/spark/commit/8787db1679c5b468afa3d2ede64eee53908fa5de).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #16971: [SPARK-19573][SQL] Make NaN/null handling consistent in ...

2017-03-16 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/16971
  
Merged build finished. Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #16971: [SPARK-19573][SQL] Make NaN/null handling consistent in ...

2017-03-16 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/16971
  
Test FAILed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/74717/
Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #16971: [SPARK-19573][SQL] Make NaN/null handling consistent in ...

2017-03-16 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/16971
  
**[Test build #74717 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/74717/testReport)**
 for PR 16971 at commit 
[`00d67f7`](https://github.com/apache/spark/commit/00d67f71e8c3eb254eabb63a53efdf675689aeb3).
 * This patch **fails Spark unit tests**.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #17320: [SPARK-19967][SQL] Add from_json in FunctionRegistry

2017-03-16 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/17320
  
Test FAILed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/74719/
Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #17320: [SPARK-19967][SQL] Add from_json in FunctionRegistry

2017-03-16 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/17320
  
Merged build finished. Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #17320: [SPARK-19967][SQL] Add from_json in FunctionRegistry

2017-03-16 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/17320
  
**[Test build #74719 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/74719/testReport)**
 for PR 17320 at commit 
[`ce39a9d`](https://github.com/apache/spark/commit/ce39a9dae6d322d0b800b260b9a4822d9e0e1f1d).
 * This patch **fails Spark unit tests**.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #17192: [SPARK-19849][SQL] Support ArrayType in to_json to produ...

2017-03-16 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/17192
  
Test FAILed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/74720/
Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #17192: [SPARK-19849][SQL] Support ArrayType in to_json to produ...

2017-03-16 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/17192
  
Merged build finished. Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #17192: [SPARK-19849][SQL] Support ArrayType in to_json to produ...

2017-03-16 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/17192
  
**[Test build #74720 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/74720/testReport)**
 for PR 17192 at commit 
[`703a6cb`](https://github.com/apache/spark/commit/703a6cb36ea920e87a3536f16572020c11197345).
 * This patch **fails Spark unit tests**.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #17302: [SPARK-19959][SQL] Fix to throw NullPointerException in ...

2017-03-16 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/17302
  
Test FAILed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/74716/
Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #17302: [SPARK-19959][SQL] Fix to throw NullPointerException in ...

2017-03-16 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/17302
  
Merged build finished. Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #17302: [SPARK-19959][SQL] Fix to throw NullPointerException in ...

2017-03-16 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/17302
  
**[Test build #74716 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/74716/testReport)**
 for PR 17302 at commit 
[`43678e7`](https://github.com/apache/spark/commit/43678e793148521b44713b4373d89d8db0bb2e66).
 * This patch **fails Spark unit tests**.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #17095: [SPARK-19763][SQL]qualified external datasource table lo...

2017-03-16 Thread kayousterhout
Github user kayousterhout commented on the issue:

https://github.com/apache/spark/pull/17095
  
Sounds like this was caused by a different PR (see the comment on the JIRA) 
and is now being fixed, so never mind here!


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #17295: [SPARK-19556][core] Do not encrypt block manager data in...

2017-03-16 Thread cloud-fan
Github user cloud-fan commented on the issue:

https://github.com/apache/spark/pull/17295
  
makes sense. one more question, ideally, shall we also transfer shuffle 
blocks after decryption?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #16209: [WIP][SPARK-10849][SQL] Adds option to the JDBC data sou...

2017-03-16 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/16209
  
**[Test build #74721 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/74721/testReport)**
 for PR 16209 at commit 
[`e76b7e0`](https://github.com/apache/spark/commit/e76b7e0b6fab0adf30f2be7ea7be50298196ac72).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #17191: [SPARK-14471][SQL] Aliases in SELECT could be used in GR...

2017-03-16 Thread maropu
Github user maropu commented on the issue:

https://github.com/apache/spark/pull/17191
  
okay, I'll recheck the code.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #17191: [SPARK-14471][SQL] Aliases in SELECT could be used in GR...

2017-03-16 Thread cloud-fan
Github user cloud-fan commented on the issue:

https://github.com/apache/spark/pull/17191
  
ok makes sense, let's support it


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #17286: [SPARK-19915][SQL] Exclude cartesian product candidates ...

2017-03-16 Thread cloud-fan
Github user cloud-fan commented on the issue:

https://github.com/apache/spark/pull/17286
  
LGTM except some minor comments


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #17286: [SPARK-19915][SQL] Exclude cartesian product cand...

2017-03-16 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request:

https://github.com/apache/spark/pull/17286#discussion_r106583389
  
--- Diff: 
sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/optimizer/JoinReorderSuite.scala
 ---
@@ -187,6 +220,8 @@ class JoinReorderSuite extends PlanTest with 
StatsEstimationTestBase {
   case (j1: Join, j2: Join) =>
 (sameJoinPlan(j1.left, j2.left) && sameJoinPlan(j1.right, 
j2.right)) ||
   (sameJoinPlan(j1.left, j2.right) && sameJoinPlan(j1.right, 
j2.left))
+  case _ if plan1.children.nonEmpty && plan2.children.nonEmpty =>
--- End diff --

when will we hit this branch?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #17286: [SPARK-19915][SQL] Exclude cartesian product cand...

2017-03-16 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request:

https://github.com/apache/spark/pull/17286#discussion_r106583140
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/internal/SQLConf.scala ---
@@ -710,6 +710,14 @@ object SQLConf {
   .intConf
   .createWithDefault(12)
 
+  val JOIN_REORDER_CARD_WEIGHT =
+buildConf("spark.sql.cbo.joinReorder.card.weight")
+  .doc("The weight of cardinality (number of rows) for plan cost 
comparison in join reorder: " +
+"rows * weight + size * (1 - weight).")
+  .doubleConf
+  .checkValue(weight => weight >= 0 && weight <= 1, "The weight value 
must be in [0, 1].")
+  .createWithDefault(0.7)
--- End diff --

it is useful to expose this config? I think most of the users will just 
disable join reordering if they have problems.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #17286: [SPARK-19915][SQL] Exclude cartesian product cand...

2017-03-16 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request:

https://github.com/apache/spark/pull/17286#discussion_r106583013
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/CostBasedJoinReorder.scala
 ---
@@ -203,64 +205,46 @@ object JoinReorderDP extends PredicateHelper {
   private def buildJoin(
   oneJoinPlan: JoinPlan,
   otherJoinPlan: JoinPlan,
-  conf: CatalystConf,
+  conf: SQLConf,
   conditions: Set[Expression],
-  topOutput: AttributeSet): JoinPlan = {
+  topOutput: AttributeSet): Option[JoinPlan] = {
 
 val onePlan = oneJoinPlan.plan
 val otherPlan = otherJoinPlan.plan
-// Now both onePlan and otherPlan become intermediate joins, so the 
cost of the
-// new join should also include their own cardinalities and sizes.
-val newCost = if (isCartesianProduct(onePlan) || 
isCartesianProduct(otherPlan)) {
-  // We consider cartesian product very expensive, thus set a very 
large cost for it.
-  // This enables to plan all the cartesian products at the end, 
because having a cartesian
-  // product as an intermediate join will significantly increase a 
plan's cost, making it
-  // impossible to be selected as the best plan for the items, unless 
there's no other choice.
-  Cost(
-rows = BigInt(Long.MaxValue) * BigInt(Long.MaxValue),
-size = BigInt(Long.MaxValue) * BigInt(Long.MaxValue))
-} else {
-  val onePlanStats = onePlan.stats(conf)
-  val otherPlanStats = otherPlan.stats(conf)
-  Cost(
-rows = oneJoinPlan.cost.rows + onePlanStats.rowCount.get +
-  otherJoinPlan.cost.rows + otherPlanStats.rowCount.get,
-size = oneJoinPlan.cost.size + onePlanStats.sizeInBytes +
-  otherJoinPlan.cost.size + otherPlanStats.sizeInBytes)
-}
-
-// Put the deeper side on the left, tend to build a left-deep tree.
-val (left, right) = if (oneJoinPlan.itemIds.size >= 
otherJoinPlan.itemIds.size) {
-  (onePlan, otherPlan)
-} else {
-  (otherPlan, onePlan)
-}
 val joinConds = conditions
   .filterNot(l => canEvaluate(l, onePlan))
   .filterNot(r => canEvaluate(r, otherPlan))
   .filter(e => e.references.subsetOf(onePlan.outputSet ++ 
otherPlan.outputSet))
-// We use inner join whether join condition is empty or not. Since 
cross join is
-// equivalent to inner join without condition.
-val newJoin = Join(left, right, Inner, joinConds.reduceOption(And))
-val collectedJoinConds = joinConds ++ oneJoinPlan.joinConds ++ 
otherJoinPlan.joinConds
-val remainingConds = conditions -- collectedJoinConds
-val neededAttr = AttributeSet(remainingConds.flatMap(_.references)) ++ 
topOutput
-val neededFromNewJoin = newJoin.outputSet.filter(neededAttr.contains)
-val newPlan =
-  if ((newJoin.outputSet -- neededFromNewJoin).nonEmpty) {
-Project(neededFromNewJoin.toSeq, newJoin)
+if (joinConds.isEmpty) {
+  // Cartesian product is very expensive, so we exclude them from 
candidate plans.
+  // This also significantly reduces the search space.
--- End diff --

great! now we can safely apply this optimization :)


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #17286: [SPARK-19915][SQL] Exclude cartesian product cand...

2017-03-16 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request:

https://github.com/apache/spark/pull/17286#discussion_r106582845
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/CostBasedJoinReorder.scala
 ---
@@ -272,26 +256,39 @@ object JoinReorderDP extends PredicateHelper {
* @param itemIds Set of item ids participating in this partial plan.
* @param plan The plan tree with the lowest cost for these items found 
so far.
* @param joinConds Join conditions included in the plan.
-   * @param cost The cost of this plan is the sum of costs of all 
intermediate joins.
+   * @param planCost The cost of this plan tree is the sum of costs of all 
intermediate joins.
--- End diff --

I think `cost` is good enough, why rename it?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #17286: [SPARK-19915][SQL] Exclude cartesian product cand...

2017-03-16 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request:

https://github.com/apache/spark/pull/17286#discussion_r106582668
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/CostBasedJoinReorder.scala
 ---
@@ -185,11 +184,14 @@ object JoinReorderDP extends PredicateHelper {
   // Should not join two overlapping item sets.
   if 
(oneSidePlan.itemIds.intersect(otherSidePlan.itemIds).isEmpty) {
 val joinPlan = buildJoin(oneSidePlan, otherSidePlan, conf, 
conditions, topOutput)
-// Check if it's the first plan for the item set, or it's a 
better plan than
-// the existing one due to lower cost.
-val existingPlan = nextLevel.get(joinPlan.itemIds)
-if (existingPlan.isEmpty || 
joinPlan.cost.lessThan(existingPlan.get.cost)) {
-  nextLevel.update(joinPlan.itemIds, joinPlan)
+if (joinPlan.isDefined) {
--- End diff --

when will this condition be false?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #17286: [SPARK-19915][SQL] Exclude cartesian product cand...

2017-03-16 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request:

https://github.com/apache/spark/pull/17286#discussion_r106582563
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/CostBasedJoinReorder.scala
 ---
@@ -128,38 +131,34 @@ case class CostBasedJoinReorder(conf: CatalystConf) 
extends Rule[LogicalPlan] wi
 object JoinReorderDP extends PredicateHelper {
 
   def search(
-  conf: CatalystConf,
+  conf: SQLConf,
   items: Seq[LogicalPlan],
   conditions: Set[Expression],
-  topOutput: AttributeSet): Option[LogicalPlan] = {
+  topOutput: AttributeSet): LogicalPlan = {
 
 // Level i maintains all found plans for i + 1 items.
 // Create the initial plans: each plan is a single item with zero cost.
-val itemIndex = items.zipWithIndex
+val itemIndex = items.zipWithIndex.map(_.swap).toMap
--- End diff --

looks like an unnecessary change now


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #17191: [SPARK-14471][SQL] Aliases in SELECT could be used in GR...

2017-03-16 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/17191
  
Merged build finished. Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #17191: [SPARK-14471][SQL] Aliases in SELECT could be used in GR...

2017-03-16 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/17191
  
Test FAILed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/74713/
Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #17191: [SPARK-14471][SQL] Aliases in SELECT could be used in GR...

2017-03-16 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/17191
  
**[Test build #74713 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/74713/testReport)**
 for PR 17191 at commit 
[`5d8c853`](https://github.com/apache/spark/commit/5d8c8532433fc2ebebdf506d636238e6b644b4ae).
 * This patch **fails Spark unit tests**.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #17179: [SPARK-19067][SS] Processing-time-based timeout in MapGr...

2017-03-16 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/17179
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/74707/
Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #17179: [SPARK-19067][SS] Processing-time-based timeout in MapGr...

2017-03-16 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/17179
  
Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #17192: [SPARK-19849][SQL] Support ArrayType in to_json to produ...

2017-03-16 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/17192
  
**[Test build #74720 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/74720/testReport)**
 for PR 17192 at commit 
[`703a6cb`](https://github.com/apache/spark/commit/703a6cb36ea920e87a3536f16572020c11197345).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #17179: [SPARK-19067][SS] Processing-time-based timeout in MapGr...

2017-03-16 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/17179
  
**[Test build #74707 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/74707/testReport)**
 for PR 17179 at commit 
[`1d0008c`](https://github.com/apache/spark/commit/1d0008cedd3e37832b31b75451eaf7a67ab832f3).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds the following public classes _(experimental)_:
  * `public class KeyedStateTimeout `


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #17320: [SPARK-19967][SQL] Add from_json in FunctionRegistry

2017-03-16 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/17320
  
**[Test build #74719 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/74719/testReport)**
 for PR 17320 at commit 
[`ce39a9d`](https://github.com/apache/spark/commit/ce39a9dae6d322d0b800b260b9a4822d9e0e1f1d).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #17192: [SPARK-19849][SQL] Support ArrayType in to_json to produ...

2017-03-16 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/17192
  
**[Test build #74718 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/74718/testReport)**
 for PR 17192 at commit 
[`8c97406`](https://github.com/apache/spark/commit/8c97406b984ab68b74df2116547c1dbedb675785).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #17216: [SPARK-19873][SS] Record num shuffle partitions in offse...

2017-03-16 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/17216
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/74708/
Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #17216: [SPARK-19873][SS] Record num shuffle partitions in offse...

2017-03-16 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/17216
  
Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #17216: [SPARK-19873][SS] Record num shuffle partitions in offse...

2017-03-16 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/17216
  
**[Test build #74708 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/74708/testReport)**
 for PR 17216 at commit 
[`4733b4e`](https://github.com/apache/spark/commit/4733b4e160bff010521319a1aa61e4f7981c65d6).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #17327: [SPARK-19721][SS][BRANCH-2.1] Good error message for ver...

2017-03-16 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/17327
  
Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #17327: [SPARK-19721][SS][BRANCH-2.1] Good error message for ver...

2017-03-16 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/17327
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/74704/
Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #17327: [SPARK-19721][SS][BRANCH-2.1] Good error message for ver...

2017-03-16 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/17327
  
**[Test build #74704 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/74704/testReport)**
 for PR 17327 at commit 
[`daabb27`](https://github.com/apache/spark/commit/daabb27aa32cb19c157e19081f6d08ff368bb42b).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #16971: [SPARK-19573][SQL] Make NaN/null handling consistent in ...

2017-03-16 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/16971
  
**[Test build #74717 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/74717/testReport)**
 for PR 16971 at commit 
[`00d67f7`](https://github.com/apache/spark/commit/00d67f71e8c3eb254eabb63a53efdf675689aeb3).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #17191: [SPARK-14471][SQL] Aliases in SELECT could be used in GR...

2017-03-16 Thread maropu
Github user maropu commented on the issue:

https://github.com/apache/spark/pull/17191
  
@cloud-fan In the mixed case, it seems PotgreSQL and MySQL support the 
syntax;
```

// PostgreSQL v9.5
postgres=# \d t2
  Table "public.t2"
 Column |  Type   | Modifiers 
+-+---
 gkey1  | integer | 
 gkey2  | integer | 
 value  | integer | 

postgres=# select gkey1 AS key1, gkey2, count(value) from t2 group by key1, 
2;
 key1 | gkey2 | count 
--+---+---
1 | 1 | 1
(1 row)

// MySQL v5.7.13 
mysql> SHOW COLUMNS FROM t2;
+---+-+--+-+-+---+
| Field | Type| Null | Key | Default | Extra |
+---+-+--+-+-+---+
| gkey1 | int(11) | YES  | | NULL|   |
| gkey2 | int(11) | YES  | | NULL|   |
| value | int(11) | YES  | | NULL|   |
+---+-+--+-+-+---+
3 rows in set (0.00 sec)

mysql> select gkey1 AS key1, gkey2, count(value) from t2 group by key1, 2;
+--+---+--+
| key1 | gkey2 | count(value) |
+--+---+--+
|1 | 1 |1 |
+--+---+--+
1 row in set (0.00 sec)
```


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #17302: [SPARK-19959][SQL] Fix to throw NullPointerException in ...

2017-03-16 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/17302
  
**[Test build #74716 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/74716/testReport)**
 for PR 17302 at commit 
[`43678e7`](https://github.com/apache/spark/commit/43678e793148521b44713b4373d89d8db0bb2e66).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #17329: [SPARK-19991]FileSegmentManagedBuffer performance improv...

2017-03-16 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/17329
  
**[Test build #74715 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/74715/testReport)**
 for PR 17329 at commit 
[`dc5fd8d`](https://github.com/apache/spark/commit/dc5fd8dda4717b485d4c4b2dfcdc5d115abf811c).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #17166: [SPARK-19820] [core] Allow reason to be specified for ta...

2017-03-16 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/17166
  
Merged build finished. Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #17166: [SPARK-19820] [core] Allow reason to be specified for ta...

2017-03-16 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/17166
  
Test FAILed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/74702/
Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #17307: [SPARK-13369] Make number of consecutive fetch failures ...

2017-03-16 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/17307
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/74699/
Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #17166: [SPARK-19820] [core] Allow reason to be specified for ta...

2017-03-16 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/17166
  
**[Test build #74702 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/74702/testReport)**
 for PR 17166 at commit 
[`8f7ffb3`](https://github.com/apache/spark/commit/8f7ffb395cae9ae7aa24a14dcdb908aaee30b710).
 * This patch **fails Spark unit tests**.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #17307: [SPARK-13369] Make number of consecutive fetch failures ...

2017-03-16 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/17307
  
Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #17307: [SPARK-13369] Make number of consecutive fetch failures ...

2017-03-16 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/17307
  
**[Test build #74699 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/74699/testReport)**
 for PR 17307 at commit 
[`0f95c8b`](https://github.com/apache/spark/commit/0f95c8b1ad260abb1a64d9cbd25d09a1bafeb1d8).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #17216: [SPARK-19873][SS] Record num shuffle partitions in offse...

2017-03-16 Thread uncleGen
Github user uncleGen commented on the issue:

https://github.com/apache/spark/pull/17216
  
Does this PR mix in some test file?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #17329: [SPARK-19991]FileSegmentManagedBuffer performance improv...

2017-03-16 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/17329
  
**[Test build #74714 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/74714/testReport)**
 for PR 17329 at commit 
[`abcfc79`](https://github.com/apache/spark/commit/abcfc79991ecd1d5cef2cd1e275b872695ba19d9).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #17329: [SPARK-19991]FileSegmentManagedBuffer performance...

2017-03-16 Thread witgo
GitHub user witgo opened a pull request:

https://github.com/apache/spark/pull/17329

[SPARK-19991]FileSegmentManagedBuffer performance improvement

FileSegmentManagedBuffer performance improvement.


## What changes were proposed in this pull request?

When we do not set the value of the configuration items 
`spark.storage.memoryMapThreshold` and `spark.shuffle.io.lazyFD`, 
each call to the cFileSegmentManagedBuffer.nioByteBuffer or 
FileSegmentManagedBuffer.createInputStream method creates a 
NoSuchElementException instance. This is a more time-consuming operation.

In the use case, this PR can improve the performance of about 3.5%

The test code:

``` scala

(1 to 10).foreach { i =>
  val numPartition = 1
  val rdd = sc.parallelize(0 until 
numPartition).repartition(numPartition).flatMap { t =>
(0 until numPartition).map(r => r * numPartition + t)
  }.repartition(numPartition)
  val serializeStart = System.currentTimeMillis()
  rdd.sum()
  val serializeFinish = System.currentTimeMillis()
  println(f"Test $i: ${(serializeFinish - serializeStart) / 1000D}%1.2f")
}


```

and `spark-defaults.conf` file:

```
spark.master  yarn-client
spark.executor.instances  20
spark.driver.memory   64g
spark.executor.memory 30g
spark.executor.cores  5
spark.default.parallelism 100 
spark.sql.shuffle.partitions  100
spark.serializer  
org.apache.spark.serializer.KryoSerializer
spark.driver.maxResultSize0
spark.ui.enabled  false 
spark.driver.extraJavaOptions -XX:+UseG1GC 
-XX:+UseStringDeduplication -XX:G1HeapRegionSize=16M -XX:MetaspaceSize=512M 
spark.executor.extraJavaOptions   -XX:+UseG1GC 
-XX:+UseStringDeduplication -XX:G1HeapRegionSize=16M -XX:MetaspaceSize=256M 
spark.cleaner.referenceTracking.blocking  true
spark.cleaner.referenceTracking.blocking.shuffle  true

```

The test results are as follows

| [SPARK-19991](https://github.com/witgo/spark/tree/SPARK-19991) 
|https://github.com/apache/spark/commit/68ea290b3aa89b2a539d13ea2c18bdb5a651b2bf|
|---| --- | 
|226.09 s| 235.21 s|

## How was this patch tested?

Existing tests.


You can merge this pull request into a Git repository by running:

$ git pull https://github.com/witgo/spark SPARK-19991

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/spark/pull/17329.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #17329


commit abcfc79991ecd1d5cef2cd1e275b872695ba19d9
Author: Guoqiang Li 
Date:   2017-03-17T03:19:37Z

FileSegmentManagedBuffer performance improvement




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #16971: [SPARK-19573][SQL] Make NaN/null handling consistent in ...

2017-03-16 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/16971
  
Test FAILed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/74705/
Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #16971: [SPARK-19573][SQL] Make NaN/null handling consistent in ...

2017-03-16 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/16971
  
Merged build finished. Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #16971: [SPARK-19573][SQL] Make NaN/null handling consistent in ...

2017-03-16 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/16971
  
**[Test build #74705 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/74705/testReport)**
 for PR 16971 at commit 
[`7bf7db3`](https://github.com/apache/spark/commit/7bf7db3114234dc900a9fd7a9b36615fa0bc1a3f).
 * This patch **fails Spark unit tests**.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #16028: [SPARK-18518][ML] HasSolver supports override

2017-03-16 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/16028
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/74709/
Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #16028: [SPARK-18518][ML] HasSolver supports override

2017-03-16 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/16028
  
Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #17244: [SPARK-19889][SQL] Make TaskContext callbacks thr...

2017-03-16 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request:

https://github.com/apache/spark/pull/17244#discussion_r106577342
  
--- Diff: core/src/main/scala/org/apache/spark/TaskContextImpl.scala ---
@@ -52,62 +63,79 @@ private[spark] class TaskContextImpl(
   @volatile private var interrupted: Boolean = false
 
   // Whether the task has completed.
-  @volatile private var completed: Boolean = false
+  private var completed: Boolean = false
 
   // Whether the task has failed.
-  @volatile private var failed: Boolean = false
+  private var failed: Boolean = false
+
+  // Throwable that caused the task to fail
+  private var failure: Throwable = _
 
   // If there was a fetch failure in the task, we store it here, to make 
sure user-code doesn't
   // hide the exception.  See SPARK-19276
   @volatile private var _fetchFailedException: 
Option[FetchFailedException] = None
 
-  override def addTaskCompletionListener(listener: 
TaskCompletionListener): this.type = {
-onCompleteCallbacks += listener
+  @GuardedBy("this")
+  override def addTaskCompletionListener(listener: TaskCompletionListener)
+  : this.type = synchronized {
+if (completed) {
+  listener.onTaskCompletion(this)
--- End diff --

`invokeListeners` takes a list of listeners, so we are able to only call 
this listener.

I think it's better to make these listeners consistent, i.e. throw 
`TaskCompletionListenerException` when failure happens during calling listener.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #16028: [SPARK-18518][ML] HasSolver supports override

2017-03-16 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/16028
  
**[Test build #74709 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/74709/testReport)**
 for PR 16028 at commit 
[`c0eba75`](https://github.com/apache/spark/commit/c0eba75966352106fbe6ce29322dbcc860ce7acf).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #17320: [SPARK-19967][SQL] Add from_json in FunctionRegistry

2017-03-16 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/17320
  
Test FAILed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/74706/
Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #17320: [SPARK-19967][SQL] Add from_json in FunctionRegistry

2017-03-16 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/17320
  
Merged build finished. Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #17320: [SPARK-19967][SQL] Add from_json in FunctionRegistry

2017-03-16 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/17320
  
**[Test build #74706 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/74706/testReport)**
 for PR 17320 at commit 
[`439c6f7`](https://github.com/apache/spark/commit/439c6f7c4f15eaf3b6660e9ff2d7fbdf9e953e27).
 * This patch **fails Spark unit tests**.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #17191: [SPARK-14471][SQL] Aliases in SELECT could be used in GR...

2017-03-16 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/17191
  
**[Test build #74713 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/74713/testReport)**
 for PR 17191 at commit 
[`5d8c853`](https://github.com/apache/spark/commit/5d8c8532433fc2ebebdf506d636238e6b644b4ae).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #16971: [SPARK-19573][SQL] Make NaN/null handling consistent in ...

2017-03-16 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/16971
  
Test FAILed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/74703/
Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #16971: [SPARK-19573][SQL] Make NaN/null handling consistent in ...

2017-03-16 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/16971
  
Merged build finished. Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #17237: [SPARK-19852][PYSPARK][ML] Update Python API setHandleIn...

2017-03-16 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/17237
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/74711/
Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #16971: [SPARK-19573][SQL] Make NaN/null handling consistent in ...

2017-03-16 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/16971
  
**[Test build #74703 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/74703/testReport)**
 for PR 16971 at commit 
[`42c7b25`](https://github.com/apache/spark/commit/42c7b25ca0665bd5187e67e732dba0edb37c05a8).
 * This patch **fails Spark unit tests**.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #17237: [SPARK-19852][PYSPARK][ML] Update Python API setHandleIn...

2017-03-16 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/17237
  
Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



  1   2   3   4   5   6   >