[GitHub] [spark] dilipbiswal commented on a change in pull request #26437: [SPARK-29800][SQL] Plan Exists 's subquery in PlanSubqueries
dilipbiswal commented on a change in pull request #26437: [SPARK-29800][SQL] Plan Exists 's subquery in PlanSubqueries URL: https://github.com/apache/spark/pull/26437#discussion_r344433444 ## File path: sql/core/src/main/scala/org/apache/spark/sql/execution/subquery.scala ## @@ -171,6 +171,63 @@ case class InSubqueryExec( } } +/** + * The physical node of exists-subquery. This is for support use exists in join's on condition, + * since some join type we can't pushdown exists condition, we plan it here + */ +case class ExistsExec(child: Expression, + subQuery: String, + plan: BaseSubqueryExec, + exprId: ExprId, + private var resultBroadcast: Broadcast[Boolean] = null) + extends ExecSubqueryExpression { + + @transient private var result: Boolean = _ + + override def dataType: DataType = BooleanType + override def children: Seq[Expression] = child :: Nil + override def nullable: Boolean = child.nullable + override def toString: String = s"EXISTS ${plan.name}" + override def withNewPlan(plan: BaseSubqueryExec): ExistsExec = copy(plan = plan) + + override def semanticEquals(other: Expression): Boolean = other match { +case in: ExistsExec => child.semanticEquals(in.child) && plan.sameResult(in.plan) +case _ => false + } + + + def updateResult(): Unit = { +result = !plan.execute().isEmpty() Review comment: @AngersZh You r right. Sorry.. i had written it as IN initially and forgot to adjust to exists :-) Yeah, we need to change RewritePredicateSubquery which handles correlated subquery rewrites. The only thing i am not sure is about the outer joins. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on issue #26439: [SPARK-29801][ML] ML models unify toString method
AmplabJenkins removed a comment on issue #26439: [SPARK-29801][ML] ML models unify toString method URL: https://github.com/apache/spark/pull/26439#issuecomment-552075132 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/18385/ Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on issue #26439: [SPARK-29801][ML] ML models unify toString method
AmplabJenkins removed a comment on issue #26439: [SPARK-29801][ML] ML models unify toString method URL: https://github.com/apache/spark/pull/26439#issuecomment-552075131 Merged build finished. Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on issue #26439: [SPARK-29801][ML] ML models unify toString method
AmplabJenkins commented on issue #26439: [SPARK-29801][ML] ML models unify toString method URL: https://github.com/apache/spark/pull/26439#issuecomment-552075131 Merged build finished. Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on issue #26439: [SPARK-29801][ML] ML models unify toString method
AmplabJenkins commented on issue #26439: [SPARK-29801][ML] ML models unify toString method URL: https://github.com/apache/spark/pull/26439#issuecomment-552075132 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/18385/ Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on issue #26439: [SPARK-29801][ML] ML models unify toString method
SparkQA commented on issue #26439: [SPARK-29801][ML] ML models unify toString method URL: https://github.com/apache/spark/pull/26439#issuecomment-552075057 **[Test build #113494 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/113494/testReport)** for PR 26439 at commit [`3564d1a`](https://github.com/apache/spark/commit/3564d1ab7121f3354fd70c026cb3f7c12ba934d9). This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on issue #26429: [SPARK-29777][SparkR] SparkR::cleanClosure aggressively removes a function required by user function
AmplabJenkins removed a comment on issue #26429: [SPARK-29777][SparkR] SparkR::cleanClosure aggressively removes a function required by user function URL: https://github.com/apache/spark/pull/26429#issuecomment-552074660 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/18384/ Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on issue #26429: [SPARK-29777][SparkR] SparkR::cleanClosure aggressively removes a function required by user function
AmplabJenkins removed a comment on issue #26429: [SPARK-29777][SparkR] SparkR::cleanClosure aggressively removes a function required by user function URL: https://github.com/apache/spark/pull/26429#issuecomment-552074658 Merged build finished. Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on issue #26429: [SPARK-29777][SparkR] SparkR::cleanClosure aggressively removes a function required by user function
AmplabJenkins commented on issue #26429: [SPARK-29777][SparkR] SparkR::cleanClosure aggressively removes a function required by user function URL: https://github.com/apache/spark/pull/26429#issuecomment-552074658 Merged build finished. Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on issue #26429: [SPARK-29777][SparkR] SparkR::cleanClosure aggressively removes a function required by user function
AmplabJenkins commented on issue #26429: [SPARK-29777][SparkR] SparkR::cleanClosure aggressively removes a function required by user function URL: https://github.com/apache/spark/pull/26429#issuecomment-552074660 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/18384/ Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on issue #26429: [SPARK-29777][SparkR] SparkR::cleanClosure aggressively removes a function required by user function
SparkQA commented on issue #26429: [SPARK-29777][SparkR] SparkR::cleanClosure aggressively removes a function required by user function URL: https://github.com/apache/spark/pull/26429#issuecomment-552074577 **[Test build #113493 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/113493/testReport)** for PR 26429 at commit [`10925bf`](https://github.com/apache/spark/commit/10925bfb1537a86d4773e2788739037a164d6ed3). This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] zhengruifeng commented on a change in pull request #26413: [SPARK-16872][ML][PYSPARK] Impl Gaussian Naive Bayes Classifier
zhengruifeng commented on a change in pull request #26413: [SPARK-16872][ML][PYSPARK] Impl Gaussian Naive Bayes Classifier URL: https://github.com/apache/spark/pull/26413#discussion_r344432487 ## File path: mllib/src/main/scala/org/apache/spark/ml/classification/NaiveBayes.scala ## @@ -396,15 +545,29 @@ object NaiveBayesModel extends MLReadable[NaiveBayesModel] { private val className = classOf[NaiveBayesModel].getName override def load(path: String): NaiveBayesModel = { + implicit val format = DefaultFormats val metadata = DefaultParamsReader.loadMetadata(path, sc, className) + val (major, minor) = VersionUtils.majorMinorVersion(metadata.sparkVersion) + val modelTypeJson = metadata.getParamValue("modelType") + val modelType = Param.jsonDecode[String](compact(render(modelTypeJson))) val dataPath = new Path(path, "data").toString val data = sparkSession.read.parquet(dataPath) val vecConverted = MLUtils.convertVectorColumnsToML(data, "pi") - val Row(pi: Vector, theta: Matrix) = MLUtils.convertMatrixColumnsToML(vecConverted, "theta") -.select("pi", "theta") -.head() - val model = new NaiveBayesModel(metadata.uid, pi, theta) + + val model = if (major.toInt < 3 || modelType != NaiveBayes.Gaussian) { Review comment: I have test loading old version models, and it works fine In 2.4.4 ```scala scala> import org.apache.spark.ml.feature._ import org.apache.spark.ml.feature._ scala> import org.apache.spark.ml.regression._ import org.apache.spark.ml.regression._ scala> import org.apache.spark.ml.classification._ import org.apache.spark.ml.classification._ scala> var df = spark.read.format("libsvm").load("/data1/Datasets/a9a/a9a") 19/11/09 15:05:36 WARN LibSVMFileFormat: 'numFeatures' option not specified, determining the number of features by going though the input. If you know the number in advance, please specify it via 'numFeatures' option to avoid the extra scan. df: org.apache.spark.sql.DataFrame = [label: double, features: vector] scala> df.persist() res0: org.apache.spark.sql.Dataset[org.apache.spark.sql.Row] = [label: double, features: vector] scala> df.count res1: Long = 32561 scala> (0 until 8).foreach(_ => df = df.union(df)) scala> df.count res3: Long = 8335616 scala> scala> val nb = new NaiveBayes() nb: org.apache.spark.ml.classification.NaiveBayes = nb_a87b69dac8f6 scala> val model = nb.fit(df) [Stage 7:==> (201 + 13) / 256]19/11/09 15:06:03 WARN BLAS: Failed to load implementation from: com.github.fommil.netlib.NativeSystemBLAS 19/11/09 15:06:03 WARN BLAS: Failed to load implementation from: com.github.fommil.netlib.NativeRefBLAS model: org.apache.spark.ml.classification.NaiveBayesModel = NaiveBayesModel (uid=nb_a87b69dac8f6) with 2 classes scala> model.save("/tmp/nbm_2.4.4") ``` In this PR: ```scala scala> import org.apache.spark.ml.classification._ import org.apache.spark.ml.classification._ scala> val model = NaiveBayesModel.load("/tmp/nbm_2.4.4") model: org.apache.spark.ml.classification.NaiveBayesModel = NaiveBayesModel (uid=nb_a87b69dac8f6) with 2 classes scala> model.sigma res0: org.apache.spark.ml.linalg.Matrix = null scala> model.getModelType res1: String = multinomial ``` This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] zhengruifeng commented on a change in pull request #26413: [SPARK-16872][ML][PYSPARK] Impl Gaussian Naive Bayes Classifier
zhengruifeng commented on a change in pull request #26413: [SPARK-16872][ML][PYSPARK] Impl Gaussian Naive Bayes Classifier URL: https://github.com/apache/spark/pull/26413#discussion_r344432419 ## File path: project/MimaExcludes.scala ## @@ -118,6 +118,9 @@ object MimaExcludes { // [SPARK-26632][Core] Separate Thread Configurations of Driver and Executor ProblemFilters.exclude[DirectMissingMethodProblem]("org.apache.spark.network.netty.SparkTransportConf.fromSparkConf"), +// [SPARK-16872][ML][PYSPARK] Impl Gaussian Naive Bayes Classifier + ProblemFilters.exclude[DirectMissingMethodProblem]("org.apache.spark.ml.classification.NaiveBayesModel.this"), Review comment: If we do not make a new subclass, then I guess we can not keep it. However, this constructor is not exposed to end users. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on issue #26413: [SPARK-16872][ML][PYSPARK] Impl Gaussian Naive Bayes Classifier
AmplabJenkins commented on issue #26413: [SPARK-16872][ML][PYSPARK] Impl Gaussian Naive Bayes Classifier URL: https://github.com/apache/spark/pull/26413#issuecomment-552073579 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/18383/ Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on issue #26413: [SPARK-16872][ML][PYSPARK] Impl Gaussian Naive Bayes Classifier
AmplabJenkins commented on issue #26413: [SPARK-16872][ML][PYSPARK] Impl Gaussian Naive Bayes Classifier URL: https://github.com/apache/spark/pull/26413#issuecomment-552073577 Merged build finished. Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on issue #26413: [SPARK-16872][ML][PYSPARK] Impl Gaussian Naive Bayes Classifier
AmplabJenkins removed a comment on issue #26413: [SPARK-16872][ML][PYSPARK] Impl Gaussian Naive Bayes Classifier URL: https://github.com/apache/spark/pull/26413#issuecomment-552073577 Merged build finished. Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on issue #26413: [SPARK-16872][ML][PYSPARK] Impl Gaussian Naive Bayes Classifier
AmplabJenkins removed a comment on issue #26413: [SPARK-16872][ML][PYSPARK] Impl Gaussian Naive Bayes Classifier URL: https://github.com/apache/spark/pull/26413#issuecomment-552073579 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/18383/ Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on issue #26413: [SPARK-16872][ML][PYSPARK] Impl Gaussian Naive Bayes Classifier
SparkQA commented on issue #26413: [SPARK-16872][ML][PYSPARK] Impl Gaussian Naive Bayes Classifier URL: https://github.com/apache/spark/pull/26413#issuecomment-552073464 **[Test build #113492 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/113492/testReport)** for PR 26413 at commit [`82961da`](https://github.com/apache/spark/commit/82961dae05500b865c5fe192c1d7ac1beec87861). This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] zhengruifeng commented on a change in pull request #26413: [SPARK-16872][ML][PYSPARK] Impl Gaussian Naive Bayes Classifier
zhengruifeng commented on a change in pull request #26413: [SPARK-16872][ML][PYSPARK] Impl Gaussian Naive Bayes Classifier URL: https://github.com/apache/spark/pull/26413#discussion_r344432172 ## File path: mllib/src/main/scala/org/apache/spark/ml/classification/NaiveBayes.scala ## @@ -311,24 +432,42 @@ class NaiveBayesModel private[ml] ( require(value == 0.0 || value == 1.0, s"Bernoulli naive Bayes requires 0 or 1 feature values but found $features.") ) -val prob = thetaMinusNegTheta.get.multiply(features) +val prob = thetaMinusNegTheta.multiply(features) BLAS.axpy(1.0, pi, prob) -BLAS.axpy(1.0, negThetaSum.get, prob) +BLAS.axpy(1.0, negThetaSum, prob) prob } - override protected def predictRaw(features: Vector): Vector = { + private def gaussianCalculation(features: Vector) = { +val prob = Array.ofDim[Double](numClasses) +var i = 0 +while (i < numClasses) { + var s = 0.0 + var j = 0 + while (j < numFeatures) { +val d = features(j) - theta(i, j) +s += d * d / sigma(i, j) +j += 1 + } + prob(i) = pi(i) - (s + logVarSum(i)) / 2 + i += 1 +} +Vectors.dense(prob) + } + + @transient private lazy val predictRawFunc = { Review comment: I mark it transient since I guess the precomputed matrices are included in this closure, whose size may be big in high-dim cases. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] zhengruifeng commented on a change in pull request #26413: [SPARK-16872][ML][PYSPARK] Impl Gaussian Naive Bayes Classifier
zhengruifeng commented on a change in pull request #26413: [SPARK-16872][ML][PYSPARK] Impl Gaussian Naive Bayes Classifier URL: https://github.com/apache/spark/pull/26413#discussion_r344431951 ## File path: mllib/src/main/scala/org/apache/spark/ml/classification/NaiveBayes.scala ## @@ -248,19 +344,24 @@ object NaiveBayes extends DefaultParamsReadable[NaiveBayes] { /** * Model produced by [[NaiveBayes]] - * @param pi log of class priors, whose dimension is C (number of classes) + * + * @param pilog of class priors, whose dimension is C (number of classes) * @param theta log of class conditional probabilities, whose dimension is C (number of classes) * by D (number of features) + * @param sigma variance of each feature, whose dimension is C (number of classes) + * by D (number of features). This matrix is only available when modelType + * is set Gaussian. */ @Since("1.5.0") class NaiveBayesModel private[ml] ( @Since("1.5.0") override val uid: String, @Since("2.0.0") val pi: Vector, -@Since("2.0.0") val theta: Matrix) +@Since("2.0.0") val theta: Matrix, +@Since("3.0.0") val sigma: Matrix) Review comment: 1, I have try make a subclass `GaussianNaiveBayesModel`, I think it will involve too much complexity in usage and impl: we have to explictly assign the subclass `GaussianNaiveBayesModel` in some way if we need sigma matrix. I am not sure whether to define a new object `object GaussianNaiveBayesModel extends MLReadable[GaussianNaiveBayesModel]` and impl write/read method in it. ```scala val model: NaiveBayesModel = new NaiveBayes().setModelType("gaussian").fit(df) val sigma = model.asInstanceOf[GaussianNaiveBayesModel].sigma ``` 2, It is ok to use `Option[Matrix]` here, however it include a little complexity for pyspark: I have to define a helper function in the scala side: ```scala // helper function for pyspark, since python do not have option type. private[spark] def pySigma: Matrix = sigma.get ``` and in the py side ```python @property @since("3.0.0") def sigma(self): """ variance of each feature. """ if self.getModelType() == "gaussian": return self._call_java("pySigma") else: return None ``` That is because it seems that scala's `Option` type can not be converted to a python object automatically. 3, otherwise we may create an empty matrix for Multinomial & Bernoulli. 4, just contine use `null` sigma. scala's `null` will be automaticly converted to python's `None`. Among above 4 approaches, I prefer to 3&4. How do you think about it? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] zhengruifeng commented on a change in pull request #26413: [SPARK-16872][ML][PYSPARK] Impl Gaussian Naive Bayes Classifier
zhengruifeng commented on a change in pull request #26413: [SPARK-16872][ML][PYSPARK] Impl Gaussian Naive Bayes Classifier URL: https://github.com/apache/spark/pull/26413#discussion_r344431951 ## File path: mllib/src/main/scala/org/apache/spark/ml/classification/NaiveBayes.scala ## @@ -248,19 +344,24 @@ object NaiveBayes extends DefaultParamsReadable[NaiveBayes] { /** * Model produced by [[NaiveBayes]] - * @param pi log of class priors, whose dimension is C (number of classes) + * + * @param pilog of class priors, whose dimension is C (number of classes) * @param theta log of class conditional probabilities, whose dimension is C (number of classes) * by D (number of features) + * @param sigma variance of each feature, whose dimension is C (number of classes) + * by D (number of features). This matrix is only available when modelType + * is set Gaussian. */ @Since("1.5.0") class NaiveBayesModel private[ml] ( @Since("1.5.0") override val uid: String, @Since("2.0.0") val pi: Vector, -@Since("2.0.0") val theta: Matrix) +@Since("2.0.0") val theta: Matrix, +@Since("3.0.0") val sigma: Matrix) Review comment: 1, I have try make a subclass `GaussianNaiveBayesModel`, I think it will involve too much complexity in usage and impl: we have to explictly assign the subclass `GaussianNaiveBayesModel` in some way if we need sigma matrix. I am not sure whether to define a new object `object GaussianNaiveBayesModel extends MLReadable[GaussianNaiveBayesModel]` and impl write/read method in it. ```scala val model: NaiveBayesModel = new NaiveBayes().setModelType("gaussian").fit(df) val sigma = model.asInstanceOf[GaussianNaiveBayesModel].sigma ``` 2, It is ok to use `Option[Matrix]` here, however it include a little complexity for pyspark: I have to define a helper function in the scala side: ```scala // helper function for pyspark, since python do not have option type. private[spark] def pySigma: Matrix = sigma.get ``` and in the py side ```python @property @since("3.0.0") def sigma(self): """ variance of each feature. """ if self.getModelType() == "gaussian": return self._call_java("pySigma") else: return None ``` That is because it seems that scala's `Option` type can not be converted to a python object automatically. 3, otherwise we may create an empty matrix for Multinomial & Bernoulli. 4, just contine use `null` sigma. Among above 4 approaches, I prefer to 3&4. How do you think about it? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] gatorsmile closed pull request #26445: [HOT-FIX] Fix the SQLBase.g4
gatorsmile closed pull request #26445: [HOT-FIX] Fix the SQLBase.g4 URL: https://github.com/apache/spark/pull/26445 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] gatorsmile commented on issue #26445: [HOT-FIX] Fix the SQLBase.g4
gatorsmile commented on issue #26445: [HOT-FIX] Fix the SQLBase.g4 URL: https://github.com/apache/spark/pull/26445#issuecomment-552071941 The build passed. I am merging it now This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] kevinyu98 commented on a change in pull request #25573: [SPARK-28833][DOCS][SQL] Document ALTER VIEW command
kevinyu98 commented on a change in pull request #25573: [SPARK-28833][DOCS][SQL] Document ALTER VIEW command URL: https://github.com/apache/spark/pull/25573#discussion_r344431503 ## File path: docs/sql-ref-syntax-ddl-alter-view.md ## @@ -19,4 +19,78 @@ license: | limitations under the License. --- -**This page is under construction** +### Description +The `ALTER VIEW` statement changes various auxiliary properties of a view. + + + Rename view +Rename the existing view. If the view name already exists in the database, an exception is thrown. This operation does +support moving the views cross databases. +# Syntax +{% highlight sql %} +ALTER VIEW viewIdentifier RENAME TO viewIdentifier +viewIdentifier:= [db_name.]view_name +{% endhighlight %} + + + Set view properties +Set one or more properties of an existing view. The properties are the key value pairs. If the properties' keys exist, +the values are replaced with the new values. If the properties' keys does not exist, the key value pairs are added into +the properties. +# Syntax +{% highlight sql %} +ALTER VIEW viewIdentifier SET TBLPROPERTIES (key1=val1, key2=val2, ...) Review comment: This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] zhengruifeng commented on a change in pull request #26413: [SPARK-16872][ML][PYSPARK] Impl Gaussian Naive Bayes Classifier
zhengruifeng commented on a change in pull request #26413: [SPARK-16872][ML][PYSPARK] Impl Gaussian Naive Bayes Classifier URL: https://github.com/apache/spark/pull/26413#discussion_r344431188 ## File path: mllib/src/main/scala/org/apache/spark/ml/classification/NaiveBayes.scala ## @@ -311,24 +432,42 @@ class NaiveBayesModel private[ml] ( require(value == 0.0 || value == 1.0, s"Bernoulli naive Bayes requires 0 or 1 feature values but found $features.") ) -val prob = thetaMinusNegTheta.get.multiply(features) +val prob = thetaMinusNegTheta.multiply(features) BLAS.axpy(1.0, pi, prob) -BLAS.axpy(1.0, negThetaSum.get, prob) +BLAS.axpy(1.0, negThetaSum, prob) prob } - override protected def predictRaw(features: Vector): Vector = { + private def gaussianCalculation(features: Vector) = { +val prob = Array.ofDim[Double](numClasses) +var i = 0 +while (i < numClasses) { + var s = 0.0 + var j = 0 + while (j < numFeatures) { +val d = features(j) - theta(i, j) +s += d * d / sigma(i, j) +j += 1 + } + prob(i) = pi(i) - (s + logVarSum(i)) / 2 + i += 1 +} +Vectors.dense(prob) + } + + @transient private lazy val predictRawFunc = { Review comment: I found that VectorIndexerModel also mark [transformFunc](https://github.com/apache/spark/blob/ed12b61784e2ce5a1779c162bde1e16e9a9a0135/mllib/src/main/scala/org/apache/spark/ml/feature/VectorIndexer.scala#L351) lazy. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] zhengruifeng commented on a change in pull request #26413: [SPARK-16872][ML][PYSPARK] Impl Gaussian Naive Bayes Classifier
zhengruifeng commented on a change in pull request #26413: [SPARK-16872][ML][PYSPARK] Impl Gaussian Naive Bayes Classifier URL: https://github.com/apache/spark/pull/26413#discussion_r344431152 ## File path: mllib/src/main/scala/org/apache/spark/ml/classification/NaiveBayes.scala ## @@ -311,24 +432,42 @@ class NaiveBayesModel private[ml] ( require(value == 0.0 || value == 1.0, s"Bernoulli naive Bayes requires 0 or 1 feature values but found $features.") ) -val prob = thetaMinusNegTheta.get.multiply(features) +val prob = thetaMinusNegTheta.multiply(features) BLAS.axpy(1.0, pi, prob) -BLAS.axpy(1.0, negThetaSum.get, prob) +BLAS.axpy(1.0, negThetaSum, prob) prob } - override protected def predictRaw(features: Vector): Vector = { + private def gaussianCalculation(features: Vector) = { +val prob = Array.ofDim[Double](numClasses) +var i = 0 +while (i < numClasses) { + var s = 0.0 + var j = 0 + while (j < numFeatures) { +val d = features(j) - theta(i, j) +s += d * d / sigma(i, j) +j += 1 + } + prob(i) = pi(i) - (s + logVarSum(i)) / 2 + i += 1 +} +Vectors.dense(prob) + } + + @transient private lazy val predictRawFunc = { Review comment: Oh it should be lazy, other it will cause: ```scala java.util.NoSuchElementException: Failed to find a default value for modelType [info] at org.apache.spark.ml.param.Params.$anonfun$getOrDefault$2(params.scala:780) [info] at scala.Option.getOrElse(Option.scala:189) [info] at org.apache.spark.ml.param.Params.getOrDefault(params.scala:780) [info] at org.apache.spark.ml.param.Params.getOrDefault$(params.scala:777) [info] at org.apache.spark.ml.PipelineStage.getOrDefault(Pipeline.scala:43) [info] at org.apache.spark.ml.param.Params.$(params.scala:786) [info] at org.apache.spark.ml.param.Params.$$(params.scala:786) [info] at org.apache.spark.ml.PipelineStage.$(Pipeline.scala:43) [info] at org.apache.spark.ml.classification.NaiveBayesModel.(NaiveBayes.scala:466) ``` Since `NaiveBayesModel` should not contain `setDefault(modelType -> NaiveBayes.Multinomial)` This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] kevinyu98 commented on a change in pull request #25573: [SPARK-28833][DOCS][SQL] Document ALTER VIEW command
kevinyu98 commented on a change in pull request #25573: [SPARK-28833][DOCS][SQL] Document ALTER VIEW command URL: https://github.com/apache/spark/pull/25573#discussion_r344431095 ## File path: docs/sql-ref-syntax-ddl-alter-view.md ## @@ -19,4 +19,78 @@ license: | limitations under the License. --- -**This page is under construction** +### Description +The `ALTER VIEW` statement changes various auxiliary properties of a view. + + + Rename view +Rename the existing view. If the view name already exists in the database, an exception is thrown. This operation does +support moving the views cross databases. +# Syntax +{% highlight sql %} +ALTER VIEW viewIdentifier RENAME TO viewIdentifier +viewIdentifier:= [db_name.]view_name +{% endhighlight %} + + + Set view properties +Set one or more properties of an existing view. The properties are the key value pairs. If the properties' keys exist, +the values are replaced with the new values. If the properties' keys does not exist, the key value pairs are added into Review comment: This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] kevinyu98 commented on a change in pull request #25573: [SPARK-28833][DOCS][SQL] Document ALTER VIEW command
kevinyu98 commented on a change in pull request #25573: [SPARK-28833][DOCS][SQL] Document ALTER VIEW command URL: https://github.com/apache/spark/pull/25573#discussion_r344431070 ## File path: docs/sql-ref-syntax-ddl-alter-view.md ## @@ -19,4 +19,78 @@ license: | limitations under the License. --- -**This page is under construction** +### Description +The `ALTER VIEW` statement changes various auxiliary properties of a view. + + + Rename view +Rename the existing view. If the view name already exists in the database, an exception is thrown. This operation does +support moving the views cross databases. +# Syntax +{% highlight sql %} +ALTER VIEW viewIdentifier RENAME TO viewIdentifier +viewIdentifier:= [db_name.]view_name +{% endhighlight %} + + + Set view properties Review comment: @huaxingao This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] kevinyu98 commented on a change in pull request #25573: [SPARK-28833][DOCS][SQL] Document ALTER VIEW command
kevinyu98 commented on a change in pull request #25573: [SPARK-28833][DOCS][SQL] Document ALTER VIEW command URL: https://github.com/apache/spark/pull/25573#discussion_r344431040 ## File path: docs/sql-ref-syntax-ddl-alter-view.md ## @@ -19,4 +19,78 @@ license: | limitations under the License. --- -**This page is under construction** +### Description +The `ALTER VIEW` statement changes various auxiliary properties of a view. + + + Rename view +Rename the existing view. If the view name already exists in the database, an exception is thrown. This operation does Review comment: @srowen, I changed this session to the following, let me know if you are thinking something else. ` Rename VIEW Renames the existing view. If the view name already exists in the source database, an exception is thrown. This operation does not support moving the views cross databases. ` This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on issue #26445: [HOT-FIX] Fix the SQLBase.g4
AmplabJenkins removed a comment on issue #26445: [HOT-FIX] Fix the SQLBase.g4 URL: https://github.com/apache/spark/pull/26445#issuecomment-552069211 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/18382/ Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on issue #26445: [HOT-FIX] Fix the SQLBase.g4
AmplabJenkins removed a comment on issue #26445: [HOT-FIX] Fix the SQLBase.g4 URL: https://github.com/apache/spark/pull/26445#issuecomment-552069208 Merged build finished. Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on issue #26445: [HOT-FIX] Fix the SQLBase.g4
AmplabJenkins commented on issue #26445: [HOT-FIX] Fix the SQLBase.g4 URL: https://github.com/apache/spark/pull/26445#issuecomment-552069208 Merged build finished. Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on issue #26445: [HOT-FIX] Fix the SQLBase.g4
AmplabJenkins commented on issue #26445: [HOT-FIX] Fix the SQLBase.g4 URL: https://github.com/apache/spark/pull/26445#issuecomment-552069211 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/18382/ Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on issue #26445: [HOT-FIX] Fix the SQLBase.g4
SparkQA commented on issue #26445: [HOT-FIX] Fix the SQLBase.g4 URL: https://github.com/apache/spark/pull/26445#issuecomment-552069150 **[Test build #113491 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/113491/testReport)** for PR 26445 at commit [`d42bc0f`](https://github.com/apache/spark/commit/d42bc0ff2660c371dce07df6c00a6ac00988). This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] gatorsmile commented on issue #26445: [HOT-FIX] Fix the SQLBase.g4
gatorsmile commented on issue #26445: [HOT-FIX] Fix the SQLBase.g4 URL: https://github.com/apache/spark/pull/26445#issuecomment-552069052 cc @cloud-fan This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] gatorsmile opened a new pull request #26445: [HOT-FIX] Fix the SQLBase.g4
gatorsmile opened a new pull request #26445: [HOT-FIX] Fix the SQLBase.g4 URL: https://github.com/apache/spark/pull/26445 ### What changes were proposed in this pull request? Remove the duplicate code ### Why are the changes needed? Fix the compilation ### Does this PR introduce any user-facing change? No ### How was this patch tested? The existing tests This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] zhengruifeng commented on a change in pull request #26413: [SPARK-16872][ML][PYSPARK] Impl Gaussian Naive Bayes Classifier
zhengruifeng commented on a change in pull request #26413: [SPARK-16872][ML][PYSPARK] Impl Gaussian Naive Bayes Classifier URL: https://github.com/apache/spark/pull/26413#discussion_r344430184 ## File path: mllib/src/main/scala/org/apache/spark/ml/classification/NaiveBayes.scala ## @@ -396,15 +545,29 @@ object NaiveBayesModel extends MLReadable[NaiveBayesModel] { private val className = classOf[NaiveBayesModel].getName override def load(path: String): NaiveBayesModel = { + implicit val format = DefaultFormats val metadata = DefaultParamsReader.loadMetadata(path, sc, className) + val (major, minor) = VersionUtils.majorMinorVersion(metadata.sparkVersion) + val modelTypeJson = metadata.getParamValue("modelType") + val modelType = Param.jsonDecode[String](compact(render(modelTypeJson))) val dataPath = new Path(path, "data").toString val data = sparkSession.read.parquet(dataPath) val vecConverted = MLUtils.convertVectorColumnsToML(data, "pi") - val Row(pi: Vector, theta: Matrix) = MLUtils.convertMatrixColumnsToML(vecConverted, "theta") -.select("pi", "theta") -.head() - val model = new NaiveBayesModel(metadata.uid, pi, theta) + + val model = if (major.toInt < 3 || modelType != NaiveBayes.Gaussian) { Review comment: yes, it is also used in: [LogisticRegressionModel](https://github.com/apache/spark/blob/5853e8b3301fd7b0bff721d5a47139afb17bfd2b/mllib/src/main/scala/org/apache/spark/ml/classification/LogisticRegression.scala#L1229) [HashingTF](https://github.com/apache/spark/blob/4664a082c2c7ac989e818958c465c72833d3ccfe/mllib/src/main/scala/org/apache/spark/ml/feature/HashingTF.scala#L147) [LinearRegressionModel](https://github.com/apache/spark/blob/5853e8b3301fd7b0bff721d5a47139afb17bfd2b/mllib/src/main/scala/org/apache/spark/ml/regression/LinearRegression.scala#L769) and so on This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] zhengruifeng commented on a change in pull request #26413: [SPARK-16872][ML][PYSPARK] Impl Gaussian Naive Bayes Classifier
zhengruifeng commented on a change in pull request #26413: [SPARK-16872][ML][PYSPARK] Impl Gaussian Naive Bayes Classifier URL: https://github.com/apache/spark/pull/26413#discussion_r344430097 ## File path: mllib/src/main/scala/org/apache/spark/ml/classification/NaiveBayes.scala ## @@ -248,19 +344,24 @@ object NaiveBayes extends DefaultParamsReadable[NaiveBayes] { /** * Model produced by [[NaiveBayes]] - * @param pi log of class priors, whose dimension is C (number of classes) + * + * @param pilog of class priors, whose dimension is C (number of classes) * @param theta log of class conditional probabilities, whose dimension is C (number of classes) * by D (number of features) + * @param sigma variance of each feature, whose dimension is C (number of classes) + * by D (number of features). This matrix is only available when modelType + * is set Gaussian. */ @Since("1.5.0") class NaiveBayesModel private[ml] ( @Since("1.5.0") override val uid: String, @Since("2.0.0") val pi: Vector, -@Since("2.0.0") val theta: Matrix) +@Since("2.0.0") val theta: Matrix, +@Since("3.0.0") val sigma: Matrix) Review comment: I will have a try to make a subclass GaussianNaiveBayesModel This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] zhengruifeng commented on a change in pull request #26413: [SPARK-16872][ML][PYSPARK] Impl Gaussian Naive Bayes Classifier
zhengruifeng commented on a change in pull request #26413: [SPARK-16872][ML][PYSPARK] Impl Gaussian Naive Bayes Classifier URL: https://github.com/apache/spark/pull/26413#discussion_r344429910 ## File path: mllib/src/main/scala/org/apache/spark/ml/classification/NaiveBayes.scala ## @@ -280,15 +381,35 @@ class NaiveBayesModel private[ml] ( * This precomputes log(1.0 - exp(theta)) and its sum which are used for the linear algebra * application of this condition (in predict function). */ - private lazy val (thetaMinusNegTheta, negThetaSum) = $(modelType) match { -case Multinomial => (None, None) + @transient private lazy val (thetaMinusNegTheta, negThetaSum) = $(modelType) match { case Bernoulli => val negTheta = theta.map(value => math.log1p(-math.exp(value))) val ones = new DenseVector(Array.fill(theta.numCols) {1.0}) val thetaMinusNegTheta = theta.map { value => value - math.log1p(-math.exp(value)) } - (Option(thetaMinusNegTheta), Option(negTheta.multiply(ones))) + (thetaMinusNegTheta, negTheta.multiply(ones)) +case Multinomial => (null, null) +case Gaussian => (null, null) Review comment: I think causing an error is ok, avaliable `thetaMinusNegTheta, negThetaSum` should only be referenced in Bernoulli case. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on issue #26439: [SPARK-29801][ML] ML models unify toString method
AmplabJenkins removed a comment on issue #26439: [SPARK-29801][ML] ML models unify toString method URL: https://github.com/apache/spark/pull/26439#issuecomment-552067319 Merged build finished. Test FAILed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on issue #26439: [SPARK-29801][ML] ML models unify toString method
AmplabJenkins removed a comment on issue #26439: [SPARK-29801][ML] ML models unify toString method URL: https://github.com/apache/spark/pull/26439#issuecomment-552067321 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/113490/ Test FAILed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA removed a comment on issue #26439: [SPARK-29801][ML] ML models unify toString method
SparkQA removed a comment on issue #26439: [SPARK-29801][ML] ML models unify toString method URL: https://github.com/apache/spark/pull/26439#issuecomment-552067093 **[Test build #113490 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/113490/testReport)** for PR 26439 at commit [`8c0be4f`](https://github.com/apache/spark/commit/8c0be4f3d7d0019e412d7a6bbc070a1cd8d03d9d). This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on issue #26439: [SPARK-29801][ML] ML models unify toString method
AmplabJenkins commented on issue #26439: [SPARK-29801][ML] ML models unify toString method URL: https://github.com/apache/spark/pull/26439#issuecomment-552067319 Merged build finished. Test FAILed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on issue #26439: [SPARK-29801][ML] ML models unify toString method
SparkQA commented on issue #26439: [SPARK-29801][ML] ML models unify toString method URL: https://github.com/apache/spark/pull/26439#issuecomment-552067313 **[Test build #113490 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/113490/testReport)** for PR 26439 at commit [`8c0be4f`](https://github.com/apache/spark/commit/8c0be4f3d7d0019e412d7a6bbc070a1cd8d03d9d). * This patch **fails to build**. * This patch merges cleanly. * This patch adds no public classes. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on issue #26439: [SPARK-29801][ML] ML models unify toString method
AmplabJenkins commented on issue #26439: [SPARK-29801][ML] ML models unify toString method URL: https://github.com/apache/spark/pull/26439#issuecomment-552067321 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/113490/ Test FAILed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on issue #26439: [SPARK-29801][ML] ML models unify toString method
AmplabJenkins removed a comment on issue #26439: [SPARK-29801][ML] ML models unify toString method URL: https://github.com/apache/spark/pull/26439#issuecomment-552067174 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/18381/ Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on issue #26439: [SPARK-29801][ML] ML models unify toString method
AmplabJenkins removed a comment on issue #26439: [SPARK-29801][ML] ML models unify toString method URL: https://github.com/apache/spark/pull/26439#issuecomment-552067172 Merged build finished. Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on issue #26439: [SPARK-29801][ML] ML models unify toString method
AmplabJenkins commented on issue #26439: [SPARK-29801][ML] ML models unify toString method URL: https://github.com/apache/spark/pull/26439#issuecomment-552067174 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/18381/ Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on issue #26439: [SPARK-29801][ML] ML models unify toString method
AmplabJenkins commented on issue #26439: [SPARK-29801][ML] ML models unify toString method URL: https://github.com/apache/spark/pull/26439#issuecomment-552067172 Merged build finished. Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on issue #26439: [SPARK-29801][ML] ML models unify toString method
SparkQA commented on issue #26439: [SPARK-29801][ML] ML models unify toString method URL: https://github.com/apache/spark/pull/26439#issuecomment-552067093 **[Test build #113490 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/113490/testReport)** for PR 26439 at commit [`8c0be4f`](https://github.com/apache/spark/commit/8c0be4f3d7d0019e412d7a6bbc070a1cd8d03d9d). This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] zhengruifeng commented on issue #26439: [SPARK-29801][ML] ML models unify toString method
zhengruifeng commented on issue #26439: [SPARK-29801][ML] ML models unify toString method URL: https://github.com/apache/spark/pull/26439#issuecomment-552066730 @huaxingao Yes, the py side should be updated too. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] zhengruifeng commented on issue #26439: [SPARK-29801][ML] ML models unify toString method
zhengruifeng commented on issue #26439: [SPARK-29801][ML] ML models unify toString method URL: https://github.com/apache/spark/pull/26439#issuecomment-552066653 retest this please This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] zhengruifeng commented on issue #26439: [SPARK-29801][ML] ML models unify toString method
zhengruifeng commented on issue #26439: [SPARK-29801][ML] ML models unify toString method URL: https://github.com/apache/spark/pull/26439#issuecomment-552066603 @srowen Yes, I updated them in the second commit. BTW, I added toString method for evluatores/tunining/features impls This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on issue #26444: [SPARK-29807][SQL] Rename "spark.sql.ansi.enabled" to "spark.sql.dialect.spark.ansi.enabled"
AmplabJenkins removed a comment on issue #26444: [SPARK-29807][SQL] Rename "spark.sql.ansi.enabled" to "spark.sql.dialect.spark.ansi.enabled" URL: https://github.com/apache/spark/pull/26444#issuecomment-552066381 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/113488/ Test FAILed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on issue #26439: [SPARK-29801][ML] ML models unify toString method
AmplabJenkins removed a comment on issue #26439: [SPARK-29801][ML] ML models unify toString method URL: https://github.com/apache/spark/pull/26439#issuecomment-552066390 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/113489/ Test FAILed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA removed a comment on issue #26439: [SPARK-29801][ML] ML models unify toString method
SparkQA removed a comment on issue #26439: [SPARK-29801][ML] ML models unify toString method URL: https://github.com/apache/spark/pull/26439#issuecomment-552066168 **[Test build #113489 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/113489/testReport)** for PR 26439 at commit [`8c0be4f`](https://github.com/apache/spark/commit/8c0be4f3d7d0019e412d7a6bbc070a1cd8d03d9d). This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on issue #26444: [SPARK-29807][SQL] Rename "spark.sql.ansi.enabled" to "spark.sql.dialect.spark.ansi.enabled"
AmplabJenkins commented on issue #26444: [SPARK-29807][SQL] Rename "spark.sql.ansi.enabled" to "spark.sql.dialect.spark.ansi.enabled" URL: https://github.com/apache/spark/pull/26444#issuecomment-552066381 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/113488/ Test FAILed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on issue #26444: [SPARK-29807][SQL] Rename "spark.sql.ansi.enabled" to "spark.sql.dialect.spark.ansi.enabled"
AmplabJenkins removed a comment on issue #26444: [SPARK-29807][SQL] Rename "spark.sql.ansi.enabled" to "spark.sql.dialect.spark.ansi.enabled" URL: https://github.com/apache/spark/pull/26444#issuecomment-552066380 Merged build finished. Test FAILed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on issue #26439: [SPARK-29801][ML] ML models unify toString method
AmplabJenkins commented on issue #26439: [SPARK-29801][ML] ML models unify toString method URL: https://github.com/apache/spark/pull/26439#issuecomment-552066387 Merged build finished. Test FAILed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA removed a comment on issue #26444: [SPARK-29807][SQL] Rename "spark.sql.ansi.enabled" to "spark.sql.dialect.spark.ansi.enabled"
SparkQA removed a comment on issue #26444: [SPARK-29807][SQL] Rename "spark.sql.ansi.enabled" to "spark.sql.dialect.spark.ansi.enabled" URL: https://github.com/apache/spark/pull/26444#issuecomment-552066170 **[Test build #113488 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/113488/testReport)** for PR 26444 at commit [`52171db`](https://github.com/apache/spark/commit/52171dbb7bae102581fb560e19619b89d77ef60a). This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on issue #26444: [SPARK-29807][SQL] Rename "spark.sql.ansi.enabled" to "spark.sql.dialect.spark.ansi.enabled"
SparkQA commented on issue #26444: [SPARK-29807][SQL] Rename "spark.sql.ansi.enabled" to "spark.sql.dialect.spark.ansi.enabled" URL: https://github.com/apache/spark/pull/26444#issuecomment-552066376 **[Test build #113488 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/113488/testReport)** for PR 26444 at commit [`52171db`](https://github.com/apache/spark/commit/52171dbb7bae102581fb560e19619b89d77ef60a). * This patch **fails to build**. * This patch merges cleanly. * This patch adds no public classes. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on issue #26439: [SPARK-29801][ML] ML models unify toString method
SparkQA commented on issue #26439: [SPARK-29801][ML] ML models unify toString method URL: https://github.com/apache/spark/pull/26439#issuecomment-552066382 **[Test build #113489 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/113489/testReport)** for PR 26439 at commit [`8c0be4f`](https://github.com/apache/spark/commit/8c0be4f3d7d0019e412d7a6bbc070a1cd8d03d9d). * This patch **fails to build**. * This patch merges cleanly. * This patch adds no public classes. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on issue #26439: [SPARK-29801][ML] ML models unify toString method
AmplabJenkins commented on issue #26439: [SPARK-29801][ML] ML models unify toString method URL: https://github.com/apache/spark/pull/26439#issuecomment-552066390 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/113489/ Test FAILed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on issue #26444: [SPARK-29807][SQL] Rename "spark.sql.ansi.enabled" to "spark.sql.dialect.spark.ansi.enabled"
AmplabJenkins commented on issue #26444: [SPARK-29807][SQL] Rename "spark.sql.ansi.enabled" to "spark.sql.dialect.spark.ansi.enabled" URL: https://github.com/apache/spark/pull/26444#issuecomment-552066380 Merged build finished. Test FAILed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on issue #26439: [SPARK-29801][ML] ML models unify toString method
AmplabJenkins removed a comment on issue #26439: [SPARK-29801][ML] ML models unify toString method URL: https://github.com/apache/spark/pull/26439#issuecomment-552066387 Merged build finished. Test FAILed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on issue #26439: [SPARK-29801][ML] ML models unify toString method
AmplabJenkins removed a comment on issue #26439: [SPARK-29801][ML] ML models unify toString method URL: https://github.com/apache/spark/pull/26439#issuecomment-552066314 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/18380/ Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on issue #26439: [SPARK-29801][ML] ML models unify toString method
AmplabJenkins removed a comment on issue #26439: [SPARK-29801][ML] ML models unify toString method URL: https://github.com/apache/spark/pull/26439#issuecomment-552066313 Merged build finished. Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on issue #26439: [SPARK-29801][ML] ML models unify toString method
AmplabJenkins commented on issue #26439: [SPARK-29801][ML] ML models unify toString method URL: https://github.com/apache/spark/pull/26439#issuecomment-552066314 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/18380/ Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on issue #26439: [SPARK-29801][ML] ML models unify toString method
AmplabJenkins commented on issue #26439: [SPARK-29801][ML] ML models unify toString method URL: https://github.com/apache/spark/pull/26439#issuecomment-552066313 Merged build finished. Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on issue #26444: [SPARK-29807][SQL] Rename "spark.sql.ansi.enabled" to "spark.sql.dialect.spark.ansi.enabled"
AmplabJenkins commented on issue #26444: [SPARK-29807][SQL] Rename "spark.sql.ansi.enabled" to "spark.sql.dialect.spark.ansi.enabled" URL: https://github.com/apache/spark/pull/26444#issuecomment-552066242 Merged build finished. Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on issue #26444: [SPARK-29807][SQL] Rename "spark.sql.ansi.enabled" to "spark.sql.dialect.spark.ansi.enabled"
AmplabJenkins removed a comment on issue #26444: [SPARK-29807][SQL] Rename "spark.sql.ansi.enabled" to "spark.sql.dialect.spark.ansi.enabled" URL: https://github.com/apache/spark/pull/26444#issuecomment-552066242 Merged build finished. Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on issue #26444: [SPARK-29807][SQL] Rename "spark.sql.ansi.enabled" to "spark.sql.dialect.spark.ansi.enabled"
AmplabJenkins removed a comment on issue #26444: [SPARK-29807][SQL] Rename "spark.sql.ansi.enabled" to "spark.sql.dialect.spark.ansi.enabled" URL: https://github.com/apache/spark/pull/26444#issuecomment-552066251 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/18379/ Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on issue #26444: [SPARK-29807][SQL] Rename "spark.sql.ansi.enabled" to "spark.sql.dialect.spark.ansi.enabled"
AmplabJenkins commented on issue #26444: [SPARK-29807][SQL] Rename "spark.sql.ansi.enabled" to "spark.sql.dialect.spark.ansi.enabled" URL: https://github.com/apache/spark/pull/26444#issuecomment-552066251 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/18379/ Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] zhengruifeng commented on a change in pull request #26439: [SPARK-29801][ML] ML models unify toString method
zhengruifeng commented on a change in pull request #26439: [SPARK-29801][ML] ML models unify toString method URL: https://github.com/apache/spark/pull/26439#discussion_r344429040 ## File path: mllib/src/main/scala/org/apache/spark/ml/feature/Bucketizer.scala ## @@ -215,6 +215,13 @@ final class Bucketizer @Since("1.4.0") (@Since("1.4.0") override val uid: String override def copy(extra: ParamMap): Bucketizer = { defaultCopy[Bucketizer](extra).setParent(parent) } + + @Since("3.0.0") + override def toString: String = { +s"Bucketizer: uid=$uid" + + get(inputCols).map(c => s", numInputCols=${c.length}").getOrElse("") + + get(outputCols).map(c => s", numOutputCols=${c.length}").getOrElse("") + } } Review comment: I test this in REPL: ``` scala> val binarizer: Binarizer = new Binarizer() binarizer: org.apache.spark.ml.feature.Binarizer = Binarizer: uid=binarizer_3eab2c2d88f6 scala> binarizer.setInputCols(Array("a", "b")) res3: binarizer.type = Binarizer: uid=binarizer_3eab2c2d88f6, numInputCols=2 scala> binarizer.setOutputCols(Array("c", "d")) res4: binarizer.type = Binarizer: uid=binarizer_3eab2c2d88f6, numInputCols=2, numOutputCols=2 ``` This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on issue #26444: [SPARK-29807][SQL] Rename "spark.sql.ansi.enabled" to "spark.sql.dialect.spark.ansi.enabled"
SparkQA commented on issue #26444: [SPARK-29807][SQL] Rename "spark.sql.ansi.enabled" to "spark.sql.dialect.spark.ansi.enabled" URL: https://github.com/apache/spark/pull/26444#issuecomment-552066170 **[Test build #113488 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/113488/testReport)** for PR 26444 at commit [`52171db`](https://github.com/apache/spark/commit/52171dbb7bae102581fb560e19619b89d77ef60a). This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on issue #26439: [SPARK-29801][ML] ML models unify toString method
SparkQA commented on issue #26439: [SPARK-29801][ML] ML models unify toString method URL: https://github.com/apache/spark/pull/26439#issuecomment-552066168 **[Test build #113489 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/113489/testReport)** for PR 26439 at commit [`8c0be4f`](https://github.com/apache/spark/commit/8c0be4f3d7d0019e412d7a6bbc070a1cd8d03d9d). This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on issue #26443: [SPARK-29805] [Core] Enable nested schema pruning and nested pruning on expressions by default
AmplabJenkins commented on issue #26443: [SPARK-29805] [Core] Enable nested schema pruning and nested pruning on expressions by default URL: https://github.com/apache/spark/pull/26443#issuecomment-552065947 Merged build finished. Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on issue #26443: [SPARK-29805] [Core] Enable nested schema pruning and nested pruning on expressions by default
AmplabJenkins removed a comment on issue #26443: [SPARK-29805] [Core] Enable nested schema pruning and nested pruning on expressions by default URL: https://github.com/apache/spark/pull/26443#issuecomment-552065949 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/113483/ Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on issue #26443: [SPARK-29805] [Core] Enable nested schema pruning and nested pruning on expressions by default
AmplabJenkins removed a comment on issue #26443: [SPARK-29805] [Core] Enable nested schema pruning and nested pruning on expressions by default URL: https://github.com/apache/spark/pull/26443#issuecomment-552065947 Merged build finished. Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on issue #26443: [SPARK-29805] [Core] Enable nested schema pruning and nested pruning on expressions by default
AmplabJenkins commented on issue #26443: [SPARK-29805] [Core] Enable nested schema pruning and nested pruning on expressions by default URL: https://github.com/apache/spark/pull/26443#issuecomment-552065949 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/113483/ Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on issue #26443: [SPARK-29805] [Core] Enable nested schema pruning and nested pruning on expressions by default
SparkQA commented on issue #26443: [SPARK-29805] [Core] Enable nested schema pruning and nested pruning on expressions by default URL: https://github.com/apache/spark/pull/26443#issuecomment-552065851 **[Test build #113483 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/113483/testReport)** for PR 26443 at commit [`8d4e7d1`](https://github.com/apache/spark/commit/8d4e7d166afb7c11bb94882912e0c54ac7e9ab71). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA removed a comment on issue #26443: [SPARK-29805] [Core] Enable nested schema pruning and nested pruning on expressions by default
SparkQA removed a comment on issue #26443: [SPARK-29805] [Core] Enable nested schema pruning and nested pruning on expressions by default URL: https://github.com/apache/spark/pull/26443#issuecomment-552047992 **[Test build #113483 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/113483/testReport)** for PR 26443 at commit [`8d4e7d1`](https://github.com/apache/spark/commit/8d4e7d166afb7c11bb94882912e0c54ac7e9ab71). This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on issue #26444: [SPARK-29807][SQL] Rename "spark.sql.ansi.enabled" to "spark.sql.dialect.spark.ansi.enabled"
AmplabJenkins removed a comment on issue #26444: [SPARK-29807][SQL] Rename "spark.sql.ansi.enabled" to "spark.sql.dialect.spark.ansi.enabled" URL: https://github.com/apache/spark/pull/26444#issuecomment-552065101 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/113487/ Test FAILed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA removed a comment on issue #26444: [SPARK-29807][SQL] Rename "spark.sql.ansi.enabled" to "spark.sql.dialect.spark.ansi.enabled"
SparkQA removed a comment on issue #26444: [SPARK-29807][SQL] Rename "spark.sql.ansi.enabled" to "spark.sql.dialect.spark.ansi.enabled" URL: https://github.com/apache/spark/pull/26444#issuecomment-552064902 **[Test build #113487 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/113487/testReport)** for PR 26444 at commit [`660e9f9`](https://github.com/apache/spark/commit/660e9f94cf5ccc7a0f5013904a9661705f0cda69). This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on issue #26444: [SPARK-29807][SQL] Rename "spark.sql.ansi.enabled" to "spark.sql.dialect.spark.ansi.enabled"
AmplabJenkins removed a comment on issue #26444: [SPARK-29807][SQL] Rename "spark.sql.ansi.enabled" to "spark.sql.dialect.spark.ansi.enabled" URL: https://github.com/apache/spark/pull/26444#issuecomment-552065098 Merged build finished. Test FAILed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on issue #26444: [SPARK-29807][SQL] Rename "spark.sql.ansi.enabled" to "spark.sql.dialect.spark.ansi.enabled"
SparkQA commented on issue #26444: [SPARK-29807][SQL] Rename "spark.sql.ansi.enabled" to "spark.sql.dialect.spark.ansi.enabled" URL: https://github.com/apache/spark/pull/26444#issuecomment-552065097 **[Test build #113487 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/113487/testReport)** for PR 26444 at commit [`660e9f9`](https://github.com/apache/spark/commit/660e9f94cf5ccc7a0f5013904a9661705f0cda69). * This patch **fails to build**. * This patch merges cleanly. * This patch adds no public classes. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on issue #26444: [SPARK-29807][SQL] Rename "spark.sql.ansi.enabled" to "spark.sql.dialect.spark.ansi.enabled"
AmplabJenkins commented on issue #26444: [SPARK-29807][SQL] Rename "spark.sql.ansi.enabled" to "spark.sql.dialect.spark.ansi.enabled" URL: https://github.com/apache/spark/pull/26444#issuecomment-552065098 Merged build finished. Test FAILed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on issue #26444: [SPARK-29807][SQL] Rename "spark.sql.ansi.enabled" to "spark.sql.dialect.spark.ansi.enabled"
AmplabJenkins commented on issue #26444: [SPARK-29807][SQL] Rename "spark.sql.ansi.enabled" to "spark.sql.dialect.spark.ansi.enabled" URL: https://github.com/apache/spark/pull/26444#issuecomment-552065101 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/113487/ Test FAILed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on issue #26444: [SPARK-29807][SQL] Rename "spark.sql.ansi.enabled" to "spark.sql.dialect.spark.ansi.enabled"
AmplabJenkins removed a comment on issue #26444: [SPARK-29807][SQL] Rename "spark.sql.ansi.enabled" to "spark.sql.dialect.spark.ansi.enabled" URL: https://github.com/apache/spark/pull/26444#issuecomment-552064968 Merged build finished. Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on issue #26444: [SPARK-29807][SQL] Rename "spark.sql.ansi.enabled" to "spark.sql.dialect.spark.ansi.enabled"
AmplabJenkins commented on issue #26444: [SPARK-29807][SQL] Rename "spark.sql.ansi.enabled" to "spark.sql.dialect.spark.ansi.enabled" URL: https://github.com/apache/spark/pull/26444#issuecomment-552064970 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/18378/ Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on issue #26444: [SPARK-29807][SQL] Rename "spark.sql.ansi.enabled" to "spark.sql.dialect.spark.ansi.enabled"
AmplabJenkins removed a comment on issue #26444: [SPARK-29807][SQL] Rename "spark.sql.ansi.enabled" to "spark.sql.dialect.spark.ansi.enabled" URL: https://github.com/apache/spark/pull/26444#issuecomment-552064970 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/18378/ Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on issue #26444: [SPARK-29807][SQL] Rename "spark.sql.ansi.enabled" to "spark.sql.dialect.spark.ansi.enabled"
AmplabJenkins commented on issue #26444: [SPARK-29807][SQL] Rename "spark.sql.ansi.enabled" to "spark.sql.dialect.spark.ansi.enabled" URL: https://github.com/apache/spark/pull/26444#issuecomment-552064968 Merged build finished. Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on issue #26444: [SPARK-29807][SQL] Rename "spark.sql.ansi.enabled" to "spark.sql.dialect.spark.ansi.enabled"
SparkQA commented on issue #26444: [SPARK-29807][SQL] Rename "spark.sql.ansi.enabled" to "spark.sql.dialect.spark.ansi.enabled" URL: https://github.com/apache/spark/pull/26444#issuecomment-552064902 **[Test build #113487 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/113487/testReport)** for PR 26444 at commit [`660e9f9`](https://github.com/apache/spark/commit/660e9f94cf5ccc7a0f5013904a9661705f0cda69). This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] xuanyuanking opened a new pull request #26444: [SPARK-29807][SQL] Rename "spark.sql.ansi.enabled" to "spark.sql.dialect.spark.ansi.enabled"
xuanyuanking opened a new pull request #26444: [SPARK-29807][SQL] Rename "spark.sql.ansi.enabled" to "spark.sql.dialect.spark.ansi.enabled" URL: https://github.com/apache/spark/pull/26444 ### What changes were proposed in this pull request? Rename config "spark.sql.ansi.enabled" to "spark.sql.dialect.spark.ansi.enabled" ### Why are the changes needed? The relation between "spark.sql.ansi.enabled" and "spark.sql.dialect" is confusing, since the "PostgreSQL" dialect should contain the features of "spark.sql.ansi.enabled". To make things clearer, we can rename the "spark.sql.ansi.enabled" to "spark.sql.dialect.spark.ansi.enabled", thus the option "spark.sql.dialect.spark.ansi.enabled" is only for Spark dialect. For the casting and arithmetic operations, runtime exceptions should be thrown if "spark.sql.dialect" is "spark" and "spark.sql.dialect.spark.ansi.enabled" is true or "spark.sql.dialect" is PostgresSQL. ### Does this PR introduce any user-facing change? Yes, the config name changed. ### How was this patch tested? Existing UT. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] cloud-fan commented on issue #26097: [SPARK-29421][SQL] Supporting Create Table Like Using Provider
cloud-fan commented on issue #26097: [SPARK-29421][SQL] Supporting Create Table Like Using Provider URL: https://github.com/apache/spark/pull/26097#issuecomment-552063945 we can add an extra check `DDLUtils.isHiveProvider`, to make it work This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] kevinyu98 commented on a change in pull request #25573: [SPARK-28833][DOCS][SQL] Document ALTER VIEW command
kevinyu98 commented on a change in pull request #25573: [SPARK-28833][DOCS][SQL] Document ALTER VIEW command URL: https://github.com/apache/spark/pull/25573#discussion_r344427485 ## File path: docs/sql-ref-syntax-ddl-alter-view.md ## @@ -19,4 +19,78 @@ license: | limitations under the License. --- -**This page is under construction** +### Description +The `ALTER VIEW` statement changes various auxiliary properties of a view. + + + Rename view Review comment: @huaxingao@srowen This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] cloud-fan commented on issue #26167: [SPARK-28893][SQL] Support MERGE INTO in the parser and add the corresponding logical plan
cloud-fan commented on issue #26167: [SPARK-28893][SQL] Support MERGE INTO in the parser and add the corresponding logical plan URL: https://github.com/apache/spark/pull/26167#issuecomment-552062728 thanks, merging to master! This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] cloud-fan closed pull request #26167: [SPARK-28893][SQL] Support MERGE INTO in the parser and add the corresponding logical plan
cloud-fan closed pull request #26167: [SPARK-28893][SQL] Support MERGE INTO in the parser and add the corresponding logical plan URL: https://github.com/apache/spark/pull/26167 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on issue #26434: [SPARK-29544] [SQL] optimize skewed partition based on data size
AmplabJenkins removed a comment on issue #26434: [SPARK-29544] [SQL] optimize skewed partition based on data size URL: https://github.com/apache/spark/pull/26434#issuecomment-552062628 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/113484/ Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on issue #26434: [SPARK-29544] [SQL] optimize skewed partition based on data size
AmplabJenkins removed a comment on issue #26434: [SPARK-29544] [SQL] optimize skewed partition based on data size URL: https://github.com/apache/spark/pull/26434#issuecomment-552062627 Merged build finished. Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org