[GitHub] spark pull request: [SPARK-11551][DOC] Replace example code in ml-...
Github user BenFradet commented on a diff in the pull request: https://github.com/apache/spark/pull/10219#discussion_r47068305 --- Diff: examples/src/main/scala/org/apache/spark/examples/ml/PolynomialExpansionExample.scala --- @@ -0,0 +1,53 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + *http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +// scalastyle:off println +package org.apache.spark.examples.ml + +// $example on$ +import org.apache.spark.ml.feature.PolynomialExpansion +import org.apache.spark.mllib.linalg.Vectors +// $example off$ +import org.apache.spark.sql.SQLContext +import org.apache.spark.{SparkConf, SparkContext} + +object PolynomialExpansionExample { + def main(args: Array[String]): Unit = { +val conf = new SparkConf().setAppName("PolynomialExpansionExample") +val sc = new SparkContext(conf) +val sqlContext = new SQLContext(sc) + +// $example on$ +val data = Array( + Vectors.dense(-2.0, 2.3), + Vectors.dense(0.0, 0.0), + Vectors.dense(0.6, -1.1) +) +val df = sqlContext.createDataFrame(data.map(Tuple1.apply)).toDF("features") +val polynomialExpansion = new PolynomialExpansion() + .setInputCol("features") + .setOutputCol("polyFeatures") + .setDegree(3) +val polyDF = polynomialExpansion.transform(df) +polyDF.select("polyFeatures").take(3).foreach(println) +// $example off$ +sc.stop() + } +} +// scalastyle:on println + + --- End diff -- Trailing lines --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-11551][DOC] Replace example code in ml-...
Github user BenFradet commented on a diff in the pull request: https://github.com/apache/spark/pull/10219#discussion_r47068389 --- Diff: examples/src/main/scala/org/apache/spark/examples/ml/PCAExample.scala --- @@ -0,0 +1,54 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + *http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +// scalastyle:off println +package org.apache.spark.examples.ml + +// $example on$ +import org.apache.spark.ml.feature.PCA +import org.apache.spark.mllib.linalg.Vectors +// $example off$ +import org.apache.spark.sql.SQLContext +import org.apache.spark.{SparkConf, SparkContext} + +object PCAExample { + def main(args: Array[String]): Unit = { +val conf = new SparkConf().setAppName("PCAExample") +val sc = new SparkContext(conf) +val sqlContext = new SQLContext(sc) + +// $example on$ +val data = Array( + Vectors.sparse(5, Seq((1, 1.0), (3, 7.0))), + Vectors.dense(2.0, 0.0, 3.0, 4.0, 5.0), + Vectors.dense(4.0, 0.0, 0.0, 6.0, 7.0) +) +val df = sqlContext.createDataFrame(data.map(Tuple1.apply)).toDF("features") +val pca = new PCA() + .setInputCol("features") + .setOutputCol("pcaFeatures") + .setK(3) + .fit(df) +val pcaDF = pca.transform(df) +val result = pcaDF.select("pcaFeatures") +result.show() +// $example off$ +sc.stop() + } +} +// scalastyle:on println + --- End diff -- Trailing line --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-11551][DOC] Replace example code in ml-...
Github user BenFradet commented on a diff in the pull request: https://github.com/apache/spark/pull/10219#discussion_r47068461 --- Diff: examples/src/main/scala/org/apache/spark/examples/ml/OneHotEncoderExample.scala --- @@ -0,0 +1,59 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + *http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +// scalastyle:off println +package org.apache.spark.examples.ml + +// $example on$ +import org.apache.spark.ml.feature.{OneHotEncoder, StringIndexer} +// $example off$ +import org.apache.spark.sql.SQLContext +import org.apache.spark.{SparkConf, SparkContext} + +object OneHotEncoderExample { + def main(args: Array[String]): Unit = { +val conf = new SparkConf().setAppName("OneHotEncoderExample") +val sc = new SparkContext(conf) +val sqlContext = new SQLContext(sc) + +// $example on$ +val df = sqlContext.createDataFrame(Seq( + (0, "a"), + (1, "b"), + (2, "c"), + (3, "a"), + (4, "a"), + (5, "c") +)).toDF("id", "category") + +val indexer = new StringIndexer() + .setInputCol("category") + .setOutputCol("categoryIndex") + .fit(df) +val indexed = indexer.transform(df) + +val encoder = new OneHotEncoder() + .setInputCol("categoryIndex") + .setOutputCol("categoryVec") +val encoded = encoder.transform(indexed) +encoded.select("id", "categoryVec").show() +// $example off$ +sc.stop() + } +} +// scalastyle:on println + --- End diff -- Trailing line --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-11551][DOC] Replace example code in ml-...
Github user BenFradet commented on a diff in the pull request: https://github.com/apache/spark/pull/10219#discussion_r47068203 --- Diff: examples/src/main/scala/org/apache/spark/examples/ml/StringIndexerExample.scala --- @@ -0,0 +1,49 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + *http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +// scalastyle:off println +package org.apache.spark.examples.ml + +// $example on$ +import org.apache.spark.ml.feature.StringIndexer +// $example off$ +import org.apache.spark.sql.SQLContext +import org.apache.spark.{SparkConf, SparkContext} + +object StringIndexerExample { + def main(args: Array[String]): Unit = { +val conf = new SparkConf().setAppName("StringIndexerExample") +val sc = new SparkContext(conf) +val sqlContext = new SQLContext(sc) + +// $example on$ +val df = sqlContext.createDataFrame( + Seq((0, "a"), (1, "b"), (2, "c"), (3, "a"), (4, "a"), (5, "c")) +).toDF("id", "category") + +val indexer = new StringIndexer() + .setInputCol("category") + .setOutputCol("categoryIndex") + +val indexed = indexer.fit(df).transform(df) +indexed.show() +// $example off$ +sc.stop() + } +} +// scalastyle:on println + --- End diff -- Trailing line --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-11551][DOC] Replace example code in ml-...
Github user BenFradet commented on the pull request: https://github.com/apache/spark/pull/10219#issuecomment-163165116 I notice some formatting quirks, especially for scala examples, otherwise it looks good. However, shouldn't we take advantage of this pr to standardize the output of the examples? For example, I think every example should end with a `show()` or `println` so the user can just c/c the example and see what it does for himself. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-11551][DOC] Replace example code in ml-...
Github user yinxusen commented on the pull request: https://github.com/apache/spark/pull/10219#issuecomment-163173174 @BenFradet It's reasonable. I'll modify them now. Thanks for the review. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-11551][DOC] Replace example code in ml-...
Github user BenFradet commented on a diff in the pull request: https://github.com/apache/spark/pull/10219#discussion_r47068916 --- Diff: examples/src/main/scala/org/apache/spark/examples/ml/ElementWiseProductExample.scala --- @@ -0,0 +1,53 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + *http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +// scalastyle:off println +package org.apache.spark.examples.ml + +// $example on$ +import org.apache.spark.ml.feature.ElementwiseProduct +import org.apache.spark.mllib.linalg.Vectors +// $example off$ +import org.apache.spark.sql.SQLContext +import org.apache.spark.{SparkConf, SparkContext} + +object ElementwiseProductExample { + def main(args: Array[String]): Unit = { +val conf = new SparkConf().setAppName("ElementwiseProductExample") +val sc = new SparkContext(conf) +val sqlContext = new SQLContext(sc) + +// $example on$ +// Create some vector data; also works for sparse vectors +val dataFrame = sqlContext.createDataFrame(Seq( + ("a", Vectors.dense(1.0, 2.0, 3.0)), + ("b", Vectors.dense(4.0, 5.0, 6.0.toDF("id", "vector") + +val transformingVector = Vectors.dense(0.0, 1.0, 2.0) +val transformer = new ElementwiseProduct() + .setScalingVec(transformingVector) + .setInputCol("vector") + .setOutputCol("transformedVector") + +// Batch transform the vectors to create new column: +transformer.transform(dataFrame).show() +// $example off$ +sc.stop() + } +} +// scalastyle:on println + --- End diff -- Trailing line --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-11551][DOC] Replace example code in ml-...
Github user BenFradet commented on the pull request: https://github.com/apache/spark/pull/10219#issuecomment-163213528 @yinxusen I'll have a look later today --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-11551][DOC] Replace example code in ml-...
Github user yinxusen commented on the pull request: https://github.com/apache/spark/pull/10219#issuecomment-163203243 @BenFradet Does the code look good for you? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-11551][DOC] Replace example code in ml-...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/10219#issuecomment-163203400 **[Test build #47427 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/47427/consoleFull)** for PR 10219 at commit [`771d015`](https://github.com/apache/spark/commit/771d015000114828ab32e38301acbb50df150f9d). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-11551][DOC] Replace example code in ml-...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/10219#issuecomment-163211490 **[Test build #47427 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/47427/consoleFull)** for PR 10219 at commit [`771d015`](https://github.com/apache/spark/commit/771d015000114828ab32e38301acbb50df150f9d). * This patch passes all tests. * This patch merges cleanly. * This patch adds the following public classes _(experimental)_:\n * `public class JavaBinarizerExample `\n * `public class JavaBucketizerExample `\n * `public class JavaDCTExample `\n * `public class JavaElementwiseProductExample `\n * `public class JavaMinMaxScalerExample `\n * `public class JavaNGramExample `\n * `public class JavaNormalizerExample `\n * `public class JavaOneHotEncoderExample `\n * `public class JavaPCAExample `\n * `public class JavaPolynomialExpansionExample `\n * `public class JavaRFormulaExample `\n * `public class JavaStandardScalerExample `\n * `public class JavaStopWordsRemoverExample `\n * `public class JavaStringIndexerExample `\n * `public class JavaTokenizerExample `\n * `public class JavaVectorAssemblerExample `\n * `public class JavaVectorIndexerExample `\n * `public class JavaVectorSlicerExample `\n --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-11551][DOC] Replace example code in ml-...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/10219#issuecomment-163211572 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/47427/ Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-11551][DOC] Replace example code in ml-...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/10219#issuecomment-163211570 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-11551][DOC] Replace example code in ml-...
Github user BenFradet commented on a diff in the pull request: https://github.com/apache/spark/pull/10219#discussion_r47150356 --- Diff: examples/src/main/python/ml/polynomial_expansion_example.py --- @@ -0,0 +1,43 @@ +# +# Licensed to the Apache Software Foundation (ASF) under one or more +# contributor license agreements. See the NOTICE file distributed with +# this work for additional information regarding copyright ownership. +# The ASF licenses this file to You under the Apache License, Version 2.0 +# (the "License"); you may not use this file except in compliance with +# the License. You may obtain a copy of the License at +# +#http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. +# + +from __future__ import print_function + +from pyspark import SparkContext +from pyspark.sql import SQLContext +# $example on$ +from pyspark.ml.feature import PolynomialExpansion +from pyspark.mllib.linalg import Vectors +# $example off$ + +if __name__ == "__main__": +sc = SparkContext(appName="PolynomialExpansionExample") +sqlContext = SQLContext(sc) + +# $example on$ +df = sqlContext\ +.createDataFrame([(Vectors.dense([-2.0, 2.3]), ), --- End diff -- nit: space --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-11551][DOC] Replace example code in ml-...
Github user BenFradet commented on a diff in the pull request: https://github.com/apache/spark/pull/10219#discussion_r47147519 --- Diff: examples/src/main/java/org/apache/spark/examples/ml/JavaBinarizerExample.java --- @@ -0,0 +1,68 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + *http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.spark.examples.ml; + +import org.apache.spark.SparkConf; +import org.apache.spark.api.java.JavaSparkContext; +import org.apache.spark.sql.SQLContext; + +// $example on$ +import java.util.Arrays; + +import org.apache.spark.api.java.JavaRDD; +import org.apache.spark.ml.feature.Binarizer; +import org.apache.spark.sql.DataFrame; +import org.apache.spark.sql.Row; +import org.apache.spark.sql.RowFactory; +import org.apache.spark.sql.types.DataTypes; +import org.apache.spark.sql.types.Metadata; +import org.apache.spark.sql.types.StructField; +import org.apache.spark.sql.types.StructType; +// $example off$ + +public class JavaBinarizerExample { + public static void main(String[] args) { +SparkConf conf = new SparkConf().setAppName("JavaBinarizerExample"); +JavaSparkContext jsc = new JavaSparkContext(conf); +SQLContext jsql = new SQLContext(jsc); + +// $example on$ +JavaRDD jrdd = jsc.parallelize(Arrays.asList( + RowFactory.create(0, 0.1), + RowFactory.create(1, 0.8), + RowFactory.create(2, 0.2) +)); +StructType schema = new StructType(new StructField[]{ + new StructField("label", DataTypes.DoubleType, false, Metadata.empty()), + new StructField("feature", DataTypes.DoubleType, false, Metadata.empty()) +}); +DataFrame continuousDataFrame = jsql.createDataFrame(jrdd, schema); +Binarizer binarizer = new Binarizer() + .setInputCol("feature") + .setOutputCol("binarized_feature") + .setThreshold(0.5); +DataFrame binarizedDataFrame = binarizer.transform(continuousDataFrame); +DataFrame binarizedFeatures = binarizedDataFrame.select("binarized_feature"); +for (Row r : binarizedFeatures.collect()) { +Double binarized_value = r.getDouble(0); --- End diff -- indent --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-11551][DOC] Replace example code in ml-...
Github user BenFradet commented on the pull request: https://github.com/apache/spark/pull/10219#issuecomment-163389541 LGTM, except two minor comments. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-11551][DOC] Replace example code in ml-...
Github user yinxusen commented on the pull request: https://github.com/apache/spark/pull/10219#issuecomment-163442658 @BenFradet I'll change it in the follow-up PR https://github.com/apache/spark/pull/10193 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-11551][DOC] Replace example code in ml-...
Github user mengxr commented on the pull request: https://github.com/apache/spark/pull/10219#issuecomment-163373814 Merged into master and branch-1.6. Thanks! --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-11551][DOC] Replace example code in ml-...
Github user mengxr commented on a diff in the pull request: https://github.com/apache/spark/pull/10219#discussion_r47143681 --- Diff: docs/ml-features.md --- @@ -794,39 +411,7 @@ dctDf.select("featuresDCT").show(3) Refer to the [DCT Java docs](api/java/org/apache/spark/ml/feature/DCT.html) for more details on the API. -{% highlight java %} -import java.util.Arrays; - -import org.apache.spark.api.java.JavaRDD; -import org.apache.spark.api.java.JavaSparkContext; -import org.apache.spark.ml.feature.DCT; -import org.apache.spark.mllib.linalg.Vector; -import org.apache.spark.mllib.linalg.VectorUDT; -import org.apache.spark.mllib.linalg.Vectors; -import org.apache.spark.sql.DataFrame; -import org.apache.spark.sql.Row; -import org.apache.spark.sql.RowFactory; -import org.apache.spark.sql.SQLContext; -import org.apache.spark.sql.types.Metadata; -import org.apache.spark.sql.types.StructField; -import org.apache.spark.sql.types.StructType; - -JavaRDD data = jsc.parallelize(Arrays.asList( - RowFactory.create(Vectors.dense(0.0, 1.0, -2.0, 3.0)), - RowFactory.create(Vectors.dense(-1.0, 2.0, 4.0, -7.0)), - RowFactory.create(Vectors.dense(14.0, -2.0, -5.0, 1.0)) -)); -StructType schema = new StructType(new StructField[] { - new StructField("features", new VectorUDT(), false, Metadata.empty()), -}); -DataFrame df = jsql.createDataFrame(data, schema); -DCT dct = new DCT() - .setInputCol("features") - .setOutputCol("featuresDCT") - .setInverse(false); -DataFrame dctDf = dct.transform(df); -dctDf.select("featuresDCT").show(3); -{% endhighlight %} +{% include_example java/org/apache/spark/examples/ml/JavaDCTExample.java %}} --- End diff -- Please remove the extra `}` at the end. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-11551][DOC] Replace example code in ml-...
Github user asfgit closed the pull request at: https://github.com/apache/spark/pull/10219 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-11551][DOC] Replace example code in ml-...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/10219#issuecomment-163135226 **[Test build #47413 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/47413/consoleFull)** for PR 10219 at commit [`8748a88`](https://github.com/apache/spark/commit/8748a888df8d17bccc03f6c178641e04242ec157). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-11551][DOC] Replace example code in ml-...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/10219#issuecomment-163137492 **[Test build #47413 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/47413/consoleFull)** for PR 10219 at commit [`8748a88`](https://github.com/apache/spark/commit/8748a888df8d17bccc03f6c178641e04242ec157). * This patch passes all tests. * This patch merges cleanly. * This patch adds the following public classes _(experimental)_:\n * `public class JavaBinarizerExample `\n * `public class JavaBucketizerExample `\n * `public class JavaDCTExample `\n * `public class JavaElementwiseProductExample `\n * `public class JavaMinMaxScalerExample `\n * `public class JavaNGramExample `\n * `public class JavaNormalizerExample `\n * `public class JavaOneHotEncoderExample `\n * `public class JavaPCAExample `\n * `public class JavaPolynomialExpansionExample `\n * `public class JavaRFormulaExample `\n * `public class JavaStandardScalerExample `\n * `public class JavaStopWordsRemoverExample `\n * `public class JavaStringIndexerExample `\n * `public class JavaTokenizerExample `\n * `public class JavaVectorAssemblerExample `\n * `public class JavaVectorIndexerExample `\n * `public class JavaVectorSlicerExample `\n --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-11551][DOC] Replace example code in ml-...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/10219#issuecomment-163137574 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/47413/ Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-11551][DOC] Replace example code in ml-...
Github user yinxusen commented on the pull request: https://github.com/apache/spark/pull/10219#issuecomment-163138018 Ping @mengxr, this is for SPARK-11551. Please sign it off if looks good to you. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-11551][DOC] Replace example code in ml-...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/10219#issuecomment-163137570 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-11551][DOC] Replace example code in ml-...
GitHub user yinxusen opened a pull request: https://github.com/apache/spark/pull/10219 [SPARK-11551][DOC] Replace example code in ml-features.md using include_example PR on behalf of @somideshmukh, thanks! You can merge this pull request into a Git repository by running: $ git pull https://github.com/yinxusen/spark SPARK-11551 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/spark/pull/10219.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #10219 commit d14f55d8e842519b81423348e6656803b4c130fe Author: somideshmukhDate: 2015-11-26T09:13:58Z [SPARK-11551][DOC][Example]Replace example code in ml-features.md using include_example commit 12b1cf33a1846250458f3093b7bf7b7826f5 Author: somideshmukh Date: 2015-11-26T10:21:05Z [SPARK-11551][DOC][Example]Replace example code in ml-features.md using include_example commit 87e673eff13799027abb6f9835223c2e3791644e Author: Xusen Yin Date: 2015-11-27T05:06:52Z fix java code issues commit 0e19113bb4882c48bd0344cd480270ef054c9708 Author: Xusen Yin Date: 2015-11-27T05:52:13Z fix scala issues commit f6a975eaf1b6584325a1c94d99fc25bffdf1bad9 Author: Xusen Yin Date: 2015-11-27T06:11:09Z add java vectorindexer, standardscaler, normalizer commit dd1d2c12d5d7e65332c955bd63127a8b59f74502 Author: Xusen Yin Date: 2015-11-27T06:16:29Z add jsc stop commit 3d1efc3661719de9a253f862473cf9a7ede60139 Author: Xusen Yin Date: 2015-11-27T06:26:22Z fix scala issues commit c23bab4beb47fd604c153ce5c94c563eaf36361c Author: Xusen Yin Date: 2015-11-27T08:02:57Z add python examples commit c143d4b2e35275f72ef7b6e5f73ef8cfcceddc4a Author: somideshmukh Date: 2015-11-28T11:59:59Z Merge pull request #1 from yinxusen/SomilBranch1.33 review result commit b688b4d4055bee4e52bcfe1adf4991a60b6e55de Author: somideshmukh Date: 2015-12-01T09:50:53Z [SPARK-11551][DOC][Example]Replace example code in ml-features.md using include_example commit 8a0d88332f39e44365c7cbe3fdb9fac251251d85 Author: Xusen Yin Date: 2015-12-01T14:53:15Z fix minor issues commit bed2192d58c1bce968f3aa4f191e739972dad7e6 Author: Xusen Yin Date: 2015-12-01T15:08:46Z merge with master commit e31fb4a9434fa9e5e4ce19900c2a98b24626032d Author: Xusen Yin Date: 2015-12-08T11:05:55Z fix python style commit 8748a888df8d17bccc03f6c178641e04242ec157 Author: Xusen Yin Date: 2015-12-09T06:46:51Z merge with master --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-11551][DOC] Replace example code in ml-...
Github user yinxusen commented on the pull request: https://github.com/apache/spark/pull/10219#issuecomment-163133018 ok to test --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org