[GitHub] spark pull request: [SPARK-11551][DOC][Example]Replace example cod...

2015-12-08 Thread liancheng
Github user liancheng commented on the pull request:

https://github.com/apache/spark/pull/10002#issuecomment-163135345
  
@yinxusen Well, I'm probably too ignorant to review #10219 :) I'd ping 
@mengxr to sign it off.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-11551][DOC][Example]Replace example cod...

2015-12-08 Thread yinxusen
Github user yinxusen commented on the pull request:

https://github.com/apache/spark/pull/10002#issuecomment-163133717
  
@somideshmukh Never mind, I add a new one on behalf of you: 
https://github.com/apache/spark/pull/10219 @liancheng if it looks good to you, 
please help me merge it after the Jenkins test. Thanks!


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-11551][DOC][Example]Replace example cod...

2015-12-08 Thread somideshmukh
Github user somideshmukh commented on the pull request:

https://github.com/apache/spark/pull/10002#issuecomment-163128740
  
Please let me know what needs to be done,I have not worked on python code.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-11551][DOC][Example]Replace example cod...

2015-12-08 Thread yinxusen
Github user yinxusen commented on the pull request:

https://github.com/apache/spark/pull/10002#issuecomment-163127889
  
Sure, I will do it


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-11551][DOC][Example]Replace example cod...

2015-12-08 Thread liancheng
Github user liancheng commented on the pull request:

https://github.com/apache/spark/pull/10002#issuecomment-163126528
  
@yinxusen Maybe you can fork this PR branch and get it merged. You may add 
a note in your PR description to remind committers to attribute your PR to 
@somideshmukh. We can specify primary author while merging a PR using our merge 
script.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-11551][DOC][Example]Replace example cod...

2015-12-08 Thread yinxusen
Github user yinxusen commented on the pull request:

https://github.com/apache/spark/pull/10002#issuecomment-163123743
  
@somideshmukh Do you still have time work on this? Pleas let me know ASAP 
thanks.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-11551][DOC][Example]Replace example cod...

2015-12-08 Thread liancheng
Github user liancheng commented on the pull request:

https://github.com/apache/spark/pull/10002#issuecomment-162854335
  
Hit some network issue, just reverted this PR from master and branch-1.6 in 
#10200.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-11551][DOC][Example]Replace example cod...

2015-12-08 Thread yinxusen
Github user yinxusen commented on the pull request:

https://github.com/apache/spark/pull/10002#issuecomment-162852362
  
@somideshmukh I give you another PR to fix the Python style issue: 
https://github.com/somideshmukh/spark/pull/3. After merging it, we can call 
jenkins testing once here.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-11551][DOC][Example]Replace example cod...

2015-12-08 Thread yinxusen
Github user yinxusen commented on the pull request:

https://github.com/apache/spark/pull/10002#issuecomment-162848273
  
Sorry for that, I will help fixing them.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-11551][DOC][Example]Replace example cod...

2015-12-08 Thread liancheng
Github user liancheng commented on the pull request:

https://github.com/apache/spark/pull/10002#issuecomment-162842512
  
Reverting this one to bring back Spark builds.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-11551][DOC][Example]Replace example cod...

2015-12-08 Thread JoshRosen
Github user JoshRosen commented on the pull request:

https://github.com/apache/spark/pull/10002#issuecomment-162841634
  
Looks like this broke the Python style tests:

```
PEP8 checks failed.
./examples/src/main/python/ml/binarizer_example.py:41:4: E114 indentation 
is not a multiple of four (comment)
./examples/src/main/python/ml/onehot_encoder_example.py:39:1: W293 blank 
line contains whitespace
./examples/src/main/python/ml/pca_example.py:33:9: E128 continuation line 
under-indented for visual indent
./examples/src/main/python/ml/pca_example.py:35:41: E231 missing whitespace 
after ','
./examples/src/main/python/ml/polynomial_expansion_example.py:34:9: E128 
continuation line under-indented for visual indent
./examples/src/main/python/ml/polynomial_expansion_example.py:35:9: E128 
continuation line under-indented for visual indent
```

I guess that Jenkins never tested this PR?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-11551][DOC][Example]Replace example cod...

2015-12-07 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/spark/pull/10002


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-11551][DOC][Example]Replace example cod...

2015-12-07 Thread mengxr
Github user mengxr commented on the pull request:

https://github.com/apache/spark/pull/10002#issuecomment-162799717
  
Merged into master and branch-1.6. Thanks!


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-11551][DOC][Example]Replace example cod...

2015-12-07 Thread yinxusen
Github user yinxusen commented on the pull request:

https://github.com/apache/spark/pull/10002#issuecomment-162799563
  
@mengxr Yes, I'll submit a follow-up pr to refine it.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-11551][DOC][Example]Replace example cod...

2015-12-07 Thread mengxr
Github user mengxr commented on a diff in the pull request:

https://github.com/apache/spark/pull/10002#discussion_r46921864
  
--- Diff: docs/ml-features.md ---
@@ -794,39 +411,7 @@ dctDf.select("featuresDCT").show(3)
 Refer to the [DCT Java docs](api/java/org/apache/spark/ml/feature/DCT.html)
 for more details on the API.
 
-{% highlight java %}
-import java.util.Arrays;
-
-import org.apache.spark.api.java.JavaRDD;
-import org.apache.spark.api.java.JavaSparkContext;
-import org.apache.spark.ml.feature.DCT;
-import org.apache.spark.mllib.linalg.Vector;
-import org.apache.spark.mllib.linalg.VectorUDT;
-import org.apache.spark.mllib.linalg.Vectors;
-import org.apache.spark.sql.DataFrame;
-import org.apache.spark.sql.Row;
-import org.apache.spark.sql.RowFactory;
-import org.apache.spark.sql.SQLContext;
-import org.apache.spark.sql.types.Metadata;
-import org.apache.spark.sql.types.StructField;
-import org.apache.spark.sql.types.StructType;
-
-JavaRDD data = jsc.parallelize(Arrays.asList(
-  RowFactory.create(Vectors.dense(0.0, 1.0, -2.0, 3.0)),
-  RowFactory.create(Vectors.dense(-1.0, 2.0, 4.0, -7.0)),
-  RowFactory.create(Vectors.dense(14.0, -2.0, -5.0, 1.0))
-));
-StructType schema = new StructType(new StructField[] {
-  new StructField("features", new VectorUDT(), false, Metadata.empty()),
-});
-DataFrame df = jsql.createDataFrame(data, schema);
-DCT dct = new DCT()
-  .setInputCol("features")
-  .setOutputCol("featuresDCT")
-  .setInverse(false);
-DataFrame dctDf = dct.transform(df);
-dctDf.select("featuresDCT").show(3);
-{% endhighlight %}
+{% include_example java/org/apache/spark/examples/ml/JavaDCTExample.java 
%}}
--- End diff --

minor: remove extra `{` at the end. You can submit a follow-up PR to fix it.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-11551][DOC][Example]Replace example cod...

2015-12-07 Thread yinxusen
Github user yinxusen commented on the pull request:

https://github.com/apache/spark/pull/10002#issuecomment-162746850
  
@mengxr Shall we merge this PR? I find there are also some PRs modify 
ml-features.md, so I think it's better to merge it before new conflicts 
emerging.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-11551][DOC][Example]Replace example cod...

2015-12-01 Thread yinxusen
Github user yinxusen commented on the pull request:

https://github.com/apache/spark/pull/10002#issuecomment-161209723
  
@mengxr LGTM 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-11551][DOC][Example]Replace example cod...

2015-12-01 Thread yinxusen
Github user yinxusen commented on the pull request:

https://github.com/apache/spark/pull/10002#issuecomment-160997037
  
@somideshmukh I have already fix all issues and merged with master. You can 
accept my pull request here https://github.com/somideshmukh/spark/pull/2. Even 
though it looks very large, it can be small when you merging it. I believe we 
can merge the PR after that.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-11551][DOC][Example]Replace example cod...

2015-12-01 Thread yinxusen
Github user yinxusen commented on the pull request:

https://github.com/apache/spark/pull/10002#issuecomment-160924644
  
@somideshmukh I'll help you change the python issues, then I will send 
another PR to you soon.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-11551][DOC][Example]Replace example cod...

2015-12-01 Thread srowen
Github user srowen commented on the pull request:

https://github.com/apache/spark/pull/10002#issuecomment-160920253
  
@somideshmukh the changes are here as you can see, but your branch still 
needs a rebase.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-11551][DOC][Example]Replace example cod...

2015-12-01 Thread somideshmukh
Github user somideshmukh commented on the pull request:

https://github.com/apache/spark/pull/10002#issuecomment-160920099
  
Pls check it and let me know whether  you have got the changes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-11551][DOC][Example]Replace example cod...

2015-12-01 Thread somideshmukh
Github user somideshmukh commented on the pull request:

https://github.com/apache/spark/pull/10002#issuecomment-160919838
  
I have made the changes you have mentioned in the given Java and Scala 
files and push the changes in Branch "SomilBranch1.33". I haven't modified 
Python files since I haven't work on that files and I donot have python 
knowledge.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-11551][DOC][Example]Replace example cod...

2015-11-30 Thread yinxusen
Github user yinxusen commented on the pull request:

https://github.com/apache/spark/pull/10002#issuecomment-160832506
  
Ping @somideshmukh, hope we can get this finished in v1.6. Could you help 
me fixing those bugs I pointed?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-11551][DOC][Example]Replace example cod...

2015-11-28 Thread yinxusen
Github user yinxusen commented on the pull request:

https://github.com/apache/spark/pull/10002#issuecomment-160310120
  
Hi @somideshmukh I have pointed all errors that I found. After you fixing 
these errors, we can handle the conflict thing. Then we can get this PR merged.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-11551][DOC][Example]Replace example cod...

2015-11-28 Thread yinxusen
Github user yinxusen commented on a diff in the pull request:

https://github.com/apache/spark/pull/10002#discussion_r46080753
  
--- Diff: examples/src/main/python/ml/normalizer_example.py ---
@@ -0,0 +1,42 @@
+#
+# Licensed to the Apache Software Foundation (ASF) under one or more
+# contributor license agreements.  See the NOTICE file distributed with
+# this work for additional information regarding copyright ownership.
+# The ASF licenses this file to You under the Apache License, Version 2.0
+# (the "License"); you may not use this file except in compliance with
+# the License.  You may obtain a copy of the License at
+#
+#http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+#
+
+from __future__ import print_function
+
+from pyspark import SparkContext
+from pyspark.sql import SQLContext
+# $example on$
+from pyspark.ml.feature import Normalizer
+# $example off$
+
+if __name__ == "__main__":
+sc = SparkContext(appName="NormalizerExample")
+sqlContext = SQLContext(sc)
+
+# $example on$
+dataFrame = sqlContext.read.format("libsvm")
+.load("data/mllib/sample_libsvm_data.txt")
--- End diff --

the indention is not right. merge this line with the above one, i.e. 
`dataFrame = 
sqlContext.read.format("libsvm").load("data/mllib/sample_libsvm_data.txt")`


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-11551][DOC][Example]Replace example cod...

2015-11-28 Thread yinxusen
Github user yinxusen commented on a diff in the pull request:

https://github.com/apache/spark/pull/10002#discussion_r46080743
  
--- Diff: examples/src/main/python/ml/onehot_encoder_example.py ---
@@ -0,0 +1,47 @@
+#
+# Licensed to the Apache Software Foundation (ASF) under one or more
+# contributor license agreements.  See the NOTICE file distributed with
+# this work for additional information regarding copyright ownership.
+# The ASF licenses this file to You under the Apache License, Version 2.0
+# (the "License"); you may not use this file except in compliance with
+# the License.  You may obtain a copy of the License at
+#
+#http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+#
+
+from __future__ import print_function
+
+from pyspark import SparkContext
+from pyspark.sql import SQLContext
+# $example on$
+from pyspark.ml.feature import OneHotEncoder, StringIndexer
+# $example off$
+
+if __name__ == "__main__":
+sc = SparkContext(appName="OneHotEncoderExample")
+sqlContext = SQLContext(sc)
+
+# $example on$
+df = sqlContext.createDataFrame([
+(0, "a"),
+(1, "b"),
+(2, "c"),
+(3, "a"),
+(4, "a"),
+(5, "c")
+], ["id", "category"])
+
+stringIndexer = StringIndexer(inputCol="category", 
outputCol="categoryIndex")
+model = stringIndexer.fit(df)
+indexed = model.transform(df)
+encoder = OneHotEncoder(includeFirst=False, inputCol="categoryIndex", 
outputCol="categoryVec")
--- End diff --

change the `includeFirst=False` to `dropLast=False`, because the former has 
already been removed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-11551][DOC][Example]Replace example cod...

2015-11-28 Thread yinxusen
Github user yinxusen commented on a diff in the pull request:

https://github.com/apache/spark/pull/10002#discussion_r46080696
  
--- Diff: 
examples/src/main/scala/org/apache/spark/examples/ml/VectorSlicerExample.scala 
---
@@ -0,0 +1,57 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+// scalastyle:off println
+package org.apache.spark.examples.ml
+
+// $example on$
+import org.apache.spark.ml.attribute.{Attribute, AttributeGroup, 
NumericAttribute}
+import org.apache.spark.ml.feature.VectorSlicer
--- End diff --

add `import org.apache.spark.mllib.linalg.Vectors` here


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-11551][DOC][Example]Replace example cod...

2015-11-28 Thread yinxusen
Github user yinxusen commented on a diff in the pull request:

https://github.com/apache/spark/pull/10002#discussion_r46080689
  
--- Diff: 
examples/src/main/scala/org/apache/spark/examples/ml/VectorSlicerExample.scala 
---
@@ -0,0 +1,57 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+// scalastyle:off println
+package org.apache.spark.examples.ml
+
+// $example on$
+import org.apache.spark.ml.attribute.{Attribute, AttributeGroup, 
NumericAttribute}
+import org.apache.spark.ml.feature.VectorSlicer
+import org.apache.spark.sql.Row
+import org.apache.spark.sql.types.StructType
+// $example off$
+import org.apache.spark.sql.SQLContext
+import org.apache.spark.{SparkConf, SparkContext}
+
+object VectorSlicerExample {
+  def main(args: Array[String]): Unit = {
+val conf = new SparkConf().setAppName("VectorSlicerExample")
+val sc = new SparkContext(conf)
+val sqlContext = new SQLContext(sc)
+
+// $example on$
+val data = Array(Row(-2.0, 2.3, 0.0))
--- End diff --

change this line into `val data = Array(Row(Vectors.dense(-2.0, 2.3, 0.0)))`


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-11551][DOC][Example]Replace example cod...

2015-11-28 Thread yinxusen
Github user yinxusen commented on a diff in the pull request:

https://github.com/apache/spark/pull/10002#discussion_r46080515
  
--- Diff: 
examples/src/main/java/org/apache/spark/examples/ml/JavaTokenizerExample.java 
---
@@ -0,0 +1,75 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.spark.examples.ml;
+
+import org.apache.spark.SparkConf;
+import org.apache.spark.api.java.JavaSparkContext;
+import org.apache.spark.sql.SQLContext;
+
+// $example on$
+import java.util.Arrays;
+
+import org.apache.spark.api.java.JavaRDD;
+import org.apache.spark.ml.feature.RegexTokenizer;
+import org.apache.spark.ml.feature.Tokenizer;
+import org.apache.spark.sql.DataFrame;
+import org.apache.spark.sql.Row;
+import org.apache.spark.sql.RowFactory;
+import org.apache.spark.sql.types.DataTypes;
+import org.apache.spark.sql.types.Metadata;
+import org.apache.spark.sql.types.StructField;
+import org.apache.spark.sql.types.StructType;
+// $example off$
+
+public class JavaTokenizerExample {
+  public static void main(String[] args) {
+SparkConf conf = new SparkConf().setAppName("JavaTokenizerExample");
+JavaSparkContext jsc = new JavaSparkContext(conf);
+SQLContext sqlContext = new SQLContext(jsc);
+
+// $example on$
+JavaRDD jrdd = jsc.parallelize(Arrays.asList(
+  RowFactory.create(0, "Hi I heard about Spark"),
+  RowFactory.create(1, "I wish Java could use case classes"),
+  RowFactory.create(2, "Logistic,regression,models,are,neat")
+));
+
+StructType schema = new StructType(new StructField[]{
+  new StructField("label", DataTypes.DoubleType, false, 
Metadata.empty()),
--- End diff --

Change the `DataTypes.DoubleType` into `DataTypes.IntegerType`


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-11551][DOC][Example]Replace example cod...

2015-11-28 Thread yinxusen
Github user yinxusen commented on a diff in the pull request:

https://github.com/apache/spark/pull/10002#discussion_r46080493
  
--- Diff: 
examples/src/main/java/org/apache/spark/examples/ml/JavaStringIndexerExample.java
 ---
@@ -0,0 +1,66 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.spark.examples.ml;
+
+import org.apache.spark.SparkConf;
+import org.apache.spark.api.java.JavaSparkContext;
+import org.apache.spark.sql.SQLContext;
+
+// $example on$
+import java.util.Arrays;
+
+import org.apache.spark.api.java.JavaRDD;
+import org.apache.spark.ml.feature.StringIndexer;
+import org.apache.spark.sql.DataFrame;
+import org.apache.spark.sql.Row;
+import org.apache.spark.sql.RowFactory;
+import org.apache.spark.sql.types.StructField;
+import org.apache.spark.sql.types.StructType;
+
+import static org.apache.spark.sql.types.DataTypes.*;
+// $example off$
+
+public class JavaStringIndexerExample {
+  public static void main(String[] args) {
+SparkConf conf = new 
SparkConf().setAppName("JavaStringIndexerExample");
+JavaSparkContext jsc = new JavaSparkContext(conf);
+SQLContext sqlContext = new SQLContext(jsc);
+
+// $example on$
+JavaRDD jrdd = jsc.parallelize(Arrays.asList(
+  RowFactory.create(0, "a"),
+  RowFactory.create(1, "b"),
+  RowFactory.create(2, "c"),
+  RowFactory.create(3, "a"),
+  RowFactory.create(4, "a"),
+  RowFactory.create(5, "c")
+));
+StructType schema = new StructType(new StructField[]{
+  createStructField("id", DoubleType, false),
--- End diff --

change the `DoubleType` into `IntegerType`


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-11551][DOC][Example]Replace example cod...

2015-11-28 Thread yinxusen
Github user yinxusen commented on a diff in the pull request:

https://github.com/apache/spark/pull/10002#discussion_r46080461
  
--- Diff: 
examples/src/main/java/org/apache/spark/examples/ml/JavaElementwiseProductExample.java
 ---
@@ -0,0 +1,74 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.spark.examples.ml;
+
+import org.apache.spark.SparkConf;
+import org.apache.spark.api.java.JavaSparkContext;
+import org.apache.spark.sql.SQLContext;
+
+// $example on$
+import java.util.ArrayList;
+import java.util.Arrays;
+import java.util.List;
+
+import org.apache.spark.api.java.JavaRDD;
+import org.apache.spark.ml.feature.ElementwiseProduct;
+import org.apache.spark.mllib.linalg.Vector;
--- End diff --

add `import org.apache.spark.mllib.linalg.VectorUDT;` here


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-11551][DOC][Example]Replace example cod...

2015-11-28 Thread yinxusen
Github user yinxusen commented on a diff in the pull request:

https://github.com/apache/spark/pull/10002#discussion_r46080449
  
--- Diff: 
examples/src/main/java/org/apache/spark/examples/ml/JavaElementwiseProductExample.java
 ---
@@ -0,0 +1,74 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.spark.examples.ml;
+
+import org.apache.spark.SparkConf;
+import org.apache.spark.api.java.JavaSparkContext;
--- End diff --

add `import org.apache.spark.mllib.linalg.VectorUDT;` here


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-11551][DOC][Example]Replace example cod...

2015-11-28 Thread yinxusen
Github user yinxusen commented on a diff in the pull request:

https://github.com/apache/spark/pull/10002#discussion_r46080443
  
--- Diff: 
examples/src/main/java/org/apache/spark/examples/ml/JavaElementwiseProductExample.java
 ---
@@ -0,0 +1,74 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.spark.examples.ml;
+
+import org.apache.spark.SparkConf;
+import org.apache.spark.api.java.JavaSparkContext;
+import org.apache.spark.sql.SQLContext;
+
+// $example on$
+import java.util.ArrayList;
+import java.util.Arrays;
+import java.util.List;
+
+import org.apache.spark.api.java.JavaRDD;
+import org.apache.spark.ml.feature.ElementwiseProduct;
+import org.apache.spark.mllib.linalg.Vector;
+import org.apache.spark.mllib.linalg.Vectors;
+import org.apache.spark.sql.DataFrame;
+import org.apache.spark.sql.Row;
+import org.apache.spark.sql.RowFactory;
+import org.apache.spark.sql.types.DataTypes;
+import org.apache.spark.sql.types.StructField;
+import org.apache.spark.sql.types.StructType;
+// $example off$
+
+public class JavaElementwiseProductExample {
+  public static void main(String[] args) {
+SparkConf conf = new 
SparkConf().setAppName("JavaElementwiseProductExample");
+JavaSparkContext jsc = new JavaSparkContext(conf);
+SQLContext sqlContext = new SQLContext(jsc);
+
+// $example on$
+// Create some vector data; also works for sparse vectors
+JavaRDD jrdd = jsc.parallelize(Arrays.asList(
+  RowFactory.create("a", Vectors.dense(1.0, 2.0, 3.0)),
+  RowFactory.create("b", Vectors.dense(4.0, 5.0, 6.0))
+));
+
+List fields = new ArrayList(2);
+fields.add(DataTypes.createStructField("id", DataTypes.StringType, 
false));
+fields.add(DataTypes.createStructField("vector", DataTypes.StringType, 
false));
--- End diff --

change the `DataTypes.StringType` into `new VectorUDT()`


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-11551][DOC][Example]Replace example cod...

2015-11-28 Thread yinxusen
Github user yinxusen commented on a diff in the pull request:

https://github.com/apache/spark/pull/10002#discussion_r46080247
  
--- Diff: docs/ml-features.md ---
@@ -1508,25 +737,7 @@ This example below demonstrates how to transform 
vectors using a transforming ve
 Refer to the [ElementwiseProduct Scala 
docs](api/scala/index.html#org.apache.spark.ml.feature.ElementwiseProduct)
 for more details on the API.
 
-{% highlight scala %}
-import org.apache.spark.ml.feature.ElementwiseProduct
-import org.apache.spark.mllib.linalg.Vectors
-
-// Create some vector data; also works for sparse vectors
-val dataFrame = sqlContext.createDataFrame(Seq(
-  ("a", Vectors.dense(1.0, 2.0, 3.0)),
-  ("b", Vectors.dense(4.0, 5.0, 6.0.toDF("id", "vector")
-
-val transformingVector = Vectors.dense(0.0, 1.0, 2.0)
-val transformer = new ElementwiseProduct()
-  .setScalingVec(transformingVector)
-  .setInputCol("vector")
-  .setOutputCol("transformedVector")
-
-// Batch transform the vectors to create new column:
-transformer.transform(dataFrame).show()
-
-{% endhighlight %}
+{% include_example 
scala/org/apache/spark/examples/ml/ElementWiseProductExample.scala %}
--- End diff --

Change the file name into `ElementwiseProductExample.scala`


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-11551][DOC][Example]Replace example cod...

2015-11-28 Thread yinxusen
Github user yinxusen commented on a diff in the pull request:

https://github.com/apache/spark/pull/10002#discussion_r46080243
  
--- Diff: 
examples/src/main/scala/org/apache/spark/examples/ml/ElementWiseProductExample.scala
 ---
@@ -0,0 +1,53 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+// scalastyle:off println
+package org.apache.spark.examples.ml
+
+// $example on$
+import org.apache.spark.ml.feature.ElementwiseProduct
+import org.apache.spark.mllib.linalg.Vectors
+// $example off$
+import org.apache.spark.sql.SQLContext
+import org.apache.spark.{SparkConf, SparkContext}
+
+object ElementwiseProductExample {
--- End diff --

Change the file name into `ElementwiseProductExample.scala`, i.e. lower 
case the `w`, making it consistency with the class name. Change the file name 
in `ml-features.md` accordingly.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-11551][DOC][Example]Replace example cod...

2015-11-28 Thread yinxusen
Github user yinxusen commented on a diff in the pull request:

https://github.com/apache/spark/pull/10002#discussion_r46080162
  
--- Diff: 
examples/src/main/java/org/apache/spark/examples/ml/JavaVectorIndexerExample.java
 ---
@@ -0,0 +1,59 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.spark.examples.ml;
+
+// $example on$
+import java.util.Map;
+
+import org.apache.spark.SparkConf;
+import org.apache.spark.api.java.JavaSparkContext;
+import org.apache.spark.ml.feature.VectorIndexer;
+import org.apache.spark.ml.feature.VectorIndexerModel;
+import org.apache.spark.sql.DataFrame;
+import org.apache.spark.sql.SQLContext;
+// $example off$
+
+public class JavaVectorIndexerExample {
+  public static void main(String[] args) {
+SparkConf conf = new 
SparkConf().setAppName("JavaVectorIndexerExample");
+JavaSparkContext jsc = new JavaSparkContext(conf);
+SQLContext jsql = new SQLContext(jsc);
+
+// $example on$
+DataFrame data = 
jsql.read().format("libsvm").load("data/mllib/sample_libsvm_data.txt");
+
+VectorIndexer indexer = new VectorIndexer()
+  .setInputCol("features")
+  .setOutputCol("indexed")
+  .setMaxCategories(10);
+
+VectorIndexerModel indexerModel = indexer.fit(data);
+
+Map> categoryMaps = 
indexerModel.javaCategoryMaps();
+System.out.print("Chose " + categoryMaps.size() + "categorical 
features:");
--- End diff --

add a white space here  `" categorical features:"`, otherwise the output is 
like this `Chose 351categorical features: ...`


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-11551][DOC][Example]Replace example cod...

2015-11-28 Thread yinxusen
Github user yinxusen commented on the pull request:

https://github.com/apache/spark/pull/10002#issuecomment-160292101
  
I think the conflict is due to some changes in `ml-features.md`, but we can 
leave it for now. After we fix all errors we can fix the conflict.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-11551][DOC][Example]Replace example cod...

2015-11-28 Thread yinxusen
Github user yinxusen commented on the pull request:

https://github.com/apache/spark/pull/10002#issuecomment-160292049
  
@somideshmukh This is really a bunch of code. I check some errors as 
described above for my carelessness and sorry for that. And after fixing those 
errors, I can get it run in my computer. I'll also check the example codes one 
by one to make sure every code example is runnable without errors.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-11551][DOC][Example]Replace example cod...

2015-11-28 Thread yinxusen
Github user yinxusen commented on a diff in the pull request:

https://github.com/apache/spark/pull/10002#discussion_r46079612
  
--- Diff: 
examples/src/main/java/org/apache/spark/examples/ml/JavaOneHotEncoderExample.java
 ---
@@ -0,0 +1,77 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.spark.examples.ml;
+
+import org.apache.spark.SparkConf;
+import org.apache.spark.api.java.JavaSparkContext;
+import org.apache.spark.sql.SQLContext;
+
+// $example on$
+import java.util.Arrays;
+
+import org.apache.spark.api.java.JavaRDD;
+import org.apache.spark.ml.feature.OneHotEncoder;
+import org.apache.spark.ml.feature.StringIndexer;
+import org.apache.spark.ml.feature.StringIndexerModel;
+import org.apache.spark.sql.DataFrame;
+import org.apache.spark.sql.Row;
+import org.apache.spark.sql.RowFactory;
+import org.apache.spark.sql.types.DataTypes;
+import org.apache.spark.sql.types.Metadata;
+import org.apache.spark.sql.types.StructField;
+import org.apache.spark.sql.types.StructType;
+// $example off
--- End diff --

change the line into `// $example off$`


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-11551][DOC][Example]Replace example cod...

2015-11-28 Thread yinxusen
Github user yinxusen commented on a diff in the pull request:

https://github.com/apache/spark/pull/10002#discussion_r46079600
  
--- Diff: 
examples/src/main/java/org/apache/spark/examples/ml/JavaPCAExample.java ---
@@ -0,0 +1,71 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.spark.examples.ml;
+
+import org.apache.spark.SparkConf;
+import org.apache.spark.api.java.JavaSparkContext;
+import org.apache.spark.sql.SQLContext;
+
+// $example on$
+import java.util.Arrays;
+
+import org.apache.spark.api.java.JavaRDD;
+import org.apache.spark.ml.feature.PCA;
+import org.apache.spark.ml.feature.PCAModel;
+import org.apache.spark.mllib.linalg.VectorUDT;
+import org.apache.spark.mllib.linalg.Vectors;
+import org.apache.spark.sql.DataFrame;
+import org.apache.spark.sql.Row;
+import org.apache.spark.sql.RowFactory;
+import org.apache.spark.sql.types.Metadata;
+import org.apache.spark.sql.types.StructField;
+import org.apache.spark.sql.types.StructType;
+// $example off
--- End diff --

change it to `// $example off$`


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-11551][DOC][Example]Replace example cod...

2015-11-28 Thread yinxusen
Github user yinxusen commented on a diff in the pull request:

https://github.com/apache/spark/pull/10002#discussion_r46079594
  
--- Diff: 
examples/src/main/java/org/apache/spark/examples/ml/JavaStopWordsRemover.java 
---
@@ -0,0 +1,65 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.spark.examples.ml;
+
+import org.apache.spark.SparkConf;
+import org.apache.spark.api.java.JavaSparkContext;
+import org.apache.spark.sql.SQLContext;
+
+// $example on$
+import java.util.Arrays;
+
+import org.apache.spark.api.java.JavaRDD;
+import org.apache.spark.ml.feature.StopWordsRemover;
+import org.apache.spark.sql.DataFrame;
+import org.apache.spark.sql.Row;
+import org.apache.spark.sql.RowFactory;
+import org.apache.spark.sql.types.DataTypes;
+import org.apache.spark.sql.types.Metadata;
+import org.apache.spark.sql.types.StructField;
+import org.apache.spark.sql.types.StructType;
+// $example off$
+
+public class JavaStopWordsRemover {
+
+  public static void main(String[] args) {
+SparkConf conf = new SparkConf().setAppName("JavaStopWordsRemover");
--- End diff --

change the app name into `JavaStopWordsRemoverExample`


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-11551][DOC][Example]Replace example cod...

2015-11-28 Thread yinxusen
Github user yinxusen commented on a diff in the pull request:

https://github.com/apache/spark/pull/10002#discussion_r46079591
  
--- Diff: 
examples/src/main/java/org/apache/spark/examples/ml/JavaStopWordsRemover.java 
---
@@ -0,0 +1,65 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.spark.examples.ml;
+
+import org.apache.spark.SparkConf;
+import org.apache.spark.api.java.JavaSparkContext;
+import org.apache.spark.sql.SQLContext;
+
+// $example on$
+import java.util.Arrays;
+
+import org.apache.spark.api.java.JavaRDD;
+import org.apache.spark.ml.feature.StopWordsRemover;
+import org.apache.spark.sql.DataFrame;
+import org.apache.spark.sql.Row;
+import org.apache.spark.sql.RowFactory;
+import org.apache.spark.sql.types.DataTypes;
+import org.apache.spark.sql.types.Metadata;
+import org.apache.spark.sql.types.StructField;
+import org.apache.spark.sql.types.StructType;
+// $example off$
+
+public class JavaStopWordsRemover {
--- End diff --

change the class name to `JavaStopWordsRemoverExample`, also change the 
file name into `JavaStopWordsRemoverExample.java`


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-11551][DOC][Example]Replace example cod...

2015-11-28 Thread yinxusen
Github user yinxusen commented on a diff in the pull request:

https://github.com/apache/spark/pull/10002#discussion_r46079574
  
--- Diff: 
examples/src/main/java/org/apache/spark/examples/ml/JavaTokenizerExample.java 
---
@@ -0,0 +1,75 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.spark.examples.ml;
+
+import org.apache.spark.SparkConf;
+import org.apache.spark.api.java.JavaSparkContext;
+import org.apache.spark.sql.SQLContext;
+
+// $example on$
+import java.util.Arrays;
+
+import org.apache.spark.api.java.JavaRDD;
+import org.apache.spark.ml.feature.RegexTokenizer;
+import org.apache.spark.ml.feature.Tokenizer;
+import org.apache.spark.sql.DataFrame;
+import org.apache.spark.sql.Row;
+import org.apache.spark.sql.RowFactory;
+import org.apache.spark.sql.types.DataTypes;
+import org.apache.spark.sql.types.Metadata;
+import org.apache.spark.sql.types.StructField;
+import org.apache.spark.sql.types.StructType;
+// $example off$
+
+public class JavaTokenizerExample {
+  public static void main(String[] args) {
+SparkConf conf = new SparkConf().setAppName("JavaTokenizerExample");
+JavaSparkContext jsc = new JavaSparkContext(conf);
+SQLContext sqlContext = new SQLContext(jsc);
+
+// $example on$
+JavaRDD jrdd = jsc.parallelize(Arrays.asList(
+  RowFactory.create(0, "Hi I heard about Spark"),
+  RowFactory.create(1, "I wish Java could use case classes"),
+  RowFactory.create(2, "Logistic,regression,models,are,neat")
+));
+
+StructType schema = new StructType(new StructField[]{
+  new StructField("label", DataTypes.DoubleType, false, 
Metadata.empty()),
+  new StructField("sentence", DataTypes.StringType, false, 
Metadata.empty())
+});
+
+DataFrame sentenceDataFrame = sqlContext.createDataFrame(jrdd, schema);
+
+Tokenizer tokenizer = new 
Tokenizer().setInputCol("sentence").setOutputCol("words");
+
+DataFrame wordsDataFrame = tokenizer.transform(sentenceDataFrame);
+for (Row r : wordsDataFrame.select("words", "label"). take(3)) {
+  java.util.List words = r.getList(0);
+  for (String word : words) System.out.print(word + " ");
+  System.out.println();
+}
+
+RegexTokenizer regexTokenizer = new RegexTokenizer()
+  .setInputCol("sentence")
+  .setOutputCol("words")
+  .setPattern("\\W");  // alternatively 
.setPattern("\\w+").setGaps(false);
+// example off
--- End diff --

change the line into `// $example off$`


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-11551][DOC][Example]Replace example cod...

2015-11-28 Thread yinxusen
Github user yinxusen commented on a diff in the pull request:

https://github.com/apache/spark/pull/10002#discussion_r46079548
  
--- Diff: docs/ml-features.md ---
@@ -1800,40 +887,7 @@ println(output.select("userFeatures", 
"features").first())
 Refer to the [VectorSlicer Java 
docs](api/java/org/apache/spark/ml/feature/VectorSlicer.html)
 for more details on the API.
 
-{% highlight java %}
-import java.util.Arrays;
-
-import org.apache.spark.api.java.JavaRDD;
-import org.apache.spark.mllib.linalg.Vectors;
-import org.apache.spark.sql.DataFrame;
-import org.apache.spark.sql.Row;
-import org.apache.spark.sql.RowFactory;
-import org.apache.spark.sql.types.*;
-import static org.apache.spark.sql.types.DataTypes.*;
-
-Attribute[] attrs = new Attribute[]{
-  NumericAttribute.defaultAttr().withName("f1"),
-  NumericAttribute.defaultAttr().withName("f2"),
-  NumericAttribute.defaultAttr().withName("f3")
-};
-AttributeGroup group = new AttributeGroup("userFeatures", attrs);
-
-JavaRDD jrdd = jsc.parallelize(Lists.newArrayList(
-  RowFactory.create(Vectors.sparse(3, new int[]{0, 1}, new double[]{-2.0, 
2.3})),
-  RowFactory.create(Vectors.dense(-2.0, 2.3, 0.0))
-));
-
-DataFrame dataset = jsql.createDataFrame(jrdd, (new 
StructType()).add(group.toStructField()));
-
-VectorSlicer vectorSlicer = new VectorSlicer()
-  .setInputCol("userFeatures").setOutputCol("features");
-
-vectorSlicer.setIndices(new int[]{1}).setNames(new String[]{"f3"});
-// or slicer.setIndices(new int[]{1, 2}), or slicer.setNames(new 
String[]{"f2", "f3"})
-
-DataFrame output = vectorSlicer.transform(dataset);
-
-System.out.println(output.select("userFeatures", "features").first());
+{% include_example 
java/org/apache/spark/examples/ml/JavaVectorSlicerExample.java %}
 {% endhighlight %}
--- End diff --

remove this line


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-11551][DOC][Example]Replace example cod...

2015-11-28 Thread yinxusen
Github user yinxusen commented on a diff in the pull request:

https://github.com/apache/spark/pull/10002#discussion_r46079541
  
--- Diff: docs/ml-features.md ---
@@ -408,37 +278,7 @@ 
ngramDataFrame.take(3).map(_.getAs[Stream[String]]("ngrams").toList).foreach(pri
 Refer to the [NGram Java 
docs](api/java/org/apache/spark/ml/feature/NGram.html)
 for more details on the API.
 
-{% highlight java %}
-import java.util.Arrays;
-
-import org.apache.spark.api.java.JavaRDD;
-import org.apache.spark.ml.feature.NGram;
-import org.apache.spark.mllib.linalg.Vector;
-import org.apache.spark.sql.DataFrame;
-import org.apache.spark.sql.Row;
-import org.apache.spark.sql.RowFactory;
-import org.apache.spark.sql.types.DataTypes;
-import org.apache.spark.sql.types.Metadata;
-import org.apache.spark.sql.types.StructField;
-import org.apache.spark.sql.types.StructType;
-
-JavaRDD jrdd = jsc.parallelize(Arrays.asList(
-  RowFactory.create(0.0, Arrays.asList("Hi", "I", "heard", "about", 
"Spark")),
-  RowFactory.create(1.0, Arrays.asList("I", "wish", "Java", "could", 
"use", "case", "classes")),
-  RowFactory.create(2.0, Arrays.asList("Logistic", "regression", "models", 
"are", "neat"))
-));
-StructType schema = new StructType(new StructField[]{
-  new StructField("label", DataTypes.DoubleType, false, Metadata.empty()),
-  new StructField("words", 
DataTypes.createArrayType(DataTypes.StringType), false, Metadata.empty())
-});
-DataFrame wordDataFrame = sqlContext.createDataFrame(jrdd, schema);
-NGram ngramTransformer = new 
NGram().setInputCol("words").setOutputCol("ngrams");
-DataFrame ngramDataFrame = ngramTransformer.transform(wordDataFrame);
-for (Row r : ngramDataFrame.select("ngrams", "label").take(3)) {
-  java.util.List ngrams = r.getList(0);
-  for (String ngram : ngrams) System.out.print(ngram + " --- ");
-  System.out.println();
-}
+{% include_example java/org/apache/spark/examples/ml/JavaNGramExample.java 
%}
 {% endhighlight %}
--- End diff --

Pls remove this line


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-11551][DOC][Example]Replace example cod...

2015-11-28 Thread somideshmukh
Github user somideshmukh closed the pull request at:

https://github.com/apache/spark/pull/9735


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-11551][DOC][Example]Replace example cod...

2015-11-27 Thread yinxusen
Github user yinxusen commented on the pull request:

https://github.com/apache/spark/pull/9735#issuecomment-160243894
  
@somideshmukh I can't close it because I am not the owner. You need close 
it yourself. :)


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-11551][DOC][Example]Replace example cod...

2015-11-27 Thread somideshmukh
Github user somideshmukh commented on the pull request:

https://github.com/apache/spark/pull/9735#issuecomment-160101745
  
Yes ,you can close this PR


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-11551][DOC][Example]Replace example cod...

2015-11-27 Thread yinxusen
Github user yinxusen commented on the pull request:

https://github.com/apache/spark/pull/9735#issuecomment-160068433
  
@somideshmukh Could we close this PR since there is a new one 
https://github.com/apache/spark/pull/10002 of SPARK-11551


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-11551][DOC][Example]Replace example cod...

2015-11-27 Thread yinxusen
Github user yinxusen commented on the pull request:

https://github.com/apache/spark/pull/10002#issuecomment-160068215
  
Besides, there is no need to create a new pull request every time you 
changing your code. You can modify your previous code and push the changes in 
current branch.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-11551][DOC][Example]Replace example cod...

2015-11-27 Thread yinxusen
Github user yinxusen commented on the pull request:

https://github.com/apache/spark/pull/10002#issuecomment-160067831
  
You can see the files changed in that pull request. I make changes in your 
original code. The main issues there are:

* The mismatch between file name and app name
* The indention in code is not very correct
* You forget to add `// $example on$` and `// $example off$` pairs in the 
code
* The reorganize of imports

I also add Python files that you missed and 3 Java codes plus with 1 Scala 
code.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-11551][DOC][Example]Replace example cod...

2015-11-27 Thread yinxusen
Github user yinxusen commented on the pull request:

https://github.com/apache/spark/pull/10002#issuecomment-160066853
  
Hi @somideshmukh, since it's lots of files in this pull request, I send a 
pull request to your repo and you can see it here: 
https://github.com/somideshmukh/spark/pull/1

All of my modifications are there, you can check the code and merge it if 
it looks good to you. Then we can check the correctness of those newly added 
files.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-11551][DOC][Example]Replace example cod...

2015-11-26 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/10002#issuecomment-159898150
  
Can one of the admins verify this patch?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-11551][DOC][Example]Replace example cod...

2015-11-26 Thread somideshmukh
GitHub user somideshmukh opened a pull request:

https://github.com/apache/spark/pull/10002

[SPARK-11551][DOC][Example]Replace example code in ml-features.md using 
include_example

Made new patch contaning only markdown examples moved to exmaple/folder.
Ony three  java code were not shfted since they were contaning compliation 
error ,these classes are
1)StandardScale 2)NormalizerExample 3)VectorIndexer

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/somideshmukh/spark SomilBranch1.33

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/spark/pull/10002.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #10002


commit d14f55d8e842519b81423348e6656803b4c130fe
Author: somideshmukh 
Date:   2015-11-26T09:13:58Z

[SPARK-11551][DOC][Example]Replace example code in ml-features.md using 
include_example

commit 12b1cf33a1846250458f3093b7bf7b7826f5
Author: somideshmukh 
Date:   2015-11-26T10:21:05Z

[SPARK-11551][DOC][Example]Replace example code in ml-features.md using 
include_example




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-11551][DOC][Example]Replace example cod...

2015-11-24 Thread mengxr
Github user mengxr commented on the pull request:

https://github.com/apache/spark/pull/9735#issuecomment-159358451
  
@somideshmukh It seems that there is some miscommunication about the task. 
It is to move example code embedded in markdown to `examples/` folder, and then 
use `include_example` in markdown to reference them. But this PR adds some 
example code back from `examples/` to markdown, which is what we want to avoid. 
Could you update your PR? It might be good to change only one example first and 
confirm with @yinxusen about the correctness, and then work on others.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-11551][DOC][Example]Replace example cod...

2015-11-23 Thread somideshmukh
Github user somideshmukh commented on the pull request:

https://github.com/apache/spark/pull/9735#issuecomment-158920107
  
Hi,The Patch that I have submitted contains code of Java and Scala ,that I 
have replaced.About Python code I donot have much experience.So haven't touch 
that part.Iif there are any other Java or Scala classes that need to replaced 
please let me know


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-11551][DOC][Example]Replace example cod...

2015-11-21 Thread yinxusen
Github user yinxusen commented on the pull request:

https://github.com/apache/spark/pull/9735#issuecomment-158640370
  
@somideshmukh Do you still have time on this? I can help if you are busy. 
Pls let me know. :)


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-11551][DOC][Example]Replace example cod...

2015-11-16 Thread yinxusen
Github user yinxusen commented on the pull request:

https://github.com/apache/spark/pull/9735#issuecomment-157041394
  
Hi @somideshmukh, thanks for working on this. You can use the PR 
https://github.com/apache/spark/pull/9713 as an example. What we need to do is 
replacing all raw code snippts in `ml-features.md` with `include_example`, and 
create code files in the `examples` dir of Spark accordingly. 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-11551][DOC][Example]Replace example cod...

2015-11-16 Thread yinxusen
Github user yinxusen commented on a diff in the pull request:

https://github.com/apache/spark/pull/9735#discussion_r44926012
  
--- Diff: docs/ml-features.md ---
@@ -53,7 +53,24 @@ Refer to the [HashingTF Java 
docs](api/java/org/apache/spark/ml/feature/HashingT
 Refer to the [HashingTF Python 
docs](api/python/pyspark.ml.html#pyspark.ml.feature.HashingTF) and
 the [IDF Python docs](api/python/pyspark.ml.html#pyspark.ml.feature.IDF) 
for more details on the API.
 
-{% include_example python/ml/tf_idf_example.py %}
--- End diff --

Here do not change the example imported by `include_example` to raw code 
back. The JIRA issue aims to change all raw code snippts with `include_example`.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-11551][DOC][Example]Replace example cod...

2015-11-16 Thread yinxusen
Github user yinxusen commented on a diff in the pull request:

https://github.com/apache/spark/pull/9735#discussion_r44925725
  
--- Diff: docs/ml-features.md ---
@@ -37,7 +37,7 @@ In the following code segment, we start with a set of 
sentences.  We split each
 Refer to the [HashingTF Scala 
docs](api/scala/index.html#org.apache.spark.ml.feature.HashingTF) and
 the [IDF Scala docs](api/scala/index.html#org.apache.spark.ml.feature.IDF) 
for more details on the API.
 
-{% include_example scala/org/apache/spark/examples/ml/TfIdfExample.scala %}
--- End diff --

There is no need to change `TfIdfExample` to `HashingTF`. You can treat 
examples in the section [`Feature 
Extractors`](http://spark.apache.org/docs/latest/ml-features.html#feature-extractors)
 as examples, since `TF-IDF`, `Word2Vec`, and `CountVectorizer` have already 
been written with `include_example`.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-11551][DOC][Example]Replace example cod...

2015-11-16 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/9735#issuecomment-156986326
  
Can one of the admins verify this patch?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-11551][DOC][Example]Replace example cod...

2015-11-16 Thread somideshmukh
GitHub user somideshmukh opened a pull request:

https://github.com/apache/spark/pull/9735

[SPARK-11551][DOC][Example]Replace example code in ml-features.md using 
include_example

Made changes in code according to spark coding style.

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/somideshmukh/spark SomilBranch1.2

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/spark/pull/9735.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #9735


commit 3cc669504ac6149e1d4b742451564e6a4e65469b
Author: somideshmukh 
Date:   2015-11-16T10:33:40Z

[SPARK-11551][DOC][Example]Replace example code in ml-features.md using 
include_example




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-11551][DOC][Example]Replace example cod...

2015-11-16 Thread somideshmukh
Github user somideshmukh closed the pull request at:

https://github.com/apache/spark/pull/9537


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-11551][DOC][Example]Replace example cod...

2015-11-15 Thread somideshmukh
Github user somideshmukh commented on the pull request:

https://github.com/apache/spark/pull/9537#issuecomment-156945905
  
I am working on this,today it will be completed


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-11551][DOC][Example]Replace example cod...

2015-11-13 Thread yinxusen
Github user yinxusen commented on the pull request:

https://github.com/apache/spark/pull/9537#issuecomment-156469579
  
@somideshmukh Do you still have time for this? I can help if you don't have 
enough time.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-11551][DOC][Example]Replace example cod...

2015-11-09 Thread yinxusen
Github user yinxusen commented on the pull request:

https://github.com/apache/spark/pull/9537#issuecomment-155026161
  
@somideshmukh Pls follow [Spark Style 
Guide](https://cwiki.apache.org/confluence/display/SPARK/Spark+Code+Style+Guide).
 E.g. All Java code should follow 2-indention style. 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-11551][DOC][Example]Replace example cod...

2015-11-08 Thread yinxusen
Github user yinxusen commented on the pull request:

https://github.com/apache/spark/pull/9537#issuecomment-154937549
  
@somideshmukh Thanks for working on this!


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-11551][DOC][Example]Replace example cod...

2015-11-07 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/9537#issuecomment-154665282
  
Can one of the admins verify this patch?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-11551][DOC][Example]Replace example cod...

2015-11-07 Thread somideshmukh
GitHub user somideshmukh opened a pull request:

https://github.com/apache/spark/pull/9537

[SPARK-11551][DOC][Example]Replace example code in ml-features.md using 
include_example

Replaced Java and Scala code

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/somideshmukh/spark NewBranch1.5.1-2

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/spark/pull/9537.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #9537


commit be53d3006865182e68c2c69eb432188c7009c4fe
Author: somideshmukh 
Date:   2015-11-04T11:08:39Z

[SPARK-10946][SQL]JDBC - Use Statement.executeUpdate instead of 
PreparedStatement.executeUpdate for DDLs

commit bfd43ab10d987e5dc23d008370a8cf7c4a782bd6
Author: somideshmukh 
Date:   2015-11-07T08:37:22Z

[SPARK-11551][Example][Doc]Replace example code in ml-features.md using 
include_example




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org