[GitHub] spark issue #13065: [SPARK-15214][SQL] Code-generation for Generate

2016-11-19 Thread rxin
Github user rxin commented on the issue:

https://github.com/apache/spark/pull/13065
  
Merging in master.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #15677: [SPARK-17963][SQL][Documentation] Add examples (extend) ...

2016-11-19 Thread gatorsmile
Github user gatorsmile commented on the issue:

https://github.com/apache/spark/pull/15677
  
True. That is why I do not know which is the best way for us to show the 
argument/parameter names.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #13065: [SPARK-15214][SQL] Code-generation for Generate

2016-11-19 Thread davies
Github user davies commented on the issue:

https://github.com/apache/spark/pull/13065
  
LGTM


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #15677: [SPARK-17963][SQL][Documentation] Add examples (extend) ...

2016-11-19 Thread rxin
Github user rxin commented on the issue:

https://github.com/apache/spark/pull/15677
  
These function docs are printed onto the console.



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #15940: [SPARK-18447][BUILD][DOCS]Fix `Note:`/`NOTE:`/`No...

2016-11-19 Thread aditya1702
Github user aditya1702 closed the pull request at:

https://github.com/apache/spark/pull/15940


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #15940: [SPARK-18447][BUILD][DOCS]Fix `Note:`/`NOTE:`/`Note that...

2016-11-19 Thread aditya1702
Github user aditya1702 commented on the issue:

https://github.com/apache/spark/pull/15940
  
@HyukjinKwon @rxin Yes I think I got confused as pointed out by Hyukjin 
above 😅 . @HyukjinKwon you can take this one over and cc me. I will work on 
another issue. Sorry for the troubles


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #15677: [SPARK-17963][SQL][Documentation] Add examples (extend) ...

2016-11-19 Thread gatorsmile
Github user gatorsmile commented on the issue:

https://github.com/apache/spark/pull/15677
  
Normally, most RDBMS docs are using _Italic fonts_ for the 
parameter/argument names. Linux man is using `underscore` to highlight argument 
names in descriptions. I also saw another way
```
``expr1''
```

I do not know which ways are better in our function descriptions. 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #15717: [SPARK-17910][SQL] Allow users to update the comment of ...

2016-11-19 Thread gatorsmile
Github user gatorsmile commented on the issue:

https://github.com/apache/spark/pull/15717
  
Let us first resolve the paser/syntax-related issues.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #15717: [SPARK-17910][SQL] Allow users to update the comm...

2016-11-19 Thread gatorsmile
Github user gatorsmile commented on a diff in the pull request:

https://github.com/apache/spark/pull/15717#discussion_r88794337
  
--- Diff: 
sql/catalyst/src/main/antlr4/org/apache/spark/sql/catalyst/parser/SqlBase.g4 ---
@@ -93,6 +93,8 @@ statement
 SET TBLPROPERTIES tablePropertyList
#setTableProperties
 | ALTER (TABLE | VIEW) tableIdentifier
 UNSET TBLPROPERTIES (IF EXISTS)? tablePropertyList 
#unsetTableProperties
+| ALTER TABLE tableIdentifier
--- End diff --

Please add `(partitionSpec)?`. If we do not support it, we can issue an 
appropriate error message in `SparkSqlParser.scala`.  


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #15935: [SPARK-18188] add checksum for blocks of broadcast

2016-11-19 Thread rxin
Github user rxin commented on the issue:

https://github.com/apache/spark/pull/15935
  
cc @joshrosen and @zsxwing too


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #15677: [SPARK-17963][SQL][Documentation] Add examples (extend) ...

2016-11-19 Thread HyukjinKwon
Github user HyukjinKwon commented on the issue:

https://github.com/apache/spark/pull/15677
  
Some of them already had backricks in the description and others did not. 
Matching it up with backticks was initially suggested by 
https://github.com/apache/spark/pull/15513#discussion_r84820066. I could not 
find the concrete reason to not follow. I can remove all of them if you confirm.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #15913: [SPARK-18481][ML] ML 2.1 QA: Remove deprecated methods f...

2016-11-19 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/15913
  
Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #15913: [SPARK-18481][ML] ML 2.1 QA: Remove deprecated methods f...

2016-11-19 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/15913
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/68900/
Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #15913: [SPARK-18481][ML] ML 2.1 QA: Remove deprecated methods f...

2016-11-19 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/15913
  
**[Test build #68900 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/68900/consoleFull)**
 for PR 15913 at commit 
[`8d3c47a`](https://github.com/apache/spark/commit/8d3c47a2cb3f2ffcf72b6798c224ac0f29c9e6b2).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #15717: [SPARK-17910][SQL] Allow users to update the comm...

2016-11-19 Thread gatorsmile
Github user gatorsmile commented on a diff in the pull request:

https://github.com/apache/spark/pull/15717#discussion_r88794139
  
--- Diff: 
sql/core/src/test/scala/org/apache/spark/sql/execution/command/DDLCommandSuite.scala
 ---
@@ -694,19 +694,6 @@ class DDLCommandSuite extends PlanTest {
 assertUnsupported("ALTER TABLE table_name SKEWED BY (key) ON (1,5,6) 
STORED AS DIRECTORIES")
   }
 
-  test("alter table: change column name/type/position/comment (not 
allowed)") {
--- End diff --

The test suite `DDLCommandSuite` is used for testing the parser. Could you 
added unit test cases for the new parser rules?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #15717: [SPARK-17910][SQL] Allow users to update the comment of ...

2016-11-19 Thread gatorsmile
Github user gatorsmile commented on the issue:

https://github.com/apache/spark/pull/15717
  
How about the partition columns of partitioned Hive serde tables? 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #15677: [SPARK-17963][SQL][Documentation] Add examples (extend) ...

2016-11-19 Thread rxin
Github user rxin commented on the issue:

https://github.com/apache/spark/pull/15677
  
@HyukjinKwon why did we add backticks to surround all parameters? It looks 
pretty weird actually.



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #15937: [SPARK-18508][SQL] Fix documentation error for DateDiff

2016-11-19 Thread rxin
Github user rxin commented on the issue:

https://github.com/apache/spark/pull/15937
  
Thanks I will fix that too.

Merging this in master/branch-2.1.



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #15907: [SPARK-18458][CORE] Fix signed integer overflow problem ...

2016-11-19 Thread rxin
Github user rxin commented on the issue:

https://github.com/apache/spark/pull/15907
  
Merging in master/branch-2.1


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #15907: [SPARK-18458][CORE] Fix signed integer overflow p...

2016-11-19 Thread rxin
Github user rxin commented on a diff in the pull request:

https://github.com/apache/spark/pull/15907#discussion_r88794022
  
--- Diff: 
core/src/main/java/org/apache/spark/util/collection/unsafe/sort/RadixSort.java 
---
@@ -40,28 +42,28 @@
* of always copying the data back to position zero for 
efficiency.
*/
   public static int sort(
-  LongArray array, int numRecords, int startByteIndex, int 
endByteIndex,
+  LongArray array, long numRecords, int startByteIndex, int 
endByteIndex,
   boolean desc, boolean signed) {
 assert startByteIndex >= 0 : "startByteIndex (" + startByteIndex + ") 
should >= 0";
 assert endByteIndex <= 7 : "endByteIndex (" + endByteIndex + ") should 
<= 7";
 assert endByteIndex > startByteIndex;
 assert numRecords * 2 <= array.size();
-int inIndex = 0;
-int outIndex = numRecords;
+long inIndex = 0;
+long outIndex = numRecords;
 if (numRecords > 0) {
   long[][] counts = getCounts(array, numRecords, startByteIndex, 
endByteIndex);
   for (int i = startByteIndex; i <= endByteIndex; i++) {
 if (counts[i] != null) {
   sortAtByte(
 array, numRecords, counts[i], i, inIndex, outIndex,
 desc, signed && i == endByteIndex);
-  int tmp = inIndex;
+  long tmp = inIndex;
   inIndex = outIndex;
   outIndex = tmp;
 }
   }
 }
-return inIndex;
+return Ints.checkedCast(inIndex);
--- End diff --

It might make sense to add a very simple if check in the beginning of this 
function. I will do it when I merge this.



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #15943: [SPARK-18511] Added an api for join operations ju...

2016-11-19 Thread HyukjinKwon
Github user HyukjinKwon commented on a diff in the pull request:

https://github.com/apache/spark/pull/15943#discussion_r88794018
  
--- Diff: sql/core/src/main/scala/org/apache/spark/sql/Dataset.scala ---
@@ -705,12 +701,10 @@ class Dataset[T] private[sql](
*
* @param right Right side of the join operation.
* @param usingColumn Name of the column to join on. This column must 
exist on both sides.
-   *
-   * @note If you perform a self-join using this function without aliasing 
the input
-   * [[DataFrame]]s, you will NOT be able to reference any columns after 
the join, since
-   * there is no way to disambiguate which side of the join you would like 
to reference.
-   *
-   * @group untypedrel
+* @note If you perform a self-join using this function without 
aliasing the input
+   *[[DataFrame]]s, you will NOT be able to reference any columns 
after the join, since
+   *there is no way to disambiguate which side of the join you 
would like to reference.
+* @group untypedrel
--- End diff --

Can we exclude unrelated changes here? Also, `@note` has the consistent 
indentation across documentation. I guess it is not encouraged to change some 
of them inconsistently without a concrete reason.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #15877: [SPARK-18429] [SQL] implement a new Aggregate for...

2016-11-19 Thread rxin
Github user rxin commented on a diff in the pull request:

https://github.com/apache/spark/pull/15877#discussion_r88793956
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/aggregate/CountMinSketchAgg.scala
 ---
@@ -0,0 +1,150 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.spark.sql.catalyst.expressions.aggregate
+
+import java.io.{ByteArrayInputStream, ByteArrayOutputStream}
+
+import org.apache.spark.sql.catalyst.InternalRow
+import org.apache.spark.sql.catalyst.analysis.TypeCheckResult
+import 
org.apache.spark.sql.catalyst.analysis.TypeCheckResult.{TypeCheckFailure, 
TypeCheckSuccess}
+import org.apache.spark.sql.catalyst.expressions.{Expression, 
ExpressionDescription}
+import org.apache.spark.sql.types._
+import org.apache.spark.unsafe.types.UTF8String
+import org.apache.spark.util.sketch.CountMinSketch
+
+/**
+ * This function returns a count-min sketch of a column with the given 
esp, confidence and seed.
+ * A count-min sketch is a probabilistic data structure used for 
summarizing streams of data in
+ * sub-linear space, which is useful for equality predicates and join size 
estimation.
+ * The result returned by the function is an array of bytes, which should 
be deserialized to a
+ * `CountMinSketch` before usage.
+ *
+ * @param child child expression that can produce column value with 
`child.eval(inputRow)`
+ * @param epsExpression relative error, must be positive
+ * @param confidenceExpression confidence, must be positive and less than 
1.0
+ * @param seedExpression random seed
+ */
+@ExpressionDescription(
+  usage = """
+_FUNC_(col, eps, confidence, seed) - Returns a count-min sketch of a 
column with the given esp,
+  confidence and seed. The result is an array of bytes, which should 
be deserialized to a
+  `CountMinSketch` before usage. `CountMinSketch` is useful for 
equality predicates and join
+  size estimation.
+  """)
+case class CountMinSketchAgg(
+child: Expression,
+epsExpression: Expression,
+confidenceExpression: Expression,
+seedExpression: Expression,
+override val mutableAggBufferOffset: Int,
+override val inputAggBufferOffset: Int) extends 
TypedImperativeAggregate[CountMinSketch] {
+
+  def this(
+  child: Expression,
+  epsExpression: Expression,
+  confidenceExpression: Expression,
+  seedExpression: Expression) = {
+this(child, epsExpression, confidenceExpression, seedExpression, 0, 0)
+  }
+
+  // Mark as lazy so that they are not evaluated during tree 
transformation.
+  private lazy val eps: Double = epsExpression.eval().asInstanceOf[Double]
+  private lazy val confidence: Double = 
confidenceExpression.eval().asInstanceOf[Double]
+  private lazy val seed: Int = seedExpression.eval().asInstanceOf[Int]
+
+  override def checkInputDataTypes(): TypeCheckResult = {
+val defaultCheck = super.checkInputDataTypes()
+if (defaultCheck.isFailure) {
+  defaultCheck
+} else if (!epsExpression.foldable || !confidenceExpression.foldable ||
+  !seedExpression.foldable) {
+  TypeCheckFailure(
+"The eps, confidence or seed provided must be a literal or 
constant foldable")
+} else if (epsExpression.eval() == null || confidenceExpression.eval() 
== null ||
+  seedExpression.eval() == null) {
+  TypeCheckFailure("The eps, confidence or seed provided should not be 
null")
+} else if (eps <= 0D) {
+  TypeCheckFailure(s"Relative error must be positive (current value = 
$eps)")
+} else if (confidence <= 0D || confidence >= 1D) {
+  TypeCheckFailure(s"Confidence must be within range (0.0, 1.0) 
(current value = $confidence)")
+} else {
+  TypeCheckSuccess
+}
+  }
+
+  override def createAggregationBuffer(): CountMinSketch = {
+CountMinSketch.create(eps, confidence, seed)
 

[GitHub] spark issue #15942: [SPARK-18407] Inferred partition columns cause assertion...

2016-11-19 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/15942
  
Merged build finished. Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #15942: [SPARK-18407] Inferred partition columns cause assertion...

2016-11-19 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/15942
  
Test FAILed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/68899/
Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #15942: [SPARK-18407] Inferred partition columns cause assertion...

2016-11-19 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/15942
  
**[Test build #68899 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/68899/consoleFull)**
 for PR 15942 at commit 
[`b4efee9`](https://github.com/apache/spark/commit/b4efee97ab5b89526ec515735de32a3fde969c72).
 * This patch **fails Spark unit tests**.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #15940: [SPARK-18447][BUILD][DOCS]Fix `Note:`/`NOTE:`/`Note that...

2016-11-19 Thread HyukjinKwon
Github user HyukjinKwon commented on the issue:

https://github.com/apache/spark/pull/15940
  
Ah, @aditya1702, I assume, from the title, you meant to do this for Python 
documentation. Actually, the PR you pointed out deals with all of Scala/Java 
documentation if I haven’t missed some. Maybe, `.. note:` in Python codes 
should be handled for Python API documentation including some images from 
manually built API documentation in the PR. If you are not sure of what to fix, 
I can take over this and cc you in the PR. I guess you could close this PR 
meanwhile.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #15943: [SPARK-18511] Added an api for join operations just on c...

2016-11-19 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/15943
  
Can one of the admins verify this patch?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #15943: [SPARK-18511] Added an api for join operations ju...

2016-11-19 Thread shiv4nsh
GitHub user shiv4nsh opened a pull request:

https://github.com/apache/spark/pull/15943

[SPARK-18511] Added an api for join operations just on column name and the 
join type

Added an api for join operations just on column name and the join type. An 
Api in which we do not have to provide the column sequences.




You can merge this pull request into a Git repository by running:

$ git pull https://github.com/shiv4nsh/spark SPARK-18511

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/spark/pull/15943.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #15943


commit 44905bdba7dbbf2b59ceb90a5ff2561802ae2704
Author: Shivansh 
Date:   2016-11-20T03:48:23Z

Added an api for join operations just on column name and the join type




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #15913: [SPARK-18481][ML] ML 2.1 QA: Remove deprecated methods f...

2016-11-19 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/15913
  
**[Test build #68900 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/68900/consoleFull)**
 for PR 15913 at commit 
[`8d3c47a`](https://github.com/apache/spark/commit/8d3c47a2cb3f2ffcf72b6798c224ac0f29c9e6b2).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #15913: [SPARK-18481][ML] ML 2.1 QA: Remove deprecated methods f...

2016-11-19 Thread yanboliang
Github user yanboliang commented on the issue:

https://github.com/apache/spark/pull/15913
  
Jenkins, test this please.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #15942: [SPARK-18407] Inferred partition columns cause assertion...

2016-11-19 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/15942
  
**[Test build #68899 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/68899/consoleFull)**
 for PR 15942 at commit 
[`b4efee9`](https://github.com/apache/spark/commit/b4efee97ab5b89526ec515735de32a3fde969c72).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #15942: [SPARK-18407] Inferred partition columns cause as...

2016-11-19 Thread brkyvz
GitHub user brkyvz opened a pull request:

https://github.com/apache/spark/pull/15942

[SPARK-18407] Inferred partition columns cause assertion error in 
StructuredStreaming

## What changes were proposed in this pull request?

It turns out we are a bit enthusiastic when providing users partition 
columns when they read the data even if they didn't specify it in their schema. 
This causes an assertion error in Streaming jobs, because the `Attribute`s of a 
given trigger don't match the `Attribute`s returned by the DataSource. The 
DataSource returns additional partition columns all the time.

While this is weird behavior for batch as well IMHO, because someone asked 
for a specific schema, but we returned them something else, apparently this 
behavior existed since Spark 1.6. I didn't try older versions. Anyway, I tried 
fixing this by not enforcing a strict size check, but by picking out the 
columns that we want from the batch DataSource.

## How was this patch tested?

Regression test

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/brkyvz/spark filesource-part-bug

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/spark/pull/15942.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #15942


commit ca0bf68e7269bc74da923a2f228bdf43b1bc868c
Author: Burak Yavuz 
Date:   2016-11-18T19:55:42Z

save

try fix

fix

commit 6578cc34cd9f6938a98361047bee61d1ab4e08fb
Author: Burak Yavuz 
Date:   2016-11-19T01:43:13Z

fixed

commit ed2c3f92d45d5075a475d83c79e45672b3aad794
Author: Burak Yavuz 
Date:   2016-11-19T03:06:25Z

better debug message

commit 8465aca7dfce72f4141e4bec241bc833a2e4a83c
Author: Burak Yavuz 
Date:   2016-11-20T03:23:54Z

ready for review

commit c2c2cd5890a38ac3848d724fbe10b24a7cd44ad6
Author: Burak Yavuz 
Date:   2016-11-20T03:25:22Z

make test a bit more complex

commit 879c6e1449074badeb6da73fb10fdd6efcb5838c
Author: Burak Yavuz 
Date:   2016-11-20T03:29:55Z

make test a bit more complex




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #15893: [SPARK-18456][ML][FOLLOWUP] Use matrix abstraction for c...

2016-11-19 Thread dbtsai
Github user dbtsai commented on the issue:

https://github.com/apache/spark/pull/15893
  
LGTM. Since this doesn't have impact on performance, and make the codebase 
cleaner,  I merged this PR into master and branch 2.1. Thanks.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #15652: [SPARK-16987] [None] Add spark-default.conf property to ...

2016-11-19 Thread vanzin
Github user vanzin commented on the issue:

https://github.com/apache/spark/pull/15652
  
Aside from me not liking the API change you're introducing, you need to add 
a unit test to make sure the feature is working. I suggest taking the scenario 
I described in a previous comment 
(https://github.com/apache/spark/pull/15652#discussion_r88137680) and turning 
it into a unit test.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #15939: [SPARK-3359][BUILD][DOCS] Print examples and disable gro...

2016-11-19 Thread HyukjinKwon
Github user HyukjinKwon commented on the issue:

https://github.com/apache/spark/pull/15939
  
Oh, no. It seems still there are a lot of errors. I just meant to show what 
I have tried to test with .. :).
Thanks for approving!


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #15652: [SPARK-16987] [None] Add spark-default.conf prope...

2016-11-19 Thread vanzin
Github user vanzin commented on a diff in the pull request:

https://github.com/apache/spark/pull/15652#discussion_r88790872
  
--- Diff: core/src/main/scala/org/apache/spark/util/Utils.scala ---
@@ -2142,9 +2142,10 @@ private[spark] object Utils extends Logging {
*/
   def startServiceOnPort[T](
   startPort: Int,
-  startService: Int => (T, Int),
+  startService: (Int, Int) => (T, Int),
   conf: SparkConf,
-  serviceName: String = ""): (T, Int) = {
+  serviceName: String = "",
+  securePort: Int = 0): (T, Int) = {
--- End diff --

I don't like what you did here. As you yourself explained, not all 
consumers of this API have the concept of separate secure and non-secure ports. 
And as I explained in my previous comments, it's possible to fix the problem 
you introduced without changing this API.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #15941: [SQL][DOC] Fix incorrect `code` tag in docs/sql-programm...

2016-11-19 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/15941
  
Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #15941: [SQL][DOC] Fix incorrect `code` tag in docs/sql-programm...

2016-11-19 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/15941
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/68898/
Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #15941: [SQL][DOC] Fix incorrect `code` tag in docs/sql-programm...

2016-11-19 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/15941
  
**[Test build #68898 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/68898/consoleFull)**
 for PR 15941 at commit 
[`8108d41`](https://github.com/apache/spark/commit/8108d410533bcf185789cdb60b791933e80d5e24).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #15940: [SPARK-18447][BUILD][DOCS]Fix `Note:`/`NOTE:`/`Note that...

2016-11-19 Thread HyukjinKwon
Github user HyukjinKwon commented on the issue:

https://github.com/apache/spark/pull/15940
  
Yes, you should check if they are exposed in the API documentation. I 
intentionally updated only those ones before.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #14650: [SPARK-17062][MESOS] add conf option to mesos dispatcher

2016-11-19 Thread vanzin
Github user vanzin commented on the issue:

https://github.com/apache/spark/pull/14650
  
LGTM. There are a few style nits that I'll fix during the merge to avoid 
another round.

Merging to master.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #15941: [SQL][DOC] Fix incorrect `code` tag in docs/sql-programm...

2016-11-19 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/15941
  
**[Test build #68898 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/68898/consoleFull)**
 for PR 15941 at commit 
[`8108d41`](https://github.com/apache/spark/commit/8108d410533bcf185789cdb60b791933e80d5e24).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #15941: [SQL][DOC] Fix incorrect `code` tag in docs/sql-p...

2016-11-19 Thread weiqingy
GitHub user weiqingy opened a pull request:

https://github.com/apache/spark/pull/15941

[SQL][DOC] Fix incorrect `code` tag in docs/sql-programming-guide.md

## What changes were proposed in this pull request?
This PR is to fix incorrect `code` tag in `sql-programming-guide.md`

## How was this patch tested?
Manually.

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/weiqingy/spark fixtag

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/spark/pull/15941.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #15941


commit 8108d410533bcf185789cdb60b791933e80d5e24
Author: Weiqing Yang 
Date:   2016-11-19T23:44:04Z

Fix incorrect `code` tag in docs/sql-programming-guide.md




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #15848: [SPARK-9487] Use the same num. worker threads in Java/Sc...

2016-11-19 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/15848
  
Test FAILed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/68897/
Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #15848: [SPARK-9487] Use the same num. worker threads in Java/Sc...

2016-11-19 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/15848
  
Merged build finished. Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #15848: [SPARK-9487] Use the same num. worker threads in Java/Sc...

2016-11-19 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/15848
  
**[Test build #68897 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/68897/consoleFull)**
 for PR 15848 at commit 
[`ecba8e5`](https://github.com/apache/spark/commit/ecba8e563feb6d5b75fe55a4bbc103899c28c5af).
 * This patch **fails Spark unit tests**.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #15940: [SPARK-18447][BUILD][DOCS]Fix `Note:`/`NOTE:`/`Note that...

2016-11-19 Thread rxin
Github user rxin commented on the issue:

https://github.com/apache/spark/pull/15940
  
I don't think any of these actually matter, because these functions are not 
public documentations.



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #15932: [SPARK-18448][CORE] SparkSession should implement...

2016-11-19 Thread srowen
Github user srowen commented on a diff in the pull request:

https://github.com/apache/spark/pull/15932#discussion_r88788935
  
--- Diff: sql/core/src/main/scala/org/apache/spark/sql/SparkSession.scala 
---
@@ -648,6 +649,13 @@ class SparkSession private(
   }
 
   /**
+   * Synonym for `stop()`.
+   *
+   * @since 2.2.0
--- End diff --

Yeah fixed it in a follow up.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #15932: [SPARK-18448][CORE] SparkSession should implement...

2016-11-19 Thread rxin
Github user rxin commented on a diff in the pull request:

https://github.com/apache/spark/pull/15932#discussion_r88788917
  
--- Diff: sql/core/src/main/scala/org/apache/spark/sql/SparkSession.scala 
---
@@ -648,6 +649,13 @@ class SparkSession private(
   }
 
   /**
+   * Synonym for `stop()`.
+   *
+   * @since 2.2.0
--- End diff --

need to update this


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #15940: [SPARK-18447][BUILD][DOCS]Fix `Note:`/`NOTE:`/`Note that...

2016-11-19 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/15940
  
Can one of the admins verify this patch?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #15940: [SPARK-18447][BUILD][DOCS]Fix `Note:`/`NOTE:`/`Note that...

2016-11-19 Thread aditya1702
Github user aditya1702 commented on the issue:

https://github.com/apache/spark/pull/15940
  
@HyukjinKwon Could you review?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #15940: [SPARK-18447][BUILD][DOCS]Fix `Note:`/`NOTE:`/`No...

2016-11-19 Thread aditya1702
GitHub user aditya1702 opened a pull request:

https://github.com/apache/spark/pull/15940

[SPARK-18447][BUILD][DOCS]Fix `Note:`/`NOTE:`/`Note that` across Pyth…

## What changes were proposed in this pull request?

It seems that just like in Scala/Java there is inconsistency in the Python 
API too at some places,

Note:
NOTE:
Note that
'''Note:'''
@note
This PR proposes to fix those to @note to be consistent.

## How was this patch tested?

This was tested by searching in the editor for versions of "NOTE" in the 
python APIs and then fixing them to become consistence.

This PR is a sub-extension of #15889 

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/aditya1702/spark aditya1702-python-api

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/spark/pull/15940.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #15940


commit 688a6cca8695fd3d74547b72ea02d1dc7a0df87d
Author: aditya1702 
Date:   2016-11-19T21:26:34Z

[SPARK-18447][BUILD][DOCS]Fix `Note:`/`NOTE:`/`Note that` across Python API 
documentation




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #15848: [SPARK-9487] Use the same num. worker threads in Java/Sc...

2016-11-19 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/15848
  
**[Test build #68897 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/68897/consoleFull)**
 for PR 15848 at commit 
[`ecba8e5`](https://github.com/apache/spark/commit/ecba8e563feb6d5b75fe55a4bbc103899c28c5af).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #15939: [SPARK-3359][BUILD][DOCS] Print examples and disable gro...

2016-11-19 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/15939
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/68896/
Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #15939: [SPARK-3359][BUILD][DOCS] Print examples and disable gro...

2016-11-19 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/15939
  
Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #15939: [SPARK-3359][BUILD][DOCS] Print examples and disable gro...

2016-11-19 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/15939
  
**[Test build #68896 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/68896/consoleFull)**
 for PR 15939 at commit 
[`8514ec4`](https://github.com/apache/spark/commit/8514ec43b2c81ca8ef44daf3bcf15b381a64c457).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #15939: [SPARK-3359][BUILD][DOCS] Print examples and disable gro...

2016-11-19 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/15939
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/68895/
Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #15939: [SPARK-3359][BUILD][DOCS] Print examples and disable gro...

2016-11-19 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/15939
  
Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #15939: [SPARK-3359][BUILD][DOCS] Print examples and disable gro...

2016-11-19 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/15939
  
**[Test build #68895 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/68895/consoleFull)**
 for PR 15939 at commit 
[`03f88d4`](https://github.com/apache/spark/commit/03f88d49482c3f7fadb39886f35c52b478cb2343).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #15861: [SPARK-18294][CORE] Implement commit protocol to support...

2016-11-19 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/15861
  
Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #15861: [SPARK-18294][CORE] Implement commit protocol to support...

2016-11-19 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/15861
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/68894/
Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #15861: [SPARK-18294][CORE] Implement commit protocol to support...

2016-11-19 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/15861
  
**[Test build #68894 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/68894/consoleFull)**
 for PR 15861 at commit 
[`bedcd10`](https://github.com/apache/spark/commit/bedcd10fe74192fd71088c8739c786d750639af6).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #15939: [SPARK-3359][BUILD][DOCS] Print examples and disable gro...

2016-11-19 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/15939
  
**[Test build #68896 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/68896/consoleFull)**
 for PR 15939 at commit 
[`8514ec4`](https://github.com/apache/spark/commit/8514ec43b2c81ca8ef44daf3bcf15b381a64c457).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #15913: [SPARK-18481][ML] ML 2.1 QA: Remove deprecated methods f...

2016-11-19 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/15913
  
Merged build finished. Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #15913: [SPARK-18481][ML] ML 2.1 QA: Remove deprecated methods f...

2016-11-19 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/15913
  
Test FAILed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/68893/
Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #15913: [SPARK-18481][ML] ML 2.1 QA: Remove deprecated methods f...

2016-11-19 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/15913
  
**[Test build #68893 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/68893/consoleFull)**
 for PR 15913 at commit 
[`8d3c47a`](https://github.com/apache/spark/commit/8d3c47a2cb3f2ffcf72b6798c224ac0f29c9e6b2).
 * This patch **fails Spark unit tests**.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #15939: [SPARK-3359][BUILD][DOCS] Print examples and disable gro...

2016-11-19 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/15939
  
**[Test build #68895 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/68895/consoleFull)**
 for PR 15939 at commit 
[`03f88d4`](https://github.com/apache/spark/commit/03f88d49482c3f7fadb39886f35c52b478cb2343).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #15939: [SPARK-3359][BUILD][DOCS] Print examples in javad...

2016-11-19 Thread HyukjinKwon
GitHub user HyukjinKwon opened a pull request:

https://github.com/apache/spark/pull/15939

[SPARK-3359][BUILD][DOCS] Print examples in javadoc and disable group and 
tparam tags in javadoc

## What changes were proposed in this pull request?

This PR proposes/fixes two things.

- Remove many errors to generate javadoc with Java8 from unrecognisable 
tags, `@tparam` and `@group`.
  
  ```
  [error] 
.../spark/mllib/target/java/org/apache/spark/ml/classification/Classifier.java:18:
 error: unknown tag: group
  [error]   /** @group setParam */
  [error]   ^
  [error] 
.../spark/mllib/target/java/org/apache/spark/ml/classification/Classifier.java:8:
 error: unknown tag: tparam
  [error]  * @tparam FeaturesType  Type of input features.  E.g., 
Vector
  [error]^
  ...
  ```

  It does not fully resolve the problem but remove many errors. It seems 
both `@group` and `@tparam` are unrecognisable in javadoc. It seems we can't 
print them pretty in javadoc in a way of `@example` here because they appear 
differently (both examples can be found in 
http://spark.apache.org/docs/2.0.2/api/scala/index.html#org.apache.spark.ml.classification.Classifier).

- Print `@example` in javadoc.
  Currently, there are few `@example` tag in several places.
  
  ```
  ./graphx/src/main/scala/org/apache/spark/graphx/Graph.scala:   * @example 
This operation might be used to evaluate a graph
  ./graphx/src/main/scala/org/apache/spark/graphx/Graph.scala:   * @example 
We might use this operation to change the vertex values
  ./graphx/src/main/scala/org/apache/spark/graphx/Graph.scala:   * @example 
This function might be used to initialize edge
  ./graphx/src/main/scala/org/apache/spark/graphx/Graph.scala:   * @example 
This function might be used to initialize edge
  ./graphx/src/main/scala/org/apache/spark/graphx/Graph.scala:   * @example 
This function might be used to initialize edge
  ./graphx/src/main/scala/org/apache/spark/graphx/Graph.scala:   * @example 
We can use this function to compute the in-degree of each
  ./graphx/src/main/scala/org/apache/spark/graphx/Graph.scala:   * @example 
This function is used to update the vertices with new values based on external 
data.
  ./graphx/src/main/scala/org/apache/spark/graphx/GraphLoader.scala:   * 
@example Loads a file in the following format:
  ./graphx/src/main/scala/org/apache/spark/graphx/GraphOps.scala:   * 
@example This function is used to update the vertices with new
  ./graphx/src/main/scala/org/apache/spark/graphx/GraphOps.scala:   * 
@example This function can be used to filter the graph based on some property, 
without
  ./graphx/src/main/scala/org/apache/spark/graphx/Pregel.scala: * @example 
We can use the Pregel abstraction to implement PageRank:
  ./graphx/src/main/scala/org/apache/spark/graphx/VertexRDD.scala: * 
@example Construct a `VertexRDD` from a plain RDD:
  
./repl/scala-2.10/src/main/scala/org/apache/spark/repl/SparkCommandLine.scala: 
* @example new SparkCommandLine(Nil).settings
  ./repl/scala-2.10/src/main/scala/org/apache/spark/repl/SparkIMain.scala:  
 * @example addImports("org.apache.spark.SparkContext")
  
./sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/expressions/LiteralGenerator.scala:
 * @example {{{
  ```

**Before**

  https://cloud.githubusercontent.com/assets/6477701/20457285/26f07e1c-aecb-11e6-9ae9-d9dee66845f4.png";>

**After**
  https://cloud.githubusercontent.com/assets/6477701/20457240/409124e4-aeca-11e6-9a91-0ba514148b52.png";>

## How was this patch tested?

Maunally tested by `jekyll build` with Java 7 and 8

```
java version "1.7.0_80"
Java(TM) SE Runtime Environment (build 1.7.0_80-b15)
Java HotSpot(TM) 64-Bit Server VM (build 24.80-b11, mixed mode)
```

```
java version "1.8.0_45"
Java(TM) SE Runtime Environment (build 1.8.0_45-b14)
Java HotSpot(TM) 64-Bit Server VM (build 25.45-b02, mixed mode)
```


You can merge this pull request into a Git repository by running:

$ git pull https://github.com/HyukjinKwon/spark SPARK-3359-javadoc

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/spark/pull/15939.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #15939


commit 03f88d49482c3f7fadb39886f35c52b478cb2343
Author: hyukjinkwon 
Date:   2016-11-19T16:29:26Z

Print examples in javadoc and disable group and tparam in javadoc




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache

[GitHub] spark pull request #15889: [SPARK-18445][BUILD][DOCS] Fix the markdown for `...

2016-11-19 Thread HyukjinKwon
Github user HyukjinKwon commented on a diff in the pull request:

https://github.com/apache/spark/pull/15889#discussion_r88783349
  
--- Diff: 
core/src/main/scala/org/apache/spark/internal/io/SparkHadoopMapReduceWriter.scala
 ---
@@ -119,7 +119,7 @@ object SparkHadoopMapReduceWriter extends Logging {
 }
   }
 
-  /** Write a RDD partition out in a single Spark task. */
+  /** Write an RDD partition out in a single Spark task. */
--- End diff --

I see. Sure, let me just keep it and will leave some comments in the 
related PRs in the future if possible.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #15889: [SPARK-18445][BUILD][DOCS] Fix the markdown for `...

2016-11-19 Thread HyukjinKwon
Github user HyukjinKwon commented on a diff in the pull request:

https://github.com/apache/spark/pull/15889#discussion_r88783252
  
--- Diff: 
core/src/main/scala/org/apache/spark/internal/io/SparkHadoopMapReduceWriter.scala
 ---
@@ -119,7 +119,7 @@ object SparkHadoopMapReduceWriter extends Logging {
 }
   }
 
-  /** Write a RDD partition out in a single Spark task. */
+  /** Write an RDD partition out in a single Spark task. */
--- End diff --

Ah, I will include 

```
./R/pkg/inst/tests/testthat/test_sparkSQL.R:test_that("jsonRDD() on a RDD 
with json string", {
./R/pkg/R/RDD.R:#' @param x A RDD.
```

It seems I should have used `-name "*.R"` instead.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #15889: [SPARK-18445][BUILD][DOCS] Fix the markdown for `...

2016-11-19 Thread srowen
Github user srowen commented on a diff in the pull request:

https://github.com/apache/spark/pull/15889#discussion_r88783222
  
--- Diff: 
core/src/main/scala/org/apache/spark/internal/io/SparkHadoopMapReduceWriter.scala
 ---
@@ -119,7 +119,7 @@ object SparkHadoopMapReduceWriter extends Logging {
 }
   }
 
-  /** Write a RDD partition out in a single Spark task. */
+  /** Write an RDD partition out in a single Spark task. */
--- End diff --

Yes, "an RDD" is correct because "a" precedes consonant _sounds_ and "an" 
precedes vowel sounds. So it's "an _arr-dee-dee_".

We don't need to fix up every last instance of this. It was a good idea to 
fix it all while fixing some that overlapped with this PR though.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #15889: [SPARK-18445][BUILD][DOCS] Fix the markdown for `...

2016-11-19 Thread HyukjinKwon
Github user HyukjinKwon commented on a diff in the pull request:

https://github.com/apache/spark/pull/15889#discussion_r88783173
  
--- Diff: 
core/src/main/scala/org/apache/spark/internal/io/SparkHadoopMapReduceWriter.scala
 ---
@@ -119,7 +119,7 @@ object SparkHadoopMapReduceWriter extends Logging {
 }
   }
 
-  /** Write a RDD partition out in a single Spark task. */
+  /** Write an RDD partition out in a single Spark task. */
--- End diff --

Are you saying the one below

```
./R/pkg/inst/tests/testthat/test_sparkSQL.R:test_that("jsonRDD() on a RDD 
with json string", {
```

?

If so, I intentionally excluded this as this is the change for 
documentation and that was not the part of documentation. Maybe, I was trying 
to be too cautious. Will include them in a minor PR i am going to submit.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #15889: [SPARK-18445][BUILD][DOCS] Fix the markdown for `...

2016-11-19 Thread HyukjinKwon
Github user HyukjinKwon commented on a diff in the pull request:

https://github.com/apache/spark/pull/15889#discussion_r88782911
  
--- Diff: 
core/src/main/scala/org/apache/spark/internal/io/SparkHadoopMapReduceWriter.scala
 ---
@@ -119,7 +119,7 @@ object SparkHadoopMapReduceWriter extends Logging {
 }
   }
 
-  /** Write a RDD partition out in a single Spark task. */
+  /** Write an RDD partition out in a single Spark task. */
--- End diff --

Could you point where `a RDD` is? I will try to include them in a similar 
way above. 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #15889: [SPARK-18445][BUILD][DOCS] Fix the markdown for `...

2016-11-19 Thread HyukjinKwon
Github user HyukjinKwon commented on a diff in the pull request:

https://github.com/apache/spark/pull/15889#discussion_r88782865
  
--- Diff: 
core/src/main/scala/org/apache/spark/internal/io/SparkHadoopMapReduceWriter.scala
 ---
@@ -119,7 +119,7 @@ object SparkHadoopMapReduceWriter extends Logging {
 }
   }
 
-  /** Write a RDD partition out in a single Spark task. */
+  /** Write an RDD partition out in a single Spark task. */
--- End diff --

BTW, I also made this mistake and found them actually `an RDD` is correct 
after googling :).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #15889: [SPARK-18445][BUILD][DOCS] Fix the markdown for `...

2016-11-19 Thread HyukjinKwon
Github user HyukjinKwon commented on a diff in the pull request:

https://github.com/apache/spark/pull/15889#discussion_r88782853
  
--- Diff: 
core/src/main/scala/org/apache/spark/internal/io/SparkHadoopMapReduceWriter.scala
 ---
@@ -119,7 +119,7 @@ object SparkHadoopMapReduceWriter extends Logging {
 }
   }
 
-  /** Write a RDD partition out in a single Spark task. */
+  /** Write an RDD partition out in a single Spark task. */
--- End diff --

It depends from how the acronym is pronounced. For example, 

An MP3 (/ɛm pi θri/)
An RPG (/ɑːr pi ʤi/)
An FBI agent (/ɛf biː aɪ/)



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #15861: [SPARK-18294][CORE] Implement commit protocol to support...

2016-11-19 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/15861
  
**[Test build #68894 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/68894/consoleFull)**
 for PR 15861 at commit 
[`bedcd10`](https://github.com/apache/spark/commit/bedcd10fe74192fd71088c8739c786d750639af6).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #13065: [SPARK-15214][SQL] Code-generation for Generate

2016-11-19 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/13065
  
**[Test build #68891 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/68891/consoleFull)**
 for PR 13065 at commit 
[`ffd5ef8`](https://github.com/apache/spark/commit/ffd5ef83fe5d5f85582faa275a5a58816da0c712).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #15435: [SPARK-17139][ML] Add model summary for MultinomialLogis...

2016-11-19 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/15435
  
Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #15435: [SPARK-17139][ML] Add model summary for MultinomialLogis...

2016-11-19 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/15435
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/68890/
Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #15435: [SPARK-17139][ML] Add model summary for MultinomialLogis...

2016-11-19 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/15435
  
**[Test build #68890 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/68890/consoleFull)**
 for PR 15435 at commit 
[`a62af1e`](https://github.com/apache/spark/commit/a62af1eb60210611c87a96cc8fdf74fe970d7d73).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #15913: [SPARK-18481][ML] ML 2.1 QA: Remove deprecated methods f...

2016-11-19 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/15913
  
**[Test build #68893 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/68893/consoleFull)**
 for PR 15913 at commit 
[`8d3c47a`](https://github.com/apache/spark/commit/8d3c47a2cb3f2ffcf72b6798c224ac0f29c9e6b2).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #15820: [SPARK-18373][SS][Kafka]Make failOnDataLoss=false work w...

2016-11-19 Thread koeninger
Github user koeninger commented on the issue:

https://github.com/apache/spark/pull/15820
  
Because the comment made by me and +1'ed by marmbrus is hidden at this 
point, I just want to re-iterate that this patch should not skip the rest of 
the partition in the case that a timeout happens.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #15913: [SPARK-18481][ML] ML 2.1 QA: Remove deprecated methods f...

2016-11-19 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/15913
  
Merged build finished. Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #15913: [SPARK-18481][ML] ML 2.1 QA: Remove deprecated methods f...

2016-11-19 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/15913
  
Test FAILed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/68892/
Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #15913: [SPARK-18481][ML] ML 2.1 QA: Remove deprecated methods f...

2016-11-19 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/15913
  
**[Test build #68892 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/68892/consoleFull)**
 for PR 15913 at commit 
[`36c2e24`](https://github.com/apache/spark/commit/36c2e248c8c1436b126352ee50ded5f01c03c4c8).
 * This patch **fails MiMa tests**.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #15889: [SPARK-18445][BUILD][DOCS] Fix the markdown for `...

2016-11-19 Thread HyukjinKwon
Github user HyukjinKwon commented on a diff in the pull request:

https://github.com/apache/spark/pull/15889#discussion_r88781534
  
--- Diff: 
core/src/main/scala/org/apache/spark/internal/io/SparkHadoopMapReduceWriter.scala
 ---
@@ -119,7 +119,7 @@ object SparkHadoopMapReduceWriter extends Logging {
 }
   }
 
-  /** Write a RDD partition out in a single Spark task. */
+  /** Write an RDD partition out in a single Spark task. */
--- End diff --

Because `a RDD` was incorrect in English.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #15913: [SPARK-18481][ML] ML 2.1 QA: Remove deprecated methods f...

2016-11-19 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/15913
  
**[Test build #68892 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/68892/consoleFull)**
 for PR 15913 at commit 
[`36c2e24`](https://github.com/apache/spark/commit/36c2e248c8c1436b126352ee50ded5f01c03c4c8).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #15868: [SPARK-18413][SQL] Add `maxConnections` JDBCOption

2016-11-19 Thread dongjoon-hyun
Github user dongjoon-hyun commented on the issue:

https://github.com/apache/spark/pull/15868
  
Hi, @srowen .
Could you review this PR again?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #15868: [SPARK-18413][SQL] Add `maxConnections` JDBCOption

2016-11-19 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/15868
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/6/
Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #15868: [SPARK-18413][SQL] Add `maxConnections` JDBCOption

2016-11-19 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/15868
  
Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #15868: [SPARK-18413][SQL] Add `maxConnections` JDBCOption

2016-11-19 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/15868
  
**[Test build #6 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/6/consoleFull)**
 for PR 15868 at commit 
[`ca75a4e`](https://github.com/apache/spark/commit/ca75a4e383ecd3e143c594a35279135022097ec3).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #14627: [SPARK-16975][SQL][FOLLOWUP] Do not duplicately check fi...

2016-11-19 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/14627
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/68887/
Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #14627: [SPARK-16975][SQL][FOLLOWUP] Do not duplicately check fi...

2016-11-19 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/14627
  
Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #14627: [SPARK-16975][SQL][FOLLOWUP] Do not duplicately check fi...

2016-11-19 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/14627
  
**[Test build #68887 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/68887/consoleFull)**
 for PR 14627 at commit 
[`b68ce0c`](https://github.com/apache/spark/commit/b68ce0c40b911176609839dfbf9a46ca07c323f0).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #15868: [SPARK-18413][SQL] Add `maxConnections` JDBCOption

2016-11-19 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/15868
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/68886/
Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #15868: [SPARK-18413][SQL] Add `maxConnections` JDBCOption

2016-11-19 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/15868
  
Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #15868: [SPARK-18413][SQL] Add `maxConnections` JDBCOption

2016-11-19 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/15868
  
**[Test build #68886 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/68886/consoleFull)**
 for PR 15868 at commit 
[`0391dbf`](https://github.com/apache/spark/commit/0391dbf9df23b90dcd690886dc2c08b7cba7620b).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #15730: [SPARK-18218][ML][MLLib] Optimize BlockMatrix multiplica...

2016-11-19 Thread WeichenXu123
Github user WeichenXu123 commented on the issue:

https://github.com/apache/spark/pull/15730
  
@brkyvz

Good question about the shuffling data in the sparse case. Now I give some 
simple analysis on it (maybe not very strict):
as discussed above, the shuffling data contains step-1 and step-2,
if we keep parallelism the same, we need to increase the `midDimSplitNum` 
and decrease `ShuffleRowPartitions` and `ShuffleColPartitions`, in such case,  
we can make sure that step-1 shuffling data will always reduce, and step-2 
shuffling data will always increase.
Now let us consider the sparse case, the step-2  shuffling data increasing 
because when each pair of left-matrix row-sets multiplying right-matrix 
col-sets, we split it into multiple parts (`midDimSplitNum` parts), in each 
parts we do the multiplying and we need to aggregate all parts together, the 
more parts need to be aggregate, the more shuffling data it needs. BUT, in 
sparse case, these parts(after multiplying) will be empty in high probability, 
so to those empty parts it shuffle nothing. So, we can expect that in sparse 
case, step-2 shuffling data won't increase much rather than the `midDimSplitNum 
= 1` case, because most split parts will shuffle nothing.

Now I am considering improve the API interface to make it easier to use.
I would like to let user specified the parallism, and the algorithm 
automatically calculate the optimal `midDimSplitNum`, what do you think about 
it ?

And thanks for careful review!
 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #13065: [SPARK-15214][SQL] Code-generation for Generate

2016-11-19 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/13065
  
**[Test build #68891 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/68891/consoleFull)**
 for PR 13065 at commit 
[`ffd5ef8`](https://github.com/apache/spark/commit/ffd5ef83fe5d5f85582faa275a5a58816da0c712).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #13065: [SPARK-15214][SQL] Code-generation for Generate

2016-11-19 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/13065
  
Test FAILed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/68885/
Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



  1   2   >