spark git commit: [SPARK-20963][SQL] Support column aliases for join relations in FROM clause

2017-08-05 Thread lixiao
Repository: spark
Updated Branches:
  refs/heads/master 41568e9a0 -> 990efad1c


[SPARK-20963][SQL] Support column aliases for join relations in FROM clause

## What changes were proposed in this pull request?
This pr added parsing rules to support column aliases for join relations in 
FROM clause.
This pr is a sub-task of #18079.

## How was this patch tested?
Added tests in `AnalysisSuite`, `PlanParserSuite,` and `SQLQueryTestSuite`.

Author: Takeshi Yamamuro 

Closes #18772 from maropu/SPARK-20963-2.


Project: http://git-wip-us.apache.org/repos/asf/spark/repo
Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/990efad1
Tree: http://git-wip-us.apache.org/repos/asf/spark/tree/990efad1
Diff: http://git-wip-us.apache.org/repos/asf/spark/diff/990efad1

Branch: refs/heads/master
Commit: 990efad1c62dec8f80debb6a1b11bdd030142768
Parents: 41568e9
Author: Takeshi Yamamuro 
Authored: Sat Aug 5 20:35:54 2017 -0700
Committer: gatorsmile 
Committed: Sat Aug 5 20:35:54 2017 -0700

--
 .../apache/spark/sql/catalyst/parser/SqlBase.g4 | 10 ++---
 .../spark/sql/catalyst/parser/AstBuilder.scala  | 44 +---
 .../sql/catalyst/analysis/AnalysisSuite.scala   | 24 ++-
 .../sql/catalyst/parser/PlanParserSuite.scala   | 13 ++
 .../sql-tests/inputs/table-aliases.sql  |  7 
 .../sql-tests/results/table-aliases.sql.out | 28 -
 6 files changed, 104 insertions(+), 22 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/spark/blob/990efad1/sql/catalyst/src/main/antlr4/org/apache/spark/sql/catalyst/parser/SqlBase.g4
--
diff --git 
a/sql/catalyst/src/main/antlr4/org/apache/spark/sql/catalyst/parser/SqlBase.g4 
b/sql/catalyst/src/main/antlr4/org/apache/spark/sql/catalyst/parser/SqlBase.g4
index 4534b7d..954955b 100644
--- 
a/sql/catalyst/src/main/antlr4/org/apache/spark/sql/catalyst/parser/SqlBase.g4
+++ 
b/sql/catalyst/src/main/antlr4/org/apache/spark/sql/catalyst/parser/SqlBase.g4
@@ -473,11 +473,11 @@ identifierComment
 ;
 
 relationPrimary
-: tableIdentifier sample? tableAlias   #tableName
-| '(' queryNoWith ')' sample? tableAlias   #aliasedQuery
-| '(' relation ')' sample? (AS? strictIdentifier)? #aliasedRelation
-| inlineTable  #inlineTableDefault2
-| functionTable#tableValuedFunction
+: tableIdentifier sample? tableAlias  #tableName
+| '(' queryNoWith ')' sample? tableAlias  #aliasedQuery
+| '(' relation ')' sample? tableAlias #aliasedRelation
+| inlineTable #inlineTableDefault2
+| functionTable   #tableValuedFunction
 ;
 
 inlineTable

http://git-wip-us.apache.org/repos/asf/spark/blob/990efad1/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/parser/AstBuilder.scala
--
diff --git 
a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/parser/AstBuilder.scala
 
b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/parser/AstBuilder.scala
index 5935017..532d6ee 100644
--- 
a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/parser/AstBuilder.scala
+++ 
b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/parser/AstBuilder.scala
@@ -739,12 +739,14 @@ class AstBuilder(conf: SQLConf) extends 
SqlBaseBaseVisitor[AnyRef] with Logging
   /**
* Create an alias (SubqueryAlias) for a join relation. This is practically 
the same as
* visitAliasedQuery and visitNamedExpression, ANTLR4 however requires us to 
use 3 different
-   * hooks.
+   * hooks. We could add alias names for output columns, for example:
+   * {{{
+   *   SELECT a, b, c, d FROM (src1 s1 INNER JOIN src2 s2 ON s1.id = s2.id) 
dst(a, b, c, d)
+   * }}}
*/
   override def visitAliasedRelation(ctx: AliasedRelationContext): LogicalPlan 
= withOrigin(ctx) {
-plan(ctx.relation)
-  .optionalMap(ctx.sample)(withSample)
-  .optionalMap(ctx.strictIdentifier)(aliasPlan)
+val relation = plan(ctx.relation).optionalMap(ctx.sample)(withSample)
+mayApplyAliasPlan(ctx.tableAlias, relation)
   }
 
   /**
@@ -756,32 +758,44 @@ class AstBuilder(conf: SQLConf) extends 
SqlBaseBaseVisitor[AnyRef] with Logging
* }}}
*/
   override def visitAliasedQuery(ctx: AliasedQueryContext): LogicalPlan = 
withOrigin(ctx) {
-val alias = if (ctx.tableAlias.strictIdentifier == null) {
+val relation = plan(ctx.queryNoWith).optionalMap(ctx.sample)(withSample)
+if (ctx.tableAlias.strictIdentifier == null) {
   // For un-aliased subqueries, use a default alias name that is not 
likely to conflict with
   // 

spark git commit: [SPARK-21637][SPARK-21451][SQL] get `spark.hadoop.*` properties from sysProps to hiveconf

2017-08-05 Thread lixiao
Repository: spark
Updated Branches:
  refs/heads/master dcac1d57f -> 41568e9a0


[SPARK-21637][SPARK-21451][SQL] get `spark.hadoop.*` properties from sysProps 
to hiveconf

## What changes were proposed in this pull request?
When we use `bin/spark-sql` command configuring `--conf spark.hadoop.foo=bar`, 
the `SparkSQLCliDriver` initializes an instance of  hiveconf, it does not add 
`foo->bar` to it.
this pr gets `spark.hadoop.*` properties from sysProps to this hiveconf

## How was this patch tested?
UT

Author: hzyaoqin 
Author: Kent Yao 

Closes #18668 from yaooqinn/SPARK-21451.


Project: http://git-wip-us.apache.org/repos/asf/spark/repo
Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/41568e9a
Tree: http://git-wip-us.apache.org/repos/asf/spark/tree/41568e9a
Diff: http://git-wip-us.apache.org/repos/asf/spark/diff/41568e9a

Branch: refs/heads/master
Commit: 41568e9a0fc4f1373171c6f8dc33c87d9affde70
Parents: dcac1d5
Author: hzyaoqin 
Authored: Sat Aug 5 17:30:47 2017 -0700
Committer: gatorsmile 
Committed: Sat Aug 5 17:30:47 2017 -0700

--
 .../apache/spark/deploy/SparkHadoopUtil.scala   | 33 +++
 docs/configuration.md   | 34 +++-
 .../hive/thriftserver/SparkSQLCLIDriver.scala   | 19 +++
 .../spark/sql/hive/thriftserver/CliSuite.scala  | 13 
 .../org/apache/spark/sql/hive/HiveUtils.scala   |  8 +
 .../apache/spark/sql/hive/HiveUtilsSuite.scala  |  9 ++
 6 files changed, 102 insertions(+), 14 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/spark/blob/41568e9a/core/src/main/scala/org/apache/spark/deploy/SparkHadoopUtil.scala
--
diff --git a/core/src/main/scala/org/apache/spark/deploy/SparkHadoopUtil.scala 
b/core/src/main/scala/org/apache/spark/deploy/SparkHadoopUtil.scala
index ce916b4..eeb6d10 100644
--- a/core/src/main/scala/org/apache/spark/deploy/SparkHadoopUtil.scala
+++ b/core/src/main/scala/org/apache/spark/deploy/SparkHadoopUtil.scala
@@ -22,8 +22,10 @@ import java.security.PrivilegedExceptionAction
 import java.text.DateFormat
 import java.util.{Arrays, Comparator, Date, Locale}
 
+import scala.collection.immutable.Map
 import scala.collection.JavaConverters._
 import scala.collection.mutable
+import scala.collection.mutable.HashMap
 import scala.util.control.NonFatal
 
 import com.google.common.primitives.Longs
@@ -74,7 +76,6 @@ class SparkHadoopUtil extends Logging {
 }
   }
 
-
   /**
* Appends S3-specific, spark.hadoop.*, and spark.buffer.size configurations 
to a Hadoop
* configuration.
@@ -99,18 +100,36 @@ class SparkHadoopUtil extends Logging {
   hadoopConf.set("fs.s3a.session.token", sessionToken)
 }
   }
-  // Copy any "spark.hadoop.foo=bar" system properties into conf as 
"foo=bar"
-  conf.getAll.foreach { case (key, value) =>
-if (key.startsWith("spark.hadoop.")) {
-  hadoopConf.set(key.substring("spark.hadoop.".length), value)
-}
-  }
+  appendSparkHadoopConfigs(conf, hadoopConf)
   val bufferSize = conf.get("spark.buffer.size", "65536")
   hadoopConf.set("io.file.buffer.size", bufferSize)
 }
   }
 
   /**
+   * Appends spark.hadoop.* configurations from a [[SparkConf]] to a Hadoop
+   * configuration without the spark.hadoop. prefix.
+   */
+  def appendSparkHadoopConfigs(conf: SparkConf, hadoopConf: Configuration): 
Unit = {
+// Copy any "spark.hadoop.foo=bar" spark properties into conf as "foo=bar"
+for ((key, value) <- conf.getAll if key.startsWith("spark.hadoop.")) {
+  hadoopConf.set(key.substring("spark.hadoop.".length), value)
+}
+  }
+
+  /**
+   * Appends spark.hadoop.* configurations from a Map to another without the 
spark.hadoop. prefix.
+   */
+  def appendSparkHadoopConfigs(
+  srcMap: Map[String, String],
+  destMap: HashMap[String, String]): Unit = {
+// Copy any "spark.hadoop.foo=bar" system properties into destMap as 
"foo=bar"
+for ((key, value) <- srcMap if key.startsWith("spark.hadoop.")) {
+  destMap.put(key.substring("spark.hadoop.".length), value)
+}
+  }
+
+  /**
* Return an appropriate (subclass) of Configuration. Creating config can 
initializes some Hadoop
* subsystems.
*/

http://git-wip-us.apache.org/repos/asf/spark/blob/41568e9a/docs/configuration.md
--
diff --git a/docs/configuration.md b/docs/configuration.md
index 011d583..e7c0306 100644
--- a/docs/configuration.md
+++ b/docs/configuration.md
@@ -2357,5 +2357,37 @@ The location of these configuration files varies across 
Hadoop versions, but
 a common location is inside of `/etc/hadoop/conf`. Some tools 

spark git commit: [SPARK-21640] Add errorifexists as a valid string for ErrorIfExists save mode

2017-08-05 Thread lixiao
Repository: spark
Updated Branches:
  refs/heads/master ba327ee54 -> dcac1d57f


[SPARK-21640] Add errorifexists as a valid string for ErrorIfExists save mode

## What changes were proposed in this pull request?

This PR includes the changes to make the string "errorifexists" also valid for 
ErrorIfExists save mode.

## How was this patch tested?

Unit tests and manual tests

Author: arodriguez 

Closes #18844 from ardlema/SPARK-21640.


Project: http://git-wip-us.apache.org/repos/asf/spark/repo
Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/dcac1d57
Tree: http://git-wip-us.apache.org/repos/asf/spark/tree/dcac1d57
Diff: http://git-wip-us.apache.org/repos/asf/spark/diff/dcac1d57

Branch: refs/heads/master
Commit: dcac1d57f0fd05605edf596c303546d83062a352
Parents: ba327ee
Author: arodriguez 
Authored: Sat Aug 5 11:21:51 2017 -0700
Committer: gatorsmile 
Committed: Sat Aug 5 11:21:51 2017 -0700

--
 sql/core/src/main/scala/org/apache/spark/sql/DataFrameWriter.scala | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)
--


http://git-wip-us.apache.org/repos/asf/spark/blob/dcac1d57/sql/core/src/main/scala/org/apache/spark/sql/DataFrameWriter.scala
--
diff --git a/sql/core/src/main/scala/org/apache/spark/sql/DataFrameWriter.scala 
b/sql/core/src/main/scala/org/apache/spark/sql/DataFrameWriter.scala
index 0fcda46..079f699 100644
--- a/sql/core/src/main/scala/org/apache/spark/sql/DataFrameWriter.scala
+++ b/sql/core/src/main/scala/org/apache/spark/sql/DataFrameWriter.scala
@@ -71,7 +71,7 @@ final class DataFrameWriter[T] private[sql](ds: Dataset[T]) {
   case "overwrite" => SaveMode.Overwrite
   case "append" => SaveMode.Append
   case "ignore" => SaveMode.Ignore
-  case "error" | "default" => SaveMode.ErrorIfExists
+  case "error" | "errorifexists" | "default" => SaveMode.ErrorIfExists
   case _ => throw new IllegalArgumentException(s"Unknown save mode: 
$saveMode. " +
 "Accepted save modes are 'overwrite', 'append', 'ignore', 'error'.")
 }


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org



[2/2] spark git commit: [SPARK-21485][FOLLOWUP][SQL][DOCS] Describes examples and arguments separately, and note/since in SQL built-in function documentation

2017-08-05 Thread lixiao
[SPARK-21485][FOLLOWUP][SQL][DOCS] Describes examples and arguments separately, 
and note/since in SQL built-in function documentation

## What changes were proposed in this pull request?

This PR proposes to separate `extended` into `examples` and `arguments` 
internally so that both can be separately documented and add `since` and `note` 
for additional information.

For `since`, it looks users sometimes get confused by, up to my knowledge, 
missing version information. For example, see 
https://www.mail-archive.com/userspark.apache.org/msg64798.html

For few good examples to check the built documentation, please see both:
`from_json` - https://spark-test.github.io/sparksqldoc/#from_json
`like` - https://spark-test.github.io/sparksqldoc/#like

For `DESCRIBE FUNCTION`, `note` and `since` are added as below:

```
> DESCRIBE FUNCTION EXTENDED rlike;
...
Extended Usage:
Arguments:
  ...

Examples:
  ...

Note:
  Use LIKE to match with simple string pattern
```

```
> DESCRIBE FUNCTION EXTENDED to_json;
...
Examples:
  ...

Since: 2.2.0
```

For the complete documentation, see https://spark-test.github.io/sparksqldoc/

## How was this patch tested?

Manual tests and existing tests. Please see 
https://spark-test.github.io/sparksqldoc

Jenkins tests are needed to double check

Author: hyukjinkwon 

Closes #18749 from HyukjinKwon/followup-sql-doc-gen.


Project: http://git-wip-us.apache.org/repos/asf/spark/repo
Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/ba327ee5
Tree: http://git-wip-us.apache.org/repos/asf/spark/tree/ba327ee5
Diff: http://git-wip-us.apache.org/repos/asf/spark/diff/ba327ee5

Branch: refs/heads/master
Commit: ba327ee54c32b11107793604895bd38559804858
Parents: 3a45c7f
Author: hyukjinkwon 
Authored: Sat Aug 5 10:10:56 2017 -0700
Committer: gatorsmile 
Committed: Sat Aug 5 10:10:56 2017 -0700

--
 .../expressions/ExpressionDescription.java  |  42 +-
 .../catalyst/expressions/ExpressionInfo.java|  65 -
 .../catalyst/analysis/FunctionRegistry.scala|  20 ++-
 .../expressions/CallMethodViaReflection.scala   |   2 +-
 .../spark/sql/catalyst/expressions/Cast.scala   |   2 +-
 .../aggregate/ApproximatePercentile.scala   |   2 +-
 .../sql/catalyst/expressions/arithmetic.scala   |  20 +--
 .../expressions/bitwiseExpressions.scala|   8 +-
 .../expressions/collectionOperations.scala  |  10 +-
 .../expressions/complexTypeCreator.scala|  11 +-
 .../expressions/conditionalExpressions.scala|   2 +-
 .../expressions/datetimeExpressions.scala   |  52 +++
 .../sql/catalyst/expressions/generators.scala   |   8 +-
 .../spark/sql/catalyst/expressions/hash.scala   |  10 +-
 .../catalyst/expressions/jsonExpressions.scala  |  14 +-
 .../catalyst/expressions/mathExpressions.scala  |  80 +--
 .../spark/sql/catalyst/expressions/misc.scala   |   6 +-
 .../catalyst/expressions/nullExpressions.scala  |  18 +--
 .../expressions/randomExpressions.scala |   4 +-
 .../expressions/regexpExpressions.scala |  69 -
 .../expressions/stringExpressions.scala |  87 ++--
 .../sql/catalyst/expressions/xml/xpath.scala|  16 +--
 .../resources/sql-tests/results/cast.sql.out|   4 +-
 .../sql-tests/results/json-functions.sql.out|   4 +
 sql/gen-sql-markdown.py | 142 ---
 25 files changed, 461 insertions(+), 237 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/spark/blob/ba327ee5/sql/catalyst/src/main/java/org/apache/spark/sql/catalyst/expressions/ExpressionDescription.java
--
diff --git 
a/sql/catalyst/src/main/java/org/apache/spark/sql/catalyst/expressions/ExpressionDescription.java
 
b/sql/catalyst/src/main/java/org/apache/spark/sql/catalyst/expressions/ExpressionDescription.java
index 62a2ce4..ea6fffa 100644
--- 
a/sql/catalyst/src/main/java/org/apache/spark/sql/catalyst/expressions/ExpressionDescription.java
+++ 
b/sql/catalyst/src/main/java/org/apache/spark/sql/catalyst/expressions/ExpressionDescription.java
@@ -24,20 +24,50 @@ import java.lang.annotation.RetentionPolicy;
 
 /**
  * ::DeveloperApi::
-
+ *
  * A function description type which can be recognized by FunctionRegistry, 
and will be used to
  * show the usage of the function in human language.
  *
  * `usage()` will be used for the function usage in brief way.
- * `extended()` will be used for the function usage in verbose way, suppose
- *  an example will be provided.
  *
- *  And we can refer the function name by `_FUNC_`, in `usage` and `extended`, 
as it's
+ * These below are concatenated and used for the function usage in verbose 
way, suppose arguments,
+ * examples, note and since 

[1/2] spark git commit: [SPARK-21485][FOLLOWUP][SQL][DOCS] Describes examples and arguments separately, and note/since in SQL built-in function documentation

2017-08-05 Thread lixiao
Repository: spark
Updated Branches:
  refs/heads/master 3a45c7fee -> ba327ee54


http://git-wip-us.apache.org/repos/asf/spark/blob/ba327ee5/sql/gen-sql-markdown.py
--
diff --git a/sql/gen-sql-markdown.py b/sql/gen-sql-markdown.py
index 8132af2..fa8124b 100644
--- a/sql/gen-sql-markdown.py
+++ b/sql/gen-sql-markdown.py
@@ -19,7 +19,8 @@ import sys
 import os
 from collections import namedtuple
 
-ExpressionInfo = namedtuple("ExpressionInfo", "className usage name extended")
+ExpressionInfo = namedtuple(
+"ExpressionInfo", "className name usage arguments examples note since")
 
 
 def _list_function_infos(jvm):
@@ -34,20 +35,21 @@ def _list_function_infos(jvm):
 name = jinfo.getName()
 usage = jinfo.getUsage()
 usage = usage.replace("_FUNC_", name) if usage is not None else usage
-extended = jinfo.getExtended()
-extended = extended.replace("_FUNC_", name) if extended is not None 
else extended
 infos.append(ExpressionInfo(
 className=jinfo.getClassName(),
-usage=usage,
 name=name,
-extended=extended))
+usage=usage,
+arguments=jinfo.getArguments().replace("_FUNC_", name),
+examples=jinfo.getExamples().replace("_FUNC_", name),
+note=jinfo.getNote(),
+since=jinfo.getSince()))
 return sorted(infos, key=lambda i: i.name)
 
 
 def _make_pretty_usage(usage):
 """
-Makes the usage description pretty and returns a formatted string.
-Otherwise, returns None.
+Makes the usage description pretty and returns a formatted string if 
`usage`
+is not an empty string. Otherwise, returns None.
 """
 
 if usage is not None and usage.strip() != "":
@@ -55,32 +57,136 @@ def _make_pretty_usage(usage):
 return "%s\n\n" % usage
 
 
-def _make_pretty_extended(extended):
+def _make_pretty_arguments(arguments):
+"""
+Makes the arguments description pretty and returns a formatted string if 
`arguments`
+starts with the argument prefix. Otherwise, returns None.
+
+Expected input:
+
+Arguments:
+  * arg0 - ...
+  ...
+  * arg0 - ...
+  ...
+
+Expected output:
+**Arguments:**
+
+* arg0 - ...
+...
+* arg0 - ...
+...
+
+"""
+
+if arguments.startswith("\nArguments:"):
+arguments = "\n".join(map(lambda u: u[6:], 
arguments.strip().split("\n")[1:]))
+return "**Arguments:**\n\n%s\n\n" % arguments
+
+
+def _make_pretty_examples(examples):
 """
-Makes the extended description pretty and returns a formatted string.
-Otherwise, returns None.
+Makes the examples description pretty and returns a formatted string if 
`examples`
+starts with the example prefix. Otherwise, returns None.
+
+Expected input:
+
+Examples:
+  > SELECT ...;
+   ...
+  > SELECT ...;
+   ...
+
+Expected output:
+**Examples:**
+
+```
+> SELECT ...;
+ ...
+> SELECT ...;
+ ...
+```
+
 """
 
-if extended is not None and extended.strip() != "":
-extended = "\n".join(map(lambda u: u.strip(), extended.split("\n")))
-return "```%s```\n\n" % extended
+if examples.startswith("\nExamples:"):
+examples = "\n".join(map(lambda u: u[6:], 
examples.strip().split("\n")[1:]))
+return "**Examples:**\n\n```\n%s\n```\n\n" % examples
+
+
+def _make_pretty_note(note):
+"""
+Makes the note description pretty and returns a formatted string if `note` 
is not
+an empty string. Otherwise, returns None.
+
+Expected input:
+
+...
+
+Expected output:
+**Note:**
+
+...
+
+"""
+
+if note != "":
+note = "\n".join(map(lambda n: n[4:], note.split("\n")))
+return "**Note:**\n%s\n" % note
 
 
 def generate_sql_markdown(jvm, path):
 """
 Generates a markdown file after listing the function information. The 
output file
 is created in `path`.
+
+Expected output:
+### NAME
+
+USAGE
+
+**Arguments:**
+
+ARGUMENTS
+
+**Examples:**
+
+```
+EXAMPLES
+```
+
+**Note:**
+
+NOTE
+
+**Since:** SINCE
+
+
+
 """
 
 with open(path, 'w') as mdfile:
 for info in _list_function_infos(jvm):
-mdfile.write("### %s\n\n" % info.name)
+name = info.name
 usage = _make_pretty_usage(info.usage)
-extended = _make_pretty_extended(info.extended)
+arguments = _make_pretty_arguments(info.arguments)
+examples = _make_pretty_examples(info.examples)
+note = _make_pretty_note(info.note)
+since = info.since
+
+mdfile.write("### %s\n\n" % name)
 if usage is not None:
-mdfile.write(usage)
-if extended is not None:
-

spark git commit: [INFRA] Close stale PRs

2017-08-05 Thread gurwls223
Repository: spark
Updated Branches:
  refs/heads/master 894d5a453 -> 3a45c7fee


[INFRA] Close stale PRs

## What changes were proposed in this pull request?

This PR proposes to close stale PRs, mostly the same instances with #18017

Closes #14085 - [SPARK-16408][SQL] SparkSQL Added file get Exception: is a 
directory …
Closes #14239 - [SPARK-16593] [CORE] [WIP] Provide a pre-fetch mechanism to 
accelerate shuffle stage.
Closes #14567 - [SPARK-16992][PYSPARK] Python Pep8 formatting and import 
reorganisation
Closes #14579 - [SPARK-16921][PYSPARK] RDD/DataFrame persist()/cache() should 
return Python context managers
Closes #14601 - [SPARK-13979][Core] Killed executor is re spawned without AWS 
key…
Closes #14830 - [SPARK-16992][PYSPARK][DOCS] import sort and autopep8 on 
Pyspark examples
Closes #14963 - [SPARK-16992][PYSPARK] Virtualenv for Pylint and pep8 in 
lint-python
Closes #15227 - [SPARK-17655][SQL]Remove unused variables declarations and 
definations in a WholeStageCodeGened stage
Closes #15240 - [SPARK-17556] [CORE] [SQL] Executor side broadcast for 
broadcast joins
Closes #15405 - [SPARK-15917][CORE] Added support for number of executors in 
Standalone [WIP]
Closes #16099 - [SPARK-18665][SQL] set statement state to "ERROR" after user 
cancel job
Closes #16445 - [SPARK-19043][SQL]Make SparkSQLSessionManager more configurable
Closes #16618 - [SPARK-14409][ML][WIP] Add RankingEvaluator
Closes #16766 - [SPARK-19426][SQL] Custom coalesce for Dataset
Closes #16832 - [SPARK-19490][SQL] ignore case sensitivity when filtering hive 
partition columns
Closes #17052 - [SPARK-19690][SS] Join a streaming DataFrame with a batch 
DataFrame which has an aggregation may not work
Closes #17267 - [SPARK-19926][PYSPARK] Make pyspark exception more user-friendly
Closes #17371 - [SPARK-19903][PYSPARK][SS] window operator miss the `watermark` 
metadata of time column
Closes #17401 - [SPARK-18364][YARN] Expose metrics for YarnShuffleService
Closes #17519 - [SPARK-15352][Doc] follow-up: add configuration docs for 
topology-aware block replication
Closes #17530 - [SPARK-5158] Access kerberized HDFS from Spark standalone
Closes #17854 - [SPARK-20564][Deploy] Reduce massive executor failures when 
executor count is large (>2000)
Closes #17979 - [SPARK-19320][MESOS][WIP]allow specifying a hard limit on 
number of gpus required in each spark executor when running on mesos
Closes #18127 - [SPARK-6628][SQL][Branch-2.1] Fix ClassCastException when 
executing sql statement 'insert into' on hbase table
Closes #18236 - [SPARK-21015] Check field name is not null and empty in 
GenericRowWit…
Closes #18269 - [SPARK-21056][SQL] Use at most one spark job to list files in 
InMemoryFileIndex
Closes #18328 - [SPARK-21121][SQL] Support changing storage level via the 
spark.sql.inMemoryColumnarStorage.level variable
Closes #18354 - [SPARK-18016][SQL][CATALYST][BRANCH-2.1] Code Generation: 
Constant Pool Limit - Class Splitting
Closes #18383 - [SPARK-21167][SS] Set kafka clientId while fetch messages
Closes #18414 - [SPARK-21169] [core] Make sure to update application status to 
RUNNING if executors are accepted and RUNNING after recovery
Closes #18432 - resolve com.esotericsoftware.kryo.KryoException
Closes #18490 - [SPARK-21269][Core][WIP] Fix FetchFailedException when enable 
maxReqSizeShuffleToMem and KryoSerializer
Closes #18585 - SPARK-21359
Closes #18609 - Spark SQL merge small files to big files Update 
InsertIntoHiveTable.scala

Added:
Closes #18308 - [SPARK-21099][Spark Core] INFO Log Message Using Incorrect 
Executor I…
Closes #18599 - [SPARK-21372] spark writes one log file even I set the number 
of spark_rotate_log to 0
Closes #18619 - [SPARK-21397][BUILD]Maven shade plugin adding 
dependency-reduced-pom.xml to …
Closes #18667 - Fix the simpleString used in error messages
Closes #18782 - Branch 2.1

Added:
Closes #17694 - [SPARK-12717][PYSPARK] Resolving race condition with pyspark 
broadcasts when using multiple threads

Added:
Closes #16456 - [SPARK-18994] clean up the local directories for application in 
future by annother thread
Closes #18683 - [SPARK-21474][CORE] Make number of parallel fetches from a 
reducer configurable
Closes #18690 - [SPARK-21334][CORE] Add metrics reporting service to External 
Shuffle Server

Added:
Closes #18827 - Merge pull request 1 from apache/master

## How was this patch tested?

N/A

Author: hyukjinkwon 

Closes #18780 from HyukjinKwon/close-prs.


Project: http://git-wip-us.apache.org/repos/asf/spark/repo
Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/3a45c7fe
Tree: http://git-wip-us.apache.org/repos/asf/spark/tree/3a45c7fe
Diff: http://git-wip-us.apache.org/repos/asf/spark/diff/3a45c7fe

Branch: refs/heads/master
Commit: 3a45c7fee6190270505d32409184b6ed1ed7b52b
Parents: 894d5a4
Author: hyukjinkwon 
Authored: Sat Aug 5 21:58:38 2017 +0900
Committer: hyukjinkwon 
Committed: Sat Aug 5 21:58:38 2017 +0900