spark git commit: [MINOR][DOCS] Minor doc fixes related with doc build and uses script dir in SQL doc gen script

2017-08-25 Thread gurwls223
Repository: spark
Updated Branches:
  refs/heads/master 522e1f80d -> 3b66b1c44


[MINOR][DOCS] Minor doc fixes related with doc build and uses script dir in SQL 
doc gen script

## What changes were proposed in this pull request?

This PR proposes both:

- Add information about Javadoc, SQL docs and few more information in 
`docs/README.md` and a comment in `docs/_plugins/copy_api_dirs.rb` related with 
Javadoc.

- Adds some commands so that the script always runs the SQL docs build under 
`./sql` directory (for directly running `./sql/create-docs.sh` in the root 
directory).

## How was this patch tested?

Manual tests with `jekyll build` and `./sql/create-docs.sh` in the root 
directory.

Author: hyukjinkwon 

Closes #19019 from HyukjinKwon/minor-doc-build.


Project: http://git-wip-us.apache.org/repos/asf/spark/repo
Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/3b66b1c4
Tree: http://git-wip-us.apache.org/repos/asf/spark/tree/3b66b1c4
Diff: http://git-wip-us.apache.org/repos/asf/spark/diff/3b66b1c4

Branch: refs/heads/master
Commit: 3b66b1c44060fb0ebf292830b08f71e990779800
Parents: 522e1f8
Author: hyukjinkwon 
Authored: Sat Aug 26 13:56:24 2017 +0900
Committer: hyukjinkwon 
Committed: Sat Aug 26 13:56:24 2017 +0900

--
 docs/README.md | 70 +
 docs/_plugins/copy_api_dirs.rb |  2 +-
 sql/create-docs.sh |  4 +++
 3 files changed, 45 insertions(+), 31 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/spark/blob/3b66b1c4/docs/README.md
--
diff --git a/docs/README.md b/docs/README.md
index 866364f..225bb1b 100644
--- a/docs/README.md
+++ b/docs/README.md
@@ -9,19 +9,22 @@ documentation yourself. Why build it yourself? So that you 
have the docs that co
 whichever version of Spark you currently have checked out of revision control.
 
 ## Prerequisites
-The Spark documentation build uses a number of tools to build HTML docs and 
API docs in Scala,
-Python and R.
+
+The Spark documentation build uses a number of tools to build HTML docs and 
API docs in Scala, Java,
+Python, R and SQL.
 
 You need to have 
[Ruby](https://www.ruby-lang.org/en/documentation/installation/) and
 
[Python](https://docs.python.org/2/using/unix.html#getting-and-installing-the-latest-version-of-python)
 installed. Also install the following libraries:
+
 ```sh
-$ sudo gem install jekyll jekyll-redirect-from pygments.rb
-$ sudo pip install Pygments
-# Following is needed only for generating API docs
-$ sudo pip install sphinx pypandoc mkdocs
-$ sudo Rscript -e 'install.packages(c("knitr", "devtools", "roxygen2", 
"testthat", "rmarkdown"), repos="http://cran.stat.ucla.edu/;)'
+$ sudo gem install jekyll jekyll-redirect-from pygments.rb
+$ sudo pip install Pygments
+# Following is needed only for generating API docs
+$ sudo pip install sphinx pypandoc mkdocs
+$ sudo Rscript -e 'install.packages(c("knitr", "devtools", "roxygen2", 
"testthat", "rmarkdown"), repos="http://cran.stat.ucla.edu/;)'
 ```
+
 (Note: If you are on a system with both Ruby 1.9 and Ruby 2.0 you may need to 
replace gem with gem2.0)
 
 ## Generating the Documentation HTML
@@ -32,42 +35,49 @@ the source code and be captured by revision control 
(currently git). This way th
 includes the version of the documentation that is relevant regardless of which 
version or release
 you have checked out or downloaded.
 
-In this directory you will find textfiles formatted using Markdown, with an 
".md" suffix. You can
-read those text files directly if you want. Start with index.md.
+In this directory you will find text files formatted using Markdown, with an 
".md" suffix. You can
+read those text files directly if you want. Start with `index.md`.
 
 Execute `jekyll build` from the `docs/` directory to compile the site. 
Compiling the site with
-Jekyll will create a directory called `_site` containing index.html as well as 
the rest of the
+Jekyll will create a directory called `_site` containing `index.html` as well 
as the rest of the
 compiled files.
 
-$ cd docs
-$ jekyll build
+```sh
+$ cd docs
+$ jekyll build
+```
 
 You can modify the default Jekyll build as follows:
+
 ```sh
-# Skip generating API docs (which takes a while)
-$ SKIP_API=1 jekyll build
-
-# Serve content locally on port 4000
-$ jekyll serve --watch
-
-# Build the site with extra features used on the live page
-$ PRODUCTION=1 jekyll build
+# Skip generating API docs (which takes a while)
+$ SKIP_API=1 jekyll build
+
+# Serve content locally on port 4000
+$ jekyll serve --watch
+
+# Build the site with extra features used on the live page
+$ PRODUCTION=1 jekyll build
 ```
 
-## API Docs (Scaladoc, Sphinx, 

spark git commit: [SPARK-21831][TEST] Remove `spark.sql.hive.convertMetastoreOrc` config in HiveCompatibilitySuite

2017-08-25 Thread lixiao
Repository: spark
Updated Branches:
  refs/heads/master 1a598d717 -> 522e1f80d


[SPARK-21831][TEST] Remove `spark.sql.hive.convertMetastoreOrc` config in 
HiveCompatibilitySuite

## What changes were proposed in this pull request?

[SPARK-19025](https://github.com/apache/spark/pull/16869) removes SQLBuilder, 
so we don't need the following in HiveCompatibilitySuite.

```scala
// Ensures that the plans generation use metastore relation and not OrcRelation
// Was done because SqlBuilder does not work with plans having logical relation
TestHive.setConf(HiveUtils.CONVERT_METASTORE_ORC, false)
```

## How was this patch tested?

Pass the existing Jenkins tests.

Author: Dongjoon Hyun 

Closes #19043 from dongjoon-hyun/SPARK-21831.


Project: http://git-wip-us.apache.org/repos/asf/spark/repo
Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/522e1f80
Tree: http://git-wip-us.apache.org/repos/asf/spark/tree/522e1f80
Diff: http://git-wip-us.apache.org/repos/asf/spark/diff/522e1f80

Branch: refs/heads/master
Commit: 522e1f80d6e1f7647ecbb6392831832f3dad0f86
Parents: 1a598d7
Author: Dongjoon Hyun 
Authored: Fri Aug 25 19:51:13 2017 -0700
Committer: gatorsmile 
Committed: Fri Aug 25 19:51:13 2017 -0700

--
 .../spark/sql/hive/execution/HiveCompatibilitySuite.scala   | 5 -
 1 file changed, 5 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/spark/blob/522e1f80/sql/hive/compatibility/src/test/scala/org/apache/spark/sql/hive/execution/HiveCompatibilitySuite.scala
--
diff --git 
a/sql/hive/compatibility/src/test/scala/org/apache/spark/sql/hive/execution/HiveCompatibilitySuite.scala
 
b/sql/hive/compatibility/src/test/scala/org/apache/spark/sql/hive/execution/HiveCompatibilitySuite.scala
index 0a53aac..45791c6 100644
--- 
a/sql/hive/compatibility/src/test/scala/org/apache/spark/sql/hive/execution/HiveCompatibilitySuite.scala
+++ 
b/sql/hive/compatibility/src/test/scala/org/apache/spark/sql/hive/execution/HiveCompatibilitySuite.scala
@@ -39,7 +39,6 @@ class HiveCompatibilitySuite extends HiveQueryFileTest with 
BeforeAndAfter {
   private val originalLocale = Locale.getDefault
   private val originalColumnBatchSize = TestHive.conf.columnBatchSize
   private val originalInMemoryPartitionPruning = 
TestHive.conf.inMemoryPartitionPruning
-  private val originalConvertMetastoreOrc = 
TestHive.conf.getConf(HiveUtils.CONVERT_METASTORE_ORC)
   private val originalCrossJoinEnabled = TestHive.conf.crossJoinEnabled
   private val originalSessionLocalTimeZone = TestHive.conf.sessionLocalTimeZone
 
@@ -58,9 +57,6 @@ class HiveCompatibilitySuite extends HiveQueryFileTest with 
BeforeAndAfter {
 TestHive.setConf(SQLConf.COLUMN_BATCH_SIZE, 5)
 // Enable in-memory partition pruning for testing purposes
 TestHive.setConf(SQLConf.IN_MEMORY_PARTITION_PRUNING, true)
-// Ensures that the plans generation use metastore relation and not 
OrcRelation
-// Was done because SqlBuilder does not work with plans having logical 
relation
-TestHive.setConf(HiveUtils.CONVERT_METASTORE_ORC, false)
 // Ensures that cross joins are enabled so that we can test them
 TestHive.setConf(SQLConf.CROSS_JOINS_ENABLED, true)
 // Fix session local timezone to America/Los_Angeles for those timezone 
sensitive tests
@@ -76,7 +72,6 @@ class HiveCompatibilitySuite extends HiveQueryFileTest with 
BeforeAndAfter {
   Locale.setDefault(originalLocale)
   TestHive.setConf(SQLConf.COLUMN_BATCH_SIZE, originalColumnBatchSize)
   TestHive.setConf(SQLConf.IN_MEMORY_PARTITION_PRUNING, 
originalInMemoryPartitionPruning)
-  TestHive.setConf(HiveUtils.CONVERT_METASTORE_ORC, 
originalConvertMetastoreOrc)
   TestHive.setConf(SQLConf.CROSS_JOINS_ENABLED, originalCrossJoinEnabled)
   TestHive.setConf(SQLConf.SESSION_LOCAL_TIMEZONE, 
originalSessionLocalTimeZone)
 


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org



spark git commit: [SPARK-21837][SQL][TESTS] UserDefinedTypeSuite Local UDTs not actually testing what it intends

2017-08-25 Thread lixiao
Repository: spark
Updated Branches:
  refs/heads/master 51620e288 -> 1a598d717


[SPARK-21837][SQL][TESTS] UserDefinedTypeSuite Local UDTs not actually testing 
what it intends

## What changes were proposed in this pull request?

Adjust Local UDTs test to assert about results, and fix index of vector column. 
See JIRA for details.

## How was this patch tested?

Existing tests.

Author: Sean Owen 

Closes #19053 from srowen/SPARK-21837.


Project: http://git-wip-us.apache.org/repos/asf/spark/repo
Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/1a598d71
Tree: http://git-wip-us.apache.org/repos/asf/spark/tree/1a598d71
Diff: http://git-wip-us.apache.org/repos/asf/spark/diff/1a598d71

Branch: refs/heads/master
Commit: 1a598d717ca9cdebae70999c4ae9350e802f6863
Parents: 51620e2
Author: Sean Owen 
Authored: Fri Aug 25 13:29:40 2017 -0700
Committer: gatorsmile 
Committed: Fri Aug 25 13:29:40 2017 -0700

--
 .../org/apache/spark/sql/UserDefinedTypeSuite.scala | 12 ++--
 1 file changed, 6 insertions(+), 6 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/spark/blob/1a598d71/sql/core/src/test/scala/org/apache/spark/sql/UserDefinedTypeSuite.scala
--
diff --git 
a/sql/core/src/test/scala/org/apache/spark/sql/UserDefinedTypeSuite.scala 
b/sql/core/src/test/scala/org/apache/spark/sql/UserDefinedTypeSuite.scala
index b096a6d..a08433b 100644
--- a/sql/core/src/test/scala/org/apache/spark/sql/UserDefinedTypeSuite.scala
+++ b/sql/core/src/test/scala/org/apache/spark/sql/UserDefinedTypeSuite.scala
@@ -203,12 +203,12 @@ class UserDefinedTypeSuite extends QueryTest with 
SharedSQLContext with ParquetT
 
   // Tests to make sure that all operators correctly convert types on the way 
out.
   test("Local UDTs") {
-val df = Seq((1, new UDT.MyDenseVector(Array(0.1, 1.0.toDF("int", 
"vec")
-df.collect()(0).getAs[UDT.MyDenseVector](1)
-df.take(1)(0).getAs[UDT.MyDenseVector](1)
-
df.limit(1).groupBy('int).agg(first('vec)).collect()(0).getAs[UDT.MyDenseVector](0)
-df.orderBy('int).limit(1).groupBy('int).agg(first('vec)).collect()(0)
-  .getAs[UDT.MyDenseVector](0)
+val vec = new UDT.MyDenseVector(Array(0.1, 1.0))
+val df = Seq((1, vec)).toDF("int", "vec")
+assert(vec === df.collect()(0).getAs[UDT.MyDenseVector](1))
+assert(vec === df.take(1)(0).getAs[UDT.MyDenseVector](1))
+checkAnswer(df.limit(1).groupBy('int).agg(first('vec)), Row(1, vec))
+checkAnswer(df.orderBy('int).limit(1).groupBy('int).agg(first('vec)), 
Row(1, vec))
   }
 
   test("UDTs with JSON") {


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org



spark git commit: [SPARK-21756][SQL] Add JSON option to allow unquoted control characters

2017-08-25 Thread lixiao
Repository: spark
Updated Branches:
  refs/heads/master 628bdeabd -> 51620e288


[SPARK-21756][SQL] Add JSON option to allow unquoted control characters

## What changes were proposed in this pull request?

This patch adds allowUnquotedControlChars option in JSON data source to allow 
JSON Strings to contain unquoted control characters (ASCII characters with 
value less than 32, including tab and line feed characters)

## How was this patch tested?
Add new test cases

Author: vinodkc 

Closes #19008 from vinodkc/br_fix_SPARK-21756.


Project: http://git-wip-us.apache.org/repos/asf/spark/repo
Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/51620e28
Tree: http://git-wip-us.apache.org/repos/asf/spark/tree/51620e28
Diff: http://git-wip-us.apache.org/repos/asf/spark/diff/51620e28

Branch: refs/heads/master
Commit: 51620e288b5e0a7fffc3899c9deadabace28e6d7
Parents: 628bdea
Author: vinodkc 
Authored: Fri Aug 25 10:18:03 2017 -0700
Committer: gatorsmile 
Committed: Fri Aug 25 10:18:03 2017 -0700

--
 python/pyspark/sql/readwriter.py |  8 ++--
 python/pyspark/sql/streaming.py  |  8 ++--
 .../apache/spark/sql/catalyst/json/JSONOptions.scala |  3 +++
 .../scala/org/apache/spark/sql/DataFrameReader.scala |  3 +++
 .../spark/sql/streaming/DataStreamReader.scala   |  3 +++
 .../datasources/json/JsonParsingOptionsSuite.scala   | 15 +++
 6 files changed, 36 insertions(+), 4 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/spark/blob/51620e28/python/pyspark/sql/readwriter.py
--
diff --git a/python/pyspark/sql/readwriter.py b/python/pyspark/sql/readwriter.py
index 7279173..01da0dc 100644
--- a/python/pyspark/sql/readwriter.py
+++ b/python/pyspark/sql/readwriter.py
@@ -176,7 +176,7 @@ class DataFrameReader(OptionUtils):
  allowComments=None, allowUnquotedFieldNames=None, 
allowSingleQuotes=None,
  allowNumericLeadingZero=None, 
allowBackslashEscapingAnyCharacter=None,
  mode=None, columnNameOfCorruptRecord=None, dateFormat=None, 
timestampFormat=None,
- multiLine=None):
+ multiLine=None, allowUnquotedControlChars=None):
 """
 Loads JSON files and returns the results as a :class:`DataFrame`.
 
@@ -234,6 +234,9 @@ class DataFrameReader(OptionUtils):
 default value, 
``-MM-dd'T'HH:mm:ss.SSSXXX``.
 :param multiLine: parse one record, which may span multiple lines, per 
file. If None is
   set, it uses the default value, ``false``.
+:param allowUnquotedControlChars: allows JSON Strings to contain 
unquoted control
+  characters (ASCII characters with 
value less than 32,
+  including tab and line feed 
characters) or not.
 
 >>> df1 = spark.read.json('python/test_support/sql/people.json')
 >>> df1.dtypes
@@ -250,7 +253,8 @@ class DataFrameReader(OptionUtils):
 allowSingleQuotes=allowSingleQuotes, 
allowNumericLeadingZero=allowNumericLeadingZero,
 
allowBackslashEscapingAnyCharacter=allowBackslashEscapingAnyCharacter,
 mode=mode, columnNameOfCorruptRecord=columnNameOfCorruptRecord, 
dateFormat=dateFormat,
-timestampFormat=timestampFormat, multiLine=multiLine)
+timestampFormat=timestampFormat, multiLine=multiLine,
+allowUnquotedControlChars=allowUnquotedControlChars)
 if isinstance(path, basestring):
 path = [path]
 if type(path) == list:

http://git-wip-us.apache.org/repos/asf/spark/blob/51620e28/python/pyspark/sql/streaming.py
--
diff --git a/python/pyspark/sql/streaming.py b/python/pyspark/sql/streaming.py
index 5bbd70c..0cf7021 100644
--- a/python/pyspark/sql/streaming.py
+++ b/python/pyspark/sql/streaming.py
@@ -407,7 +407,7 @@ class DataStreamReader(OptionUtils):
  allowComments=None, allowUnquotedFieldNames=None, 
allowSingleQuotes=None,
  allowNumericLeadingZero=None, 
allowBackslashEscapingAnyCharacter=None,
  mode=None, columnNameOfCorruptRecord=None, dateFormat=None, 
timestampFormat=None,
- multiLine=None):
+ multiLine=None,  allowUnquotedControlChars=None):
 """
 Loads a JSON file stream and returns the results as a 
:class:`DataFrame`.
 
@@ -467,6 +467,9 @@ class DataStreamReader(OptionUtils):
 default value, 
``-MM-dd'T'HH:mm:ss.SSSXXX``.
 :param multiLine: parse one record, which may span multiple lines, per 
file. If None is

spark git commit: [SPARK-17742][CORE] Fail launcher app handle if child process exits with error.

2017-08-25 Thread vanzin
Repository: spark
Updated Branches:
  refs/heads/master 1813c4a8d -> 628bdeabd


[SPARK-17742][CORE] Fail launcher app handle if child process exits with error.

This is a follow up to cba826d0; that commit set the app handle state
to "LOST" when the child process exited, but that can be ambiguous. This
change sets the state to "FAILED" if the exit code was non-zero and
the handle state wasn't a failure state, or "LOST" if the exit status
was zero.

Author: Marcelo Vanzin 

Closes #19012 from vanzin/SPARK-17742.


Project: http://git-wip-us.apache.org/repos/asf/spark/repo
Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/628bdeab
Tree: http://git-wip-us.apache.org/repos/asf/spark/tree/628bdeab
Diff: http://git-wip-us.apache.org/repos/asf/spark/diff/628bdeab

Branch: refs/heads/master
Commit: 628bdeabda3347d0903c9ac8748d37d7b379d1e6
Parents: 1813c4a
Author: Marcelo Vanzin 
Authored: Fri Aug 25 10:04:21 2017 -0700
Committer: Marcelo Vanzin 
Committed: Fri Aug 25 10:04:21 2017 -0700

--
 .../spark/launcher/ChildProcAppHandle.java  | 27 +++-
 .../spark/launcher/ChildProcAppHandleSuite.java | 21 ++-
 2 files changed, 41 insertions(+), 7 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/spark/blob/628bdeab/launcher/src/main/java/org/apache/spark/launcher/ChildProcAppHandle.java
--
diff --git 
a/launcher/src/main/java/org/apache/spark/launcher/ChildProcAppHandle.java 
b/launcher/src/main/java/org/apache/spark/launcher/ChildProcAppHandle.java
index bf91640..5391d4a 100644
--- a/launcher/src/main/java/org/apache/spark/launcher/ChildProcAppHandle.java
+++ b/launcher/src/main/java/org/apache/spark/launcher/ChildProcAppHandle.java
@@ -156,9 +156,15 @@ class ChildProcAppHandle implements SparkAppHandle {
* the exit code.
*/
   void monitorChild() {
-while (childProc.isAlive()) {
+Process proc = childProc;
+if (proc == null) {
+  // Process may have already been disposed of, e.g. by calling kill().
+  return;
+}
+
+while (proc.isAlive()) {
   try {
-childProc.waitFor();
+proc.waitFor();
   } catch (Exception e) {
 LOG.log(Level.WARNING, "Exception waiting for child process to exit.", 
e);
   }
@@ -173,15 +179,24 @@ class ChildProcAppHandle implements SparkAppHandle {
 
   int ec;
   try {
-ec = childProc.exitValue();
+ec = proc.exitValue();
   } catch (Exception e) {
 LOG.log(Level.WARNING, "Exception getting child process exit code, 
assuming failure.", e);
 ec = 1;
   }
 
-  // Only override the success state; leave other fail states alone.
-  if (!state.isFinal() || (ec != 0 && state == State.FINISHED)) {
-state = State.LOST;
+  State newState = null;
+  if (ec != 0) {
+// Override state with failure if the current state is not final, or 
is success.
+if (!state.isFinal() || state == State.FINISHED) {
+  newState = State.FAILED;
+}
+  } else if (!state.isFinal()) {
+newState = State.LOST;
+  }
+
+  if (newState != null) {
+state = newState;
 fireEvent(false);
   }
 }

http://git-wip-us.apache.org/repos/asf/spark/blob/628bdeab/launcher/src/test/java/org/apache/spark/launcher/ChildProcAppHandleSuite.java
--
diff --git 
a/launcher/src/test/java/org/apache/spark/launcher/ChildProcAppHandleSuite.java 
b/launcher/src/test/java/org/apache/spark/launcher/ChildProcAppHandleSuite.java
index 602f55a..3b4d1b0 100644
--- 
a/launcher/src/test/java/org/apache/spark/launcher/ChildProcAppHandleSuite.java
+++ 
b/launcher/src/test/java/org/apache/spark/launcher/ChildProcAppHandleSuite.java
@@ -46,7 +46,9 @@ public class ChildProcAppHandleSuite extends BaseSuite {
   private static final List TEST_SCRIPT = Arrays.asList(
 "#!/bin/sh",
 "echo \"output\"",
-"echo \"error\" 1>&2");
+"echo \"error\" 1>&2",
+"while [ -n \"$1\" ]; do EC=$1; shift; done",
+"exit $EC");
 
   private static File TEST_SCRIPT_PATH;
 
@@ -176,6 +178,7 @@ public class ChildProcAppHandleSuite extends BaseSuite {
 
   @Test
   public void testProcMonitorWithOutputRedirection() throws Exception {
+assumeFalse(isWindows());
 File err = Files.createTempFile("out", "txt").toFile();
 SparkAppHandle handle = new TestSparkLauncher()
   .redirectError()
@@ -187,6 +190,7 @@ public class ChildProcAppHandleSuite extends BaseSuite {
 
   @Test
   public void testProcMonitorWithLogRedirection() throws Exception {
+assumeFalse(isWindows());
 SparkAppHandle handle = new TestSparkLauncher()
   .redirectToLog(getClass().getName())
   

spark git commit: [SPARK-21714][CORE][YARN] Avoiding re-uploading remote resources in yarn client mode

2017-08-25 Thread vanzin
Repository: spark
Updated Branches:
  refs/heads/master 1f24ceee6 -> 1813c4a8d


[SPARK-21714][CORE][YARN] Avoiding re-uploading remote resources in yarn client 
mode

## What changes were proposed in this pull request?

With SPARK-10643, Spark supports download resources from remote in client 
deploy mode. But the implementation overrides variables which representing 
added resources (like `args.jars`, `args.pyFiles`) to local path, And yarn 
client leverage this local path to re-upload resources to distributed cache. 
This is unnecessary to break the semantics of putting resources in a shared FS. 
So here proposed to fix it.

## How was this patch tested?

This is manually verified with jars, pyFiles in local and remote storage, both 
in client and cluster mode.

Author: jerryshao 

Closes #18962 from jerryshao/SPARK-21714.


Project: http://git-wip-us.apache.org/repos/asf/spark/repo
Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/1813c4a8
Tree: http://git-wip-us.apache.org/repos/asf/spark/tree/1813c4a8
Diff: http://git-wip-us.apache.org/repos/asf/spark/diff/1813c4a8

Branch: refs/heads/master
Commit: 1813c4a8dd4388fe76a4ec772c9be151be0f60a1
Parents: 1f24cee
Author: jerryshao 
Authored: Fri Aug 25 09:57:53 2017 -0700
Committer: Marcelo Vanzin 
Committed: Fri Aug 25 09:57:53 2017 -0700

--
 .../org/apache/spark/deploy/SparkSubmit.scala   | 64 +++---
 .../apache/spark/internal/config/package.scala  |  2 +-
 .../scala/org/apache/spark/util/Utils.scala | 25 ---
 .../apache/spark/deploy/SparkSubmitSuite.scala  | 70 
 .../main/scala/org/apache/spark/repl/Main.scala |  2 +-
 5 files changed, 114 insertions(+), 49 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/spark/blob/1813c4a8/core/src/main/scala/org/apache/spark/deploy/SparkSubmit.scala
--
diff --git a/core/src/main/scala/org/apache/spark/deploy/SparkSubmit.scala 
b/core/src/main/scala/org/apache/spark/deploy/SparkSubmit.scala
index e569251..548149a 100644
--- a/core/src/main/scala/org/apache/spark/deploy/SparkSubmit.scala
+++ b/core/src/main/scala/org/apache/spark/deploy/SparkSubmit.scala
@@ -212,14 +212,20 @@ object SparkSubmit extends CommandLineUtils {
 
   /**
* Prepare the environment for submitting an application.
-   * This returns a 4-tuple:
-   *   (1) the arguments for the child process,
-   *   (2) a list of classpath entries for the child,
-   *   (3) a map of system properties, and
-   *   (4) the main class for the child
+   *
+   * @param args the parsed SparkSubmitArguments used for environment 
preparation.
+   * @param conf the Hadoop Configuration, this argument will only be set in 
unit test.
+   * @return a 4-tuple:
+   *(1) the arguments for the child process,
+   *(2) a list of classpath entries for the child,
+   *(3) a map of system properties, and
+   *(4) the main class for the child
+   *
* Exposed for testing.
*/
-  private[deploy] def prepareSubmitEnvironment(args: SparkSubmitArguments)
+  private[deploy] def prepareSubmitEnvironment(
+  args: SparkSubmitArguments,
+  conf: Option[HadoopConfiguration] = None)
   : (Seq[String], Seq[String], Map[String, String], String) = {
 // Return values
 val childArgs = new ArrayBuffer[String]()
@@ -322,7 +328,7 @@ object SparkSubmit extends CommandLineUtils {
   }
 }
 
-val hadoopConf = new HadoopConfiguration()
+val hadoopConf = conf.getOrElse(new HadoopConfiguration())
 val targetDir = DependencyUtils.createTempDir()
 
 // Resolve glob path for different resources.
@@ -332,19 +338,21 @@ object SparkSubmit extends CommandLineUtils {
 args.archives = Option(args.archives).map(resolveGlobPaths(_, 
hadoopConf)).orNull
 
 // In client mode, download remote files.
+var localPrimaryResource: String = null
+var localJars: String = null
+var localPyFiles: String = null
 if (deployMode == CLIENT) {
-  args.primaryResource = Option(args.primaryResource).map {
+  localPrimaryResource = Option(args.primaryResource).map {
 downloadFile(_, targetDir, args.sparkProperties, hadoopConf)
   }.orNull
-  args.jars = Option(args.jars).map {
+  localJars = Option(args.jars).map {
 downloadFileList(_, targetDir, args.sparkProperties, hadoopConf)
   }.orNull
-  args.pyFiles = Option(args.pyFiles).map {
+  localPyFiles = Option(args.pyFiles).map {
 downloadFileList(_, targetDir, args.sparkProperties, hadoopConf)
   }.orNull
 }
 
-
 // If we're running a python app, set the main class to our specific 
python runner
 if (args.isPython && deployMode == CLIENT) {
   if (args.primaryResource == 

spark git commit: [SPARK-21832][TEST] Merge SQLBuilderTest into ExpressionSQLBuilderSuite

2017-08-25 Thread lixiao
Repository: spark
Updated Branches:
  refs/heads/master de7af295c -> 1f24ceee6


[SPARK-21832][TEST] Merge SQLBuilderTest into ExpressionSQLBuilderSuite

## What changes were proposed in this pull request?

After [SPARK-19025](https://github.com/apache/spark/pull/16869), there is no 
need to keep SQLBuilderTest.
ExpressionSQLBuilderSuite is the only place to use it.
This PR aims to remove SQLBuilderTest.

## How was this patch tested?

Pass the updated `ExpressionSQLBuilderSuite`.

Author: Dongjoon Hyun 

Closes #19044 from dongjoon-hyun/SPARK-21832.


Project: http://git-wip-us.apache.org/repos/asf/spark/repo
Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/1f24ceee
Tree: http://git-wip-us.apache.org/repos/asf/spark/tree/1f24ceee
Diff: http://git-wip-us.apache.org/repos/asf/spark/diff/1f24ceee

Branch: refs/heads/master
Commit: 1f24ceee606f17c4f3ca969fa4b5631256fa09e8
Parents: de7af29
Author: Dongjoon Hyun 
Authored: Fri Aug 25 08:59:48 2017 -0700
Committer: gatorsmile 
Committed: Fri Aug 25 08:59:48 2017 -0700

--
 .../catalyst/ExpressionSQLBuilderSuite.scala| 23 --
 .../spark/sql/catalyst/SQLBuilderTest.scala | 44 
 2 files changed, 20 insertions(+), 47 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/spark/blob/1f24ceee/sql/hive/src/test/scala/org/apache/spark/sql/catalyst/ExpressionSQLBuilderSuite.scala
--
diff --git 
a/sql/hive/src/test/scala/org/apache/spark/sql/catalyst/ExpressionSQLBuilderSuite.scala
 
b/sql/hive/src/test/scala/org/apache/spark/sql/catalyst/ExpressionSQLBuilderSuite.scala
index 90f9059..d9cf1f3 100644
--- 
a/sql/hive/src/test/scala/org/apache/spark/sql/catalyst/ExpressionSQLBuilderSuite.scala
+++ 
b/sql/hive/src/test/scala/org/apache/spark/sql/catalyst/ExpressionSQLBuilderSuite.scala
@@ -19,12 +19,29 @@ package org.apache.spark.sql.catalyst
 
 import java.sql.Timestamp
 
+import org.apache.spark.sql.QueryTest
 import org.apache.spark.sql.catalyst.dsl.expressions._
-import org.apache.spark.sql.catalyst.expressions.{If, Literal, 
SpecifiedWindowFrame, TimeAdd,
-  TimeSub, WindowSpecDefinition}
+import org.apache.spark.sql.catalyst.expressions._
+import org.apache.spark.sql.hive.test.TestHiveSingleton
 import org.apache.spark.unsafe.types.CalendarInterval
 
-class ExpressionSQLBuilderSuite extends SQLBuilderTest {
+class ExpressionSQLBuilderSuite extends QueryTest with TestHiveSingleton {
+  protected def checkSQL(e: Expression, expectedSQL: String): Unit = {
+val actualSQL = e.sql
+try {
+  assert(actualSQL == expectedSQL)
+} catch {
+  case cause: Throwable =>
+fail(
+  s"""Wrong SQL generated for the following expression:
+ |
+ |${e.prettyName}
+ |
+ |$cause
+   """.stripMargin)
+}
+  }
+
   test("literal") {
 checkSQL(Literal("foo"), "'foo'")
 checkSQL(Literal("\"foo\""), "'\"foo\"'")

http://git-wip-us.apache.org/repos/asf/spark/blob/1f24ceee/sql/hive/src/test/scala/org/apache/spark/sql/catalyst/SQLBuilderTest.scala
--
diff --git 
a/sql/hive/src/test/scala/org/apache/spark/sql/catalyst/SQLBuilderTest.scala 
b/sql/hive/src/test/scala/org/apache/spark/sql/catalyst/SQLBuilderTest.scala
deleted file mode 100644
index 157783a..000
--- a/sql/hive/src/test/scala/org/apache/spark/sql/catalyst/SQLBuilderTest.scala
+++ /dev/null
@@ -1,44 +0,0 @@
-/*
- * Licensed to the Apache Software Foundation (ASF) under one or more
- * contributor license agreements.  See the NOTICE file distributed with
- * this work for additional information regarding copyright ownership.
- * The ASF licenses this file to You under the Apache License, Version 2.0
- * (the "License"); you may not use this file except in compliance with
- * the License.  You may obtain a copy of the License at
- *
- *http://www.apache.org/licenses/LICENSE-2.0
- *
- * Unless required by applicable law or agreed to in writing, software
- * distributed under the License is distributed on an "AS IS" BASIS,
- * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
- * See the License for the specific language governing permissions and
- * limitations under the License.
- */
-
-package org.apache.spark.sql.catalyst
-
-import scala.util.control.NonFatal
-
-import org.apache.spark.sql.{DataFrame, Dataset, QueryTest}
-import org.apache.spark.sql.catalyst.expressions.Expression
-import org.apache.spark.sql.catalyst.plans.logical.LogicalPlan
-import org.apache.spark.sql.hive.test.TestHiveSingleton
-
-
-abstract class SQLBuilderTest extends QueryTest with TestHiveSingleton {
-  protected def checkSQL(e: Expression, expectedSQL: 

spark git commit: [MINOR][BUILD] Fix build warnings and Java lint errors

2017-08-25 Thread srowen
Repository: spark
Updated Branches:
  refs/heads/master 574ef6c98 -> de7af295c


[MINOR][BUILD] Fix build warnings and Java lint errors

## What changes were proposed in this pull request?

Fix build warnings and Java lint errors. This just helps a bit in evaluating 
(new) warnings in another PR I have open.

## How was this patch tested?

Existing tests

Author: Sean Owen 

Closes #19051 from srowen/JavaWarnings.


Project: http://git-wip-us.apache.org/repos/asf/spark/repo
Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/de7af295
Tree: http://git-wip-us.apache.org/repos/asf/spark/tree/de7af295
Diff: http://git-wip-us.apache.org/repos/asf/spark/diff/de7af295

Branch: refs/heads/master
Commit: de7af295c2047f1b508cb02e735e0e743395f181
Parents: 574ef6c
Author: Sean Owen 
Authored: Fri Aug 25 16:07:13 2017 +0100
Committer: Sean Owen 
Committed: Fri Aug 25 16:07:13 2017 +0100

--
 .../java/org/apache/spark/util/kvstore/InMemoryStore.java | 2 +-
 .../java/org/apache/spark/util/kvstore/KVStoreIterator.java   | 3 ++-
 .../apache/spark/network/TransportRequestHandlerSuite.java| 7 +--
 .../java/org/apache/spark/launcher/SparkLauncherSuite.java| 1 -
 .../org/apache/spark/launcher/ChildProcAppHandleSuite.java| 1 -
 .../org/apache/spark/ml/tuning/CrossValidatorSuite.scala  | 7 +++
 .../apache/spark/ml/tuning/TrainValidationSplitSuite.scala| 7 +++
 pom.xml   | 2 +-
 .../execution/datasources/parquet/VectorizedColumnReader.java | 3 ++-
 .../spark/sql/execution/vectorized/AggregateHashMap.java  | 1 -
 .../spark/sql/execution/vectorized/ArrowColumnVector.java | 1 -
 11 files changed, 17 insertions(+), 18 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/spark/blob/de7af295/common/kvstore/src/main/java/org/apache/spark/util/kvstore/InMemoryStore.java
--
diff --git 
a/common/kvstore/src/main/java/org/apache/spark/util/kvstore/InMemoryStore.java 
b/common/kvstore/src/main/java/org/apache/spark/util/kvstore/InMemoryStore.java
index 9cae5da..5ca4371 100644
--- 
a/common/kvstore/src/main/java/org/apache/spark/util/kvstore/InMemoryStore.java
+++ 
b/common/kvstore/src/main/java/org/apache/spark/util/kvstore/InMemoryStore.java
@@ -171,7 +171,7 @@ public class InMemoryStore implements KVStore {
 public  InMemoryView view(Class type) {
   Preconditions.checkArgument(ti.type().equals(type), "Unexpected type: 
%s", type);
   Collection all = (Collection) data.values();
-  return new InMemoryView(type, all, ti);
+  return new InMemoryView<>(type, all, ti);
 }
 
   }

http://git-wip-us.apache.org/repos/asf/spark/blob/de7af295/common/kvstore/src/main/java/org/apache/spark/util/kvstore/KVStoreIterator.java
--
diff --git 
a/common/kvstore/src/main/java/org/apache/spark/util/kvstore/KVStoreIterator.java
 
b/common/kvstore/src/main/java/org/apache/spark/util/kvstore/KVStoreIterator.java
index 28a432b..e6254a9 100644
--- 
a/common/kvstore/src/main/java/org/apache/spark/util/kvstore/KVStoreIterator.java
+++ 
b/common/kvstore/src/main/java/org/apache/spark/util/kvstore/KVStoreIterator.java
@@ -17,6 +17,7 @@
 
 package org.apache.spark.util.kvstore;
 
+import java.io.Closeable;
 import java.util.Iterator;
 import java.util.List;
 
@@ -31,7 +32,7 @@ import org.apache.spark.annotation.Private;
  * 
  */
 @Private
-public interface KVStoreIterator extends Iterator, AutoCloseable {
+public interface KVStoreIterator extends Iterator, Closeable {
 
   /**
* Retrieve multiple elements from the store.

http://git-wip-us.apache.org/repos/asf/spark/blob/de7af295/common/network-common/src/test/java/org/apache/spark/network/TransportRequestHandlerSuite.java
--
diff --git 
a/common/network-common/src/test/java/org/apache/spark/network/TransportRequestHandlerSuite.java
 
b/common/network-common/src/test/java/org/apache/spark/network/TransportRequestHandlerSuite.java
index 1ed5711..2656cbe 100644
--- 
a/common/network-common/src/test/java/org/apache/spark/network/TransportRequestHandlerSuite.java
+++ 
b/common/network-common/src/test/java/org/apache/spark/network/TransportRequestHandlerSuite.java
@@ -102,7 +102,7 @@ public class TransportRequestHandlerSuite {
 
   private class ExtendedChannelPromise extends DefaultChannelPromise {
 
-private List listeners = new ArrayList<>();
+private List> listeners = new 
ArrayList<>();
 private boolean success;
 
 ExtendedChannelPromise(Channel channel) {
@@ -113,7 +113,10 @@ public class TransportRequestHandlerSuite {
 @Override
 public 

spark git commit: [SPARK-21527][CORE] Use buffer limit in order to use JAVA NIO Util's buffercache

2017-08-25 Thread wenchen
Repository: spark
Updated Branches:
  refs/heads/master 7d16776d2 -> 574ef6c98


[SPARK-21527][CORE] Use buffer limit in order to use JAVA NIO Util's buffercache

## What changes were proposed in this pull request?

Right now, ChunkedByteBuffer#writeFully do not slice bytes first.We observe 
code in java nio Util#getTemporaryDirectBuffer below:

BufferCache cache = bufferCache.get();
ByteBuffer buf = cache.get(size);
if (buf != null) {
return buf;
} else {
// No suitable buffer in the cache so we need to allocate a new
// one. To avoid the cache growing then we remove the first
// buffer from the cache and free it.
if (!cache.isEmpty()) {
buf = cache.removeFirst();
free(buf);
}
return ByteBuffer.allocateDirect(size);
}

If we slice first with a fixed size, we can use buffer cache and only need to 
allocate at the first write call.
Since we allocate new buffer, we can not control the free time of this 
buffer.This once cause memory issue in our production cluster.
In this patch, i supply a new api which will slice with fixed size for buffer 
writing.

## How was this patch tested?

Unit test and test in production.

Author: zhoukang 
Author: zhoukang 

Closes #18730 from caneGuy/zhoukang/improve-chunkwrite.


Project: http://git-wip-us.apache.org/repos/asf/spark/repo
Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/574ef6c9
Tree: http://git-wip-us.apache.org/repos/asf/spark/tree/574ef6c9
Diff: http://git-wip-us.apache.org/repos/asf/spark/diff/574ef6c9

Branch: refs/heads/master
Commit: 574ef6c987c636210828e96d2f797d8f10aff05e
Parents: 7d16776
Author: zhoukang 
Authored: Fri Aug 25 22:59:31 2017 +0800
Committer: Wenchen Fan 
Committed: Fri Aug 25 22:59:31 2017 +0800

--
 .../scala/org/apache/spark/internal/config/package.scala |  9 +
 .../org/apache/spark/util/io/ChunkedByteBuffer.scala | 11 ++-
 2 files changed, 19 insertions(+), 1 deletion(-)
--


http://git-wip-us.apache.org/repos/asf/spark/blob/574ef6c9/core/src/main/scala/org/apache/spark/internal/config/package.scala
--
diff --git a/core/src/main/scala/org/apache/spark/internal/config/package.scala 
b/core/src/main/scala/org/apache/spark/internal/config/package.scala
index 9495cd2..0457a66 100644
--- a/core/src/main/scala/org/apache/spark/internal/config/package.scala
+++ b/core/src/main/scala/org/apache/spark/internal/config/package.scala
@@ -293,6 +293,15 @@ package object config {
   .booleanConf
   .createWithDefault(false)
 
+  private[spark] val BUFFER_WRITE_CHUNK_SIZE =
+ConfigBuilder("spark.buffer.write.chunkSize")
+  .internal()
+  .doc("The chunk size during writing out the bytes of ChunkedByteBuffer.")
+  .bytesConf(ByteUnit.BYTE)
+  .checkValue(_ <= Int.MaxValue, "The chunk size during writing out the 
bytes of" +
+" ChunkedByteBuffer should not larger than Int.MaxValue.")
+  .createWithDefault(64 * 1024 * 1024)
+
   private[spark] val CHECKPOINT_COMPRESS =
 ConfigBuilder("spark.checkpoint.compress")
   .doc("Whether to compress RDD checkpoints. Generally a good idea. 
Compression will use " +

http://git-wip-us.apache.org/repos/asf/spark/blob/574ef6c9/core/src/main/scala/org/apache/spark/util/io/ChunkedByteBuffer.scala
--
diff --git 
a/core/src/main/scala/org/apache/spark/util/io/ChunkedByteBuffer.scala 
b/core/src/main/scala/org/apache/spark/util/io/ChunkedByteBuffer.scala
index f48bfd5..c28570f 100644
--- a/core/src/main/scala/org/apache/spark/util/io/ChunkedByteBuffer.scala
+++ b/core/src/main/scala/org/apache/spark/util/io/ChunkedByteBuffer.scala
@@ -24,6 +24,8 @@ import java.nio.channels.WritableByteChannel
 import com.google.common.primitives.UnsignedBytes
 import io.netty.buffer.{ByteBuf, Unpooled}
 
+import org.apache.spark.SparkEnv
+import org.apache.spark.internal.config
 import org.apache.spark.network.util.ByteArrayWritableChannel
 import org.apache.spark.storage.StorageUtils
 
@@ -40,6 +42,11 @@ private[spark] class ChunkedByteBuffer(var chunks: 
Array[ByteBuffer]) {
   require(chunks != null, "chunks must not be null")
   require(chunks.forall(_.position() == 0), "chunks' positions must be 0")
 
+  // Chunk size in bytes
+  private val bufferWriteChunkSize =
+Option(SparkEnv.get).map(_.conf.get(config.BUFFER_WRITE_CHUNK_SIZE))
+  .getOrElse(config.BUFFER_WRITE_CHUNK_SIZE.defaultValue.get).toInt
+
   private[this] var disposed: Boolean = false
 
   /**
@@ -56,7 +63,9 @@ private[spark] class ChunkedByteBuffer(var chunks: 

spark git commit: [SPARK-21255][SQL][WIP] Fixed NPE when creating encoder for enum

2017-08-25 Thread srowen
Repository: spark
Updated Branches:
  refs/heads/master f3676d639 -> 7d16776d2


[SPARK-21255][SQL][WIP] Fixed NPE when creating encoder for enum

## What changes were proposed in this pull request?

Fixed NPE when creating encoder for enum.

When you try to create an encoder for Enum type (or bean with enum property) 
via Encoders.bean(...), it fails with NullPointerException at TypeToken:495.
I did a little research and it turns out, that in JavaTypeInference following 
code
```
  def getJavaBeanReadableProperties(beanClass: Class[_]): 
Array[PropertyDescriptor] = {
val beanInfo = Introspector.getBeanInfo(beanClass)
beanInfo.getPropertyDescriptors.filterNot(_.getName == "class")
  .filter(_.getReadMethod != null)
  }
```
filters out properties named "class", because we wouldn't want to serialize 
that. But enum types have another property of type Class named 
"declaringClass", which we are trying to inspect recursively. Eventually we try 
to inspect ClassLoader class, which has property "defaultAssertionStatus" with 
no read method, which leads to NPE at TypeToken:495.

I added property name "declaringClass" to filtering to resolve this.

## How was this patch tested?
Unit test in JavaDatasetSuite which creates an encoder for enum

Author: mike 
Author: Mikhail Sveshnikov 

Closes #18488 from mike0sv/enum-support.


Project: http://git-wip-us.apache.org/repos/asf/spark/repo
Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/7d16776d
Tree: http://git-wip-us.apache.org/repos/asf/spark/tree/7d16776d
Diff: http://git-wip-us.apache.org/repos/asf/spark/diff/7d16776d

Branch: refs/heads/master
Commit: 7d16776d28da5bcf656f0d8556b15ed3a5edca44
Parents: f3676d6
Author: mike 
Authored: Fri Aug 25 07:22:34 2017 +0100
Committer: Sean Owen 
Committed: Fri Aug 25 07:22:34 2017 +0100

--
 .../spark/sql/catalyst/JavaTypeInference.scala  | 40 ++
 .../catalyst/encoders/ExpressionEncoder.scala   | 14 +++-
 .../catalyst/expressions/objects/objects.scala  |  4 +-
 .../org/apache/spark/sql/JavaDatasetSuite.java  | 77 
 4 files changed, 131 insertions(+), 4 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/spark/blob/7d16776d/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/JavaTypeInference.scala
--
diff --git 
a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/JavaTypeInference.scala
 
b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/JavaTypeInference.scala
index 21363d3..33f6ce0 100644
--- 
a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/JavaTypeInference.scala
+++ 
b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/JavaTypeInference.scala
@@ -32,6 +32,7 @@ import org.apache.spark.sql.catalyst.expressions.objects._
 import org.apache.spark.sql.catalyst.util.{ArrayBasedMapData, DateTimeUtils, 
GenericArrayData}
 import org.apache.spark.sql.types._
 import org.apache.spark.unsafe.types.UTF8String
+import org.apache.spark.util.Utils
 
 /**
  * Type-inference utilities for POJOs and Java collections.
@@ -118,6 +119,10 @@ object JavaTypeInference {
 val (valueDataType, nullable) = inferDataType(valueType, seenTypeSet)
 (MapType(keyDataType, valueDataType, nullable), true)
 
+  case other if other.isEnum =>
+(StructType(Seq(StructField(typeToken.getRawType.getSimpleName,
+  StringType, nullable = false))), true)
+
   case other =>
 if (seenTypeSet.contains(other)) {
   throw new UnsupportedOperationException(
@@ -140,6 +145,7 @@ object JavaTypeInference {
   def getJavaBeanReadableProperties(beanClass: Class[_]): 
Array[PropertyDescriptor] = {
 val beanInfo = Introspector.getBeanInfo(beanClass)
 beanInfo.getPropertyDescriptors.filterNot(_.getName == "class")
+  .filterNot(_.getName == "declaringClass")
   .filter(_.getReadMethod != null)
   }
 
@@ -303,6 +309,11 @@ object JavaTypeInference {
   keyData :: valueData :: Nil,
   returnNullable = false)
 
+  case other if other.isEnum =>
+StaticInvoke(JavaTypeInference.getClass, ObjectType(other), 
"deserializeEnumName",
+  expressions.Literal.create(other.getEnumConstants.apply(0), 
ObjectType(other))
+:: getPath :: Nil)
+
   case other =>
 val properties = getJavaBeanReadableAndWritableProperties(other)
 val setters = properties.map { p =>
@@ -345,6 +356,30 @@ object JavaTypeInference {
 }
   }
 
+  /** Returns a mapping from enum value to int for given enum type */
+  def enumSerializer[T <: Enum[T]](enum: Class[T]): T => UTF8String = {
+assert(enum.isEnum)
+inputObject: T =>
+  UTF8String.fromString(inputObject.name())
+  }
+
+  /** Returns value