[GitHub] spark pull request #19291: Branch 2.1

2017-09-21 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/spark/pull/19291


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #19291: Branch 2.1

2017-09-20 Thread zhizu2018
GitHub user zhizu2018 opened a pull request:

https://github.com/apache/spark/pull/19291

Branch 2.1

## What changes were proposed in this pull request?

(Please fill in changes proposed in this fix)

## How was this patch tested?

(Please explain how this patch was tested. E.g. unit tests, integration 
tests, manual tests)
(If this patch involves UI changes, please attach a screenshot; otherwise, 
remove this)

Please review http://spark.apache.org/contributing.html before opening a 
pull request.


You can merge this pull request into a Git repository by running:

$ git pull https://github.com/apache/spark branch-2.1

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/spark/pull/19291.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #19291


commit a3d5300a030fb5f1c275e671603e0745b6466735
Author: Stan Zhai 
Date:   2017-02-09T20:01:25Z

[SPARK-19509][SQL] Grouping Sets do not respect nullable grouping columns

## What changes were proposed in this pull request?
The analyzer currently does not check if a column used in grouping sets is 
actually nullable itself. This can cause the nullability of the column to be 
incorrect, which can cause null pointer exceptions down the line. This PR fixes 
that by also consider the nullability of the column.

This is only a problem for Spark 2.1 and below. The latest master uses a 
different approach.

Closes https://github.com/apache/spark/pull/16874

## How was this patch tested?
Added a regression test to `SQLQueryTestSuite.grouping_set`.

Author: Herman van Hovell 

Closes #16873 from hvanhovell/SPARK-19509.

commit ff5818b8cee7c718ef5bdef125c8d6971d64acde
Author: Bogdan Raducanu 
Date:   2017-02-10T09:50:07Z

[SPARK-19512][BACKPORT-2.1][SQL] codegen for compare structs fails #16852

## What changes were proposed in this pull request?

Set currentVars to null in GenerateOrdering.genComparisons before genCode 
is called. genCode ignores INPUT_ROW if currentVars is not null and in 
genComparisons we want it to use INPUT_ROW.

## How was this patch tested?

Added test with 2 queries in WholeStageCodegenSuite

Author: Bogdan Raducanu 

Closes #16875 from bogdanrdc/SPARK-19512-2.1.

commit 7b5ea000e246f7052e7324fd7f2e99f32aaece17
Author: Burak Yavuz 
Date:   2017-02-10T11:55:06Z

[SPARK-19543] from_json fails when the input row is empty

## What changes were proposed in this pull request?

Using from_json on a column with an empty string results in: 
java.util.NoSuchElementException: head of empty list.

This is because `parser.parse(input)` may return `Nil` when 
`input.trim.isEmpty`

## How was this patch tested?

Regression test in `JsonExpressionsSuite`

Author: Burak Yavuz 

Closes #16881 from brkyvz/json-fix.

(cherry picked from commit d5593f7f5794bd0343e783ac4957864fed9d1b38)
Signed-off-by: Herman van Hovell 

commit e580bb035236dd92ade126af6bb98288d88179c4
Author: Andrew Ray 
Date:   2016-12-13T07:49:22Z

[SPARK-18717][SQL] Make code generation for Scala Map work with 
immutable.Map also

## What changes were proposed in this pull request?

Fixes compile errors in generated code when user has case class with a 
`scala.collections.immutable.Map` instead of a `scala.collections.Map`. Since 
ArrayBasedMapData.toScalaMap returns the immutable version we can make it work 
with both.

## How was this patch tested?

Additional unit tests.

Author: Andrew Ray 

Closes #16161 from aray/fix-map-codegen.

(cherry picked from commit 46d30ac4846b3ec94426cc482c42cff72ebd6d92)
Signed-off-by: Cheng Lian 

commit 173c2387a38b260b46d7646b332e404f6ebe1a17
Author: titicaca 
Date:   2017-02-12T18:42:15Z

[SPARK-19342][SPARKR] bug fixed in collect method for collecting timestamp 
column

## What changes were proposed in this pull request?

Fix a bug in collect method for collecting timestamp column, the bug can be 
reproduced as shown in the following codes and outputs:

```
library(SparkR)
sparkR.session(master = "local")
df <- data.frame(col1 = c(0, 1, 2),
 col2 = c(as.POSIXct("2017-01-01 00:00:01"), NA, 
as.POSIXct("2017-01-01 12:00:01")))

sdf1 <- createDataFrame(df)
print(dtypes(sdf1))
df1 <- collect(sdf1)
print(lapply(df1, class))

sdf2 <- filter(sdf1, "col1 > 0")
print(dtypes(sdf2))
df2 <-