date:20180816

[GitHub] spark pull request #22126: [SPARK-23938][SQL][FOLLOW-UP][TEST] Nullabilities...

2018-08-16 Thread asfgit

Github user asfgit closed the pull request at:

https://github.com/apache/spark/pull/22126


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #22130: [SPARK-25137][Spark Shell] NumberFormatException` when s...

2018-08-16 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/22130
  
**[Test build #94877 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/94877/testReport)**
 for PR 22130 at commit 
[`8ab5f87`](https://github.com/apache/spark/commit/8ab5f879843e74bec43ceada1027d1d5818e40da).


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #22121: [SPARK-25133][SQL][Doc]AVRO data source guide

2018-08-16 Thread gatorsmile

Github user gatorsmile commented on the issue:

https://github.com/apache/spark/pull/22121
  
@gengliangwang Could you also post the screen shot in your PR description?



---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #22130: [SPARK-25137][Spark Shell] NumberFormatException` when s...

2018-08-16 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/22130
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 

https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/2262/
Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #22130: [SPARK-25137][Spark Shell] NumberFormatException` when s...

2018-08-16 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/22130
  
Merged build finished. Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #22130: [SPARK-25137][Spark Shell] NumberFormatException` when s...

2018-08-16 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/22130
  
Merged build finished. Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #22130: [SPARK-25137][Spark Shell] NumberFormatException` when s...

2018-08-16 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/22130
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 

https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/2261/
Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #22130: [SPARK-25137][Spark Shell] NumberFormatException` when s...

2018-08-16 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/22130
  
**[Test build #94876 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/94876/testReport)**
 for PR 22130 at commit 
[`d00929f`](https://github.com/apache/spark/commit/d00929f28b2523869252d67fefc04297aadc5af6).
 * This patch **fails build dependency tests**.
 * This patch merges cleanly.
 * This patch adds no public classes.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #22130: [SPARK-25137][Spark Shell] NumberFormatException` when s...

2018-08-16 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/22130
  
Test FAILed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/94876/
Test FAILed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #22130: [SPARK-25137][Spark Shell] NumberFormatException` when s...

2018-08-16 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/22130
  
Merged build finished. Test FAILed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #22126: [SPARK-23938][SQL][FOLLOW-UP][TEST] Nullabilities of val...

2018-08-16 Thread ueshin

Github user ueshin commented on the issue:

https://github.com/apache/spark/pull/22126
  
Thanks! merging to master.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #22130: [SPARK-25137][Spark Shell] NumberFormatException` when s...

2018-08-16 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/22130
  
**[Test build #94876 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/94876/testReport)**
 for PR 22130 at commit 
[`d00929f`](https://github.com/apache/spark/commit/d00929f28b2523869252d67fefc04297aadc5af6).


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #22130: [SPARK-25137][Spark Shell] NumberFormatException`...

2018-08-16 Thread vinodkc

GitHub user vinodkc opened a pull request:

https://github.com/apache/spark/pull/22130

[SPARK-25137][Spark Shell] NumberFormatException` when starting spark-shell 
from Mac terminal

## What changes were proposed in this pull request?

 When starting spark-shell from Mac terminal,  Getting exception
[ERROR] Failed to construct terminal; falling back to unsupported
java.lang.NumberFormatException: For input string: "0x100"
at 
java.lang.NumberFormatException.forInputString(NumberFormatException.java:65)
at java.lang.Integer.parseInt(Integer.java:580)
at java.lang.Integer.valueOf(Integer.java:766)
at jline.internal.InfoCmp.parseInfoCmp(InfoCmp.java:59)
at jline.UnixTerminal.parseInfoCmp(UnixTerminal.java:242)
at jline.UnixTerminal.(UnixTerminal.java:65)
at jline.UnixTerminal.(UnixTerminal.java:50)
at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
at 
sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
at 
sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
at java.lang.reflect.Constructor.newInstance(Constructor.java:423)
at java.lang.Class.newInstance(Class.java:442)
at jline.TerminalFactory.getFlavor(TerminalFactory.java:211)

This issue is due a jline defect : 
https://github.com/jline/jline2/issues/281, which is fixed in Jline 2.14.4, 
bumping up JLine version in spark to version  > Jline 2.14.4 will fix the issue

## How was this patch tested?
No new  UT/automation test added,  after upgrade to latest Jline version 
2.14.6, manually tested spark shell features



You can merge this pull request into a Git repository by running:

$ git pull https://github.com/vinodkc/spark br_UpgradeJLineVersion

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/spark/pull/22130.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #22130


commit d00929f28b2523869252d67fefc04297aadc5af6
Author: Vinod KC 
Date:   2018-08-17T04:10:18Z

Upgrade JLine to 2.14.6




---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #22121: [SPARK-25133][SQL][Doc]AVRO data source guide

2018-08-16 Thread gengliangwang

Github user gengliangwang commented on the issue:

https://github.com/apache/spark/pull/22121
  
@cloud-fan @gatorsmile 


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #22079: [SPARK-23207][SPARK-22905][SQL][BACKPORT-2.2] Shuffle+Re...

2018-08-16 Thread bersprockets

Github user bersprockets commented on the issue:

https://github.com/apache/spark/pull/22079
  
@jiangxb1987 gentle ping.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #22126: [SPARK-23938][SQL][FOLLOW-UP][TEST] Nullabilities of val...

2018-08-16 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/22126
  
Merged build finished. Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #22126: [SPARK-23938][SQL][FOLLOW-UP][TEST] Nullabilities of val...

2018-08-16 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/22126
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/94871/
Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #22126: [SPARK-23938][SQL][FOLLOW-UP][TEST] Nullabilities of val...

2018-08-16 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/22126
  
**[Test build #94871 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/94871/testReport)**
 for PR 22126 at commit 
[`45d044c`](https://github.com/apache/spark/commit/45d044c42fd8b785c734a920f4b557ca469a5212).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #22120: [SPARK-25131]Event logs missing applicationAttemptId for...

2018-08-16 Thread ajithme

Github user ajithme commented on the issue:

https://github.com/apache/spark/pull/22120
  
@vanzin I agree its a trivial change. Just wanted it to be consistent 
output with yarn cluster mode. This is not just for event logs also for a 
custom SparkListener , it may be confusing that appId is empty in client case 
and a actual number in cluster case for onApplicationStart, this is where its 
effect can be seen. 


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #22123: [SPARK-25134][SQL] Csv column pruning with checking of h...

2018-08-16 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/22123
  
**[Test build #94875 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/94875/testReport)**
 for PR 22123 at commit 
[`09c986c`](https://github.com/apache/spark/commit/09c986c7e9586346255ba7631db83f2f88fe1625).


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #22123: [SPARK-25134][SQL] Csv column pruning with checking of h...

2018-08-16 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/22123
  
Merged build finished. Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #21794: [SPARK-24834][CORE] use java comparison for float and do...

2018-08-16 Thread srowen

Github user srowen commented on the issue:

https://github.com/apache/spark/pull/21794
  
I think we'd have to close this due to the behavior change, but would merge 
an optimization of the existing behavior.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #22123: [SPARK-25134][SQL] Csv column pruning with checking of h...

2018-08-16 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/22123
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 

https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/2260/
Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #22123: [SPARK-25134][SQL] Csv column pruning with checki...

2018-08-16 Thread koertkuipers

Github user koertkuipers commented on a diff in the pull request:

https://github.com/apache/spark/pull/22123#discussion_r210801081
  
--- Diff: 
sql/core/src/test/scala/org/apache/spark/sql/execution/datasources/csv/CSVSuite.scala
 ---
@@ -1603,6 +1603,44 @@ class CSVSuite extends QueryTest with 
SharedSQLContext with SQLTestUtils with Te
   .exists(msg => msg.getRenderedMessage.contains("CSV header does not 
conform to the schema")))
   }
 
+  test("SPARK-25134: check header on parsing of dataset with projection 
and column pruning") {
+withSQLConf(SQLConf.CSV_PARSER_COLUMN_PRUNING.key -> "true") {
+  withTempPath { path =>
+val dir = path.getAbsolutePath
+Seq(("a", "b")).toDF("columnA", "columnB").write
+  .format("csv")
+  .option("header", true)
+  .save(dir)
+checkAnswer(spark.read
+  .format("csv")
+  .option("header", true)
+  .option("enforceSchema", false)
+  .load(dir)
+  .select("columnA"),
+  Row("a"))
+  }
+}
+  }
+
+  test("SPARK-25134: check header on parsing of dataset with projection 
and no column pruning") {
+withSQLConf(SQLConf.CSV_PARSER_COLUMN_PRUNING.key -> "false") {
--- End diff --

ok will remove


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #21961: Spark 20597

2018-08-16 Thread srowen

Github user srowen commented on a diff in the pull request:

https://github.com/apache/spark/pull/21961#discussion_r210800762
  
--- Diff: 
external/kafka-0-10-sql/src/main/scala/org/apache/spark/sql/kafka010/KafkaSourceProvider.scala
 ---
@@ -231,7 +231,13 @@ private[kafka010] class KafkaSourceProvider extends 
DataSourceRegister
   parameters: Map[String, String],
   partitionColumns: Seq[String],
   outputMode: OutputMode): Sink = {
-val defaultTopic = parameters.get(TOPIC_OPTION_KEY).map(_.trim)
+// Picks the defaulttopicname from "path" key, an entry in 
"parameters" Map,
+// if no topic key is present in the "parameters" Map and is provided 
with key "path".
+val defaultTopic = parameters.get(TOPIC_OPTION_KEY) match {
--- End diff --

Isn't this simpler as something like

```
val defaultTopic = parameters.getOrElse(TOPIC_OPTION_KEY, 
parameters.get(PATH_OPTION_KEY)).map(_.trim)
```


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #21860: [SPARK-24901][SQL]Merge the codegen of RegularHashMap an...

2018-08-16 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/21860
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/94872/
Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #21860: [SPARK-24901][SQL]Merge the codegen of RegularHashMap an...

2018-08-16 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/21860
  
Merged build finished. Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #21537: [SPARK-24505][SQL] Convert strings in codegen to blocks:...

2018-08-16 Thread gatorsmile

Github user gatorsmile commented on the issue:

https://github.com/apache/spark/pull/21537
  
@kiszk The initial prototype or proof of concept can be in any personal 
branch. When we merge it to the master branch, we still need to separate it 
from the current codegen and make it configurable. After the release, the users 
can choose which one to be used. When the new IR is stable, we can then 
consider deprecate the current one. This is majorly for product stability. We 
need to follow the similar principle for any big project. 

@viirya @mgaido91 Let us first focus on the new IR design and prototype. 


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #21860: [SPARK-24901][SQL]Merge the codegen of RegularHashMap an...

2018-08-16 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/21860
  
**[Test build #94872 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/94872/testReport)**
 for PR 21860 at commit 
[`3aa4e6d`](https://github.com/apache/spark/commit/3aa4e6d2c4ebd330898feb75af7b7fb36f512ea7).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #21868: [SPARK-24906][SQL] Adaptively enlarge split / par...

2018-08-16 Thread HyukjinKwon

Github user HyukjinKwon commented on a diff in the pull request:

https://github.com/apache/spark/pull/21868#discussion_r210799950
  
--- Diff: 
sql/core/src/main/scala/org/apache/spark/sql/execution/DataSourceScanExec.scala 
---
@@ -425,12 +426,44 @@ case class FileSourceScanExec(
   fsRelation: HadoopFsRelation): RDD[InternalRow] = {
 val defaultMaxSplitBytes =
   fsRelation.sparkSession.sessionState.conf.filesMaxPartitionBytes
-val openCostInBytes = 
fsRelation.sparkSession.sessionState.conf.filesOpenCostInBytes
+var openCostInBytes = 
fsRelation.sparkSession.sessionState.conf.filesOpenCostInBytes
 val defaultParallelism = 
fsRelation.sparkSession.sparkContext.defaultParallelism
 val totalBytes = selectedPartitions.flatMap(_.files.map(_.getLen + 
openCostInBytes)).sum
 val bytesPerCore = totalBytes / defaultParallelism
 
-val maxSplitBytes = Math.min(defaultMaxSplitBytes, 
Math.max(openCostInBytes, bytesPerCore))
+var maxSplitBytes = Math.min(defaultMaxSplitBytes, 
Math.max(openCostInBytes, bytesPerCore))
+
+
if(fsRelation.sparkSession.sessionState.conf.isParquetSizeAdaptiveEnabled &&
+  (fsRelation.fileFormat.isInstanceOf[ParquetSource] ||
+fsRelation.fileFormat.isInstanceOf[OrcFileFormat])) {
+  if (relation.dataSchema.map(_.dataType).forall(dataType =>
+dataType.isInstanceOf[CalendarIntervalType] || 
dataType.isInstanceOf[StructType]
+  || dataType.isInstanceOf[MapType] || 
dataType.isInstanceOf[NullType]
+  || dataType.isInstanceOf[AtomicType] || 
dataType.isInstanceOf[ArrayType])) {
+
+def getTypeLength(dataType: DataType): Int = {
+  if (dataType.isInstanceOf[StructType]) {
+
fsRelation.sparkSession.sessionState.conf.parquetStructTypeLength
+  } else if (dataType.isInstanceOf[ArrayType]) {
+
fsRelation.sparkSession.sessionState.conf.parquetArrayTypeLength
+  } else if (dataType.isInstanceOf[MapType]) {
+fsRelation.sparkSession.sessionState.conf.parquetMapTypeLength
+  } else {
+dataType.defaultSize
+  }
+}
+
+val selectedColumnSize = 
requiredSchema.map(_.dataType).map(getTypeLength(_))
+  .reduceOption(_ + _).getOrElse(StringType.defaultSize)
+val totalColumnSize = 
relation.dataSchema.map(_.dataType).map(getTypeLength(_))
+  .reduceOption(_ + _).getOrElse(StringType.defaultSize)
--- End diff --

I think his point is that the estimation is super rough which I agree with 
.. I am less sure if we should go ahead or not partially by this reason as well.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #21868: [SPARK-24906][SQL] Adaptively enlarge split / par...

2018-08-16 Thread viirya

Github user viirya commented on a diff in the pull request:

https://github.com/apache/spark/pull/21868#discussion_r210799970
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/internal/SQLConf.scala ---
@@ -459,6 +460,29 @@ object SQLConf {
 .intConf
 .createWithDefault(4096)
 
+  val IS_PARQUET_PARTITION_ADAPTIVE_ENABLED = 
buildConf("spark.sql.parquet.adaptiveFileSplit")
+.doc("For columnar file format (e.g., Parquet), it's possible that 
only few (not all) " +
+  "columns are needed. So, it's better to make sure that the total 
size of the selected " +
+  "columns is about 128 MB "
+)
+.booleanConf
+.createWithDefault(false)
+
+  val PARQUET_STRUCT_LENGTH = buildConf("spark.sql.parquet.struct.length")
+.doc("Set the default size of struct column")
+.intConf
+.createWithDefault(StringType.defaultSize)
+
+  val PARQUET_MAP_LENGTH = buildConf("spark.sql.parquet.map.length")
--- End diff --

Yeah, I was thinking that.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #21537: [SPARK-24505][SQL] Convert strings in codegen to blocks:...

2018-08-16 Thread viirya

Github user viirya commented on the issue:

https://github.com/apache/spark/pull/21537
  
If we will continue on improving current codegen framework, I think it is 
good to have a design doc reviewed by the community.

If we decide to have IR design and get rid of this string based framework, 
do we still need to have design doc for the current codegen improvement? Or we 
can focus on IR design doc?


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #21868: [SPARK-24906][SQL] Adaptively enlarge split / par...

2018-08-16 Thread HyukjinKwon

Github user HyukjinKwon commented on a diff in the pull request:

https://github.com/apache/spark/pull/21868#discussion_r210799891
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/internal/SQLConf.scala ---
@@ -459,6 +460,29 @@ object SQLConf {
 .intConf
 .createWithDefault(4096)
 
+  val IS_PARQUET_PARTITION_ADAPTIVE_ENABLED = 
buildConf("spark.sql.parquet.adaptiveFileSplit")
+.doc("For columnar file format (e.g., Parquet), it's possible that 
only few (not all) " +
+  "columns are needed. So, it's better to make sure that the total 
size of the selected " +
+  "columns is about 128 MB "
+)
+.booleanConf
+.createWithDefault(false)
+
+  val PARQUET_STRUCT_LENGTH = buildConf("spark.sql.parquet.struct.length")
+.doc("Set the default size of struct column")
+.intConf
+.createWithDefault(StringType.defaultSize)
+
+  val PARQUET_MAP_LENGTH = buildConf("spark.sql.parquet.map.length")
--- End diff --

I wouldn't do this. This makes more complicated and I would just set a 
bigger number for `maxPartitionBytes`.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #21868: [SPARK-24906][SQL] Adaptively enlarge split / par...

2018-08-16 Thread HyukjinKwon

Github user HyukjinKwon commented on a diff in the pull request:

https://github.com/apache/spark/pull/21868#discussion_r210799770
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/internal/SQLConf.scala ---
@@ -459,6 +460,29 @@ object SQLConf {
 .intConf
 .createWithDefault(4096)
 
+  val IS_PARQUET_PARTITION_ADAPTIVE_ENABLED = 
buildConf("spark.sql.parquet.adaptiveFileSplit")
+.doc("For columnar file format (e.g., Parquet), it's possible that 
only few (not all) " +
+  "columns are needed. So, it's better to make sure that the total 
size of the selected " +
+  "columns is about 128 MB "
--- End diff --

It sounds not describing what the configuration does actually.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #21868: [SPARK-24906][SQL] Adaptively enlarge split / par...

2018-08-16 Thread HyukjinKwon

Github user HyukjinKwon commented on a diff in the pull request:

https://github.com/apache/spark/pull/21868#discussion_r210799731
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/internal/SQLConf.scala ---
@@ -459,6 +460,29 @@ object SQLConf {
 .intConf
 .createWithDefault(4096)
 
+  val IS_PARQUET_PARTITION_ADAPTIVE_ENABLED = 
buildConf("spark.sql.parquet.adaptiveFileSplit")
+.doc("For columnar file format (e.g., Parquet), it's possible that 
only few (not all) " +
--- End diff --

`it's` I would avoid abbreviation.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #22110: [SPARK-25122][SQL] Deduplication of supports equa...

2018-08-16 Thread asfgit

Github user asfgit closed the pull request at:

https://github.com/apache/spark/pull/22110


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #21868: [SPARK-24906][SQL] Adaptively enlarge split / par...

2018-08-16 Thread HyukjinKwon

Github user HyukjinKwon commented on a diff in the pull request:

https://github.com/apache/spark/pull/21868#discussion_r210799600
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/internal/SQLConf.scala ---
@@ -459,6 +460,29 @@ object SQLConf {
 .intConf
 .createWithDefault(4096)
 
+  val IS_PARQUET_PARTITION_ADAPTIVE_ENABLED = 
buildConf("spark.sql.parquet.adaptiveFileSplit")
--- End diff --

This configuration doesn't look specific to parquet anymore.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #21868: [SPARK-24906][SQL] Adaptively enlarge split / partition ...

2018-08-16 Thread HyukjinKwon

Github user HyukjinKwon commented on the issue:

https://github.com/apache/spark/pull/21868
  
@habren, BTW, just for clarification, you can set the bigger number to 
`spark.sql.files.maxPartitionBytes` explicitly and that resolved your issue. 
This one is to handle it dynamically, right?


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #22124: [SPARK-25135][SQL] Insert datasource table may al...

2018-08-16 Thread cloud-fan

Github user cloud-fan commented on a diff in the pull request:

https://github.com/apache/spark/pull/22124#discussion_r210799343
  
--- Diff: 
sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/rules.scala 
---
@@ -490,7 +490,8 @@ object DDLPreprocessingUtils {
   case (expected, actual) =>
 if (expected.dataType.sameType(actual.dataType) &&
   expected.name == actual.name &&
-  expected.metadata == actual.metadata) {
+  expected.metadata == actual.metadata &&
+  expected.exprId.id == actual.exprId.id) {
--- End diff --

why does this fix the problem?


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #22110: [SPARK-25122][SQL] Deduplication of supports equals code

2018-08-16 Thread cloud-fan

Github user cloud-fan commented on the issue:

https://github.com/apache/spark/pull/22110
  
thanks, merging to master!


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #22116: [DOCS]Update configuration.md

2018-08-16 Thread KraFusion

Github user KraFusion commented on the issue:

https://github.com/apache/spark/pull/22116
  
@srowen Thanks!
yes, my bad. next time I will bundle (better yet I will look for the same 
issue elsewhere in the docs), and I'll use a better title.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #21537: [SPARK-24505][SQL] Convert strings in codegen to blocks:...

2018-08-16 Thread cloud-fan

Github user cloud-fan commented on the issue:

https://github.com/apache/spark/pull/21537
  
It's a good point that we should have a design doc for this codegen 
infrastructure improvement, since it's very critical to Spark. And we should 
have it reviewed by the community.

There were some discussions on the PRs and JIRAs, but it didn't happen in 
the dev list. This is something we should do next.

At this stage, I think it's too late to revert anything related to the 
codegen improvement. There are so many codegen templates get touched and I 
think reverting is riskier.

But we should hold it now until the design doc is reviewed by the community 
in dev list.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #22112: [SPARK-23243][Core] Fix RDD.repartition() data correctne...

2018-08-16 Thread cloud-fan

Github user cloud-fan commented on the issue:

https://github.com/apache/spark/pull/22112
  
Actually we can extend the solution later and I've mentioned it in my PR 
description.

Basically there are 3 kinds of closures:
1. totally random
2. always output same data set in a random order
3. always output same data sequence (same order)

Spark is able to handle closure 1, the cost is, whenever a fetch failure 
happens and a map task gets retried, Spark needs to rollback all the succeeding 
stages and retry them, because their input has changed. `zip` falls in this 
category, but due to time constraints, I think it's ok to document it and fix 
it later.

For closure 2, Spark can treat it as closure 3 if the shuffle partitioner 
is order insensitive like range/hash partitioner. This means, when a map task 
gets retried, it will produce the same data for the reducers, so we don't need 
to rollback all the succeeding stages. However, if the shuffle partitioner is 
order insensitive like round-robin, Spark has to treat it like closure 1 and 
rollback all the succeeding stages if a map task gets retried.

Closure 3 is already handled well by the current Spark.

In this PR, I assume all the RDDs' computing functions are closure 3, so 
that we don't have performance regression. The only exception is shuffled RDD, 
which outputs data in a random order because of the remote block fetching.

In the future, we can extend `RDD#isIdempotent` to an enum to indicate the 
3 closure types, and change the `FetchFailed` handling logic accordingly.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #21950: [SPARK-24914][SQL][WIP] Add configuration to avoid OOM d...

2018-08-16 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/21950
  
**[Test build #94874 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/94874/testReport)**
 for PR 21950 at commit 
[`3a65edf`](https://github.com/apache/spark/commit/3a65edf0e07f3beb6d6dd4dcb16e76ea7210c5e9).


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #21950: [SPARK-24914][SQL][WIP] Add configuration to avoid OOM d...

2018-08-16 Thread squito

Github user squito commented on the issue:

https://github.com/apache/spark/pull/21950
  
retest this please


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #22045: [SPARK-23940][SQL] Add transform_values SQL funct...

2018-08-16 Thread asfgit

Github user asfgit closed the pull request at:

https://github.com/apache/spark/pull/22045


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #21868: [SPARK-24906][SQL] Adaptively enlarge split / par...

2018-08-16 Thread habren

Github user habren commented on a diff in the pull request:

https://github.com/apache/spark/pull/21868#discussion_r210793717
  
--- Diff: 
sql/core/src/main/scala/org/apache/spark/sql/execution/DataSourceScanExec.scala 
---
@@ -425,12 +426,44 @@ case class FileSourceScanExec(
   fsRelation: HadoopFsRelation): RDD[InternalRow] = {
 val defaultMaxSplitBytes =
   fsRelation.sparkSession.sessionState.conf.filesMaxPartitionBytes
-val openCostInBytes = 
fsRelation.sparkSession.sessionState.conf.filesOpenCostInBytes
+var openCostInBytes = 
fsRelation.sparkSession.sessionState.conf.filesOpenCostInBytes
 val defaultParallelism = 
fsRelation.sparkSession.sparkContext.defaultParallelism
 val totalBytes = selectedPartitions.flatMap(_.files.map(_.getLen + 
openCostInBytes)).sum
 val bytesPerCore = totalBytes / defaultParallelism
 
-val maxSplitBytes = Math.min(defaultMaxSplitBytes, 
Math.max(openCostInBytes, bytesPerCore))
+var maxSplitBytes = Math.min(defaultMaxSplitBytes, 
Math.max(openCostInBytes, bytesPerCore))
+
+
if(fsRelation.sparkSession.sessionState.conf.isParquetSizeAdaptiveEnabled &&
+  (fsRelation.fileFormat.isInstanceOf[ParquetSource] ||
+fsRelation.fileFormat.isInstanceOf[OrcFileFormat])) {
+  if (relation.dataSchema.map(_.dataType).forall(dataType =>
+dataType.isInstanceOf[CalendarIntervalType] || 
dataType.isInstanceOf[StructType]
+  || dataType.isInstanceOf[MapType] || 
dataType.isInstanceOf[NullType]
+  || dataType.isInstanceOf[AtomicType] || 
dataType.isInstanceOf[ArrayType])) {
+
+def getTypeLength(dataType: DataType): Int = {
+  if (dataType.isInstanceOf[StructType]) {
+
fsRelation.sparkSession.sessionState.conf.parquetStructTypeLength
+  } else if (dataType.isInstanceOf[ArrayType]) {
+
fsRelation.sparkSession.sessionState.conf.parquetArrayTypeLength
+  } else if (dataType.isInstanceOf[MapType]) {
+fsRelation.sparkSession.sessionState.conf.parquetMapTypeLength
+  } else {
+dataType.defaultSize
+  }
+}
+
+val selectedColumnSize = 
requiredSchema.map(_.dataType).map(getTypeLength(_))
+  .reduceOption(_ + _).getOrElse(StringType.defaultSize)
+val totalColumnSize = 
relation.dataSchema.map(_.dataType).map(getTypeLength(_))
+  .reduceOption(_ + _).getOrElse(StringType.defaultSize)
--- End diff --

@gatorsmile  The target of this change is not making users easy to set the 
partition size. Instead, when user set the partition size, this change will try 
its best to make sure the read size is  close to the value that set by user. 
Without this change, when user set partition size to 128MB, the actual read 
size may be 1MB or even smaller because of column pruning.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #22045: [SPARK-23940][SQL] Add transform_values SQL function

2018-08-16 Thread ueshin

Github user ueshin commented on the issue:

https://github.com/apache/spark/pull/22045
  
Thanks! merging to master.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #21990: [SPARK-25003][PYSPARK] Use SessionExtensions in Pyspark

2018-08-16 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/21990
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/94873/
Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #21990: [SPARK-25003][PYSPARK] Use SessionExtensions in Pyspark

2018-08-16 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/21990
  
**[Test build #94873 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/94873/testReport)**
 for PR 21990 at commit 
[`0eea205`](https://github.com/apache/spark/commit/0eea205ca0591c68975412873b34393f6bf19437).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds the following public classes _(experimental)_:
  * `class SparkExtensionsTest(unittest.TestCase, SQLTestUtils):`


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #21990: [SPARK-25003][PYSPARK] Use SessionExtensions in Pyspark

2018-08-16 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/21990
  
Merged build finished. Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #22122: [SPARK-24665][PySpark][FollowUp] Use SQLConf in PySpark ...

2018-08-16 Thread xuanyuanking

Github user xuanyuanking commented on the issue:

https://github.com/apache/spark/pull/22122
  
Thanks.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #22112: [SPARK-23243][Core] Fix RDD.repartition() data correctne...

2018-08-16 Thread mridulm

Github user mridulm commented on the issue:

https://github.com/apache/spark/pull/22112
  
@tgravescs To understand better, are you suggesting that we do not support 
any api and/or user closure which depends on input order ?
If yes, that would break not just repartition + shuffle, but also other 
publically exposed api in spark core and (my guess) non trivial aspects of 
mllib.

Or is it that we support repartition and possibly a few other high priority 
cases (sampling in mllib for example ?) and not support the rest ?

My (unproven) contention is that solution for repartition + shuffle would 
be a general solution (or very close to it) : which will then work for all 
other cases with suitable modifications as required.
By "expand solution to cover all later.", I was referring to these changes 
to leverage whatever we build for repartition in other usecases- for example 
set appropriate parameters, etc in interest of time.



---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #21990: [SPARK-25003][PYSPARK] Use SessionExtensions in P...

2018-08-16 Thread HyukjinKwon

Github user HyukjinKwon commented on a diff in the pull request:

https://github.com/apache/spark/pull/21990#discussion_r210790581
  
--- Diff: python/pyspark/sql/tests.py ---
@@ -3563,6 +3563,51 @@ def 
test_query_execution_listener_on_collect_with_arrow(self):
 "The callback from the query execution listener should be 
called after 'toPandas'")
 
 
+class SparkExtensionsTest(unittest.TestCase, SQLTestUtils):
+# These tests are separate because it uses 'spark.sql.extensions' 
which is
+# static and immutable. This can't be set or unset, for example, via 
`spark.conf`.
+
+@classmethod
+def setUpClass(cls):
+import glob
+from pyspark.find_spark_home import _find_spark_home
+
+SPARK_HOME = _find_spark_home()
+filename_pattern = (
+"sql/core/target/scala-*/test-classes/org/apache/spark/sql/"
+"SparkSessionExtensionSuite.class")
+if not glob.glob(os.path.join(SPARK_HOME, filename_pattern)):
+raise unittest.SkipTest(
+"'org.apache.spark.sql.SparkSessionExtensionSuite.' is not 
"
+"available. Will skip the related tests.")
+
+# Note that 'spark.sql.extensions' is a static immutable 
configuration.
+cls.spark = SparkSession.builder \
+.master("local[4]") \
+.appName(cls.__name__) \
+.config(
+"spark.sql.extensions",
+"org.apache.spark.sql.MyExtensions") \
+.getOrCreate()
+
+@classmethod
+def tearDownClass(cls):
+cls.spark.stop()
+
+def tearDown(self):
+self.spark._jvm.OnSuccessCall.clear()
--- End diff --

This wouldn't be needed since I did this for testing if the callback is 
called or not in the PR pointed out.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #22122: [SPARK-24665][PySpark][FollowUp] Use SQLConf in P...

2018-08-16 Thread asfgit

Github user asfgit closed the pull request at:

https://github.com/apache/spark/pull/22122


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #21990: [SPARK-25003][PYSPARK] Use SessionExtensions in P...

2018-08-16 Thread HyukjinKwon

Github user HyukjinKwon commented on a diff in the pull request:

https://github.com/apache/spark/pull/21990#discussion_r210790531
  
--- Diff: python/pyspark/sql/tests.py ---
@@ -3563,6 +3563,51 @@ def 
test_query_execution_listener_on_collect_with_arrow(self):
 "The callback from the query execution listener should be 
called after 'toPandas'")
 
 
+class SparkExtensionsTest(unittest.TestCase, SQLTestUtils):
+# These tests are separate because it uses 'spark.sql.extensions' 
which is
+# static and immutable. This can't be set or unset, for example, via 
`spark.conf`.
+
+@classmethod
+def setUpClass(cls):
+import glob
+from pyspark.find_spark_home import _find_spark_home
+
+SPARK_HOME = _find_spark_home()
+filename_pattern = (
+"sql/core/target/scala-*/test-classes/org/apache/spark/sql/"
+"SparkSessionExtensionSuite.class")
+if not glob.glob(os.path.join(SPARK_HOME, filename_pattern)):
+raise unittest.SkipTest(
+"'org.apache.spark.sql.SparkSessionExtensionSuite.' is not 
"
+"available. Will skip the related tests.")
+
+# Note that 'spark.sql.extensions' is a static immutable 
configuration.
+cls.spark = SparkSession.builder \
+.master("local[4]") \
+.appName(cls.__name__) \
+.config(
+"spark.sql.extensions",
+"org.apache.spark.sql.MyExtensions") \
--- End diff --

@RussellSpitzer, I think you should push `MyExtensions` scala side code too.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #22114: [SPARK-24938][Core] Prevent Netty from using onheap memo...

2018-08-16 Thread NiharS

Github user NiharS commented on the issue:

https://github.com/apache/spark/pull/22114
  
They pass on my machine :(


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #22122: [SPARK-24665][PySpark][FollowUp] Use SQLConf in PySpark ...

2018-08-16 Thread HyukjinKwon

Github user HyukjinKwon commented on the issue:

https://github.com/apache/spark/pull/22122
  
Merged to master.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #22122: [SPARK-24665][PySpark][FollowUp] Use SQLConf in PySpark ...

2018-08-16 Thread xuanyuanking

Github user xuanyuanking commented on the issue:

https://github.com/apache/spark/pull/22122
  
```
Are they all instances to fix?
```
@HyukjinKwon Yep, I grep all `conf.get("spark.sql.xxx")` and make sure for 
this. The remaining of hard code config is StaticSQLConf 
`spark.sql.catalogImplementation` in session.py, it can't manage by SQLConf.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #21990: [SPARK-25003][PYSPARK] Use SessionExtensions in Pyspark

2018-08-16 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/21990
  
**[Test build #94873 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/94873/testReport)**
 for PR 21990 at commit 
[`0eea205`](https://github.com/apache/spark/commit/0eea205ca0591c68975412873b34393f6bf19437).


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #22123: [SPARK-25134][SQL] Csv column pruning with checki...

2018-08-16 Thread HyukjinKwon

Github user HyukjinKwon commented on a diff in the pull request:

https://github.com/apache/spark/pull/22123#discussion_r210788916
  
--- Diff: 
sql/core/src/test/scala/org/apache/spark/sql/execution/datasources/csv/CSVSuite.scala
 ---
@@ -1603,6 +1603,44 @@ class CSVSuite extends QueryTest with 
SharedSQLContext with SQLTestUtils with Te
   .exists(msg => msg.getRenderedMessage.contains("CSV header does not 
conform to the schema")))
   }
 
+  test("SPARK-25134: check header on parsing of dataset with projection 
and column pruning") {
+withSQLConf(SQLConf.CSV_PARSER_COLUMN_PRUNING.key -> "true") {
+  withTempPath { path =>
+val dir = path.getAbsolutePath
+Seq(("a", "b")).toDF("columnA", "columnB").write
+  .format("csv")
+  .option("header", true)
+  .save(dir)
+checkAnswer(spark.read
+  .format("csv")
+  .option("header", true)
+  .option("enforceSchema", false)
+  .load(dir)
+  .select("columnA"),
+  Row("a"))
+  }
+}
+  }
+
+  test("SPARK-25134: check header on parsing of dataset with projection 
and no column pruning") {
+withSQLConf(SQLConf.CSV_PARSER_COLUMN_PRUNING.key -> "false") {
--- End diff --

I think `false` case test can be removed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #21950: [SPARK-24914][SQL][WIP] Add configuration to avoid OOM d...

2018-08-16 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/21950
  
Merged build finished. Test FAILed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #21950: [SPARK-24914][SQL][WIP] Add configuration to avoid OOM d...

2018-08-16 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/21950
  
Test FAILed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/94869/
Test FAILed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #21950: [SPARK-24914][SQL][WIP] Add configuration to avoid OOM d...

2018-08-16 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/21950
  
**[Test build #94869 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/94869/testReport)**
 for PR 21950 at commit 
[`3a65edf`](https://github.com/apache/spark/commit/3a65edf0e07f3beb6d6dd4dcb16e76ea7210c5e9).
 * This patch **fails Spark unit tests**.
 * This patch merges cleanly.
 * This patch adds no public classes.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #22098: [SPARK-24886][INFRA] Fix the testing script to increase ...

2018-08-16 Thread HyukjinKwon

Github user HyukjinKwon commented on the issue:

https://github.com/apache/spark/pull/22098
  
haha, it's more then 4 years ago .. if we are unsure on the env, let me 
just push this in.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #22104: [SPARK-24721][SQL] Exclude Python UDFs filters in...

2018-08-16 Thread HyukjinKwon

Github user HyukjinKwon commented on a diff in the pull request:

https://github.com/apache/spark/pull/22104#discussion_r210786895
  
--- Diff: python/pyspark/sql/tests.py ---
@@ -3367,6 +3367,33 @@ def test_ignore_column_of_all_nulls(self):
 finally:
 shutil.rmtree(path)
 
+# SPARK-24721
+def test_datasource_with_udf_filter_lit_input(self):
+from pyspark.sql.functions import udf, lit, col
+
+path = tempfile.mkdtemp()
+shutil.rmtree(path)
+try:
+
self.spark.range(1).write.mode("overwrite").format('csv').save(path)
+filesource_df = self.spark.read.csv(path)
+datasource_df = self.spark.read \
+.format("org.apache.spark.sql.sources.SimpleScanSource") \
+.option('from', 0).option('to', 1).load()
+datasource_v2_df = self.spark.read \
+
.format("org.apache.spark.sql.sources.v2.SimpleDataSourceV2") \
--- End diff --

This wouldn't work if test classes are not compiled. I think we should 
better make another test suite that skips the test if the test classes are not 
existent.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #22114: [SPARK-24938][Core] Prevent Netty from using onheap memo...

2018-08-16 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/22114
  
Merged build finished. Test FAILed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #22114: [SPARK-24938][Core] Prevent Netty from using onheap memo...

2018-08-16 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/22114
  
Test FAILed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/94867/
Test FAILed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #22114: [SPARK-24938][Core] Prevent Netty from using onheap memo...

2018-08-16 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/22114
  
**[Test build #94867 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/94867/testReport)**
 for PR 22114 at commit 
[`c2f9ed1`](https://github.com/apache/spark/commit/c2f9ed10776842ffe0746fcc89b157675fa6c455).
 * This patch **fails Spark unit tests**.
 * This patch merges cleanly.
 * This patch adds no public classes.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #21868: [SPARK-24906][SQL] Adaptively enlarge split / par...

2018-08-16 Thread viirya

Github user viirya commented on a diff in the pull request:

https://github.com/apache/spark/pull/21868#discussion_r210785081
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/internal/SQLConf.scala ---
@@ -459,6 +458,29 @@ object SQLConf {
 .intConf
 .createWithDefault(4096)
 
+  val IS_PARQUET_PARTITION_ADAPTIVE_ENABLED = 
buildConf("spark.sql.parquet.adaptiveFileSplit")
+.doc("For columnar file format (e.g., Parquet), it's possible that 
only few (not all) " +
+  "columns are needed. So, it's better to make sure that the total 
size of the selected " +
+  "columns is about 128 MB "
+)
+.booleanConf
+.createWithDefault(false)
+
+  val PARQUET_STRUCT_LENGTH = buildConf("spark.sql.parquet.struct.length")
+.doc("Set the default size of struct column")
+.intConf
+.createWithDefault(StringType.defaultSize)
--- End diff --

And these configs assume that different storage formats use the same size?


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #21868: [SPARK-24906][SQL] Adaptively enlarge split / par...

2018-08-16 Thread viirya

Github user viirya commented on a diff in the pull request:

https://github.com/apache/spark/pull/21868#discussion_r210779310
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/internal/SQLConf.scala ---
@@ -459,6 +458,29 @@ object SQLConf {
 .intConf
 .createWithDefault(4096)
 
+  val IS_PARQUET_PARTITION_ADAPTIVE_ENABLED = 
buildConf("spark.sql.parquet.adaptiveFileSplit")
+.doc("For columnar file format (e.g., Parquet), it's possible that 
only few (not all) " +
+  "columns are needed. So, it's better to make sure that the total 
size of the selected " +
+  "columns is about 128 MB "
+)
+.booleanConf
+.createWithDefault(false)
+
+  val PARQUET_STRUCT_LENGTH = buildConf("spark.sql.parquet.struct.length")
+.doc("Set the default size of struct column")
+.intConf
+.createWithDefault(StringType.defaultSize)
+
+  val PARQUET_MAP_LENGTH = buildConf("spark.sql.parquet.map.length")
+.doc("Set the default size of map column")
+.intConf
+.createWithDefault(StringType.defaultSize)
+
+  val PARQUET_ARRAY_LENGTH = buildConf("spark.sql.parquet.array.length")
+.doc("Set the default size of array column")
+.intConf
+.createWithDefault(StringType.defaultSize)
--- End diff --

This feature includes so many configs, my concern is it is hard for end 
users to set them.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #21868: [SPARK-24906][SQL] Adaptively enlarge split / par...

2018-08-16 Thread viirya

Github user viirya commented on a diff in the pull request:

https://github.com/apache/spark/pull/21868#discussion_r210765335
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/internal/SQLConf.scala ---
@@ -25,17 +25,16 @@ import java.util.zip.Deflater
 import scala.collection.JavaConverters._
 import scala.collection.immutable
 import scala.util.matching.Regex
-
--- End diff --

Please don't remove these blank lines. Can you revert it back?


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #21819: [SPARK-24863][SS] Report Kafka offset lag as a custom me...

2018-08-16 Thread HyukjinKwon

Github user HyukjinKwon commented on the issue:

https://github.com/apache/spark/pull/21819
  
Let me leave this in few days in case someone has more comments on this.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #21320: [SPARK-4502][SQL] Parquet nested column pruning - founda...

2018-08-16 Thread HyukjinKwon

Github user HyukjinKwon commented on the issue:

https://github.com/apache/spark/pull/21320
  
I said either way works fine. It doesn't matter which way we go. Better 
close one of them if the approach is the same and both PRs are active.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #22125: [DOCS] Fix cloud-integration.md Typo

2018-08-16 Thread kiszk

Github user kiszk commented on the issue:

https://github.com/apache/spark/pull/22125
  
@KraFusion Sorry, I overlooked another PR.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #21537: [SPARK-24505][SQL] Convert strings in codegen to blocks:...

2018-08-16 Thread HyukjinKwon

Github user HyukjinKwon commented on the issue:

https://github.com/apache/spark/pull/21537
  
I don't see a notable risk here. That's just to avoid string interpolation, 
which makes less error-prone, which is discussed already and the code change is 
small.

I hope we can move other discussions to other threads like JIRA or mailing 
list so that people can see. It's quite difficult to find such discussion to me 
actually.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #22129: just testing pyarrow changes

2018-08-16 Thread shaneknapp

GitHub user shaneknapp opened a pull request:

https://github.com/apache/spark/pull/22129

just testing pyarrow changes

## What changes were proposed in this pull request?

(Please fill in changes proposed in this fix)

## How was this patch tested?

(Please explain how this patch was tested. E.g. unit tests, integration 
tests, manual tests)
(If this patch involves UI changes, please attach a screenshot; otherwise, 
remove this)

Please review http://spark.apache.org/contributing.html before opening a 
pull request.


You can merge this pull request into a Git repository by running:

$ git pull https://github.com/shaneknapp/spark pyarrow-test

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/spark/pull/22129.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #22129


commit df646e860f97c09f7fea9a80a058025bd3edac57
Author: shane knapp 
Date:   2018-08-17T00:48:47Z

just testing pyarrow changes




---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #22129: just testing pyarrow changes

2018-08-16 Thread shaneknapp

Github user shaneknapp closed the pull request at:

https://github.com/apache/spark/pull/22129


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #21221: [SPARK-23429][CORE] Add executor memory metrics to heart...

2018-08-16 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/21221
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/94865/
Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #21221: [SPARK-23429][CORE] Add executor memory metrics to heart...

2018-08-16 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/21221
  
Merged build finished. Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #21221: [SPARK-23429][CORE] Add executor memory metrics to heart...

2018-08-16 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/21221
  
**[Test build #94865 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/94865/testReport)**
 for PR 21221 at commit 
[`2897281`](https://github.com/apache/spark/commit/2897281a384d25556609a17be21f926cb5d68dd6).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #21860: [SPARK-24901][SQL]Merge the codegen of RegularHashMap an...

2018-08-16 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/21860
  
**[Test build #94872 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/94872/testReport)**
 for PR 21860 at commit 
[`3aa4e6d`](https://github.com/apache/spark/commit/3aa4e6d2c4ebd330898feb75af7b7fb36f512ea7).


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #22048: [SPARK-25108][SQL] Fix the show method to display the wi...

2018-08-16 Thread kiszk

Github user kiszk commented on the issue:

https://github.com/apache/spark/pull/22048
  
Thank you for creating a JIRA entry and for putting the result. The test 
case is not available yet.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #22126: [SPARK-23938][SQL][FOLLOW-UP][TEST] Nullabilities of val...

2018-08-16 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/22126
  
Merged build finished. Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #22126: [SPARK-23938][SQL][FOLLOW-UP][TEST] Nullabilities of val...

2018-08-16 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/22126
  
**[Test build #94871 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/94871/testReport)**
 for PR 22126 at commit 
[`45d044c`](https://github.com/apache/spark/commit/45d044c42fd8b785c734a920f4b557ca469a5212).


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #22126: [SPARK-23938][SQL][FOLLOW-UP][TEST] Nullabilities of val...

2018-08-16 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/22126
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 

https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/2259/
Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #22045: [SPARK-23940][SQL] Add transform_values SQL function

2018-08-16 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/22045
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/94864/
Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #22045: [SPARK-23940][SQL] Add transform_values SQL function

2018-08-16 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/22045
  
Merged build finished. Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #22045: [SPARK-23940][SQL] Add transform_values SQL function

2018-08-16 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/22045
  
**[Test build #94864 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/94864/testReport)**
 for PR 22045 at commit 
[`3382e1a`](https://github.com/apache/spark/commit/3382e1a5396c8e5a94802d92a7106eacf627617c).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #21990: [SPARK-25003][PYSPARK] Use SessionExtensions in Pyspark

2018-08-16 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/21990
  
Merged build finished. Test FAILed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #21990: [SPARK-25003][PYSPARK] Use SessionExtensions in Pyspark

2018-08-16 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/21990
  
**[Test build #94870 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/94870/testReport)**
 for PR 21990 at commit 
[`d5c37b7`](https://github.com/apache/spark/commit/d5c37b732f1948d8240bd8de33a080ac5db03571).
 * This patch **fails Python style tests**.
 * This patch merges cleanly.
 * This patch adds the following public classes _(experimental)_:
  * `class SparkExtensionsTest(unittest.TestCase, SQLTestUtils):`


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #21990: [SPARK-25003][PYSPARK] Use SessionExtensions in Pyspark

2018-08-16 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/21990
  
Test FAILed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/94870/
Test FAILed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #21990: [SPARK-25003][PYSPARK] Use SessionExtensions in Pyspark

2018-08-16 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/21990
  
**[Test build #94870 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/94870/testReport)**
 for PR 21990 at commit 
[`d5c37b7`](https://github.com/apache/spark/commit/d5c37b732f1948d8240bd8de33a080ac5db03571).


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #22116: [DOCS]Update configuration.md

2018-08-16 Thread asfgit

Github user asfgit closed the pull request at:

https://github.com/apache/spark/pull/22116


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #22125: [DOCS] Fix cloud-integration.md Typo

2018-08-16 Thread asfgit

Github user asfgit closed the pull request at:

https://github.com/apache/spark/pull/22125


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #22127: [SPARK-25032][SQL] fix drop database issue

2018-08-16 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/22127
  
Test FAILed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/94868/
Test FAILed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #22127: [SPARK-25032][SQL] fix drop database issue

2018-08-16 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/22127
  
Merged build finished. Test FAILed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #22127: [SPARK-25032][SQL] fix drop database issue

2018-08-16 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/22127
  
**[Test build #94868 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/94868/testReport)**
 for PR 22127 at commit 
[`8255336`](https://github.com/apache/spark/commit/825533682c98598409e537fa866dcdab915e3948).
 * This patch **fails Spark unit tests**.
 * This patch merges cleanly.
 * This patch adds no public classes.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #22125: [DOCS] Fix cloud-integration.md Typo

2018-08-16 Thread srowen

Github user srowen commented on the issue:

https://github.com/apache/spark/pull/22125
  
Merged to master


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #22116: [DOCS]Update configuration.md

2018-08-16 Thread srowen

Github user srowen commented on the issue:

https://github.com/apache/spark/pull/22116
  
Merged to master. For the future, a better title and bundling these in one 
PR would be preferable.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

1 2 3 4 5 >

1 - 100 of 440 matches

Mail list logo