Github user HyukjinKwon commented on the issue:
https://github.com/apache/spark/pull/14279
@srowen I think it is okay as a separate PR.
I believe adding the things I said for a follow-up will mess up the change
and make this hard to be reviewed. Nevertheless, I don't mind
GitHub user HyukjinKwon opened a pull request:
https://github.com/apache/spark/pull/14627
[SPARK-16975][SQL] Do not duplicately check file paths in data sources
implementing FileFormat and prevent to attempt to list twice in ORC
## What changes were proposed in this pull request
Github user HyukjinKwon commented on the issue:
https://github.com/apache/spark/pull/14627
@dongjoon-hyun Thanks for taking a look! Actually, I intended to test your
fix for all data sources just to make sure the issue in `SPARK-16975` is
resolved for all.
As it is a clean
Github user HyukjinKwon commented on the issue:
https://github.com/apache/spark/pull/14627
I am happy to add some more tests but I am a bit confused of what I should
test. Maybe a test for `HadoopFsRelation.listLeafFiles` to make sure
directories are not included for the return value
Github user HyukjinKwon commented on the issue:
https://github.com/apache/spark/pull/14627
FYI, in most data sources, it would try to read duplicately or fail to
infer the schema (haven't tested this yet) if directories are allowed in
`FileFormat.inferSchema`.
---
If your project
Github user HyukjinKwon commented on the issue:
https://github.com/apache/spark/pull/14627
Yes, it is not. I am sorry for the confusion. It seems your fix is
perfectly fine and correct but it is just a clean-up to get rid of duplicated
logics.
BTW, I am also okay
Github user HyukjinKwon commented on the issue:
https://github.com/apache/spark/pull/14627
Could you take a look please @rxin and @liancheng ?
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have
Github user HyukjinKwon commented on the issue:
https://github.com/apache/spark/pull/14627
BTW, I am waiting for the Jenkins tests before cc someone. However, as you
are already in here (I appreciate it), I appreciate it if you take a look.
---
If your project is set up for it, you
Github user HyukjinKwon commented on the issue:
https://github.com/apache/spark/pull/14627
Ah, yes thank you!
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes
Github user HyukjinKwon commented on the issue:
https://github.com/apache/spark/pull/14593
Also just for reviewers, the inconsistent stuffs I listed in the PR
description happen randomly across documentation. So, this fixes them to be
consistent according to style guide lines
Github user HyukjinKwon commented on the issue:
https://github.com/apache/spark/pull/14102
@cloud-fan Thanks! I think it is ready to be reviewed.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does
GitHub user HyukjinKwon opened a pull request:
https://github.com/apache/spark/pull/14593
[MINOR][DOCS] Fix style in examples and inconsistent indentation across
documentation
## What changes were proposed in this pull request?
This PR fixes the documentation as below
Github user HyukjinKwon commented on the issue:
https://github.com/apache/spark/pull/14588
Please let me cc @srowen to make sure because I believe it is about
building.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well
Github user HyukjinKwon commented on the issue:
https://github.com/apache/spark/pull/14593
BTW, this is not fixing some wrong examples and inconsistent indentation
codes in `structured-streaming-programming-guide.md` because
https://github.com/apache/spark/pull/14564 is handling them
Github user HyukjinKwon commented on a diff in the pull request:
https://github.com/apache/spark/pull/14102#discussion_r74374585
--- Diff:
sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/json/JacksonParser.scala
---
@@ -35,184 +34,289 @@ import
Github user HyukjinKwon commented on a diff in the pull request:
https://github.com/apache/spark/pull/14102#discussion_r74372103
--- Diff:
sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/json/JacksonParser.scala
---
@@ -35,184 +34,289 @@ import
Github user HyukjinKwon commented on a diff in the pull request:
https://github.com/apache/spark/pull/14102#discussion_r74372641
--- Diff:
sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/json/JacksonParser.scala
---
@@ -35,184 +34,289 @@ import
Github user HyukjinKwon commented on a diff in the pull request:
https://github.com/apache/spark/pull/14102#discussion_r74375624
--- Diff:
sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/json/JacksonParser.scala
---
@@ -35,184 +34,289 @@ import
Github user HyukjinKwon commented on a diff in the pull request:
https://github.com/apache/spark/pull/14102#discussion_r74375976
--- Diff:
sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/json/JacksonParser.scala
---
@@ -35,184 +34,289 @@ import
Github user HyukjinKwon commented on a diff in the pull request:
https://github.com/apache/spark/pull/14102#discussion_r74382183
--- Diff:
sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/json/JacksonParser.scala
---
@@ -35,184 +34,296 @@ import
Github user HyukjinKwon commented on a diff in the pull request:
https://github.com/apache/spark/pull/14102#discussion_r74382623
--- Diff:
sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/json/JacksonParser.scala
---
@@ -35,184 +34,296 @@ import
Github user HyukjinKwon commented on the issue:
https://github.com/apache/spark/pull/14593
@srowen Thank you for review! I can revert non-visible ones but will just
leave the visible ones.
I am okay with reverting this (anyway now we know there are nits of spacing
Github user HyukjinKwon commented on a diff in the pull request:
https://github.com/apache/spark/pull/14102#discussion_r74214974
--- Diff:
sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/json/JacksonParser.scala
---
@@ -35,184 +34,330 @@ import
Github user HyukjinKwon commented on a diff in the pull request:
https://github.com/apache/spark/pull/14102#discussion_r74220035
--- Diff:
sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/json/JacksonParser.scala
---
@@ -35,184 +34,330 @@ import
GitHub user HyukjinKwon opened a pull request:
https://github.com/apache/spark/pull/14172
[SPARK-16516][SQL] Support for pushing down filters for decimal and
timestamp types in ORC
## What changes were proposed in this pull request?
It seems ORC supports all the types
Github user HyukjinKwon commented on the issue:
https://github.com/apache/spark/pull/14172
@liancheng I thought both were omitted in the original codes because
`BigDecimal` and `Timestamp` are not supported but it seems they are and work
fine. Could you please take a look
Github user HyukjinKwon commented on the issue:
https://github.com/apache/spark/pull/14181
FYI, if replacement is disabled, it is failed when the ratio is more than
1.0.
```
scala> spark.range(10).sample(false, 1.1).withColumn("mid",
monotonically_increa
Github user HyukjinKwon commented on the issue:
https://github.com/apache/spark/pull/14181
FYI, it seems it still happens even if ratio is less than 1.0 because it is
sampling with replacement.
```
scala> spark.range(10).sample(true, 0.5).withColumn(&
Github user HyukjinKwon commented on a diff in the pull request:
https://github.com/apache/spark/pull/14181#discussion_r70765806
--- Diff: sql/core/src/main/scala/org/apache/spark/sql/Dataset.scala ---
@@ -1536,21 +1536,26 @@ class Dataset[T] private[sql](
* Returns a new
Github user HyukjinKwon commented on the issue:
https://github.com/apache/spark/pull/14181
Yea, sampling with replacement expects the results can be duplicated. IMHO,
this fix should be enabled always when `replace` is `true`.
---
If your project is set up for it, you can reply
Github user HyukjinKwon commented on a diff in the pull request:
https://github.com/apache/spark/pull/14181#discussion_r70766221
--- Diff:
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/Optimizer.scala
---
@@ -154,8 +154,12 @@ class SimpleTestOptimizer
Github user HyukjinKwon commented on a diff in the pull request:
https://github.com/apache/spark/pull/14181#discussion_r70771559
--- Diff: sql/core/src/main/scala/org/apache/spark/sql/Dataset.scala ---
@@ -1536,21 +1536,26 @@ class Dataset[T] private[sql](
* Returns a new
Github user HyukjinKwon commented on a diff in the pull request:
https://github.com/apache/spark/pull/14181#discussion_r70770371
--- Diff:
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/Optimizer.scala
---
@@ -154,8 +154,12 @@ class SimpleTestOptimizer
Github user HyukjinKwon commented on the issue:
https://github.com/apache/spark/pull/14294
@liancheng and @davies Will this change be appropriate? Could you please
take a look?
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub
Github user HyukjinKwon commented on the issue:
https://github.com/apache/spark/pull/14117
Could you please take a look @rxin ?
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
Github user HyukjinKwon commented on the issue:
https://github.com/apache/spark/pull/14035
hm.. I can close if it looks inappropriate or it seems making a lot of
conflicts across PRs. Could you give some feedback please @mengxr and
@yanboliang ?
---
If your project is set up
GitHub user HyukjinKwon opened a pull request:
https://github.com/apache/spark/pull/14294
[SPARK-16646][SQL] LEAST and GREATEST doesn't accept numeric arguments with
different data types
## What changes were proposed in this pull request?
This PR makes `LEAST
Github user HyukjinKwon commented on the issue:
https://github.com/apache/spark/pull/14294
retest this please
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes
Github user HyukjinKwon commented on a diff in the pull request:
https://github.com/apache/spark/pull/14102#discussion_r71099098
--- Diff:
sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/json/JSONOptions.scala
---
@@ -51,7 +53,8 @@ private[sql] class
Github user HyukjinKwon commented on a diff in the pull request:
https://github.com/apache/spark/pull/14102#discussion_r71099616
--- Diff:
sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/json/JacksonParser.scala
---
@@ -35,184 +34,306 @@ import
Github user HyukjinKwon commented on the issue:
https://github.com/apache/spark/pull/14102
@yhuai Thank you for your review! I will try to address all your comments
first.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well
Github user HyukjinKwon commented on a diff in the pull request:
https://github.com/apache/spark/pull/13988#discussion_r71097989
--- Diff:
sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/csv/UnivocityGenerator.scala
---
@@ -0,0 +1,83 @@
+/*
+ * Licensed
Github user HyukjinKwon commented on a diff in the pull request:
https://github.com/apache/spark/pull/14102#discussion_r71099344
--- Diff:
sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/json/JacksonParser.scala
---
@@ -35,184 +34,306 @@ import
Github user HyukjinKwon commented on a diff in the pull request:
https://github.com/apache/spark/pull/14217#discussion_r71076055
--- Diff:
sql/core/src/test/scala/org/apache/spark/sql/execution/datasources/parquet/ParquetIOSuite.scala
---
@@ -169,6 +169,19 @@ class ParquetIOSuite
Github user HyukjinKwon commented on the issue:
https://github.com/apache/spark/pull/14236
cc @rxin
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so
Github user HyukjinKwon commented on the issue:
https://github.com/apache/spark/pull/14236
FYI, there is one more usage in
[ColumnExpressionSuite.scala#L517](https://github.com/apache/spark/blob/46395db80e3304e3f3a1ebdc8aadb8f2819b48b4/sql/core/src/test/scala/org/apache/spark/sql
GitHub user HyukjinKwon opened a pull request:
https://github.com/apache/spark/pull/14236
[SPARK-16588][SQL] Missed API fix for a function name mismatched between
FunctionRegistry and functions.scala
## What changes were proposed in this pull request?
It seems the function
Github user HyukjinKwon commented on a diff in the pull request:
https://github.com/apache/spark/pull/12944#discussion_r71254024
--- Diff:
common/network-shuffle/src/main/java/org/apache/spark/network/shuffle/ShuffleIndexRecord.java
---
@@ -0,0 +1,39 @@
+/*
+ * Licensed
Github user HyukjinKwon commented on a diff in the pull request:
https://github.com/apache/spark/pull/14102#discussion_r71261575
--- Diff:
sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/json/InferSchema.scala
---
@@ -60,13 +60,13 @@ private[sql] object
Github user HyukjinKwon commented on a diff in the pull request:
https://github.com/apache/spark/pull/13912#discussion_r71276368
--- Diff: python/pyspark/sql/readwriter.py ---
@@ -328,6 +328,10 @@ def csv(self, path, schema=None, sep=None,
encoding=None, quote=None, escape=Non
Github user HyukjinKwon commented on the issue:
https://github.com/apache/spark/pull/14102
@yhuai the commits I pushed include the changes below:
- Reverts the changes in `JSONOptions` about `columnNameOfCorruptRecord`
https://github.com/apache/spark/pull/14102
Github user HyukjinKwon commented on the issue:
https://github.com/apache/spark/pull/13912
I am closing this first and then will open a new one only for the default
format within today. I will cc you all as soon as I open. Thanks!
---
If your project is set up for it, you can reply
Github user HyukjinKwon closed the pull request at:
https://github.com/apache/spark/pull/13912
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature
Github user HyukjinKwon commented on a diff in the pull request:
https://github.com/apache/spark/pull/14279#discussion_r71474425
--- Diff:
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/util/DateTimeUtils.scala
---
@@ -62,6 +64,10 @@ object DateTimeUtils
Github user HyukjinKwon commented on a diff in the pull request:
https://github.com/apache/spark/pull/14242#discussion_r71501991
--- Diff:
examples/src/main/scala/org/apache/spark/examples/SparkKMeans.scala ---
@@ -33,7 +33,6 @@ object SparkKMeans {
def parseVector(line
GitHub user HyukjinKwon opened a pull request:
https://github.com/apache/spark/pull/14279
[SPARK-16216][SQL] Write Timestamp and Date in ISO 8601 formatted string by
default for CSV and JSON
## What changes were proposed in this pull request?
Currently, CSV datasource
Github user HyukjinKwon commented on a diff in the pull request:
https://github.com/apache/spark/pull/14279#discussion_r71474579
--- Diff:
sql/core/src/test/scala/org/apache/spark/sql/execution/datasources/csv/CSVSuite.scala
---
@@ -477,15 +478,15 @@ class CSVSuite extends
Github user HyukjinKwon commented on the issue:
https://github.com/apache/spark/pull/14279
cc @rxin, @srowen and @deanchen
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
Github user HyukjinKwon commented on the issue:
https://github.com/apache/spark/pull/14217
Closing this due to HIVE-14294. This will causes SPARK-16632 for normal
parquet reader as well.
---
If your project is set up for it, you can reply to this email and have your
reply appear
Github user HyukjinKwon closed the pull request at:
https://github.com/apache/spark/pull/14217
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature
Github user HyukjinKwon commented on the issue:
https://github.com/apache/spark/pull/14279
It seems due to https://issues.apache.org/jira/browse/LANG-982 I will look
into this deeper
---
If your project is set up for it, you can reply to this email and have your
reply appear
Github user HyukjinKwon commented on the issue:
https://github.com/apache/spark/pull/14124
Could you take a look please @marmbrus ?
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have
Github user HyukjinKwon commented on the issue:
https://github.com/apache/spark/pull/14117
Gentle ping @davies @liancheng
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
Github user HyukjinKwon commented on the issue:
https://github.com/apache/spark/pull/14102
Sorry for pinging here and there.. @yhuai @liancheng
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does
Github user HyukjinKwon commented on the issue:
https://github.com/apache/spark/pull/14035
Gentle ping @mengxr and @yanboliang
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
Github user HyukjinKwon commented on the issue:
https://github.com/apache/spark/pull/14028
@yhuai Could you take another look?
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
Github user HyukjinKwon commented on a diff in the pull request:
https://github.com/apache/spark/pull/13912#discussion_r71120468
--- Diff:
sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/csv/CSVRelation.scala
---
@@ -195,18 +202,40 @@ private[sql] class
Github user HyukjinKwon commented on the issue:
https://github.com/apache/spark/pull/14236
@rxin Sure, Thank you very much.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
Github user HyukjinKwon commented on a diff in the pull request:
https://github.com/apache/spark/pull/13912#discussion_r71130049
--- Diff: python/pyspark/sql/readwriter.py ---
@@ -328,6 +328,10 @@ def csv(self, path, schema=None, sep=None,
encoding=None, quote=None, escape=Non
Github user HyukjinKwon commented on a diff in the pull request:
https://github.com/apache/spark/pull/13912#discussion_r71131209
--- Diff: python/pyspark/sql/readwriter.py ---
@@ -328,6 +328,10 @@ def csv(self, path, schema=None, sep=None,
encoding=None, quote=None, escape=Non
Github user HyukjinKwon commented on a diff in the pull request:
https://github.com/apache/spark/pull/13912#discussion_r71124928
--- Diff:
sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/csv/CSVRelation.scala
---
@@ -195,18 +202,40 @@ private[sql] class
Github user HyukjinKwon commented on a diff in the pull request:
https://github.com/apache/spark/pull/13912#discussion_r71124645
--- Diff:
sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/csv/CSVRelation.scala
---
@@ -195,18 +202,40 @@ private[sql] class
Github user HyukjinKwon commented on a diff in the pull request:
https://github.com/apache/spark/pull/13912#discussion_r71127773
--- Diff: python/pyspark/sql/readwriter.py ---
@@ -328,6 +328,10 @@ def csv(self, path, schema=None, sep=None,
encoding=None, quote=None, escape=Non
Github user HyukjinKwon commented on a diff in the pull request:
https://github.com/apache/spark/pull/13912#discussion_r71124186
--- Diff: python/pyspark/sql/readwriter.py ---
@@ -328,6 +328,10 @@ def csv(self, path, schema=None, sep=None,
encoding=None, quote=None, escape=Non
Github user HyukjinKwon commented on a diff in the pull request:
https://github.com/apache/spark/pull/13912#discussion_r71124234
--- Diff: python/pyspark/sql/readwriter.py ---
@@ -328,6 +328,10 @@ def csv(self, path, schema=None, sep=None,
encoding=None, quote=None, escape=Non
GitHub user HyukjinKwon opened a pull request:
https://github.com/apache/spark/pull/14215
[SPARK-16544][SQL][WIP] Support for conversion from compatible schema for
Parquet data source when data types are not matched
## What changes were proposed in this pull request
GitHub user HyukjinKwon opened a pull request:
https://github.com/apache/spark/pull/14217
[SPARK-16562][SQL] Do not allow downcast in INT32 based types for normal
Parquet reader
## What changes were proposed in this pull request?
Currently, INT32 based types, (`ShortType
Github user HyukjinKwon commented on the issue:
https://github.com/apache/spark/pull/14035
ping @mengxr and @yanboliang
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled
Github user HyukjinKwon commented on the issue:
https://github.com/apache/spark/pull/14215
I see, yes I will think of a better way to fix the message. Yea it is still
happening across other data sources and this implementation is very specific to
Parquet.
I just wonder we
Github user HyukjinKwon commented on the issue:
https://github.com/apache/spark/pull/14215
For handling messages, I will open a separate PR soon!
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does
Github user HyukjinKwon commented on the issue:
https://github.com/apache/spark/pull/14215
Hi @gatorsmile @dongjoon-hyun @liancheng , currently this deals with only
`NumericType` except `DecimalType` for upcasting only for non-vectorized reader.
Before proceeding further, I
Github user HyukjinKwon commented on the issue:
https://github.com/apache/spark/pull/14217
cc @liancheng
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so
Github user HyukjinKwon commented on the issue:
https://github.com/apache/spark/pull/12972
Hi @liancheng ! Could you take a quick look please?
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have
Github user HyukjinKwon closed the pull request at:
https://github.com/apache/spark/pull/13942
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature
Github user HyukjinKwon commented on the issue:
https://github.com/apache/spark/pull/13942
Fixed in
https://github.com/apache/spark/commit/1aad8c6e59c1e8b18a3eaa8ded93ff6ad05d83df
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub
Github user HyukjinKwon commented on a diff in the pull request:
https://github.com/apache/spark/pull/13517#discussion_r68871740
--- Diff:
sql/catalyst/src/main/antlr4/org/apache/spark/sql/catalyst/parser/SqlBase.g4 ---
@@ -45,11 +45,11 @@ statement
| ALTER DATABASE
Github user HyukjinKwon commented on a diff in the pull request:
https://github.com/apache/spark/pull/13517#discussion_r68869270
--- Diff:
sql/catalyst/src/main/antlr4/org/apache/spark/sql/catalyst/parser/SqlBase.g4 ---
@@ -45,11 +45,11 @@ statement
| ALTER DATABASE
GitHub user HyukjinKwon opened a pull request:
https://github.com/apache/spark/pull/13963
[TRIVIAL][PYSPARK] Clean up orc compression option as well
## What changes were proposed in this pull request?
This PR corrects ORC compression option for PySpark as well. I think
Github user HyukjinKwon commented on the issue:
https://github.com/apache/spark/pull/13963
cc @davies
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so
Github user HyukjinKwon commented on the issue:
https://github.com/apache/spark/pull/12972
No problem! thank you!
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled
Github user HyukjinKwon commented on the issue:
https://github.com/apache/spark/pull/14067
Hi, @rxin @liancheng, I hope this is not missed to 2.0..
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does
Github user HyukjinKwon commented on the issue:
https://github.com/apache/spark/pull/14067
Oh, @liancheng I just corrected some more. Please take another look..
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your
Github user HyukjinKwon commented on a diff in the pull request:
https://github.com/apache/spark/pull/14067#discussion_r69707869
--- Diff:
sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/parquet/ParquetFilters.scala
---
@@ -188,7 +188,7 @@ private[sql] object
Github user HyukjinKwon commented on the issue:
https://github.com/apache/spark/pull/14067
Hm... I though Spark does not support filter-push down for nested fields.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your
Github user HyukjinKwon commented on the issue:
https://github.com/apache/spark/pull/14067
cc @viirya as well.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes
Github user HyukjinKwon commented on the issue:
https://github.com/apache/spark/pull/14067
yes it is, I just found I will correct them all. Thank you!
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does
GitHub user HyukjinKwon opened a pull request:
https://github.com/apache/spark/pull/14067
[SPARK-16371][SQL] Do not push down filters incorrectly when inner name and
outer name are the same in Parquet
## What changes were proposed in this pull request?
Currently
Github user HyukjinKwon commented on the issue:
https://github.com/apache/spark/pull/14028
Actually, I resembled your codes @liancheng here. Would this change be
sensible maybe?
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub
GitHub user HyukjinKwon opened a pull request:
https://github.com/apache/spark/pull/14063
[MINOR][PySpark][DOC] Fix SparkSession and Builder API docum
## What changes were proposed in this pull request?
This PR fixes wrongly formatted examples in PySpark documentation
Github user HyukjinKwon commented on the issue:
https://github.com/apache/spark/pull/14063
Oh, I will fix them up soon.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled
1201 - 1300 of 12622 matches
Mail list logo