Github user gatorsmile commented on the issue:
https://github.com/apache/spark/pull/22160
@dbtsai Have you tried to run it in scala 2.12? We still can do the upgrade
after Apache 2.4 RC.
---
-
To unsubscribe, e
Github user gatorsmile commented on the issue:
https://github.com/apache/spark/pull/21320
@mallman Thanks!
---
-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews
Github user gatorsmile commented on the issue:
https://github.com/apache/spark/pull/22160
Thank you! @dbtsai Let us see whether this can pass all the tests?
---
-
To unsubscribe, e-mail: reviews-unsubscr
Github user gatorsmile commented on the issue:
https://github.com/apache/spark/pull/21320
Try this when `spark.sql.nestedSchemaPruning.enabled` is on?
```SQL
withTable("t1") {
spark.sql(
"""
|Create table t
Github user gatorsmile commented on a diff in the pull request:
https://github.com/apache/spark/pull/22123#discussion_r211287978
--- Diff:
sql/core/src/test/scala/org/apache/spark/sql/execution/datasources/csv/CSVSuite.scala
---
@@ -1603,6 +1603,39 @@ class CSVSuite extends
Github user gatorsmile commented on a diff in the pull request:
https://github.com/apache/spark/pull/22123#discussion_r211283824
--- Diff:
sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/csv/CSVDataSource.scala
---
@@ -227,10 +210,9 @@ object
Github user gatorsmile commented on the issue:
https://github.com/apache/spark/pull/21909
LGTM.
Thanks for being patient to address all the comments! Merged to master.
---
-
To unsubscribe, e-mail: reviews
Github user gatorsmile commented on the issue:
https://github.com/apache/spark/pull/22123
cc @MaxGekk
---
-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h
Github user gatorsmile commented on the issue:
https://github.com/apache/spark/pull/20226
@maropu Could you take this over?
---
-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands
Github user gatorsmile commented on a diff in the pull request:
https://github.com/apache/spark/pull/21909#discussion_r211045699
--- Diff:
sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/json/JsonDataSource.scala
---
@@ -223,7 +224,8 @@ object
Github user gatorsmile commented on a diff in the pull request:
https://github.com/apache/spark/pull/21909#discussion_r211045061
--- Diff:
sql/catalyst/src/main/scala/org/apache/spark/sql/internal/SQLConf.scala ---
@@ -1492,6 +1492,15 @@ object SQLConf {
"This us
Github user gatorsmile commented on the issue:
https://github.com/apache/spark/pull/22121
We also need to document the extra enhancements that are added in this
release, compared with the databricks/spark-avro package
Github user gatorsmile commented on the issue:
https://github.com/apache/spark/pull/22121
We should do the same thing for the other native sources.
---
-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
Github user gatorsmile commented on a diff in the pull request:
https://github.com/apache/spark/pull/22121#discussion_r210970616
--- Diff: docs/avro-data-source-guide.md ---
@@ -0,0 +1,267 @@
+---
+layout: global
+title: Avro Data Source Guide
+---
+
+Since
Github user gatorsmile commented on the issue:
https://github.com/apache/spark/pull/22134
Do we need it in the current stage? Regarding UX, it looks complex to end
users. I am unable to remember the names. It is very easy to provide a wrong
class name
Github user gatorsmile commented on a diff in the pull request:
https://github.com/apache/spark/pull/22133#discussion_r210959550
--- Diff:
sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/DataSource.scala
---
@@ -637,6 +638,17 @@ object DataSource extends
Github user gatorsmile commented on the issue:
https://github.com/apache/spark/pull/21537
@kiszk Please create a JIRA and we can post more ideas there. Thanks!
---
-
To unsubscribe, e-mail: reviews-unsubscr
Github user gatorsmile commented on the issue:
https://github.com/apache/spark/pull/22121
@gengliangwang Could you also post the screen shot in your PR description?
---
-
To unsubscribe, e-mail: reviews-unsubscr
Github user gatorsmile commented on the issue:
https://github.com/apache/spark/pull/21537
@kiszk The initial prototype or proof of concept can be in any personal
branch. When we merge it to the master branch, we still need to separate it
from the current codegen and make
Github user gatorsmile commented on a diff in the pull request:
https://github.com/apache/spark/pull/21909#discussion_r210767018
--- Diff:
sql/core/src/test/scala/org/apache/spark/sql/execution/datasources/json/JsonSuite.scala
---
@@ -2223,21 +2223,31 @@ class JsonSuite extends
Github user gatorsmile commented on a diff in the pull request:
https://github.com/apache/spark/pull/21909#discussion_r210765672
--- Diff: docs/sql-programming-guide.md ---
@@ -1894,6 +1894,7 @@ working with timestamps in `pandas_udf`s to get the
best performance, see
Github user gatorsmile commented on the issue:
https://github.com/apache/spark/pull/22108
Thanks! Merged to master.
---
-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail
Github user gatorsmile commented on a diff in the pull request:
https://github.com/apache/spark/pull/21909#discussion_r210693829
--- Diff:
sql/core/src/test/scala/org/apache/spark/sql/execution/datasources/json/JsonSuite.scala
---
@@ -2223,21 +2223,31 @@ class JsonSuite extends
Github user gatorsmile commented on a diff in the pull request:
https://github.com/apache/spark/pull/21909#discussion_r210666117
--- Diff:
sql/catalyst/src/main/scala/org/apache/spark/sql/internal/SQLConf.scala ---
@@ -1492,6 +1492,15 @@ object SQLConf {
"This us
Github user gatorsmile commented on the issue:
https://github.com/apache/spark/pull/21537
@HyukjinKwon I am worrying about the design of a mixture of representation
s"" and code""? When the design is not good, it is hard to maintain it and add
new code based on
Github user gatorsmile commented on a diff in the pull request:
https://github.com/apache/spark/pull/21868#discussion_r210494497
--- Diff:
sql/core/src/main/scala/org/apache/spark/sql/execution/DataSourceScanExec.scala
---
@@ -425,12 +426,44 @@ case class FileSourceScanExec
Github user gatorsmile commented on the issue:
https://github.com/apache/spark/pull/22119
For details, see the discussion in the JIRA
https://issues.apache.org/jira/browse/SPARK-24924
---
-
To unsubscribe, e-mail
Github user gatorsmile commented on the issue:
https://github.com/apache/spark/pull/22111
Let us hold these codegen PRs until we see the design doc for building IR
for the codegen?
---
-
To unsubscribe, e-mail
Github user gatorsmile commented on the issue:
https://github.com/apache/spark/pull/21537
We are fully swamped by the hotfix and regressions of 2.3 release and the
new features that are targeting to 2.4. We should post some comments in this PR
earlier.
Designing an IR
Github user gatorsmile commented on the issue:
https://github.com/apache/spark/pull/21537
To Spark users, introducing AnalysisBarrier is a disaster. However, to the
developers of Spark internal, this is just a bug. If you served the customers
who are heavily using Spark, you
Github user gatorsmile commented on the issue:
https://github.com/apache/spark/pull/21537
I am fine to not revert it since it is too late. So many related PRs have
been merged, but we need to seriously consider writing and reviewing the design
docs before changing the code generation
Github user gatorsmile commented on the issue:
https://github.com/apache/spark/pull/21537
AnalysisBarrier does not introduce a behavior change. However, this
requires our analyzer rules must be idempotent. The most recent correctness bug
also shows another big potential hole
https
Github user gatorsmile commented on the issue:
https://github.com/apache/spark/pull/21537
This is not related to the credit or not. I think we need to introduce an
IR like what the compiler is doing instead of continuous improvement on the
existing one, which is already very hacky
Github user gatorsmile commented on the issue:
https://github.com/apache/spark/pull/21537
This is bad. The design needs to be carefully reviewed before implementing
it. Basically, we breaks the basic principles of software engineering. It is
very strange to write the design doc after
Github user gatorsmile commented on the issue:
https://github.com/apache/spark/pull/21537
@kiszk You are a JVM expert with a very strong IR background. Could you
lead the efforts and drive the IR design
Github user gatorsmile commented on the issue:
https://github.com/apache/spark/pull/21537
Another example is the AnalysisBarrier, which becomes a disaster to Spark
2.3 release. Many blockers, correctness bugs, performance regressions are
caused by that. Thus, I think we should revert
Github user gatorsmile commented on the issue:
https://github.com/apache/spark/pull/21537
Based on this PR, so many changes will be made in the codegen. The codegen
is very fundamental to Spark SQL. I do not think we should merge this PR at
this stage. To be more disciplined, we need
Github user gatorsmile commented on the issue:
https://github.com/apache/spark/pull/22007
The bump is fine but this is not for making it congruent with Stocator,
which is just an external connector
Github user gatorsmile commented on the issue:
https://github.com/apache/spark/pull/17185
Let us discuss it in the JIRA
https://issues.apache.org/jira/browse/SPARK-25121
---
-
To unsubscribe, e-mail: reviews
Github user gatorsmile commented on the issue:
https://github.com/apache/spark/pull/17185
@skambha @dilipbiswal How about the hint resolution after supporting multi
part names?
---
-
To unsubscribe, e-mail: reviews
Github user gatorsmile commented on the issue:
https://github.com/apache/spark/pull/17400
@maropu We should fix it. How about doing it after the code freeze? So far,
all are swamped by different tasks
Github user gatorsmile commented on the issue:
https://github.com/apache/spark/pull/22107
cc @felixcheung
---
-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews
Github user gatorsmile commented on a diff in the pull request:
https://github.com/apache/spark/pull/22102#discussion_r210053264
--- Diff:
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/Analyzer.scala
---
@@ -1704,6 +1704,7 @@ class Analyzer
Github user gatorsmile commented on the issue:
https://github.com/apache/spark/pull/21123
@HyukjinKwon @rdblue V2 data source APIs should not break FileFormat V1
compatibility, IMO. Based on my experience, it is not a right thing to force
users to change their Spark applications
Github user gatorsmile commented on the issue:
https://github.com/apache/spark/pull/22104
@icexelloss Do we face the same issue for DataSourceStrategy?
---
-
To unsubscribe, e-mail: reviews-unsubscr
Github user gatorsmile commented on a diff in the pull request:
https://github.com/apache/spark/pull/22104#discussion_r210044089
--- Diff: python/pyspark/sql/tests.py ---
@@ -3367,6 +3367,24 @@ def test_ignore_column_of_all_nulls(self):
finally
Github user gatorsmile commented on a diff in the pull request:
https://github.com/apache/spark/pull/22101#discussion_r210043412
--- Diff:
sql/catalyst/src/main/java/org/apache/spark/sql/execution/RecordBinaryComparator.java
---
@@ -27,7 +27,6 @@
public int compare
Github user gatorsmile commented on the issue:
https://github.com/apache/spark/pull/22102
@mgaido91 The PR is merged. Could you close it?
---
-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
Github user gatorsmile commented on a diff in the pull request:
https://github.com/apache/spark/pull/22102#discussion_r210036342
--- Diff: sql/core/src/test/scala/org/apache/spark/sql/DataFrameSuite.scala
---
@@ -2300,4 +2300,10 @@ class DataFrameSuite extends QueryTest
Github user gatorsmile commented on the issue:
https://github.com/apache/spark/pull/22102
Let me confirm it and then will merge it to 2.3
---
-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
Github user gatorsmile commented on the issue:
https://github.com/apache/spark/pull/22102
This might not be the last one. Let us backport the fix of @maryannxue ?
---
-
To unsubscribe, e-mail: reviews-unsubscr
Github user gatorsmile commented on the issue:
https://github.com/apache/spark/pull/22082
@dbtsai Nope. I did not hit any issue. :)
---
-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional
Github user gatorsmile commented on the issue:
https://github.com/apache/spark/pull/21939
@BryanCutler @shaneknapp Thanks for your work!
---
-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
Github user gatorsmile commented on the issue:
https://github.com/apache/spark/pull/21439
LGTM
---
-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h
Github user gatorsmile commented on a diff in the pull request:
https://github.com/apache/spark/pull/21439#discussion_r209461516
--- Diff: sql/core/src/test/resources/sql-tests/inputs/json-functions.sql
---
@@ -39,3 +39,8 @@ select from_json('{"a":1, "b&q
Github user gatorsmile commented on a diff in the pull request:
https://github.com/apache/spark/pull/21439#discussion_r209461334
--- Diff:
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/json/JacksonParser.scala
---
@@ -101,6 +102,21 @@ class JacksonParser
Github user gatorsmile commented on the issue:
https://github.com/apache/spark/pull/22082
cc @dbtsai @srowen
---
-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail
GitHub user gatorsmile opened a pull request:
https://github.com/apache/spark/pull/22082
[SPARK-24420][Build][FOLLOW-UP] Upgrade ASM6 APIs
## What changes were proposed in this pull request?
Use ASM 6 APIs after we upgrading it to ASM6.
## How was this patch tested
Github user gatorsmile commented on the issue:
https://github.com/apache/spark/pull/22079
The original fix
https://github.com/apache/spark/pull/22079/commits/efccc028bce64bf4754ce81ee16533c19b4384b2
has been merged to Spark 2.3. After 5+ months, we have not received any
correctness
Github user gatorsmile commented on the issue:
https://github.com/apache/spark/pull/22079
cc @jiangxb1987
---
-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews
Github user gatorsmile commented on the issue:
https://github.com/apache/spark/pull/22072
ok to test
---
-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h
Github user gatorsmile commented on the issue:
https://github.com/apache/spark/pull/22072
retest this please
---
-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail
Github user gatorsmile commented on the issue:
https://github.com/apache/spark/pull/22077
ok to test
---
-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h
Github user gatorsmile commented on the issue:
https://github.com/apache/spark/pull/22077
test this please
---
-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews
Github user gatorsmile commented on the issue:
https://github.com/apache/spark/pull/21977
Why not using `resource.RLIMIT_RSS`?
---
-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional
Github user gatorsmile commented on the issue:
https://github.com/apache/spark/pull/21889
retest this please
---
-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail
Github user gatorsmile commented on the issue:
https://github.com/apache/spark/pull/21320
retest this please
---
-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail
Github user gatorsmile commented on a diff in the pull request:
https://github.com/apache/spark/pull/21087#discussion_r209077023
--- Diff:
sql/catalyst/src/main/scala/org/apache/spark/sql/internal/SQLConf.scala ---
@@ -580,6 +580,11 @@ object SQLConf {
.booleanConf
Github user gatorsmile commented on a diff in the pull request:
https://github.com/apache/spark/pull/21087#discussion_r209076282
--- Diff:
sql/catalyst/src/main/scala/org/apache/spark/sql/internal/SQLConf.scala ---
@@ -580,6 +580,11 @@ object SQLConf {
.booleanConf
Github user gatorsmile commented on the issue:
https://github.com/apache/spark/pull/22060
`AliasViewChild` is the only rule? Can we whitelist it first? It sounds
like many tests are skipped.
---
-
To unsubscribe
Github user gatorsmile commented on the issue:
https://github.com/apache/spark/pull/21889
retest this please
---
-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail
Github user gatorsmile commented on the issue:
https://github.com/apache/spark/pull/21320
retest this please
---
-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail
Github user gatorsmile commented on the issue:
https://github.com/apache/spark/pull/21608
Thanks! Merged to master.
---
-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail
Github user gatorsmile commented on the issue:
https://github.com/apache/spark/pull/21889
retest this please
---
-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail
Github user gatorsmile commented on the issue:
https://github.com/apache/spark/pull/21320
retest this please
---
-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail
Github user gatorsmile commented on the issue:
https://github.com/apache/spark/pull/22049
Thanks! Merged to master.
---
-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail
Github user gatorsmile commented on a diff in the pull request:
https://github.com/apache/spark/pull/21994#discussion_r20881
--- Diff: pom.xml ---
@@ -2609,6 +2609,28 @@
+
+com.github.spotbugs
+spotbugs
Github user gatorsmile commented on the issue:
https://github.com/apache/spark/pull/21939
Really thank you for your help! @shaneknapp
---
-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
Github user gatorsmile commented on the issue:
https://github.com/apache/spark/pull/21889
I hit the following error in my local environment.
```
sbt.ForkMain$ForkError: org.apache.spark.SparkException: Job aborted due to
stage failure: Task 0 in stage 220.0 failed 1 times
Github user gatorsmile commented on the issue:
https://github.com/apache/spark/pull/21939
@shaneknapp That is great! I think making Jenkins in a quite mode looks
fine, as long as we send out a note to the dev list
Github user gatorsmile commented on the issue:
https://github.com/apache/spark/pull/21939
https://lists.apache.org/thread.html/5b0836e44f9386fae2f99deed0a01441c699040c991d833faf520357@%3Cdev.arrow.apache.org%3E
Arrow 0.10.0 release is officially announced. @shaneknapp Could
Github user gatorsmile commented on the issue:
https://github.com/apache/spark/pull/21939
We need to upgrade it to 0.10.0 in this release, if possible. It resolves
some bugs, e.g., https://issues.apache.org/jira/browse/ARROW-1973
Github user gatorsmile commented on a diff in the pull request:
https://github.com/apache/spark/pull/20611#discussion_r208412882
--- Diff: core/src/main/scala/org/apache/spark/util/Utils.scala ---
@@ -1976,6 +1976,49 @@ private[spark] object Utils extends Logging
Github user gatorsmile commented on the issue:
https://github.com/apache/spark/pull/21596
@jerryshao This is for 3.0
---
-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail
Github user gatorsmile commented on a diff in the pull request:
https://github.com/apache/spark/pull/21608#discussion_r208408858
--- Diff:
sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/DataSourceUtils.scala
---
@@ -49,4 +51,11 @@ object DataSourceUtils
Github user gatorsmile commented on the issue:
https://github.com/apache/spark/pull/22028
Thanks! Merged to master.
---
-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail
Github user gatorsmile commented on the issue:
https://github.com/apache/spark/pull/21977
cc @jiangxb1987 @cloud-fan @jerryshao @vanzin
---
-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
Github user gatorsmile commented on the issue:
https://github.com/apache/spark/pull/21977
@rdblue Is this for YARN only?
---
-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e
Github user gatorsmile commented on the issue:
https://github.com/apache/spark/pull/21977
test cases?
---
-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h
Github user gatorsmile commented on the issue:
https://github.com/apache/spark/pull/22006
Thanks! Merged to master.
---
-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail
Github user gatorsmile commented on the issue:
https://github.com/apache/spark/pull/22028
ok to test
---
-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h
Github user gatorsmile commented on a diff in the pull request:
https://github.com/apache/spark/pull/22030#discussion_r208326382
--- Diff:
sql/core/src/main/scala/org/apache/spark/sql/RelationalGroupedDataset.scala ---
@@ -403,20 +415,29 @@ class RelationalGroupedDataset protected
Github user gatorsmile commented on a diff in the pull request:
https://github.com/apache/spark/pull/21608#discussion_r208269963
--- Diff:
sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/DataSourceUtils.scala
---
@@ -49,4 +51,11 @@ object DataSourceUtils
Github user gatorsmile commented on the issue:
https://github.com/apache/spark/pull/21970
Thanks! Merged to master.
---
-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail
Github user gatorsmile commented on the issue:
https://github.com/apache/spark/pull/22012
Thanks! Merged to master.
---
-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail
Github user gatorsmile commented on the issue:
https://github.com/apache/spark/pull/21939
It sounds like the vote can pass soon.
https://lists.apache.org/thread.html/9900da1540be5aafce27691fd40395bb53f465302db29979c154d99a@%3Cdev.arrow.apache.org%3E
Github user gatorsmile commented on the issue:
https://github.com/apache/spark/pull/21939
To get this in, we might need to delay the code freeze. Can you reply the
dev list email
http://apache-spark-developers-list.1001551.n3.nabble.com/code-freeze-and-branch-cut-for-Apache-Spark-2
Github user gatorsmile commented on the issue:
https://github.com/apache/spark/pull/21939
After the code freeze, the dependency changes are not allowed. Hopefully,
we can make it before that.
---
-
To unsubscribe
Github user gatorsmile commented on a diff in the pull request:
https://github.com/apache/spark/pull/21909#discussion_r207850329
--- Diff:
sql/core/src/test/scala/org/apache/spark/sql/execution/datasources/json/JsonSuite.scala
---
@@ -2225,19 +2225,21 @@ class JsonSuite extends
Github user gatorsmile commented on the issue:
https://github.com/apache/spark/pull/21970
LGTM pending Jenkins
---
-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail
601 - 700 of 14035 matches
Mail list logo