Github user mallman commented on a diff in the pull request:
https://github.com/apache/spark/pull/15539#discussion_r84118387
--- Diff:
sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/TableFileCatalog.scala
---
@@ -32,13 +32,21 @@ import
Github user mallman commented on a diff in the pull request:
https://github.com/apache/spark/pull/15538#discussion_r84114928
--- Diff:
sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/parquet/ParquetFileFormat.scala
---
@@ -56,6 +56,7 @@ class ParquetFileFormat
Github user mallman commented on the issue:
https://github.com/apache/spark/pull/15539
@ericl I took this PR for a test drive with some large-ish tables.
Everything appeared to work as expected.
As far as performance goes, planning a simple select on a partitioned table
Github user mallman commented on a diff in the pull request:
https://github.com/apache/spark/pull/15539#discussion_r83990858
--- Diff:
sql/hive/src/test/scala/org/apache/spark/sql/hive/HiveTablePerfStatsSuite.scala
---
@@ -103,11 +92,84 @@ class HiveDataFrameSuite extends
Github user mallman commented on a diff in the pull request:
https://github.com/apache/spark/pull/15539#discussion_r83988599
--- Diff:
sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/ListingFileCatalog.scala
---
@@ -38,14 +38,16 @@ class ListingFileCatalog
Github user mallman commented on a diff in the pull request:
https://github.com/apache/spark/pull/15538#discussion_r83988497
--- Diff:
sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/parquet/ParquetFileFormat.scala
---
@@ -56,6 +56,7 @@ class ParquetFileFormat
Github user mallman commented on a diff in the pull request:
https://github.com/apache/spark/pull/15539#discussion_r83987399
--- Diff:
sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/ListingFileCatalog.scala
---
@@ -38,14 +38,16 @@ class ListingFileCatalog
Github user mallman commented on a diff in the pull request:
https://github.com/apache/spark/pull/15539#discussion_r83987017
--- Diff:
sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/PartitioningAwareFileCatalog.scala
---
@@ -276,15 +290,15 @@ object
Github user mallman commented on a diff in the pull request:
https://github.com/apache/spark/pull/15538#discussion_r83984654
--- Diff:
sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/parquet/ParquetFileFormat.scala
---
@@ -56,6 +56,7 @@ class ParquetFileFormat
Github user mallman commented on a diff in the pull request:
https://github.com/apache/spark/pull/15539#discussion_r83980885
--- Diff: core/src/main/scala/org/apache/spark/util/Utils.scala ---
@@ -1486,10 +1485,10 @@ private[spark] object Utils extends Logging {
val
GitHub user mallman opened a pull request:
https://github.com/apache/spark/pull/15538
[SPARK-17993][SQL] Fix Parquet log output redirection
(Link to Jira issue: https://issues.apache.org/jira/browse/SPARK-17993)
## What changes were proposed in this pull request
Github user mallman commented on a diff in the pull request:
https://github.com/apache/spark/pull/15518#discussion_r83741391
--- Diff:
sql/core/src/test/scala/org/apache/spark/sql/execution/datasources/parquet/ParquetPartitionDiscoverySuite.scala
---
@@ -626,8 +626,9 @@ class
Github user mallman commented on a diff in the pull request:
https://github.com/apache/spark/pull/15518#discussion_r83738938
--- Diff:
sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/FileCatalog.scala
---
@@ -0,0 +1,66 @@
+/*
+ * Licensed to the Apache
Github user mallman commented on a diff in the pull request:
https://github.com/apache/spark/pull/14690#discussion_r83518291
--- Diff:
sql/hive/src/main/scala/org/apache/spark/sql/hive/HiveExternalCatalog.scala ---
@@ -626,6 +627,40 @@ private[spark] class HiveExternalCatalog(conf
Github user mallman commented on a diff in the pull request:
https://github.com/apache/spark/pull/14690#discussion_r83517834
--- Diff:
sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/ListingFileCatalog.scala
---
@@ -17,32 +17,26 @@
package
Github user mallman commented on the issue:
https://github.com/apache/spark/pull/14690
>> Hm, I haven't seen that with my test queries. Would adding your
workaround to SparkILoopInit work?
> It does not, unfortunately.
I believe this impacts people with parq
Github user mallman commented on the issue:
https://github.com/apache/spark/pull/14690
> Hm, I haven't seen that with my test queries. Would adding your
workaround to SparkILoopInit work?
It does not, unfortunately.
---
If your project is set up for it, you can re
Github user mallman commented on the issue:
https://github.com/apache/spark/pull/14690
> btw, what's the parquet log redirection issue? I don't see anything
unusual in spark shell.
Whenever I run a query on a Hive parquet table I get
```
spark-sql> sele
Github user mallman commented on the issue:
https://github.com/apache/spark/pull/14690
I just pushed the rebase. It was really hairy, but I tried hard to ensure I
got essentially all three branches' changes in.
---
If your project is set up for it, you can reply to this email
Github user mallman commented on the issue:
https://github.com/apache/spark/pull/14690
I'm still working on the rebase. It's very complexâthere are two other
commits involved.
>> 1. Do we need a workaround for ORC like we made for Parquet?
> 1) yes
Github user mallman commented on the issue:
https://github.com/apache/spark/pull/14690
I will work on a rebase. Meanwhile, I've revisited the open issues in the
PR description. To summarize:
1. Do we need a workaround for ORC like we made for Parquet?
1. What's the impact
Github user mallman commented on a diff in the pull request:
https://github.com/apache/spark/pull/14690#discussion_r83355030
--- Diff:
sql/hive/src/test/scala/org/apache/spark/sql/hive/parquetSuites.scala ---
@@ -34,7 +34,7 @@ import org.apache.spark.util.Utils
// The data
Github user mallman commented on a diff in the pull request:
https://github.com/apache/spark/pull/14690#discussion_r83352979
--- Diff:
sql/hive/src/main/scala/org/apache/spark/sql/hive/HiveExternalCatalog.scala ---
@@ -626,6 +627,40 @@ private[spark] class HiveExternalCatalog(conf
Github user mallman commented on a diff in the pull request:
https://github.com/apache/spark/pull/14690#discussion_r83349072
--- Diff:
sql/hive/src/test/scala/org/apache/spark/sql/hive/parquetSuites.scala ---
@@ -34,7 +34,7 @@ import org.apache.spark.util.Utils
// The data
Github user mallman commented on the issue:
https://github.com/apache/spark/pull/14690
> Oops there is a conflict now.
NP. I'm working on the rebase.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If y
Github user mallman commented on a diff in the pull request:
https://github.com/apache/spark/pull/14690#discussion_r83325529
--- Diff:
sql/core/src/main/scala/org/apache/spark/sql/execution/DataSourceScanExec.scala
---
@@ -225,13 +225,19 @@ case class FileSourceScanExec
Github user mallman commented on a diff in the pull request:
https://github.com/apache/spark/pull/14690#discussion_r83325088
--- Diff:
sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/PruneFileSourcePartitions.scala
---
@@ -0,0 +1,72 @@
+/*
+ * Licensed
Github user mallman commented on a diff in the pull request:
https://github.com/apache/spark/pull/14690#discussion_r83318096
--- Diff:
sql/hive/src/main/scala/org/apache/spark/sql/hive/HiveExternalCatalog.scala ---
@@ -616,6 +617,44 @@ private[spark] class HiveExternalCatalog(conf
Github user mallman commented on the issue:
https://github.com/apache/spark/pull/14690
> Btw, I noticed that this suite was failing in jenkins only.
>
> [info] - partitioned pruned table reports only selected files *** FAILED
*** (610 milliseconds)
>
>
Github user mallman commented on the issue:
https://github.com/apache/spark/pull/14690
> This patch fails MiMa tests.
I've never seen this before. What does this mean?
---
If your project is set up for it, you can reply to this email and have your
reply appear on Git
Github user mallman commented on the issue:
https://github.com/apache/spark/pull/14690
I've pushed an update to `ParquetMetastoreSuite` that illustrates the bug
(or "limitation") WRT support for mixed-case partition columns I discovered
yesterday. To
Github user mallman commented on a diff in the pull request:
https://github.com/apache/spark/pull/14690#discussion_r83141382
--- Diff:
sql/hive/src/main/scala/org/apache/spark/sql/hive/HiveExternalCatalog.scala ---
@@ -616,6 +617,44 @@ private[spark] class HiveExternalCatalog(conf
Github user mallman commented on a diff in the pull request:
https://github.com/apache/spark/pull/14690#discussion_r83131630
--- Diff:
sql/hive/src/main/scala/org/apache/spark/sql/hive/HiveMetastoreCatalog.scala ---
@@ -199,59 +197,30 @@ private[hive] class
HiveMetastoreCatalog
Github user mallman commented on a diff in the pull request:
https://github.com/apache/spark/pull/14690#discussion_r83115827
--- Diff:
sql/hive/src/main/scala/org/apache/spark/sql/hive/HiveMetastoreCatalog.scala ---
@@ -199,59 +197,30 @@ private[hive] class
HiveMetastoreCatalog
Github user mallman commented on a diff in the pull request:
https://github.com/apache/spark/pull/14690#discussion_r83086945
--- Diff:
sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/PruneFileSourcePartitions.scala
---
@@ -0,0 +1,72 @@
+/*
+ * Licensed
Github user mallman commented on a diff in the pull request:
https://github.com/apache/spark/pull/14690#discussion_r83085289
--- Diff:
sql/hive/src/main/scala/org/apache/spark/sql/hive/HiveExternalCatalog.scala ---
@@ -616,6 +617,44 @@ private[spark] class HiveExternalCatalog(conf
Github user mallman commented on a diff in the pull request:
https://github.com/apache/spark/pull/14690#discussion_r83085625
--- Diff:
sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/TableFileCatalog.scala
---
@@ -0,0 +1,103 @@
+/*
+ * Licensed
Github user mallman commented on the issue:
https://github.com/apache/spark/pull/14690
I determined the performance regression was introduced by a commit I hadn't
pushed to this PR. Sorry for the false alarm. ð Needless to say, I'm not
pushing that commit.
---
If your project
Github user mallman commented on the issue:
https://github.com/apache/spark/pull/14690
>> Btw I've noticed a significant performance difference between
ListingFileCatalog and TableFileCatalog's implementation of ListFiles. The
difference seems to be that ListingFileC
Github user mallman commented on the issue:
https://github.com/apache/spark/pull/14690
> Btw I've noticed a significant performance difference between
ListingFileCatalog and TableFileCatalog's implementation of ListFiles. The
difference seems to be that ListingFileCata
Github user mallman commented on the issue:
https://github.com/apache/spark/pull/14690
I'm testing this patch on a couple of tables internally with on the order
of 10k partitions. Performance is much slower than it should be. I'm
investigating.
---
If your project is set up
Github user mallman commented on a diff in the pull request:
https://github.com/apache/spark/pull/14690#discussion_r83036315
--- Diff:
sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/PruneFileSourcePartitions.scala
---
@@ -0,0 +1,72 @@
+/*
+ * Licensed
Github user mallman commented on the issue:
https://github.com/apache/spark/pull/14690
I updated the description of this PR to reflect the workaround for the
Hive/Parquet case-sensitivity issue.
Do we need a similar workaround for ORC?
---
If your project is set up
Github user mallman commented on a diff in the pull request:
https://github.com/apache/spark/pull/14690#discussion_r82902172
--- Diff:
sql/core/src/main/scala/org/apache/spark/sql/execution/DataSourceScanExec.scala
---
@@ -225,13 +225,16 @@ case class FileSourceScanExec
Github user mallman commented on a diff in the pull request:
https://github.com/apache/spark/pull/14690#discussion_r82900810
--- Diff:
sql/core/src/main/scala/org/apache/spark/sql/execution/DataSourceScanExec.scala
---
@@ -225,13 +225,16 @@ case class FileSourceScanExec
Github user mallman commented on the issue:
https://github.com/apache/spark/pull/14690
>> Finally, this would require us to read the schema files. That's
something I'm trying to avoid in this patch.
> Not sure what you mean here, but the parquet change should be
Github user mallman commented on the issue:
https://github.com/apache/spark/pull/14690
I believe that using a method like `TableFileCatalog.filterPartitions` to
build a new file catalog restricted to some pruned partitions is a sound
approach, however I'm starting to reconsider
Github user mallman commented on the issue:
https://github.com/apache/spark/pull/14690
Ah cripes. I committed something I didn't want to. I'm rebasing again in a
few...
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well
Github user mallman commented on a diff in the pull request:
https://github.com/apache/spark/pull/14690#discussion_r82850487
--- Diff:
sql/core/src/main/scala/org/apache/spark/sql/execution/DataSourceScanExec.scala
---
@@ -225,13 +225,16 @@ case class FileSourceScanExec
Github user mallman commented on the issue:
https://github.com/apache/spark/pull/14690
BTW, I'm working on a rebase to fix merge conflicts and address reviewers'
feedback.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well
Github user mallman commented on the issue:
https://github.com/apache/spark/pull/14690
I would be wary of amending our data sources to support case-insensitive
field resolution. For one thing, strictly speaking it can lead to ambiguity in
schema resolution. In theâpotential
Github user mallman commented on a diff in the pull request:
https://github.com/apache/spark/pull/14690#discussion_r82713318
--- Diff:
sql/core/src/main/scala/org/apache/spark/sql/execution/DataSourceScanExec.scala
---
@@ -225,13 +225,16 @@ case class FileSourceScanExec
Github user mallman commented on a diff in the pull request:
https://github.com/apache/spark/pull/14690#discussion_r82713068
--- Diff:
sql/core/src/main/scala/org/apache/spark/sql/execution/DataSourceScanExec.scala
---
@@ -225,13 +225,16 @@ case class FileSourceScanExec
Github user mallman commented on a diff in the pull request:
https://github.com/apache/spark/pull/14690#discussion_r82712965
--- Diff:
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/catalog/InMemoryCatalog.scala
---
@@ -477,6 +478,15 @@ class InMemoryCatalog
Github user mallman commented on the issue:
https://github.com/apache/spark/pull/14690
I've rebased this PR and refactored it somewhat. The main change is to move
the partition pruning logic from `FileSourceStrategy` into a Catalyst optimizer
rule called `PruneFileSourcePartitions
Github user mallman commented on the issue:
https://github.com/apache/spark/pull/14690
> I think we can just remove the table schema reconciliation when
converting hive parquet table to data source table, and fix the failed tests.
@cloud-fan Sorry, I don't understand y
Github user mallman commented on the issue:
https://github.com/apache/spark/pull/14690
@cloud-fan Thanks for your careful analysis. I'm just getting back to work
today from two weeks off and will reply as soon as I have some time. It may be
a few days. Cheers.
---
If your project
Github user mallman commented on the issue:
https://github.com/apache/spark/pull/14690
FYI I'll be mostly away the rest of this week and off the grid entirely
next week.
I've continued to work on this patch on my side. Like I wrote earlier, I've
been awaiting the outcome
Github user mallman commented on the issue:
https://github.com/apache/spark/pull/14750
@clockfly I can offer one answer to your question. One of the main benefits
of this change is to allow us to remove the costly schema reconciliation
between hive metastore schema and on-disk
Github user mallman commented on the issue:
https://github.com/apache/spark/pull/14388
@viirya Any progress on this?
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled
Github user mallman commented on the issue:
https://github.com/apache/spark/pull/14690
Thanks @ericl and @davies for your latest feedback. I'd like to take the
opportunity to rebase this PR off of #14750 after it's merged to master before
pushing another commit, however I understand
Github user mallman commented on a diff in the pull request:
https://github.com/apache/spark/pull/14690#discussion_r77426981
--- Diff: sql/core/src/main/scala/org/apache/spark/sql/Dataset.scala ---
@@ -2531,6 +2531,8 @@ class Dataset[T] private[sql](
*/
def
Github user mallman commented on a diff in the pull request:
https://github.com/apache/spark/pull/14690#discussion_r77426759
--- Diff:
sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/fileSourceInterfaces.scala
---
@@ -346,11 +340,30 @@ trait FileCatalog
Github user mallman commented on a diff in the pull request:
https://github.com/apache/spark/pull/14690#discussion_r77426633
--- Diff:
sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/TableFileCatalog.scala
---
@@ -0,0 +1,102 @@
+/*
+ * Licensed
Github user mallman commented on a diff in the pull request:
https://github.com/apache/spark/pull/14690#discussion_r77425878
--- Diff:
sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/TableFileCatalog.scala
---
@@ -0,0 +1,102 @@
+/*
+ * Licensed
Github user mallman commented on a diff in the pull request:
https://github.com/apache/spark/pull/14690#discussion_r77425776
--- Diff:
sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/FileSourceStrategy.scala
---
@@ -79,8 +79,16 @@ object FileSourceStrategy
Github user mallman commented on a diff in the pull request:
https://github.com/apache/spark/pull/14690#discussion_r77423958
--- Diff:
sql/core/src/main/scala/org/apache/spark/sql/execution/DataSourceScanExec.scala
---
@@ -184,7 +184,7 @@ case class FileSourceScanExec
Github user mallman commented on the issue:
https://github.com/apache/spark/pull/14690
I've rebased and pushed a new commit which prune's a `HadoopFsRelation`'s
`TableFileCatalog` to a `ListingFileCatalog` for a given set of partition
pruning expressions. Another upside
Github user mallman commented on the issue:
https://github.com/apache/spark/pull/14690
Thanks, that looks great.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled
Github user mallman commented on the issue:
https://github.com/apache/spark/pull/14388
@viirya I sent you an email with a link to a test file to your public
github e-mail address.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub
Github user mallman commented on the issue:
https://github.com/apache/spark/pull/14690
@cloud-fan Now that #14155 is merged, have you started on a follow up PR to
address the column name case-insensitivity issue? I've rebased and done some
more work on this PR, but I'd like
Github user mallman commented on the issue:
https://github.com/apache/spark/pull/14388
@viirya I'll see what I can do. If nothing else, I may be able to share a
private data file over S3 if you promise not to share it with anyone else.
---
If your project is set up for it, you can
Github user mallman commented on the issue:
https://github.com/apache/spark/pull/14388
@viirya If I do a simple `select` on an array field it works, but if I add
an `order by` clause which orders by the array column I get exceptions like
```
16/08/29 21:47:01 ERROR
Github user mallman commented on the issue:
https://github.com/apache/spark/pull/14811
Done
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so
Github user mallman closed the pull request at:
https://github.com/apache/spark/pull/14811
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature
Github user mallman commented on the issue:
https://github.com/apache/spark/pull/14798
@zsxwing PR #14811 is a backport of this PR to `branch-2.0`.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does
GitHub user mallman opened a pull request:
https://github.com/apache/spark/pull/14811
[SPARK-17231][CORE] Avoid building debug or trace log messages unless
This is simply a backport of #14798 to `branch-2.0`. This backport omits
the change to `ExternalShuffleBlockHandler.java
Github user mallman commented on the issue:
https://github.com/apache/spark/pull/14798
Will do
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so
Github user mallman commented on the issue:
https://github.com/apache/spark/pull/14798
I focused mainly on trace and debug logging. I didn't do much with errors
or warnings, especially where exceptions are logged. I'm assuming these are
less frequent, and the cost of building those
Github user mallman commented on the issue:
https://github.com/apache/spark/pull/14617
@jerryshao The UI changes look great. I have not had a chance to scrutinize
the source changes. Hopefully we can get someone else to help review.
---
If your project is set up for it, you can
GitHub user mallman opened a pull request:
https://github.com/apache/spark/pull/14798
[SPARK-17231][CORE] Avoid building debug or trace log messages unless the
respective log level is enabled
(This PR addresses https://issues.apache.org/jira/browse/SPARK-17231)
## What
Github user mallman commented on a diff in the pull request:
https://github.com/apache/spark/pull/14537#discussion_r75904621
--- Diff:
sql/hive/src/main/scala/org/apache/spark/sql/hive/HiveMetastoreCatalog.scala ---
@@ -237,21 +237,26 @@ private[hive] class
HiveMetastoreCatalog
Github user mallman commented on the issue:
https://github.com/apache/spark/pull/14617
Hi @jerryshao. I think this is a great idea and fills in an important gap
in the app's UI. Going by the screenshot you posted, instead of putting both on
and off heap memory in a single column, how
Github user mallman commented on a diff in the pull request:
https://github.com/apache/spark/pull/14537#discussion_r75884267
--- Diff:
sql/hive/src/main/scala/org/apache/spark/sql/hive/HiveMetastoreCatalog.scala ---
@@ -237,21 +237,26 @@ private[hive] class
HiveMetastoreCatalog
Github user mallman commented on the issue:
https://github.com/apache/spark/pull/14537
@rajeshbalamohan So for Orc 2.x files, would schema inference be
unnecessary?
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well
Github user mallman commented on the issue:
https://github.com/apache/spark/pull/14690
That looks great!
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so
Github user mallman commented on the issue:
https://github.com/apache/spark/pull/14690
@cloud-fan O... how exciting! Is there a PR?
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have
Github user mallman commented on the issue:
https://github.com/apache/spark/pull/14690
It may be, but we at least need to get the unit tests working. And they use
mixed case column names. :)
---
If your project is set up for it, you can reply to this email and have your
reply appear
Github user mallman commented on the issue:
https://github.com/apache/spark/pull/14690
@ericl That would be ideal, however the Hive metastore does not faithfully
record column names with upper case characters. So if you save a parquet file
with a column named `myCol` and define
Github user mallman commented on the issue:
https://github.com/apache/spark/pull/14690
I've been thinking some more about the metastore/file schema
reconciliation. As I mentioned in the PR description, this patch omits this
reconciliation. This causes failures when the parquet files
Github user mallman commented on the issue:
https://github.com/apache/spark/pull/14690
@cloud-fan Actually, all hive table partition metadata are retrieved for
every query analysis. This is then used to compare the new metadata with the
cached metadata. If they're the same, then we
GitHub user mallman opened a pull request:
https://github.com/apache/spark/pull/14690
[SPARK-16980][SQL] Load only catalog table partition metadata required to
answer a query
(This PR addresses https://issues.apache.org/jira/browse/SPARK-16980.)
(N.B. I'm submitting
Github user mallman commented on the issue:
https://github.com/apache/spark/pull/14537
@rajeshbalamohan We'll need a committer to review your patch.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does
Github user mallman commented on a diff in the pull request:
https://github.com/apache/spark/pull/14551#discussion_r74276138
--- Diff: core/src/test/scala/org/apache/spark/util/UtilsSuite.scala ---
@@ -874,4 +874,38 @@ class UtilsSuite extends SparkFunSuite
Github user mallman commented on a diff in the pull request:
https://github.com/apache/spark/pull/14537#discussion_r74092780
--- Diff:
sql/hive/src/main/scala/org/apache/spark/sql/hive/HiveMetastoreCatalog.scala ---
@@ -287,14 +287,14 @@ private[hive] class
HiveMetastoreCatalog
Github user mallman commented on a diff in the pull request:
https://github.com/apache/spark/pull/14537#discussion_r74092457
--- Diff:
sql/hive/src/main/scala/org/apache/spark/sql/hive/HiveMetastoreCatalog.scala ---
@@ -287,14 +287,14 @@ private[hive] class
HiveMetastoreCatalog
Github user mallman commented on the issue:
https://github.com/apache/spark/pull/14537
@rajeshbalamohan, the changes to `HiveMetastoreCatalog.scala` look
reasonable. This mirrors the behavior of this method before the `if
(fileType.equals("parquet"))` expression was
Github user mallman commented on a diff in the pull request:
https://github.com/apache/spark/pull/14537#discussion_r74003788
--- Diff:
sql/hive/src/main/scala/org/apache/spark/sql/hive/HiveMetastoreCatalog.scala ---
@@ -294,7 +294,9 @@ private[hive] class HiveMetastoreCatalog
Github user mallman commented on a diff in the pull request:
https://github.com/apache/spark/pull/13818#discussion_r73892580
--- Diff:
sql/hive/src/main/scala/org/apache/spark/sql/hive/execution/InsertIntoHiveTable.scala
---
@@ -298,6 +298,7 @@ case class InsertIntoHiveTable
Github user mallman commented on the issue:
https://github.com/apache/spark/pull/14064
@cloud-fan Muchas gracias!
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled
501 - 600 of 656 matches
Mail list logo