Github user mallman commented on a diff in the pull request:
https://github.com/apache/spark/pull/165#discussion_r10714229
--- Diff:
mllib/src/main/scala/org/apache/spark/mllib/recommendation/ALS.scala ---
@@ -187,21 +189,39 @@ class ALS private
Github user mallman commented on a diff in the pull request:
https://github.com/apache/spark/pull/16578#discussion_r100229300
--- Diff:
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/planning/SelectedField.scala
---
@@ -0,0 +1,76 @@
+/*
+ * Licensed to the
Github user mallman commented on a diff in the pull request:
https://github.com/apache/spark/pull/16578#discussion_r100229358
--- Diff:
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/planning/GetStructField2.scala
---
@@ -0,0 +1,33 @@
+/*
+ * Licensed to the
Github user mallman commented on a diff in the pull request:
https://github.com/apache/spark/pull/16578#discussion_r100360523
--- Diff:
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/planning/GetStructField2.scala
---
@@ -0,0 +1,33 @@
+/*
+ * Licensed to the
Github user mallman commented on a diff in the pull request:
https://github.com/apache/spark/pull/16785#discussion_r100364260
--- Diff:
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/plans/logical/LogicalPlan.scala
---
@@ -314,19 +322,29 @@ abstract class UnaryNode
Github user mallman commented on the issue:
https://github.com/apache/spark/pull/16775
@viirya I believe this PR meshes with the refactoring and application to
pregel GraphX algorithms in #15125. Basically, it moves the periodic
checkpointing code from mllib into core and uses it in
Github user mallman commented on the issue:
https://github.com/apache/spark/pull/16797
>> Like you said, users can still create a hive table with
mixed-case-schema parquet/orc files, by hive or other systems like presto. This
table is readable for hive, and for Spark prior
Github user mallman commented on the issue:
https://github.com/apache/spark/pull/16797
BTW @budde, given that this represents a regression in behavior from
previous versions of Spark, I think it is too generous of you to label the Jira
issue as an "improvement" instead of
Github user mallman commented on a diff in the pull request:
https://github.com/apache/spark/pull/15125#discussion_r100608839
--- Diff: graphx/src/main/scala/org/apache/spark/graphx/Pregel.scala ---
@@ -123,16 +127,25 @@ object Pregel extends Logging {
s" bu
Github user mallman commented on a diff in the pull request:
https://github.com/apache/spark/pull/15125#discussion_r100609529
--- Diff: graphx/src/main/scala/org/apache/spark/graphx/Pregel.scala ---
@@ -123,16 +127,25 @@ object Pregel extends Logging {
s" bu
Github user mallman commented on a diff in the pull request:
https://github.com/apache/spark/pull/15125#discussion_r100612840
--- Diff: graphx/src/main/scala/org/apache/spark/graphx/Pregel.scala ---
@@ -123,16 +127,25 @@ object Pregel extends Logging {
s" bu
Github user mallman commented on a diff in the pull request:
https://github.com/apache/spark/pull/15125#discussion_r100631975
--- Diff:
mllib/src/test/scala/org/apache/spark/mllib/impl/PeriodicGraphCheckpointerSuite.scala
---
@@ -21,6 +21,7 @@ import org.apache.hadoop.fs.Path
Github user mallman commented on a diff in the pull request:
https://github.com/apache/spark/pull/15125#discussion_r100632148
--- Diff:
mllib/src/test/scala/org/apache/spark/mllib/impl/PeriodicRDDCheckpointerSuite.scala
---
@@ -23,7 +23,7 @@ import org.apache.spark.{SparkContext
Github user mallman commented on a diff in the pull request:
https://github.com/apache/spark/pull/15125#discussion_r100638130
--- Diff:
graphx/src/main/scala/org/apache/spark/graphx/util/PeriodicGraphCheckpointer.scala
---
@@ -76,7 +77,7 @@ import
Github user mallman commented on a diff in the pull request:
https://github.com/apache/spark/pull/15125#discussion_r100638292
--- Diff: graphx/src/main/scala/org/apache/spark/graphx/Pregel.scala ---
@@ -123,16 +127,25 @@ object Pregel extends Logging {
s" bu
Github user mallman commented on a diff in the pull request:
https://github.com/apache/spark/pull/15125#discussion_r100640256
--- Diff: graphx/src/main/scala/org/apache/spark/graphx/Pregel.scala ---
@@ -123,16 +127,25 @@ object Pregel extends Logging {
s" bu
Github user mallman commented on a diff in the pull request:
https://github.com/apache/spark/pull/15125#discussion_r100641170
--- Diff:
graphx/src/main/scala/org/apache/spark/graphx/util/PeriodicGraphCheckpointer.scala
---
@@ -87,10 +88,7 @@ private[mllib] class
Github user mallman commented on the issue:
https://github.com/apache/spark/pull/15125
@viirya @dding3 I'm going to rerun our big connected components computation
with the changes I've suggested to validate that it still performs and
completes as expected. Given the time r
Github user mallman commented on the issue:
https://github.com/apache/spark/pull/15125
@dding3 These latest changes look great. I'll run our big connected
components job today and report back.
---
If your project is set up for it, you can reply to this email and have your
Github user mallman commented on the issue:
https://github.com/apache/spark/pull/15125
Our connected components computation completed successfully, with
performance as expected. I've created a PR against @dding3's PR branch to
incorporate a couple simple things. Then I t
Github user mallman commented on the issue:
https://github.com/apache/spark/pull/16578
@viirya I've added a commit to address some of your feedback. I will have
another commit to address the others, but I'm not sure when I'll have it in.
Hopefully by the end of next
Github user mallman commented on the issue:
https://github.com/apache/spark/pull/16942
Force pushing your branch shouldn't close the PR. You didn't close it
manually?
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub a
Github user mallman commented on the issue:
https://github.com/apache/spark/pull/16942
Weird. I think I've seen that behavior once before. But I think the only
time I force push on a PR is to rebase. Maybe that's the only kind of force
push allowed for Github PRs.
-
Github user mallman commented on a diff in the pull request:
https://github.com/apache/spark/pull/16499#discussion_r101592331
--- Diff: core/src/main/scala/org/apache/spark/storage/BlockManager.scala
---
@@ -1018,7 +1025,9 @@ private[spark] class BlockManager(
try
Github user mallman commented on a diff in the pull request:
https://github.com/apache/spark/pull/16499#discussion_r101602014
--- Diff: core/src/main/scala/org/apache/spark/storage/BlockManager.scala
---
@@ -813,7 +813,14 @@ private[spark] class BlockManager
Github user mallman commented on a diff in the pull request:
https://github.com/apache/spark/pull/16499#discussion_r101602604
--- Diff: core/src/main/scala/org/apache/spark/storage/BlockManager.scala
---
@@ -813,7 +813,14 @@ private[spark] class BlockManager
Github user mallman commented on the issue:
https://github.com/apache/spark/pull/15125
> I think @mallman is saying he would merge changes to @dding3 branch
Yes, or I could do them in a follow up PR. Or @dding3 could do them without
my PR. I'm not hung up on gettin
Github user mallman commented on the issue:
https://github.com/apache/spark/pull/15125
@dding3 I submitted a PR against your `cp2_pregel` branch. If you merge
that PR into your branch, it will be reflected in this PR. This is my PR:
https://github.com/dding3/spark/pull/1.
---
If
Github user mallman commented on the issue:
https://github.com/apache/spark/pull/15125
LGTM. @felixcheung are we good to merge?
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
Github user mallman commented on a diff in the pull request:
https://github.com/apache/spark/pull/16499#discussion_r101675576
--- Diff: core/src/main/scala/org/apache/spark/storage/BlockManager.scala
---
@@ -813,7 +813,14 @@ private[spark] class BlockManager
Github user mallman commented on a diff in the pull request:
https://github.com/apache/spark/pull/16499#discussion_r101675669
--- Diff: core/src/main/scala/org/apache/spark/storage/BlockManager.scala
---
@@ -1018,7 +1025,9 @@ private[spark] class BlockManager(
try
Github user mallman commented on a diff in the pull request:
https://github.com/apache/spark/pull/16499#discussion_r101809099
--- Diff: core/src/main/scala/org/apache/spark/storage/BlockManager.scala
---
@@ -813,7 +813,14 @@ private[spark] class BlockManager
Github user mallman commented on a diff in the pull request:
https://github.com/apache/spark/pull/16499#discussion_r101809872
--- Diff: core/src/main/scala/org/apache/spark/storage/BlockManager.scala
---
@@ -1018,7 +1025,9 @@ private[spark] class BlockManager(
try
Github user mallman commented on a diff in the pull request:
https://github.com/apache/spark/pull/15125#discussion_r101818789
--- Diff: graphx/src/main/scala/org/apache/spark/graphx/GraphOps.scala ---
@@ -362,12 +362,14 @@ class GraphOps[VD: ClassTag, ED: ClassTag](graph:
Graph[VD
Github user mallman commented on a diff in the pull request:
https://github.com/apache/spark/pull/15125#discussion_r101819321
--- Diff:
graphx/src/main/scala/org/apache/spark/graphx/util/PeriodicGraphCheckpointer.scala
---
@@ -87,10 +87,10 @@ private[mllib] class
Github user mallman commented on a diff in the pull request:
https://github.com/apache/spark/pull/15125#discussion_r102053462
--- Diff: graphx/src/main/scala/org/apache/spark/graphx/Pregel.scala ---
@@ -154,7 +169,9 @@ object Pregel extends Logging {
// count the
Github user mallman commented on a diff in the pull request:
https://github.com/apache/spark/pull/15125#discussion_r102056438
--- Diff: docs/graphx-programming-guide.md ---
@@ -708,7 +708,9 @@ messages remaining.
> messaging function. These constraints allow additio
Github user mallman commented on a diff in the pull request:
https://github.com/apache/spark/pull/15125#discussion_r102057156
--- Diff: docs/graphx-programming-guide.md ---
@@ -708,7 +708,9 @@ messages remaining.
> messaging function. These constraints allow additio
Github user mallman commented on the issue:
https://github.com/apache/spark/pull/15125
@dding3, thank you for your continued patience and dedication to this PR,
despite the continued change requests. We are getting closer to a merge.
---
If your project is set up for it, you can
Github user mallman commented on a diff in the pull request:
https://github.com/apache/spark/pull/16499#discussion_r102271763
--- Diff: core/src/main/scala/org/apache/spark/storage/BlockManager.scala
---
@@ -1018,7 +1025,9 @@ private[spark] class BlockManager(
try
Github user mallman commented on a diff in the pull request:
https://github.com/apache/spark/pull/16499#discussion_r102272981
--- Diff: core/src/main/scala/org/apache/spark/storage/BlockManager.scala
---
@@ -813,7 +813,14 @@ private[spark] class BlockManager
Github user mallman commented on a diff in the pull request:
https://github.com/apache/spark/pull/15125#discussion_r102290972
--- Diff: docs/graphx-programming-guide.md ---
@@ -708,7 +708,9 @@ messages remaining.
> messaging function. These constraints allow additio
Github user mallman commented on a diff in the pull request:
https://github.com/apache/spark/pull/15125#discussion_r102292537
--- Diff: graphx/src/main/scala/org/apache/spark/graphx/Pregel.scala ---
@@ -122,27 +125,39 @@ object Pregel extends Logging {
require
Github user mallman commented on a diff in the pull request:
https://github.com/apache/spark/pull/16499#discussion_r102293219
--- Diff: core/src/main/scala/org/apache/spark/storage/BlockManager.scala
---
@@ -843,7 +852,15 @@ private[spark] class BlockManager
Github user mallman commented on a diff in the pull request:
https://github.com/apache/spark/pull/16499#discussion_r102293681
--- Diff: core/src/main/scala/org/apache/spark/storage/BlockManager.scala
---
@@ -317,6 +317,9 @@ private[spark] class BlockManager
Github user mallman commented on the issue:
https://github.com/apache/spark/pull/15480
Hi @lw-lin. Just FYI we use this patch at VideoAmp and would love to see it
merged in. I notice this PR has gone a little cold. I'm sorry I can't offer
much concrete help, but I wante
GitHub user mallman opened a pull request:
https://github.com/apache/spark/pull/16499
[SPARK-17204][CORE] Fix replicated off heap storage
(Jira: https://issues.apache.org/jira/browse/SPARK-17204)
## What changes were proposed in this pull request?
There are a
Github user mallman commented on a diff in the pull request:
https://github.com/apache/spark/pull/16499#discussion_r95066296
--- Diff:
core/src/test/scala/org/apache/spark/storage/BlockManagerReplicationSuite.scala
---
@@ -387,12 +388,23 @@ class BlockManagerReplicationSuite
Github user mallman commented on a diff in the pull request:
https://github.com/apache/spark/pull/16499#discussion_r95066452
--- Diff:
core/src/test/scala/org/apache/spark/storage/BlockManagerReplicationSuite.scala
---
@@ -375,7 +375,8 @@ class BlockManagerReplicationSuite
Github user mallman commented on a diff in the pull request:
https://github.com/apache/spark/pull/16500#discussion_r95206030
--- Diff:
sql/hive/src/main/scala/org/apache/spark/sql/hive/execution/InsertIntoHiveTable.scala
---
@@ -392,7 +392,9 @@ case class InsertIntoHiveTable
Github user mallman commented on the issue:
https://github.com/apache/spark/pull/16514
> A good suggestion. Will do the code changes tomorrow. Thanks!
I look forward to seeing this. Thanks for taking this on.
---
If your project is set up for it, you can reply to this em
Github user mallman commented on a diff in the pull request:
https://github.com/apache/spark/pull/16514#discussion_r95489960
--- Diff:
sql/hive/src/main/scala/org/apache/spark/sql/hive/HiveMetastoreCatalog.scala ---
@@ -119,7 +119,30 @@ private[hive] class HiveMetastoreCatalog
Github user mallman commented on a diff in the pull request:
https://github.com/apache/spark/pull/16514#discussion_r95490619
--- Diff:
sql/hive/src/test/scala/org/apache/spark/sql/hive/execution/HiveDDLSuite.scala
---
@@ -555,6 +557,61 @@ class HiveDDLSuite
GitHub user mallman opened a pull request:
https://github.com/apache/spark/pull/16578
[SPARK-4502][SQL] Parquet nested column pruning
(Link to Jira: https://issues.apache.org/jira/browse/SPARK-4502)
## What changes were proposed in this pull request?
One of the
Github user mallman commented on the issue:
https://github.com/apache/spark/pull/16578
cc @rxin @ericl @cloud-fan
@marmbrus I would love to get your feedback on this if you have the time.
---
If your project is set up for it, you can reply to this email and have your
reply
Github user mallman commented on the issue:
https://github.com/apache/spark/pull/16499
@rxin, can you recommend someone I reach out to for help reviewing this PR?
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your
Github user mallman commented on the issue:
https://github.com/apache/spark/pull/16499
Josh, can you take a look at this when you have a chance?
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not
Github user mallman commented on the issue:
https://github.com/apache/spark/pull/16578
> Does this take over #14957? If so, we might need Closes #14957 in the PR
description for the merge script to close that one or let the author know this
takes over that.
I don
Github user mallman commented on the issue:
https://github.com/apache/spark/pull/14690
I've rebased this PR and refactored it somewhat. The main change is to move
the partition pruning logic from `FileSourceStrategy` into a Catalyst optimizer
rule called `PruneFileSourceParti
Github user mallman commented on a diff in the pull request:
https://github.com/apache/spark/pull/14690#discussion_r82712965
--- Diff:
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/catalog/InMemoryCatalog.scala
---
@@ -477,6 +478,15 @@ class InMemoryCatalog
Github user mallman commented on a diff in the pull request:
https://github.com/apache/spark/pull/14690#discussion_r82713068
--- Diff:
sql/core/src/main/scala/org/apache/spark/sql/execution/DataSourceScanExec.scala
---
@@ -225,13 +225,16 @@ case class FileSourceScanExec
Github user mallman commented on a diff in the pull request:
https://github.com/apache/spark/pull/14690#discussion_r82713318
--- Diff:
sql/core/src/main/scala/org/apache/spark/sql/execution/DataSourceScanExec.scala
---
@@ -225,13 +225,16 @@ case class FileSourceScanExec
Github user mallman commented on the issue:
https://github.com/apache/spark/pull/14690
I would be wary of amending our data sources to support case-insensitive
field resolution. For one thing, strictly speaking it can lead to ambiguity in
schema resolution. In theâpotential but
Github user mallman commented on the issue:
https://github.com/apache/spark/pull/14690
BTW, I'm working on a rebase to fix merge conflicts and address reviewers'
feedback.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitH
Github user mallman commented on a diff in the pull request:
https://github.com/apache/spark/pull/14690#discussion_r82850487
--- Diff:
sql/core/src/main/scala/org/apache/spark/sql/execution/DataSourceScanExec.scala
---
@@ -225,13 +225,16 @@ case class FileSourceScanExec
Github user mallman commented on the issue:
https://github.com/apache/spark/pull/14690
Ah cripes. I committed something I didn't want to. I'm rebasing again in a
few...
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub a
Github user mallman commented on the issue:
https://github.com/apache/spark/pull/14690
I believe that using a method like `TableFileCatalog.filterPartitions` to
build a new file catalog restricted to some pruned partitions is a sound
approach, however I'm starting to reconside
Github user mallman commented on the issue:
https://github.com/apache/spark/pull/14690
>> Finally, this would require us to read the schema files. That's
something I'm trying to avoid in this patch.
> Not sure what you mean here, but the parquet change
Github user mallman commented on a diff in the pull request:
https://github.com/apache/spark/pull/14690#discussion_r82900810
--- Diff:
sql/core/src/main/scala/org/apache/spark/sql/execution/DataSourceScanExec.scala
---
@@ -225,13 +225,16 @@ case class FileSourceScanExec
Github user mallman commented on a diff in the pull request:
https://github.com/apache/spark/pull/14690#discussion_r82902172
--- Diff:
sql/core/src/main/scala/org/apache/spark/sql/execution/DataSourceScanExec.scala
---
@@ -225,13 +225,16 @@ case class FileSourceScanExec
Github user mallman commented on the issue:
https://github.com/apache/spark/pull/14690
I updated the description of this PR to reflect the workaround for the
Hive/Parquet case-sensitivity issue.
Do we need a similar workaround for ORC?
---
If your project is set up for it
Github user mallman commented on a diff in the pull request:
https://github.com/apache/spark/pull/14690#discussion_r83036315
--- Diff:
sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/PruneFileSourcePartitions.scala
---
@@ -0,0 +1,72 @@
+/*
+ * Licensed
Github user mallman commented on the issue:
https://github.com/apache/spark/pull/14690
I'm testing this patch on a couple of tables internally with on the order
of 10k partitions. Performance is much slower than it should be. I'm
investigating.
---
If your project is set
Github user mallman commented on the issue:
https://github.com/apache/spark/pull/14690
> Btw I've noticed a significant performance difference between
ListingFileCatalog and TableFileCatalog's implementation of ListFiles. The
difference seems to be that Listi
Github user mallman commented on the issue:
https://github.com/apache/spark/pull/14690
>> Btw I've noticed a significant performance difference between
ListingFileCatalog and TableFileCatalog's implementation of ListFiles. The
difference seems to be that Li
Github user mallman commented on the issue:
https://github.com/apache/spark/pull/14690
I determined the performance regression was introduced by a commit I hadn't
pushed to this PR. Sorry for the false alarm. ð Needless to say, I'm not
pushing that commit.
---
If your
Github user mallman commented on a diff in the pull request:
https://github.com/apache/spark/pull/14690#discussion_r83085625
--- Diff:
sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/TableFileCatalog.scala
---
@@ -0,0 +1,103 @@
+/*
+ * Licensed to the
Github user mallman commented on a diff in the pull request:
https://github.com/apache/spark/pull/14690#discussion_r83085289
--- Diff:
sql/hive/src/main/scala/org/apache/spark/sql/hive/HiveExternalCatalog.scala ---
@@ -616,6 +617,44 @@ private[spark] class HiveExternalCatalog(conf
Github user mallman commented on a diff in the pull request:
https://github.com/apache/spark/pull/14690#discussion_r83086945
--- Diff:
sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/PruneFileSourcePartitions.scala
---
@@ -0,0 +1,72 @@
+/*
+ * Licensed
Github user mallman commented on a diff in the pull request:
https://github.com/apache/spark/pull/14690#discussion_r83115827
--- Diff:
sql/hive/src/main/scala/org/apache/spark/sql/hive/HiveMetastoreCatalog.scala ---
@@ -199,59 +197,30 @@ private[hive] class
HiveMetastoreCatalog
Github user mallman commented on a diff in the pull request:
https://github.com/apache/spark/pull/14690#discussion_r83131630
--- Diff:
sql/hive/src/main/scala/org/apache/spark/sql/hive/HiveMetastoreCatalog.scala ---
@@ -199,59 +197,30 @@ private[hive] class
HiveMetastoreCatalog
Github user mallman commented on a diff in the pull request:
https://github.com/apache/spark/pull/14690#discussion_r83141382
--- Diff:
sql/hive/src/main/scala/org/apache/spark/sql/hive/HiveExternalCatalog.scala ---
@@ -616,6 +617,44 @@ private[spark] class HiveExternalCatalog(conf
Github user mallman commented on the issue:
https://github.com/apache/spark/pull/14690
I've pushed an update to `ParquetMetastoreSuite` that illustrates the bug
(or "limitation") WRT support for mixed-case partition columns I discovered
yesterday. To reiterate
Github user mallman commented on the issue:
https://github.com/apache/spark/pull/14690
> This patch fails MiMa tests.
I've never seen this before. What does this mean?
---
If your project is set up for it, you can reply to this email and have your
reply appear on G
Github user mallman commented on the issue:
https://github.com/apache/spark/pull/14690
> Btw, I noticed that this suite was failing in jenkins only.
>
> [info] - partitioned pruned table reports only selected files *** FAILED
*** (610 milliseconds)
>
>
Github user mallman commented on a diff in the pull request:
https://github.com/apache/spark/pull/14690#discussion_r83318096
--- Diff:
sql/hive/src/main/scala/org/apache/spark/sql/hive/HiveExternalCatalog.scala ---
@@ -616,6 +617,44 @@ private[spark] class HiveExternalCatalog(conf
Github user mallman commented on a diff in the pull request:
https://github.com/apache/spark/pull/14690#discussion_r83325088
--- Diff:
sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/PruneFileSourcePartitions.scala
---
@@ -0,0 +1,72 @@
+/*
+ * Licensed
Github user mallman commented on a diff in the pull request:
https://github.com/apache/spark/pull/14690#discussion_r83325529
--- Diff:
sql/core/src/main/scala/org/apache/spark/sql/execution/DataSourceScanExec.scala
---
@@ -225,13 +225,19 @@ case class FileSourceScanExec
Github user mallman commented on the issue:
https://github.com/apache/spark/pull/14690
> Oops there is a conflict now.
NP. I'm working on the rebase.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well.
Github user mallman commented on a diff in the pull request:
https://github.com/apache/spark/pull/14690#discussion_r83349072
--- Diff:
sql/hive/src/test/scala/org/apache/spark/sql/hive/parquetSuites.scala ---
@@ -34,7 +34,7 @@ import org.apache.spark.util.Utils
// The data
Github user mallman commented on a diff in the pull request:
https://github.com/apache/spark/pull/14690#discussion_r83352979
--- Diff:
sql/hive/src/main/scala/org/apache/spark/sql/hive/HiveExternalCatalog.scala ---
@@ -626,6 +627,40 @@ private[spark] class HiveExternalCatalog(conf
Github user mallman commented on a diff in the pull request:
https://github.com/apache/spark/pull/14690#discussion_r83355030
--- Diff:
sql/hive/src/test/scala/org/apache/spark/sql/hive/parquetSuites.scala ---
@@ -34,7 +34,7 @@ import org.apache.spark.util.Utils
// The data
Github user mallman commented on the issue:
https://github.com/apache/spark/pull/14690
I will work on a rebase. Meanwhile, I've revisited the open issues in the
PR description. To summarize:
1. Do we need a workaround for ORC like we made for Parquet?
1. What's
Github user mallman commented on the issue:
https://github.com/apache/spark/pull/14690
I'm still working on the rebase. It's very complexâthere are two other
commits involved.
>> 1. Do we need a workaround for ORC like we made for Parquet?
> 1)
Github user mallman commented on the issue:
https://github.com/apache/spark/pull/14690
I just pushed the rebase. It was really hairy, but I tried hard to ensure I
got essentially all three branches' changes in.
---
If your project is set up for it, you can reply to this emai
Github user mallman commented on the issue:
https://github.com/apache/spark/pull/14690
> btw, what's the parquet log redirection issue? I don't see anything
unusual in spark shell.
Whenever I run a query on a Hive parquet table I get
```
spark-sq
Github user mallman commented on the issue:
https://github.com/apache/spark/pull/14690
> Hm, I haven't seen that with my test queries. Would adding your
workaround to SparkILoopInit work?
It does not, unfortunately.
---
If your project is set up for it, you can
Github user mallman commented on the issue:
https://github.com/apache/spark/pull/14690
>> Hm, I haven't seen that with my test queries. Would adding your
workaround to SparkILoopInit work?
> It does not, unfortunately.
I believe this impacts people with
Github user mallman commented on a diff in the pull request:
https://github.com/apache/spark/pull/14690#discussion_r83517834
--- Diff:
sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/ListingFileCatalog.scala
---
@@ -17,32 +17,26 @@
package
Github user mallman commented on a diff in the pull request:
https://github.com/apache/spark/pull/14690#discussion_r83518291
--- Diff:
sql/hive/src/main/scala/org/apache/spark/sql/hive/HiveExternalCatalog.scala ---
@@ -626,6 +627,40 @@ private[spark] class HiveExternalCatalog(conf
1 - 100 of 658 matches
Mail list logo