Github user rxin commented on the issue:
https://github.com/apache/spark/pull/19394
What's the other value?
---
-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail
Github user rxin commented on the issue:
https://github.com/apache/spark/pull/19394
Not sure - maybe print the chi-value of the test and see if they make
sense. If they do, we can change the threshold
Github user rxin commented on the issue:
https://github.com/apache/spark/pull/18732
What's the difference between this one and the transform function you also
proposed? I'm trying to see if all the naming makes sense when considered
together
Github user rxin commented on the issue:
https://github.com/apache/spark/pull/19393
LGTM but I wrote most of the code so perhaps we should find somebody else
to review.
---
-
To unsubscribe, e-mail: reviews
Github user rxin commented on the issue:
https://github.com/apache/spark/pull/18732
Is this just a mapGroups function?
---
-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e
Github user rxin commented on a diff in the pull request:
https://github.com/apache/spark/pull/19387#discussion_r141786663
--- Diff: core/src/main/scala/org/apache/spark/Partitioner.scala ---
@@ -108,11 +108,21 @@ class HashPartitioner(partitions: Int) extends
Partitioner
Github user rxin commented on the issue:
https://github.com/apache/spark/pull/19387
Merging in master.
---
-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h
Github user rxin commented on the issue:
https://github.com/apache/spark/pull/19387
I put up a comment saying this test result should be deterministic, since
the sampling uses a fixed seed based on partition id
Github user rxin commented on a diff in the pull request:
https://github.com/apache/spark/pull/19387#discussion_r141764431
--- Diff:
sql/core/src/test/scala/org/apache/spark/sql/ConfigBehaviorSuite.scala ---
@@ -0,0 +1,64 @@
+/*
+ * Licensed to the Apache Software
Github user rxin commented on a diff in the pull request:
https://github.com/apache/spark/pull/19387#discussion_r141764415
--- Diff:
sql/core/src/test/scala/org/apache/spark/sql/ConfigBehaviorSuite.scala ---
@@ -0,0 +1,64 @@
+/*
+ * Licensed to the Apache Software
Github user rxin commented on a diff in the pull request:
https://github.com/apache/spark/pull/19387#discussion_r141755874
--- Diff: core/src/main/scala/org/apache/spark/Partitioner.scala ---
@@ -108,9 +108,17 @@ class HashPartitioner(partitions: Int) extends
Partitioner
GitHub user rxin opened a pull request:
https://github.com/apache/spark/pull/19387
[SPARK-22160][SQL] Allow changing sample points per partition in range
shuffle exchange
## What changes were proposed in this pull request?
Spark's RangePartitioner hard codes the number
Github user rxin commented on the issue:
https://github.com/apache/spark/pull/19384
I reverted the 2nd commit. Should be good for merge now.
---
-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
Github user rxin commented on the issue:
https://github.com/apache/spark/pull/19384
hm the 2nd commit is not meant for this one.
---
-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional
GitHub user rxin opened a pull request:
https://github.com/apache/spark/pull/19384
[SPARK-22159][SQL] Make config names consistently end with "enabled".
## What changes were proposed in this pull request?
spark.sql.execution.ar
GitHub user rxin opened a pull request:
https://github.com/apache/spark/pull/19376
[SPARK-22153][SQL] Rename ShuffleExchange -> ShuffleExchangeExec
## What changes were proposed in this pull request?
For some reason when we added the Exec suffix to all physical operators,
Github user rxin commented on a diff in the pull request:
https://github.com/apache/spark/pull/19362#discussion_r141403817
--- Diff:
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/Optimizer.scala
---
@@ -136,6 +134,8 @@ abstract class Optimizer(sessionCatalog
Github user rxin commented on a diff in the pull request:
https://github.com/apache/spark/pull/19269#discussion_r140379971
--- Diff:
sql/core/src/main/java/org/apache/spark/sql/sources/v2/writer/DataSourceV2Writer.java
---
@@ -0,0 +1,71 @@
+/*
+ * Licensed to the Apache
Github user rxin commented on a diff in the pull request:
https://github.com/apache/spark/pull/19269#discussion_r139889741
--- Diff:
sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/v2/WriteToDataSourceV2Command.scala
---
@@ -0,0 +1,114
Github user rxin commented on a diff in the pull request:
https://github.com/apache/spark/pull/19269#discussion_r139889045
--- Diff:
sql/core/src/main/java/org/apache/spark/sql/sources/v2/writer/SupportsWriteUnsafeRow.java
---
@@ -0,0 +1,44 @@
+/*
+ * Licensed
Github user rxin commented on a diff in the pull request:
https://github.com/apache/spark/pull/19269#discussion_r13951
--- Diff:
sql/core/src/main/java/org/apache/spark/sql/sources/v2/writer/DataWriter.java
---
@@ -0,0 +1,38 @@
+/*
+ * Licensed to the Apache Software
Github user rxin commented on the issue:
https://github.com/apache/spark/pull/18704
cc @michal-databricks any thoughts on this?
---
-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional
Github user rxin commented on the issue:
https://github.com/apache/spark/pull/19261
What does this even mean?
---
-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail
Github user rxin commented on the issue:
https://github.com/apache/spark/pull/19136
LGTM.
Still some feedback that can be addressed later. We should also document
all the APIs as Evolving
Github user rxin commented on a diff in the pull request:
https://github.com/apache/spark/pull/19136#discussion_r138947707
--- Diff:
sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/v2/DataSourceRDD.scala
---
@@ -0,0 +1,71 @@
+/*
+ * Licensed
Github user rxin commented on a diff in the pull request:
https://github.com/apache/spark/pull/19136#discussion_r138947426
--- Diff:
sql/core/src/main/java/org/apache/spark/sql/sources/v2/reader/Statistics.java
---
@@ -0,0 +1,29 @@
+/*
+ * Licensed to the Apache Software
Github user rxin commented on a diff in the pull request:
https://github.com/apache/spark/pull/19136#discussion_r138947297
--- Diff:
sql/core/src/main/java/org/apache/spark/sql/sources/v2/ReadSupportWithSchema.java
---
@@ -0,0 +1,45 @@
+/*
+ * Licensed to the Apache
Github user rxin commented on a diff in the pull request:
https://github.com/apache/spark/pull/19136#discussion_r138946124
--- Diff:
sql/core/src/main/java/org/apache/spark/sql/sources/v2/ReadSupport.java ---
@@ -0,0 +1,36 @@
+/*
+ * Licensed to the Apache Software
Github user rxin commented on a diff in the pull request:
https://github.com/apache/spark/pull/19136#discussion_r138945691
--- Diff:
sql/core/src/main/java/org/apache/spark/sql/sources/v2/DataSourceV2.java ---
@@ -0,0 +1,28 @@
+/*
+ * Licensed to the Apache Software
Github user rxin commented on a diff in the pull request:
https://github.com/apache/spark/pull/19136#discussion_r138709319
--- Diff:
sql/core/src/main/java/org/apache/spark/sql/sources/v2/reader/Statistics.java
---
@@ -0,0 +1,28 @@
+/*
+ * Licensed to the Apache Software
Github user rxin commented on a diff in the pull request:
https://github.com/apache/spark/pull/19136#discussion_r138665881
--- Diff:
sql/core/src/main/java/org/apache/spark/sql/sources/v2/reader/DataReader.java
---
@@ -0,0 +1,36 @@
+/*
+ * Licensed to the Apache Software
Github user rxin commented on a diff in the pull request:
https://github.com/apache/spark/pull/19136#discussion_r138624261
--- Diff:
sql/core/src/main/java/org/apache/spark/sql/sources/v2/reader/upward/Statistics.java
---
@@ -0,0 +1,28 @@
+/*
+ * Licensed to the Apache
Github user rxin commented on a diff in the pull request:
https://github.com/apache/spark/pull/19136#discussion_r138623586
--- Diff:
sql/core/src/main/java/org/apache/spark/sql/sources/v2/reader/downward/ColumnPruningSupport.java
---
@@ -0,0 +1,36 @@
+/*
+ * Licensed
Github user rxin commented on a diff in the pull request:
https://github.com/apache/spark/pull/19136#discussion_r138622262
--- Diff:
sql/core/src/main/java/org/apache/spark/sql/sources/v2/reader/DataReader.java
---
@@ -0,0 +1,36 @@
+/*
+ * Licensed to the Apache Software
Github user rxin commented on a diff in the pull request:
https://github.com/apache/spark/pull/19136#discussion_r138622067
--- Diff:
sql/core/src/main/java/org/apache/spark/sql/sources/v2/SchemaRequiredDataSourceV2.java
---
@@ -0,0 +1,42 @@
+/*
+ * Licensed to the Apache
Github user rxin commented on a diff in the pull request:
https://github.com/apache/spark/pull/19136#discussion_r138621970
--- Diff:
sql/core/src/main/java/org/apache/spark/sql/sources/v2/SchemaRequiredDataSourceV2.java
---
@@ -0,0 +1,42 @@
+/*
+ * Licensed to the Apache
Github user rxin commented on a diff in the pull request:
https://github.com/apache/spark/pull/19136#discussion_r138621700
--- Diff:
sql/core/src/main/java/org/apache/spark/sql/sources/v2/DataSourceV2Options.java
---
@@ -0,0 +1,49 @@
+/*
+ * Licensed to the Apache
Github user rxin commented on a diff in the pull request:
https://github.com/apache/spark/pull/19136#discussion_r138621506
--- Diff:
sql/core/src/main/java/org/apache/spark/sql/sources/v2/DataSourceV2Options.java
---
@@ -0,0 +1,49 @@
+/*
+ * Licensed to the Apache
Github user rxin commented on the issue:
https://github.com/apache/spark/pull/16578
I tried this and this is definitely super useful! it's a big patch and most
of the people working in this area are either doing something else that's not
Spark, or working on a few high priority SPIPs
Github user rxin commented on a diff in the pull request:
https://github.com/apache/spark/pull/19086#discussion_r136640563
--- Diff:
sql/hive/src/main/scala/org/apache/spark/sql/hive/client/HiveClientImpl.scala
---
@@ -495,15 +495,16 @@ private[hive] class HiveClientImpl
Github user rxin commented on a diff in the pull request:
https://github.com/apache/spark/pull/19064#discussion_r135590939
--- Diff:
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/ScalaUDF.scala
---
@@ -23,6 +23,12 @@ import
Github user rxin commented on a diff in the pull request:
https://github.com/apache/spark/pull/19064#discussion_r135590830
--- Diff:
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/ScalaUDF.scala
---
@@ -23,6 +23,12 @@ import
Github user rxin commented on the issue:
https://github.com/apache/spark/pull/18906
I understand why you are using Python. What I don't understand is why you'd
need to annotate nullability, because those are typically annotated for the
purpose of performance improvement, but Python
Github user rxin commented on a diff in the pull request:
https://github.com/apache/spark/pull/18999#discussion_r135132622
--- Diff: python/pyspark/sql/dataframe.py ---
@@ -659,19 +659,77 @@ def distinct(self):
return DataFrame(self._jdf.distinct(), self.sql_ctx
Github user rxin commented on the issue:
https://github.com/apache/spark/pull/18906
@ptkool have you seen a real use case so far that you need this? I'm a bit
surprised since Python UDFs are already pretty slow, and you'd care about this.
Are there other cases you run
Github user rxin commented on the issue:
https://github.com/apache/spark/pull/19008
Can you also update the various json functions in which we document the
options? The way it is right now there is no way for end-users to discover this
option.
---
If your project is set up
Github user rxin commented on a diff in the pull request:
https://github.com/apache/spark/pull/18999#discussion_r134123916
--- Diff: python/pyspark/sql/dataframe.py ---
@@ -659,19 +659,77 @@ def distinct(self):
return DataFrame(self._jdf.distinct(), self.sql_ctx
Github user rxin commented on a diff in the pull request:
https://github.com/apache/spark/pull/18999#discussion_r134123358
--- Diff: python/pyspark/sql/dataframe.py ---
@@ -659,19 +659,77 @@ def distinct(self):
return DataFrame(self._jdf.distinct(), self.sql_ctx
Github user rxin commented on the issue:
https://github.com/apache/spark/pull/18996
Merging in master. Thanks.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled
Github user rxin commented on the issue:
https://github.com/apache/spark/pull/18988
Thanks. Do you want to add the Python and R ones?
It is a little bit tricky because in Python we would need to detect whether
withReplacement is a boolean or a floating point value
Github user rxin commented on a diff in the pull request:
https://github.com/apache/spark/pull/18979#discussion_r133887920
--- Diff:
sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/BasicWriteStatsTracker.scala
---
@@ -57,7 +60,14 @@ class
GitHub user rxin opened a pull request:
https://github.com/apache/spark/pull/18988
[SPARK-21778][SQL] Simpler Dataset.sample API in Scala / Java
## What changes were proposed in this pull request?
Dataset.sample requires a boolean flag withReplacement as the first
argument
Github user rxin commented on the issue:
https://github.com/apache/spark/pull/18640
lgtm
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature
Github user rxin commented on a diff in the pull request:
https://github.com/apache/spark/pull/18956#discussion_r133360047
--- Diff:
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/Optimizer.scala
---
@@ -37,6 +37,12 @@ import org.apache.spark.sql.types
Github user rxin commented on a diff in the pull request:
https://github.com/apache/spark/pull/18640#discussion_r133131618
--- Diff: sql/core/pom.xml ---
@@ -87,6 +87,16 @@
+ org.apache.orc
+ orc-core
+ ${orc.classifier
Github user rxin commented on the issue:
https://github.com/apache/spark/pull/18923
Jenkins, test this please.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled
Github user rxin commented on the issue:
https://github.com/apache/spark/pull/18912
Merging in master.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so
GitHub user rxin opened a pull request:
https://github.com/apache/spark/pull/18912
[SQL] Remove unused getTableOption in ExternalCatalog
## What changes were proposed in this pull request?
This patch removes the unused SessionCatalog.getTableMetadataOption and
ExternalCatalog
Github user rxin commented on the issue:
https://github.com/apache/spark/pull/18900
We should put this in the catalog, shouldn't we?
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have
Github user rxin commented on the issue:
https://github.com/apache/spark/pull/18884
Merging in master.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so
Github user rxin commented on the issue:
https://github.com/apache/spark/pull/18884
this looks good to me, but I didn't review super carefully.
cc @cloud-fan
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well
Github user rxin commented on the issue:
https://github.com/apache/spark/pull/18752
Merging in master.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so
Github user rxin commented on the issue:
https://github.com/apache/spark/pull/18886
thx lgtm
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so
Github user rxin commented on the issue:
https://github.com/apache/spark/pull/18786
I suspect it is ok for R ...
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled
Github user rxin commented on a diff in the pull request:
https://github.com/apache/spark/pull/18884#discussion_r132022172
--- Diff:
sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/FileStreamSink.scala
---
@@ -128,6 +128,7 @@ class FileStreamSink
Github user rxin commented on a diff in the pull request:
https://github.com/apache/spark/pull/18884#discussion_r132006851
--- Diff:
sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/WriteStatsTracker.scala
---
@@ -0,0 +1,121 @@
+/*
+ * Licensed
Github user rxin commented on a diff in the pull request:
https://github.com/apache/spark/pull/18884#discussion_r132006674
--- Diff:
sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/BasicWriteStatsTracker.scala
---
@@ -0,0 +1,133 @@
+/*
+ * Licensed
Github user rxin commented on the issue:
https://github.com/apache/spark/pull/18607
We can just put this code in a 3rd party library, can't we? If there is an
issue with service/code discovery, we can come up with some sort of
registration process similar to the data source API
Github user rxin commented on the issue:
https://github.com/apache/spark/pull/18884
Jenkins, add to white list.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled
Github user rxin commented on the issue:
https://github.com/apache/spark/pull/18607
Unfortunately I think drill is not popular enough to warrant inclusion in
here yet. If this is not extensible, we should make it possible to include such
mappings outside Spark and then perhaps Drill
Github user rxin commented on the issue:
https://github.com/apache/spark/pull/18851
Looks like the strip global limit is used by at least some test cases.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project
Github user rxin commented on the issue:
https://github.com/apache/spark/pull/18844
Ok makes sense. LGTM.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so
Github user rxin commented on the issue:
https://github.com/apache/spark/pull/18851
cc @JoshRosen
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so
GitHub user rxin opened a pull request:
https://github.com/apache/spark/pull/18851
[SPARK-21644][SQL] LocalLimit.maxRows is defined incorrectly
## What changes were proposed in this pull request?
The definition of `maxRows` in `LocalLimit` operator was simply wrong. This
patch
Github user rxin commented on the issue:
https://github.com/apache/spark/pull/18844
Actually why do we need this? Can't you just add Error?
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have
Github user rxin commented on the issue:
https://github.com/apache/spark/pull/18640
I just checked the dependency size. They look pretty reasonable, roughly 2
MBs in total (although I do worry in the future whether ORC would bring in a
lot more jars).
cc @omalley any
Github user rxin commented on the issue:
https://github.com/apache/spark/pull/18640
Why don't we then create a separate orc module? Just copy a few of the
files over?
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well
Github user rxin commented on the issue:
https://github.com/apache/spark/pull/18640
To the best of my knowledge almost everybody runs with Hive anyway and the
vast majority of users that run ORC are Hive users. In hindsight we probably
should have put most of the data source
Github user rxin commented on the issue:
https://github.com/apache/spark/pull/18749
@srowen that's not what I said. Almost always an explicit LGTM is
preferred. There are tiny changes that might not require them, and it is up to
the judgement of the committer. But those are more
Github user rxin commented on the issue:
https://github.com/apache/spark/pull/18640
Why are we adding this to core? Why not just the hive module?
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does
Github user rxin commented on the issue:
https://github.com/apache/spark/pull/18844
Jenkins, test this please.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled
Github user rxin commented on the issue:
https://github.com/apache/spark/pull/18839
cc @gatorsmile
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so
Github user rxin commented on the issue:
https://github.com/apache/spark/pull/18839
Some test on string form of the plan might fail. Let's see ...
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does
GitHub user rxin opened a pull request:
https://github.com/apache/spark/pull/18839
[SPARK-21634][SQL] Change OneRowRelation from a case object to case class
## What changes were proposed in this pull request?
OneRowRelation is the only plan that is a case object, which causes
Github user rxin commented on the issue:
https://github.com/apache/spark/pull/18749
@HyukjinKwon you weren't a committer before :)
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have
Github user rxin commented on the issue:
https://github.com/apache/spark/pull/18749
@srowen search for "RTC vs CTR (was: Concerning Sentry...)"
From Todd Lipcon:
```
I don't have incubator stats... nor do I have a good way to measure "most
a
Github user rxin commented on the issue:
https://github.com/apache/spark/pull/18749
Actually Sean I disagree. Spark has always been review then commit from the
days before it entered ASF. In a huge debate last year within the ASF on RTC vs
CTR, Spark was cited as a prominent example
Github user rxin commented on the issue:
https://github.com/apache/spark/pull/18749
Ah OK. That's what we are discussing here. In the past it has always been
an explicit "LGTM". That was defined before github had even the approval
feature. Now most committers are actual
Github user rxin commented on the issue:
https://github.com/apache/spark/pull/18828
Still looking into it, but the failure is related to reuse exchange and
caching.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well
Github user rxin commented on the issue:
https://github.com/apache/spark/pull/18749
What's your point? You should be able to merge PR without anybody reviewing?
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your
Github user rxin commented on the issue:
https://github.com/apache/spark/pull/18749
Yes.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so
Github user rxin commented on the issue:
https://github.com/apache/spark/pull/18749
It is documented: http://spark.apache.org/contributing.html
It's been the convention forever and it's also good to use one way rather
than multiple, so I'd prefer us just using that ... until
Github user rxin commented on the issue:
https://github.com/apache/spark/pull/18749
I think @srowen did it here using the new github approval: srowen approved
these changes 20 hours ago
@srowen might be better if we stick with the LGTM one.
---
If your project is set up
Github user rxin commented on the issue:
https://github.com/apache/spark/pull/18749
seems fine to me.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so
Github user rxin commented on the issue:
https://github.com/apache/spark/pull/18828
cc @adrian-ionescu @gatorsmile
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled
GitHub user rxin opened a pull request:
https://github.com/apache/spark/pull/18828
[SPARK-21619][SQL] Fail the execution of canonicalized plans explicitly
## What changes were proposed in this pull request?
Canonicalized plans are not supposed to be executed. I ran into a case
Github user rxin commented on the issue:
https://github.com/apache/spark/pull/18805
@sitalkedia anyway you can talk to the FB team that does that one and
relicense, similar to RocksDB?
---
If your project is set up for it, you can reply to this email and have your
reply appear
Github user rxin commented on the issue:
https://github.com/apache/spark/pull/18805
Our compression codec is actually completely decoupled from Hadoops, but
dependency management (and licensing) can be annoying to deal with.
---
If your project is set up for it, you can reply
Github user rxin commented on the issue:
https://github.com/apache/spark/pull/18805
How big is the dependency that's getting pulled in? If we are adding more
compression codecs maybe we should retire some old ones, or move them into a
separate package so downstream apps can
Github user rxin commented on the issue:
https://github.com/apache/spark/pull/18805
Any benchmark data?
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so
401 - 500 of 14826 matches
Mail list logo