Github user RussellSpitzer commented on a diff in the pull request:
https://github.com/apache/spark/pull/22576#discussion_r226622273
--- Diff:
sql/core/src/main/scala/org/apache/spark/sql/SparkSessionExtensions.scala ---
@@ -168,4 +173,21 @@ class SparkSessionExtensions
Github user RussellSpitzer commented on a diff in the pull request:
https://github.com/apache/spark/pull/21990#discussion_r225992993
--- Diff: sql/core/src/main/scala/org/apache/spark/sql/SparkSession.scala
---
@@ -1136,4 +1121,27 @@ object SparkSession extends Logging
Github user RussellSpitzer commented on a diff in the pull request:
https://github.com/apache/spark/pull/21990#discussion_r225805019
--- Diff: sql/core/src/main/scala/org/apache/spark/sql/SparkSession.scala
---
@@ -1136,4 +1121,27 @@ object SparkSession extends Logging
Github user RussellSpitzer commented on the issue:
https://github.com/apache/spark/pull/21990
Addressed Comments from @HyukjinKwon , I'm interested in @ueshin 's
suggestions, but I can't figure out how to do that unless we bake it into the
Extensions constructor. If we place
Github user RussellSpitzer commented on a diff in the pull request:
https://github.com/apache/spark/pull/21990#discussion_r225613517
--- Diff: sql/core/src/main/scala/org/apache/spark/sql/SparkSession.scala
---
@@ -1136,4 +1121,27 @@ object SparkSession extends Logging
Github user RussellSpitzer commented on a diff in the pull request:
https://github.com/apache/spark/pull/21990#discussion_r225612270
--- Diff: sql/core/src/main/scala/org/apache/spark/sql/SparkSession.scala
---
@@ -1136,4 +1121,27 @@ object SparkSession extends Logging
Github user RussellSpitzer commented on a diff in the pull request:
https://github.com/apache/spark/pull/21990#discussion_r225610975
--- Diff: sql/core/src/main/scala/org/apache/spark/sql/SparkSession.scala
---
@@ -1136,4 +1121,27 @@ object SparkSession extends Logging
Github user RussellSpitzer commented on a diff in the pull request:
https://github.com/apache/spark/pull/21990#discussion_r225608605
--- Diff: python/pyspark/sql/session.py ---
@@ -219,6 +219,7 @@ def __init__(self, sparkContext, jsparkSession=None
Github user RussellSpitzer commented on the issue:
https://github.com/apache/spark/pull/22576
Cleaned up
---
-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h
Github user RussellSpitzer commented on the issue:
https://github.com/apache/spark/pull/21990
I'm fine with anything really, I still think the ideal solution is probably
not to tie the creation of the py4j gateway to the SparkContext, but that's
probably a much bigger refactor
Github user RussellSpitzer commented on the issue:
https://github.com/apache/spark/pull/22576
Ah I was registering functions with the built-in registry which is not
reset. I've changed it to register only with a clone of the built-in registry.
This would allow multiple extensions
Github user RussellSpitzer commented on the issue:
https://github.com/apache/spark/pull/22576
Looks like the session with extensions from the test suite is leaking to
other suites ... Investigating
On Fri, Sep 28, 2018 at 11:25 AM UCB AMPLab
wrote:
> T
Github user RussellSpitzer commented on the issue:
https://github.com/apache/spark/pull/22576
@hvanhovell Made a full PR for the change we discussed. Also updated the
signature to match the new defined types for the registry and Identifier
GitHub user RussellSpitzer opened a pull request:
https://github.com/apache/spark/pull/22576
[SPARK-25560][SQL] Allow FunctionInjection in SparkExtensions
This allows an implementer of Spark Session Extensions to utilize a
method "injectFunction" which will add a ne
Github user RussellSpitzer commented on the issue:
https://github.com/apache/spark/pull/21990
Added new method of injecting extensions, this way the "getOrCreate" code
from the scala method is not needed. @H
Github user RussellSpitzer commented on the issue:
https://github.com/apache/spark/pull/21990
@HyukjinKwon so you want me to rewrite the code in python? I will note
SparkR is doing this exact same thing
Github user RussellSpitzer commented on the issue:
https://github.com/apache/spark/pull/21990
What I wanted was to just call the Scala Methods, instead of having half
the code and half in python, but we create the JVM in the SparkContext creation
code so this ends up not being a good
Github user RussellSpitzer commented on the issue:
https://github.com/apache/spark/pull/21990
@HyukjinKwon So i've been staring at this for a while today, and I guess
the big issue is that we always need to make a Python SparkContext to get a
handle on the JavaGateway, so everything
Github user RussellSpitzer commented on a diff in the pull request:
https://github.com/apache/spark/pull/21990#discussion_r210941883
--- Diff: python/pyspark/sql/tests.py ---
@@ -3563,6 +3563,51 @@ def
test_query_execution_listener_on_collect_with_arrow(self
Github user RussellSpitzer commented on a diff in the pull request:
https://github.com/apache/spark/pull/21990#discussion_r210909196
--- Diff: python/pyspark/sql/tests.py ---
@@ -3563,6 +3563,51 @@ def
test_query_execution_listener_on_collect_with_arrow(self
Github user RussellSpitzer commented on a diff in the pull request:
https://github.com/apache/spark/pull/21990#discussion_r210909083
--- Diff: python/pyspark/sql/tests.py ---
@@ -3563,6 +3563,51 @@ def
test_query_execution_listener_on_collect_with_arrow(self
Github user RussellSpitzer commented on a diff in the pull request:
https://github.com/apache/spark/pull/21990#discussion_r210428804
--- Diff: python/pyspark/sql/session.py ---
@@ -218,7 +218,9 @@ def __init__(self, sparkContext, jsparkSession=None
Github user RussellSpitzer closed the pull request at:
https://github.com/apache/spark/pull/21988
---
-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h
Github user RussellSpitzer closed the pull request at:
https://github.com/apache/spark/pull/21989
---
-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h
Github user RussellSpitzer commented on the issue:
https://github.com/apache/spark/pull/21988
@felixcheung I just didn't know what version to target so I made a a PR for
each one. We can just close the ones that shouldn't be merged
Github user RussellSpitzer commented on the issue:
https://github.com/apache/spark/pull/21989
@kiszk sure, it all depends which branch the merge target should be I
wasn't sure which one was being used for changes of this nature. Technically
it's a bug fix I believe
Github user RussellSpitzer commented on the issue:
https://github.com/apache/spark/pull/21988
Local PEP didn't seem to mind this code ... Fixed up the indentation so
hopefully jenkins will like it now
GitHub user RussellSpitzer opened a pull request:
https://github.com/apache/spark/pull/21990
[SPARK-25003][PYSPARK] Use SessionExtensions in Pyspark
Master
## What changes were proposed in this pull request?
Previously Pyspark used the private constructor
GitHub user RussellSpitzer opened a pull request:
https://github.com/apache/spark/pull/21989
[SPARK-25003][PYSPARK] Use SessionExtensions in Pyspark
(Branch-2.3)
##What changes were proposed in this pull request?
Previously Pyspark used the private constructor
GitHub user RussellSpitzer opened a pull request:
https://github.com/apache/spark/pull/21988
[SPARK-25003][PYSPARK] Use SessionExtensions in Pyspark
## What changes were proposed in this pull request?
Previously Pyspark used the private constructor for SparkSession when
GitHub user RussellSpitzer opened a pull request:
https://github.com/apache/spark/pull/21453
Test branch to see how Scala 2.11.12 performs
This may be useful when Java 8 is no longer supported since
Scala 2.11.12 supports later versions of Java
## What changes were
Github user RussellSpitzer closed the pull request at:
https://github.com/apache/spark/pull/20190
---
-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h
Github user RussellSpitzer commented on the issue:
https://github.com/apache/spark/pull/20190
@jerryshao https://github.com/apache/spark/pull/20298
---
-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
GitHub user RussellSpitzer opened a pull request:
https://github.com/apache/spark/pull/20298
[SPARK-22976][Core]: Cluster mode driver dir removed while running
## What changes were proposed in this pull request?
The clean up logic on the worker perviously determined
Github user RussellSpitzer commented on the issue:
https://github.com/apache/spark/pull/20201
This looks very exciting to me
---
-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands
Github user RussellSpitzer commented on the issue:
https://github.com/apache/spark/pull/20190
@zsxwing I think you were the last to touch this code, could you please
review?
---
-
To unsubscribe, e-mail: reviews
GitHub user RussellSpitzer opened a pull request:
https://github.com/apache/spark/pull/20190
[SPARK-22976][Core]: Cluster mode driver directories can be removed wâ¦
â¦hile running
## What changes were proposed in this pull request?
The clean up logic
Github user RussellSpitzer commented on a diff in the pull request:
https://github.com/apache/spark/pull/19136#discussion_r137531270
--- Diff:
sql/core/src/main/java/org/apache/spark/sql/sources/v2/reader/ReadTask.java ---
@@ -0,0 +1,43 @@
+/*
+ * Licensed to the Apache
Github user RussellSpitzer commented on a diff in the pull request:
https://github.com/apache/spark/pull/19136#discussion_r137333974
--- Diff:
sql/core/src/main/java/org/apache/spark/sql/sources/v2/reader/DataSourceV2Reader.java
---
@@ -0,0 +1,126 @@
+/*
+ * Licensed
Github user RussellSpitzer closed the pull request at:
https://github.com/apache/spark/pull/10655
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so
Github user RussellSpitzer commented on the issue:
https://github.com/apache/spark/pull/10655
We fixed this on a different pr https://github.com/apache/spark/pull/11317
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well
Github user RussellSpitzer commented on a diff in the pull request:
https://github.com/apache/spark/pull/11796#discussion_r73781398
--- Diff: assembly/pom.xml ---
@@ -69,6 +68,17 @@
spark-repl_${scala.binary.version}
${project.version
Github user RussellSpitzer commented on the issue:
https://github.com/apache/spark/pull/11317
Updated
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so
Github user RussellSpitzer commented on the issue:
https://github.com/apache/spark/pull/13652
I would love to be able to just specify days since epoch rather than using
java.sql.Date
---
If your project is set up for it, you can reply to this email and have your
reply appear
Github user RussellSpitzer commented on the issue:
https://github.com/apache/spark/pull/13652
I think this is unfortunately the right thing to do. I wish we didn't have
to use java.sql.Date :(
---
If your project is set up for it, you can reply to this email and have your
reply
Github user RussellSpitzer commented on the pull request:
https://github.com/apache/spark/pull/11317#issuecomment-218052162
I don't think this is because of me
```
# A fatal error has been detected by the Java Runtime Environment:
#
# Internal Error
Github user RussellSpitzer commented on the pull request:
https://github.com/apache/spark/pull/11317#issuecomment-218030539
@HyukjinKwon + @yhuai Sorry it took so long! Things have been busy :)
---
If your project is set up for it, you can reply to this email and have your
reply
Github user RussellSpitzer commented on the pull request:
https://github.com/apache/spark/pull/10655#issuecomment-217483240
Sorry I forgot about this, I'll clean this up tomorrow and get it ready
---
If your project is set up for it, you can reply to this email and have your
reply
GitHub user RussellSpitzer opened a pull request:
https://github.com/apache/spark/pull/11317
[SPARK-12639] [SQL] Mark Filters Fully Handled By Sources with *
## What changes were proposed in this pull request?
In order to make it clear which filters are fully handled
Github user RussellSpitzer closed the pull request at:
https://github.com/apache/spark/pull/10929
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so
GitHub user RussellSpitzer opened a pull request:
https://github.com/apache/spark/pull/10929
SPARK-12639 SQL Mark Filters Fully Handled By Sources with *
In order to make it clear which filters are fully handled by the
underlying datasource we will mark them
Github user RussellSpitzer commented on the pull request:
https://github.com/apache/spark/pull/10932#issuecomment-175294425
I'm +1 on this in 2.0 :)
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does
Github user RussellSpitzer commented on the pull request:
https://github.com/apache/spark/pull/10655#issuecomment-174113049
Haven't forgotten this will have a new pr soon :)
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well
Github user RussellSpitzer commented on the pull request:
https://github.com/apache/spark/pull/10655#issuecomment-172145152
I personally think the ambiguous `PUSHED_FILTERS` is more confusing. When
we see a predicate there we have no idea whether or not it is a valid filter
Github user RussellSpitzer commented on the pull request:
https://github.com/apache/spark/pull/10655#issuecomment-172136871
@yhuai I removed the PushedFilters and add the other examples. We could
read-add the "PushedFilters" if you like. I wasn't sure if you still wanted
Github user RussellSpitzer commented on a diff in the pull request:
https://github.com/apache/spark/pull/10655#discussion_r49342098
--- Diff:
sql/core/src/main/scala/org/apache/spark/sql/execution/ExistingRDD.scala ---
@@ -114,6 +114,7 @@ private[sql] object PhysicalRDD
Github user RussellSpitzer commented on the pull request:
https://github.com/apache/spark/pull/10655#issuecomment-170064440
@rxin Added, basically I think the current "PushedFilters" list isn't very
valuable if everything is listed there. So instead we should just list thos
Github user RussellSpitzer commented on a diff in the pull request:
https://github.com/apache/spark/pull/10655#discussion_r49153042
--- Diff:
sql/core/src/main/scala/org/apache/spark/sql/execution/ExistingRDD.scala ---
@@ -114,6 +114,7 @@ private[sql] object PhysicalRDD
Github user RussellSpitzer commented on a diff in the pull request:
https://github.com/apache/spark/pull/10655#discussion_r49152876
--- Diff:
sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/DataSourceStrategy.scala
---
@@ -321,8 +321,8 @@ private[sql] object
GitHub user RussellSpitzer opened a pull request:
https://github.com/apache/spark/pull/10655
SPARK-12639 SQL Improve Explain for Datasources with Handled Predicates
SPARK-11661 Makes all predicates pushed down to underlying Datasources
regardless of whether the source can handle
Github user RussellSpitzer commented on the pull request:
https://github.com/apache/spark/pull/9369#issuecomment-157886376
np
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
Github user RussellSpitzer closed the pull request at:
https://github.com/apache/spark/pull/9369
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so
62 matches
Mail list logo