[jira] (SPARK-40341) Implement `Rolling.median`.
[ https://issues.apache.org/jira/browse/SPARK-40341 ] Artsiom Yudovin deleted comment on SPARK-40341: - was (Author: ayudovin): I'm working on this > Implement `Rolling.median`. > --- > > Key: SPARK-40341 > URL: https://issues.apache.org/jira/browse/SPARK-40341 > Project: Spark > Issue Type: Sub-task > Components: Pandas API on Spark >Affects Versions: 3.4.0 >Reporter: Haejoon Lee >Assignee: Yikun Jiang >Priority: Major > > We should implement `Rolling.median` for increasing pandas API coverage. > pandas docs: > https://pandas.pydata.org/docs/reference/api/pandas.core.window.rolling.Rolling.median.html -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-40341) Implement `Rolling.median`.
[ https://issues.apache.org/jira/browse/SPARK-40341?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17607908#comment-17607908 ] Artsiom Yudovin commented on SPARK-40341: - I'm working on this > Implement `Rolling.median`. > --- > > Key: SPARK-40341 > URL: https://issues.apache.org/jira/browse/SPARK-40341 > Project: Spark > Issue Type: Sub-task > Components: Pandas API on Spark >Affects Versions: 3.4.0 >Reporter: Haejoon Lee >Assignee: Yikun Jiang >Priority: Major > > We should implement `Rolling.median` for increasing pandas API coverage. > pandas docs: > https://pandas.pydata.org/docs/reference/api/pandas.core.window.rolling.Rolling.median.html -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-40334) Implement `GroupBy.prod`.
[ https://issues.apache.org/jira/browse/SPARK-40334?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17604841#comment-17604841 ] Artsiom Yudovin commented on SPARK-40334: - Got you, thank you so much! > Implement `GroupBy.prod`. > - > > Key: SPARK-40334 > URL: https://issues.apache.org/jira/browse/SPARK-40334 > Project: Spark > Issue Type: Sub-task > Components: Pandas API on Spark >Affects Versions: 3.4.0 >Reporter: Haejoon Lee >Assignee: Haejoon Lee >Priority: Major > > We should implement `GroupBy.prod` for increasing pandas API coverage. > pandas docs: > https://pandas.pydata.org/docs/reference/api/pandas.core.groupby.GroupBy.prod.html -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-40334) Implement `GroupBy.prod`.
[ https://issues.apache.org/jira/browse/SPARK-40334?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17603463#comment-17603463 ] Artsiom Yudovin commented on SPARK-40334: - [~itholic], Hi, I have been started to work on this ticket 2 days ago. Does it make sense to continue or choose another ticket? > Implement `GroupBy.prod`. > - > > Key: SPARK-40334 > URL: https://issues.apache.org/jira/browse/SPARK-40334 > Project: Spark > Issue Type: Sub-task > Components: Pandas API on Spark >Affects Versions: 3.4.0 >Reporter: Haejoon Lee >Assignee: Haejoon Lee >Priority: Major > > We should implement `GroupBy.prod` for increasing pandas API coverage. > pandas docs: > https://pandas.pydata.org/docs/reference/api/pandas.core.groupby.GroupBy.prod.html -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-30122) Allow setting serviceAccountName for executor pods
[ https://issues.apache.org/jira/browse/SPARK-30122?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17031084#comment-17031084 ] Artsiom Yudovin commented on SPARK-30122: - (y) > Allow setting serviceAccountName for executor pods > -- > > Key: SPARK-30122 > URL: https://issues.apache.org/jira/browse/SPARK-30122 > Project: Spark > Issue Type: Improvement > Components: Kubernetes >Affects Versions: 3.1.0 >Reporter: Juho Mäkinen >Priority: Major > Fix For: 3.1.0 > > > Currently it doesn't seem to be possible to have Spark Driver set the > serviceAccountName for executor pods it launches. > There is a " > spark.kubernetes.authenticate.driver.serviceAccountName" property so > naturally one can expect to have a similar > "spark.kubernetes.authenticate.executor.serviceAccountName" property, but > such doesn't exists. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-29147) Spark doesn't use shuffleHashJoin as expected
[ https://issues.apache.org/jira/browse/SPARK-29147?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Artsiom Yudovin updated SPARK-29147: Summary: Spark doesn't use shuffleHashJoin as expected (was: Spark sortMergeJoin is not changing to shuffleHashJoin) > Spark doesn't use shuffleHashJoin as expected > - > > Key: SPARK-29147 > URL: https://issues.apache.org/jira/browse/SPARK-29147 > Project: Spark > Issue Type: Question > Components: Spark Core, SQL >Affects Versions: 2.4.3, 2.4.4 >Reporter: Artsiom Yudovin >Priority: Critical > > I run the following code: > {code:java} > val spark = SparkSession.builder() > .appName("ShuffleHashJoin") > .master("local[*]") > .config("spark.sql.autoBroadcastJoinThreshold", 0) > .config("spark.sql.join.preferSortMergeJoin", value = false) > .getOrCreate() > import spark.implicits._ > val dataset = Seq( > ("1", "playing"), > ("2", "with"), > ("3", "ShuffledHashJoinExec") > ).toDF("id", "token") > val dataset1 = Seq( > ("1", "playing"), > ("2", "with"), > ("3", "ShuffledHashJoinExec") > ).toDF("id1", "token") > >dataset.join(dataset1, $"id" === $"id1", "inner").foreach(t => println(t)) > {code} > My expectation that Spark will use 'shuffleHashJoin' but I see in SparkUI and > explain() that Spark uses 'sortMergeJoin' -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-29147) Spark sortMergeJoin is not changing to shuffleHashJoin
Artsiom Yudovin created SPARK-29147: --- Summary: Spark sortMergeJoin is not changing to shuffleHashJoin Key: SPARK-29147 URL: https://issues.apache.org/jira/browse/SPARK-29147 Project: Spark Issue Type: Question Components: Spark Core, SQL Affects Versions: 2.4.4, 2.4.3 Reporter: Artsiom Yudovin I run the following code: {code:java} val spark = SparkSession.builder() .appName("ShuffleHashJoin") .master("local[*]") .config("spark.sql.autoBroadcastJoinThreshold", 0) .config("spark.sql.join.preferSortMergeJoin", value = false) .getOrCreate() import spark.implicits._ val dataset = Seq( ("1", "playing"), ("2", "with"), ("3", "ShuffledHashJoinExec") ).toDF("id", "token") val dataset1 = Seq( ("1", "playing"), ("2", "with"), ("3", "ShuffledHashJoinExec") ).toDF("id1", "token") dataset.join(dataset1, $"id" === $"id1", "inner").foreach(t => println(t)) {code} My expectation that Spark will use 'shuffleHashJoin' but I see in SparkUI and explain() that Spark uses 'sortMergeJoin' -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-25713) Implement copy() for ColumnarArray
[ https://issues.apache.org/jira/browse/SPARK-25713?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16751028#comment-16751028 ] Artsiom Yudovin commented on SPARK-25713: - [~cloud_fan] > Implement copy() for ColumnarArray > -- > > Key: SPARK-25713 > URL: https://issues.apache.org/jira/browse/SPARK-25713 > Project: Spark > Issue Type: Improvement > Components: Spark Core >Affects Versions: 2.4.0 >Reporter: Liwen Sun >Priority: Minor > Fix For: 3.0.0 > > -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org