[jira] (SPARK-40341) Implement `Rolling.median`.

2022-09-21 Thread Artsiom Yudovin (Jira)


[ https://issues.apache.org/jira/browse/SPARK-40341 ]


Artsiom Yudovin deleted comment on SPARK-40341:
-

was (Author: ayudovin):
I'm working on this

> Implement `Rolling.median`.
> ---
>
> Key: SPARK-40341
> URL: https://issues.apache.org/jira/browse/SPARK-40341
> Project: Spark
>  Issue Type: Sub-task
>  Components: Pandas API on Spark
>Affects Versions: 3.4.0
>Reporter: Haejoon Lee
>Assignee: Yikun Jiang
>Priority: Major
>
> We should implement `Rolling.median` for increasing pandas API coverage.
> pandas docs: 
> https://pandas.pydata.org/docs/reference/api/pandas.core.window.rolling.Rolling.median.html



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-40341) Implement `Rolling.median`.

2022-09-21 Thread Artsiom Yudovin (Jira)


[ 
https://issues.apache.org/jira/browse/SPARK-40341?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17607908#comment-17607908
 ] 

Artsiom Yudovin commented on SPARK-40341:
-

I'm working on this

> Implement `Rolling.median`.
> ---
>
> Key: SPARK-40341
> URL: https://issues.apache.org/jira/browse/SPARK-40341
> Project: Spark
>  Issue Type: Sub-task
>  Components: Pandas API on Spark
>Affects Versions: 3.4.0
>Reporter: Haejoon Lee
>Assignee: Yikun Jiang
>Priority: Major
>
> We should implement `Rolling.median` for increasing pandas API coverage.
> pandas docs: 
> https://pandas.pydata.org/docs/reference/api/pandas.core.window.rolling.Rolling.median.html



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-40334) Implement `GroupBy.prod`.

2022-09-14 Thread Artsiom Yudovin (Jira)


[ 
https://issues.apache.org/jira/browse/SPARK-40334?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17604841#comment-17604841
 ] 

Artsiom Yudovin commented on SPARK-40334:
-

Got you, thank you so much!

> Implement `GroupBy.prod`.
> -
>
> Key: SPARK-40334
> URL: https://issues.apache.org/jira/browse/SPARK-40334
> Project: Spark
>  Issue Type: Sub-task
>  Components: Pandas API on Spark
>Affects Versions: 3.4.0
>Reporter: Haejoon Lee
>Assignee: Haejoon Lee
>Priority: Major
>
> We should implement `GroupBy.prod` for increasing pandas API coverage.
> pandas docs: 
> https://pandas.pydata.org/docs/reference/api/pandas.core.groupby.GroupBy.prod.html



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-40334) Implement `GroupBy.prod`.

2022-09-13 Thread Artsiom Yudovin (Jira)


[ 
https://issues.apache.org/jira/browse/SPARK-40334?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17603463#comment-17603463
 ] 

Artsiom Yudovin commented on SPARK-40334:
-

[~itholic], Hi, I have been started to work on this ticket 2 days ago. Does it 
make sense to continue or choose another ticket? 

> Implement `GroupBy.prod`.
> -
>
> Key: SPARK-40334
> URL: https://issues.apache.org/jira/browse/SPARK-40334
> Project: Spark
>  Issue Type: Sub-task
>  Components: Pandas API on Spark
>Affects Versions: 3.4.0
>Reporter: Haejoon Lee
>Assignee: Haejoon Lee
>Priority: Major
>
> We should implement `GroupBy.prod` for increasing pandas API coverage.
> pandas docs: 
> https://pandas.pydata.org/docs/reference/api/pandas.core.groupby.GroupBy.prod.html



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-30122) Allow setting serviceAccountName for executor pods

2020-02-05 Thread Artsiom Yudovin (Jira)


[ 
https://issues.apache.org/jira/browse/SPARK-30122?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17031084#comment-17031084
 ] 

Artsiom Yudovin commented on SPARK-30122:
-

(y)

> Allow setting serviceAccountName for executor pods
> --
>
> Key: SPARK-30122
> URL: https://issues.apache.org/jira/browse/SPARK-30122
> Project: Spark
>  Issue Type: Improvement
>  Components: Kubernetes
>Affects Versions: 3.1.0
>Reporter: Juho Mäkinen
>Priority: Major
> Fix For: 3.1.0
>
>
> Currently it doesn't seem to be possible to have Spark Driver set the 
> serviceAccountName for executor pods it launches.
> There is a "
> spark.kubernetes.authenticate.driver.serviceAccountName" property so 
> naturally one can expect to have a similar 
> "spark.kubernetes.authenticate.executor.serviceAccountName" property, but 
> such doesn't exists.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Updated] (SPARK-29147) Spark doesn't use shuffleHashJoin as expected

2019-09-18 Thread Artsiom Yudovin (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-29147?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Artsiom Yudovin updated SPARK-29147:

Summary: Spark doesn't use shuffleHashJoin as expected  (was: Spark 
sortMergeJoin is not changing to shuffleHashJoin)

> Spark doesn't use shuffleHashJoin as expected
> -
>
> Key: SPARK-29147
> URL: https://issues.apache.org/jira/browse/SPARK-29147
> Project: Spark
>  Issue Type: Question
>  Components: Spark Core, SQL
>Affects Versions: 2.4.3, 2.4.4
>Reporter: Artsiom Yudovin
>Priority: Critical
>
> I run the following code:
> {code:java}
> val spark = SparkSession.builder()
>   .appName("ShuffleHashJoin")
>   .master("local[*]")
>   .config("spark.sql.autoBroadcastJoinThreshold", 0)
>   .config("spark.sql.join.preferSortMergeJoin", value = false)
>   .getOrCreate()
> import spark.implicits._
> val dataset = Seq(
>   ("1", "playing"),
>   ("2", "with"),
>   ("3", "ShuffledHashJoinExec")
> ).toDF("id", "token")
> val dataset1 = Seq(
>   ("1", "playing"),
>   ("2", "with"),
>   ("3", "ShuffledHashJoinExec")
> ).toDF("id1", "token")
>   
>dataset.join(dataset1, $"id" === $"id1", "inner").foreach(t => println(t))
> {code}
> My expectation that Spark will use 'shuffleHashJoin' but I see in SparkUI and 
> explain() that Spark uses 'sortMergeJoin'



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Created] (SPARK-29147) Spark sortMergeJoin is not changing to shuffleHashJoin

2019-09-18 Thread Artsiom Yudovin (Jira)
Artsiom Yudovin created SPARK-29147:
---

 Summary: Spark sortMergeJoin is not changing to shuffleHashJoin
 Key: SPARK-29147
 URL: https://issues.apache.org/jira/browse/SPARK-29147
 Project: Spark
  Issue Type: Question
  Components: Spark Core, SQL
Affects Versions: 2.4.4, 2.4.3
Reporter: Artsiom Yudovin


I run the following code:
{code:java}
val spark = SparkSession.builder()
  .appName("ShuffleHashJoin")
  .master("local[*]")
  .config("spark.sql.autoBroadcastJoinThreshold", 0)
  .config("spark.sql.join.preferSortMergeJoin", value = false)
  .getOrCreate()

import spark.implicits._
val dataset = Seq(
  ("1", "playing"),
  ("2", "with"),
  ("3", "ShuffledHashJoinExec")
).toDF("id", "token")

val dataset1 = Seq(
  ("1", "playing"),
  ("2", "with"),
  ("3", "ShuffledHashJoinExec")
).toDF("id1", "token")
  
   dataset.join(dataset1, $"id" === $"id1", "inner").foreach(t => println(t))
{code}
My expectation that Spark will use 'shuffleHashJoin' but I see in SparkUI and 
explain() that Spark uses 'sortMergeJoin'



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-25713) Implement copy() for ColumnarArray

2019-01-24 Thread Artsiom Yudovin (JIRA)


[ 
https://issues.apache.org/jira/browse/SPARK-25713?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16751028#comment-16751028
 ] 

Artsiom Yudovin commented on SPARK-25713:
-

[~cloud_fan] 

> Implement copy() for ColumnarArray
> --
>
> Key: SPARK-25713
> URL: https://issues.apache.org/jira/browse/SPARK-25713
> Project: Spark
>  Issue Type: Improvement
>  Components: Spark Core
>Affects Versions: 2.4.0
>Reporter: Liwen Sun
>Priority: Minor
> Fix For: 3.0.0
>
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org