date:20211128

[jira] [Assigned] (SPARK-37481) Disappearance of skipped stages mislead the bug hunting

2021-11-28 Thread Apache Spark (Jira)



 [ 
https://issues.apache.org/jira/browse/SPARK-37481?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Apache Spark reassigned SPARK-37481:


Assignee: (was: Apache Spark)

> Disappearance of skipped stages mislead the bug hunting 
> 
>
> Key: SPARK-37481
> URL: https://issues.apache.org/jira/browse/SPARK-37481
> Project: Spark
>  Issue Type: Bug
>  Components: Spark Core
>Affects Versions: 3.1.2, 3.2.0, 3.3.0
>Reporter: Kent Yao
>Priority: Major
>
> # 
>  ## With FetchFailedException and Map Stage Retries
> When rerunning spark-sql shell with the original SQL in 
> [https://gist.github.com/yaooqinn/6acb7b74b343a6a6dffe8401f6b7b45c#gistcomment-3977315]
> !https://user-images.githubusercontent.com/8326978/143821530-ff498caa-abce-483d-a24b-315aacf7e0a0.png!
> 1. stage 3 threw FetchFailedException and caused itself and its parent 
> stage(stage 2) to retry
> 2. stage 2 was skipped before but its attemptId was still 0, so when its 
> retry happened it got removed from `Skipped Stages` 
> The DAG of Job 2 doesn't show that stage 2 is skipped anymore.
> !https://user-images.githubusercontent.com/8326978/143824666-6390b64a-a45b-4bc8-b05d-c5abbb28cdef.png!
> Besides, a retried stage usually has a subset of tasks from the original 
> stage. If we mark it as an original one, the metrics might lead us into 
> pitfalls.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Assigned] (SPARK-37481) Disappearance of skipped stages mislead the bug hunting

2021-11-28 Thread Apache Spark (Jira)



 [ 
https://issues.apache.org/jira/browse/SPARK-37481?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Apache Spark reassigned SPARK-37481:


Assignee: Apache Spark

> Disappearance of skipped stages mislead the bug hunting 
> 
>
> Key: SPARK-37481
> URL: https://issues.apache.org/jira/browse/SPARK-37481
> Project: Spark
>  Issue Type: Bug
>  Components: Spark Core
>Affects Versions: 3.1.2, 3.2.0, 3.3.0
>Reporter: Kent Yao
>Assignee: Apache Spark
>Priority: Major
>
> # 
>  ## With FetchFailedException and Map Stage Retries
> When rerunning spark-sql shell with the original SQL in 
> [https://gist.github.com/yaooqinn/6acb7b74b343a6a6dffe8401f6b7b45c#gistcomment-3977315]
> !https://user-images.githubusercontent.com/8326978/143821530-ff498caa-abce-483d-a24b-315aacf7e0a0.png!
> 1. stage 3 threw FetchFailedException and caused itself and its parent 
> stage(stage 2) to retry
> 2. stage 2 was skipped before but its attemptId was still 0, so when its 
> retry happened it got removed from `Skipped Stages` 
> The DAG of Job 2 doesn't show that stage 2 is skipped anymore.
> !https://user-images.githubusercontent.com/8326978/143824666-6390b64a-a45b-4bc8-b05d-c5abbb28cdef.png!
> Besides, a retried stage usually has a subset of tasks from the original 
> stage. If we mark it as an original one, the metrics might lead us into 
> pitfalls.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Commented] (SPARK-37481) Disappearance of skipped stages mislead the bug hunting

2021-11-28 Thread Apache Spark (Jira)



[ 
https://issues.apache.org/jira/browse/SPARK-37481?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17450235#comment-17450235
 ] 

Apache Spark commented on SPARK-37481:
--

User 'yaooqinn' has created a pull request for this issue:
https://github.com/apache/spark/pull/34735

> Disappearance of skipped stages mislead the bug hunting 
> 
>
> Key: SPARK-37481
> URL: https://issues.apache.org/jira/browse/SPARK-37481
> Project: Spark
>  Issue Type: Bug
>  Components: Spark Core
>Affects Versions: 3.1.2, 3.2.0, 3.3.0
>Reporter: Kent Yao
>Priority: Major
>
> # 
>  ## With FetchFailedException and Map Stage Retries
> When rerunning spark-sql shell with the original SQL in 
> [https://gist.github.com/yaooqinn/6acb7b74b343a6a6dffe8401f6b7b45c#gistcomment-3977315]
> !https://user-images.githubusercontent.com/8326978/143821530-ff498caa-abce-483d-a24b-315aacf7e0a0.png!
> 1. stage 3 threw FetchFailedException and caused itself and its parent 
> stage(stage 2) to retry
> 2. stage 2 was skipped before but its attemptId was still 0, so when its 
> retry happened it got removed from `Skipped Stages` 
> The DAG of Job 2 doesn't show that stage 2 is skipped anymore.
> !https://user-images.githubusercontent.com/8326978/143824666-6390b64a-a45b-4bc8-b05d-c5abbb28cdef.png!
> Besides, a retried stage usually has a subset of tasks from the original 
> stage. If we mark it as an original one, the metrics might lead us into 
> pitfalls.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Updated] (SPARK-37481) Disappearance of skipped stages mislead the bug hunting

2021-11-28 Thread Kent Yao (Jira)



 [ 
https://issues.apache.org/jira/browse/SPARK-37481?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kent Yao updated SPARK-37481:
-
Description: 
# 
 ## With FetchFailedException and Map Stage Retries

When rerunning spark-sql shell with the original SQL in 
[https://gist.github.com/yaooqinn/6acb7b74b343a6a6dffe8401f6b7b45c#gistcomment-3977315]

!https://user-images.githubusercontent.com/8326978/143821530-ff498caa-abce-483d-a24b-315aacf7e0a0.png!

1. stage 3 threw FetchFailedException and caused itself and its parent 
stage(stage 2) to retry
2. stage 2 was skipped before but its attemptId was still 0, so when its retry 
happened it got removed from `Skipped Stages` 

The DAG of Job 2 doesn't show that stage 2 is skipped anymore.

!https://user-images.githubusercontent.com/8326978/143824666-6390b64a-a45b-4bc8-b05d-c5abbb28cdef.png!

Besides, a retried stage usually has a subset of tasks from the original stage. 
If we mark it as an original one, the metrics might lead us into pitfalls.

  was:
## With FetchFailedException and Map Stage Retries

When rerunning spark-sql shell with the original SQL in 
https://gist.github.com/yaooqinn/6acb7b74b343a6a6dffe8401f6b7b45c#gistcomment-3977315

![image](https://user-images.githubusercontent.com/8326978/143821530-ff498caa-abce-483d-a24b-315aacf7e0a0.png)

1. stage 3 threw FetchFailedException and caused itself and its parent 
stage(stage 2) to retry
2. stage 2 was skipped before but its attemptId was still 0, so when its retry 
happened it got removed from `Skipped Stages` 

The DAG of Job 2 doesn't show that stage 2 is skipped anymore.

![image](https://user-images.githubusercontent.com/8326978/143824666-6390b64a-a45b-4bc8-b05d-c5abbb28cdef.png)


Besides, a retried stage usually has a subset of tasks from the original stage. 
If we mark it as an original one, the metrics might lead us into pitfalls.


> Disappearance of skipped stages mislead the bug hunting 
> 
>
> Key: SPARK-37481
> URL: https://issues.apache.org/jira/browse/SPARK-37481
> Project: Spark
>  Issue Type: Bug
>  Components: Spark Core
>Affects Versions: 3.1.2, 3.2.0, 3.3.0
>Reporter: Kent Yao
>Priority: Major
>
> # 
>  ## With FetchFailedException and Map Stage Retries
> When rerunning spark-sql shell with the original SQL in 
> [https://gist.github.com/yaooqinn/6acb7b74b343a6a6dffe8401f6b7b45c#gistcomment-3977315]
> !https://user-images.githubusercontent.com/8326978/143821530-ff498caa-abce-483d-a24b-315aacf7e0a0.png!
> 1. stage 3 threw FetchFailedException and caused itself and its parent 
> stage(stage 2) to retry
> 2. stage 2 was skipped before but its attemptId was still 0, so when its 
> retry happened it got removed from `Skipped Stages` 
> The DAG of Job 2 doesn't show that stage 2 is skipped anymore.
> !https://user-images.githubusercontent.com/8326978/143824666-6390b64a-a45b-4bc8-b05d-c5abbb28cdef.png!
> Besides, a retried stage usually has a subset of tasks from the original 
> stage. If we mark it as an original one, the metrics might lead us into 
> pitfalls.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Created] (SPARK-37481) Disappearance of skipped stages mislead the bug hunting

2021-11-28 Thread Kent Yao (Jira)

Kent Yao created SPARK-37481:


 Summary: Disappearance of skipped stages mislead the bug hunting 
 Key: SPARK-37481
 URL: https://issues.apache.org/jira/browse/SPARK-37481
 Project: Spark
  Issue Type: Bug
  Components: Spark Core
Affects Versions: 3.2.0, 3.1.2, 3.3.0
Reporter: Kent Yao


## With FetchFailedException and Map Stage Retries

When rerunning spark-sql shell with the original SQL in 
https://gist.github.com/yaooqinn/6acb7b74b343a6a6dffe8401f6b7b45c#gistcomment-3977315

![image](https://user-images.githubusercontent.com/8326978/143821530-ff498caa-abce-483d-a24b-315aacf7e0a0.png)

1. stage 3 threw FetchFailedException and caused itself and its parent 
stage(stage 2) to retry
2. stage 2 was skipped before but its attemptId was still 0, so when its retry 
happened it got removed from `Skipped Stages` 

The DAG of Job 2 doesn't show that stage 2 is skipped anymore.

![image](https://user-images.githubusercontent.com/8326978/143824666-6390b64a-a45b-4bc8-b05d-c5abbb28cdef.png)


Besides, a retried stage usually has a subset of tasks from the original stage. 
If we mark it as an original one, the metrics might lead us into pitfalls.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Commented] (SPARK-37480) Configurations in docs/running-on-kubernetes.md are not uptodate

2021-11-28 Thread Apache Spark (Jira)



[ 
https://issues.apache.org/jira/browse/SPARK-37480?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17450233#comment-17450233
 ] 

Apache Spark commented on SPARK-37480:
--

User 'Yikun' has created a pull request for this issue:
https://github.com/apache/spark/pull/34734

> Configurations in docs/running-on-kubernetes.md are not uptodate
> 
>
> Key: SPARK-37480
> URL: https://issues.apache.org/jira/browse/SPARK-37480
> Project: Spark
>  Issue Type: Bug
>  Components: Kubernetes
>Affects Versions: 3.3.0
>Reporter: Yikun Jiang
>Priority: Minor
>




--
This message was sent by Atlassian Jira
(v8.20.1#820001)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Assigned] (SPARK-37480) Configurations in docs/running-on-kubernetes.md are not uptodate

2021-11-28 Thread Apache Spark (Jira)



 [ 
https://issues.apache.org/jira/browse/SPARK-37480?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Apache Spark reassigned SPARK-37480:


Assignee: (was: Apache Spark)

> Configurations in docs/running-on-kubernetes.md are not uptodate
> 
>
> Key: SPARK-37480
> URL: https://issues.apache.org/jira/browse/SPARK-37480
> Project: Spark
>  Issue Type: Bug
>  Components: Kubernetes
>Affects Versions: 3.3.0
>Reporter: Yikun Jiang
>Priority: Minor
>




--
This message was sent by Atlassian Jira
(v8.20.1#820001)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Assigned] (SPARK-37480) Configurations in docs/running-on-kubernetes.md are not uptodate

2021-11-28 Thread Apache Spark (Jira)



 [ 
https://issues.apache.org/jira/browse/SPARK-37480?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Apache Spark reassigned SPARK-37480:


Assignee: Apache Spark

> Configurations in docs/running-on-kubernetes.md are not uptodate
> 
>
> Key: SPARK-37480
> URL: https://issues.apache.org/jira/browse/SPARK-37480
> Project: Spark
>  Issue Type: Bug
>  Components: Kubernetes
>Affects Versions: 3.3.0
>Reporter: Yikun Jiang
>Assignee: Apache Spark
>Priority: Minor
>




--
This message was sent by Atlassian Jira
(v8.20.1#820001)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Created] (SPARK-37480) Configurations in docs/running-on-kubernetes.md are not uptodate

2021-11-28 Thread Yikun Jiang (Jira)

Yikun Jiang created SPARK-37480:
---

 Summary: Configurations in docs/running-on-kubernetes.md are not 
uptodate
 Key: SPARK-37480
 URL: https://issues.apache.org/jira/browse/SPARK-37480
 Project: Spark
  Issue Type: Bug
  Components: Kubernetes
Affects Versions: 3.3.0
Reporter: Yikun Jiang






--
This message was sent by Atlassian Jira
(v8.20.1#820001)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Commented] (SPARK-36525) DS V2 Index Support

2021-11-28 Thread Huaxin Gao (Jira)



[ 
https://issues.apache.org/jira/browse/SPARK-36525?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17450224#comment-17450224
 ] 

Huaxin Gao commented on SPARK-36525:


The major reason I work on index support is because I have customers who need 
this in iceberg. I don't have any plan to make FileTable implement SupportIndex 
because parquet or ORC doesn't support index. 

> DS V2 Index Support
> ---
>
> Key: SPARK-36525
> URL: https://issues.apache.org/jira/browse/SPARK-36525
> Project: Spark
>  Issue Type: Improvement
>  Components: SQL
>Affects Versions: 3.3.0
>Reporter: Huaxin Gao
>Priority: Major
>
> Many data sources support index to improvement query performance. In order to 
> take advantage of the index support in data source, the following APIs will 
> be added for working with indexes:
> {code:java}
>  public interface SupportsIndex extends Table {
>   /**
>* Creates an index.
>*
>* @param indexName the name of the index to be created
>* @param indexType the type of the index to be created. If this is not 
> specified, Spark
>*  will use empty String.
>* @param columns the columns on which index to be created
>* @param columnsProperties the properties of the columns on which index to 
> be created
>* @param properties the properties of the index to be created
>* @throws IndexAlreadyExistsException If the index already exists.
>*/
>   void createIndex(String indexName,
>   String indexType,
>   NamedReference[] columns,
>   Map> columnsProperties,
>   Map properties)
>   throws IndexAlreadyExistsException;
>   /**
>* Drops the index with the given name.
>*
>* @param indexName the name of the index to be dropped.
>* @throws NoSuchIndexException If the index does not exist.
>*/
>   void dropIndex(String indexName) throws NoSuchIndexException;
>   /**
>* Checks whether an index exists in this table.
>*
>* @param indexName the name of the index
>* @return true if the index exists, false otherwise
>*/
>   boolean indexExists(String indexName);
>   /**
>* Lists all the indexes in this table.
>*/
>   TableIndex[] listIndexes();
> }
> {code}



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Commented] (SPARK-36346) Support TimestampNTZ type in Orc file source

2021-11-28 Thread Apache Spark (Jira)



[ 
https://issues.apache.org/jira/browse/SPARK-36346?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17450219#comment-17450219
 ] 

Apache Spark commented on SPARK-36346:
--

User 'dongjoon-hyun' has created a pull request for this issue:
https://github.com/apache/spark/pull/34733

> Support TimestampNTZ type in Orc file source
> 
>
> Key: SPARK-36346
> URL: https://issues.apache.org/jira/browse/SPARK-36346
> Project: Spark
>  Issue Type: Sub-task
>  Components: SQL
>Affects Versions: 3.3.0
>Reporter: jiaan.geng
>Assignee: jiaan.geng
>Priority: Major
> Fix For: 3.3.0
>
>
> As per https://orc.apache.org/docs/types.html, Orc supports both 
> TIMESTAMP_NTZ and TIMESTAMP_LTZ (Spark's current default timestamp type):
> * A TIMESTAMP => TIMESTAMP_LTZ
> * Timestamp with local time zone => TIMESTAMP_NTZ
> In Spark 3.1 or prior, Spark only considered TIMESTAMP.
> Since 3.2, with the support of timestamp without time zone type:
> * Orc writer follows the definition and uses "Timestamp with local time zone" 
> on writing TIMESTAMP_NTZ.
> * Orc reader converts the "Timestamp with local time zone" to TIMESTAMP_NTZ.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Updated] (SPARK-37392) Catalyst optimizer very time-consuming and memory-intensive with some "explode(array)"

2021-11-28 Thread Francois MARTIN (Jira)



 [ 
https://issues.apache.org/jira/browse/SPARK-37392?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Francois MARTIN updated SPARK-37392:

Description: 
The problem occurs with the simple code below:
{code:java}
import session.implicits._

Seq(
  (1, "x", "x", "x", "x", "x", "x", "x", "x", "x", "x", "x", "x", "x", "x", 
"x", "x", "x", "x", "x", "x")
).toDF()
  .checkpoint() // or save and reload to truncate lineage
  .createOrReplaceTempView("sub")

session.sql("""
  SELECT
*
  FROM
  (
SELECT
  EXPLODE( ARRAY( * ) ) result
FROM
(
  SELECT
_1 a, _2 b, _3 c, _4 d, _5 e, _6 f, _7 g, _8 h, _9 i, _10 j, _11 k, _12 
l, _13 m, _14 n, _15 o, _16 p, _17 q, _18 r, _19 s, _20 t, _21 u
  FROM
sub
)
  )
  WHERE
result != ''
  """).show() {code}
It takes several minutes and a very high Java heap usage, when it should be 
immediate.

It does not occur when replacing the unique integer value (1) with a string 
value ({_}"x"{_}).

All the time is spent in the _PruneFilters_ optimization rule.

Not reproduced in Spark 2.4.1.

  was:
The problem occurs with the simple code below:
{code:java}
import session.implicits._

Seq(
  (1, "x", "x", "x", "x", "x", "x", "x", "x", "x", "x", "x", "x", "x", "x", 
"x", "x", "x", "x", "x", "x")
).toDF()
  .checkpoint() // or save and reload to truncate lineage
  .createOrReplaceTempView("sub")

session.sql("""
  SELECT
*
  FROM
  (
SELECT
  EXPLODE( ARRAY( * ) ) result
FROM
(
  SELECT
_1 a, _2 b, _3 c, _4 d, _5 e, _6 f, _7 g, _8 h, _9 i, _10 j, _11 k, _12 
l, _13 m, _14 n, _15 o, _16 p, _17 q, _18 r, _19 s, _20 t, _21 u
  FROM
sub
)
  )
  WHERE
result != ''
  """).show() {code}
It takes several minutes and a very high Java heap usage, when it should be 
immediate.

It does not occur when replacing the unique integer value ({_}1{_}) with a 
string value ({_}"x"{_}).

All the time is spent in the _PruneFilters_ optimization rule.

Not reproduced in Spark 2.4.1.


> Catalyst optimizer very time-consuming and memory-intensive with some 
> "explode(array)" 
> ---
>
> Key: SPARK-37392
> URL: https://issues.apache.org/jira/browse/SPARK-37392
> Project: Spark
>  Issue Type: Bug
>  Components: Spark Core
>Affects Versions: 3.1.2
>Reporter: Francois MARTIN
>Priority: Major
>
> The problem occurs with the simple code below:
> {code:java}
> import session.implicits._
> Seq(
>   (1, "x", "x", "x", "x", "x", "x", "x", "x", "x", "x", "x", "x", "x", "x", 
> "x", "x", "x", "x", "x", "x")
> ).toDF()
>   .checkpoint() // or save and reload to truncate lineage
>   .createOrReplaceTempView("sub")
> session.sql("""
>   SELECT
> *
>   FROM
>   (
> SELECT
>   EXPLODE( ARRAY( * ) ) result
> FROM
> (
>   SELECT
> _1 a, _2 b, _3 c, _4 d, _5 e, _6 f, _7 g, _8 h, _9 i, _10 j, _11 k, 
> _12 l, _13 m, _14 n, _15 o, _16 p, _17 q, _18 r, _19 s, _20 t, _21 u
>   FROM
> sub
> )
>   )
>   WHERE
> result != ''
>   """).show() {code}
> It takes several minutes and a very high Java heap usage, when it should be 
> immediate.
> It does not occur when replacing the unique integer value (1) with a string 
> value ({_}"x"{_}).
> All the time is spent in the _PruneFilters_ optimization rule.
> Not reproduced in Spark 2.4.1.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Commented] (SPARK-37291) PySpark init SparkSession should copy conf to sharedState

2021-11-28 Thread Apache Spark (Jira)



[ 
https://issues.apache.org/jira/browse/SPARK-37291?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17450214#comment-17450214
 ] 

Apache Spark commented on SPARK-37291:
--

User 'AngersZh' has created a pull request for this issue:
https://github.com/apache/spark/pull/34732

>  PySpark init SparkSession should copy conf to sharedState
> --
>
> Key: SPARK-37291
> URL: https://issues.apache.org/jira/browse/SPARK-37291
> Project: Spark
>  Issue Type: Bug
>  Components: PySpark, SQL
>Affects Versions: 3.2.0
>Reporter: angerszhu
>Assignee: angerszhu
>Priority: Major
> Fix For: 3.3.0
>
>
> PySpark SparkSession.config should respect enableHiveSupport



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Commented] (SPARK-37291) PySpark init SparkSession should copy conf to sharedState

2021-11-28 Thread Apache Spark (Jira)



[ 
https://issues.apache.org/jira/browse/SPARK-37291?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17450213#comment-17450213
 ] 

Apache Spark commented on SPARK-37291:
--

User 'AngersZh' has created a pull request for this issue:
https://github.com/apache/spark/pull/34732

>  PySpark init SparkSession should copy conf to sharedState
> --
>
> Key: SPARK-37291
> URL: https://issues.apache.org/jira/browse/SPARK-37291
> Project: Spark
>  Issue Type: Bug
>  Components: PySpark, SQL
>Affects Versions: 3.2.0
>Reporter: angerszhu
>Assignee: angerszhu
>Priority: Major
> Fix For: 3.3.0
>
>
> PySpark SparkSession.config should respect enableHiveSupport



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Commented] (SPARK-37055) Apply 'compute.eager_check' across all the codebase

2021-11-28 Thread dch nguyen (Jira)



[ 
https://issues.apache.org/jira/browse/SPARK-37055?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17450210#comment-17450210
 ] 

dch nguyen commented on SPARK-37055:


thanks! I will try to address them

> Apply 'compute.eager_check' across all the codebase
> ---
>
> Key: SPARK-37055
> URL: https://issues.apache.org/jira/browse/SPARK-37055
> Project: Spark
>  Issue Type: Umbrella
>  Components: PySpark
>Affects Versions: 3.3.0
>Reporter: dch nguyen
>Priority: Major
>
> As [~hyukjin.kwon] guide
>  1 Make every input validation like this covered by the new configuration. 
> For example:
> {code:python}
> - a == b
> + def eager_check(f): # Utility function 
> + return not config.compute.eager_check and f() 
> + 
> + eager_check(lambda: a == b)
> {code}
> 2 We should check if the output makes sense although the behaviour is not 
> matched with pandas'. If the output does not make sense, we shouldn't cover 
> it with this configuration.
> 3 Make this configuration enabled by default so we match the behaviour to 
> pandas' by default.
>  
> We have to make sure listing which API is affected in the description of 
> 'compute.eager_check'



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Commented] (SPARK-36525) DS V2 Index Support

2021-11-28 Thread Yang Jie (Jira)



[ 
https://issues.apache.org/jira/browse/SPARK-36525?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17450207#comment-17450207
 ] 

Yang Jie commented on SPARK-36525:
--

Do we plan to make FileTable support the trait of SupportIndex

> DS V2 Index Support
> ---
>
> Key: SPARK-36525
> URL: https://issues.apache.org/jira/browse/SPARK-36525
> Project: Spark
>  Issue Type: Improvement
>  Components: SQL
>Affects Versions: 3.3.0
>Reporter: Huaxin Gao
>Priority: Major
>
> Many data sources support index to improvement query performance. In order to 
> take advantage of the index support in data source, the following APIs will 
> be added for working with indexes:
> {code:java}
>  public interface SupportsIndex extends Table {
>   /**
>* Creates an index.
>*
>* @param indexName the name of the index to be created
>* @param indexType the type of the index to be created. If this is not 
> specified, Spark
>*  will use empty String.
>* @param columns the columns on which index to be created
>* @param columnsProperties the properties of the columns on which index to 
> be created
>* @param properties the properties of the index to be created
>* @throws IndexAlreadyExistsException If the index already exists.
>*/
>   void createIndex(String indexName,
>   String indexType,
>   NamedReference[] columns,
>   Map> columnsProperties,
>   Map properties)
>   throws IndexAlreadyExistsException;
>   /**
>* Drops the index with the given name.
>*
>* @param indexName the name of the index to be dropped.
>* @throws NoSuchIndexException If the index does not exist.
>*/
>   void dropIndex(String indexName) throws NoSuchIndexException;
>   /**
>* Checks whether an index exists in this table.
>*
>* @param indexName the name of the index
>* @return true if the index exists, false otherwise
>*/
>   boolean indexExists(String indexName);
>   /**
>* Lists all the indexes in this table.
>*/
>   TableIndex[] listIndexes();
> }
> {code}



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Assigned] (SPARK-37443) Provide a profiler for Python/Pandas UDFs

2021-11-28 Thread Hyukjin Kwon (Jira)



 [ 
https://issues.apache.org/jira/browse/SPARK-37443?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hyukjin Kwon reassigned SPARK-37443:


Assignee: Takuya Ueshin

> Provide a profiler for Python/Pandas UDFs
> -
>
> Key: SPARK-37443
> URL: https://issues.apache.org/jira/browse/SPARK-37443
> Project: Spark
>  Issue Type: Improvement
>  Components: PySpark
>Affects Versions: 3.3.0
>Reporter: Takuya Ueshin
>Assignee: Takuya Ueshin
>Priority: Major
>
> Currently a profiler is provided for only {{RDD}} operations, but providing a 
> profiler for Python/Pandas UDFs would be great.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Resolved] (SPARK-37443) Provide a profiler for Python/Pandas UDFs

2021-11-28 Thread Hyukjin Kwon (Jira)



 [ 
https://issues.apache.org/jira/browse/SPARK-37443?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hyukjin Kwon resolved SPARK-37443.
--
Fix Version/s: 3.3.0
   Resolution: Fixed

Issue resolved by pull request 34685
[https://github.com/apache/spark/pull/34685]

> Provide a profiler for Python/Pandas UDFs
> -
>
> Key: SPARK-37443
> URL: https://issues.apache.org/jira/browse/SPARK-37443
> Project: Spark
>  Issue Type: Improvement
>  Components: PySpark
>Affects Versions: 3.3.0
>Reporter: Takuya Ueshin
>Assignee: Takuya Ueshin
>Priority: Major
> Fix For: 3.3.0
>
>
> Currently a profiler is provided for only {{RDD}} operations, but providing a 
> profiler for Python/Pandas UDFs would be great.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Commented] (SPARK-37055) Apply 'compute.eager_check' across all the codebase

2021-11-28 Thread Hyukjin Kwon (Jira)



[ 
https://issues.apache.org/jira/browse/SPARK-37055?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17450197#comment-17450197
 ] 

Hyukjin Kwon commented on SPARK-37055:
--

equals is the same too: 
https://github.com/apache/spark/blob/master/python/pyspark/pandas/series.py#L5842

> Apply 'compute.eager_check' across all the codebase
> ---
>
> Key: SPARK-37055
> URL: https://issues.apache.org/jira/browse/SPARK-37055
> Project: Spark
>  Issue Type: Umbrella
>  Components: PySpark
>Affects Versions: 3.3.0
>Reporter: dch nguyen
>Priority: Major
>
> As [~hyukjin.kwon] guide
>  1 Make every input validation like this covered by the new configuration. 
> For example:
> {code:python}
> - a == b
> + def eager_check(f): # Utility function 
> + return not config.compute.eager_check and f() 
> + 
> + eager_check(lambda: a == b)
> {code}
> 2 We should check if the output makes sense although the behaviour is not 
> matched with pandas'. If the output does not make sense, we shouldn't cover 
> it with this configuration.
> 3 Make this configuration enabled by default so we match the behaviour to 
> pandas' by default.
>  
> We have to make sure listing which API is affected in the description of 
> 'compute.eager_check'



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Commented] (SPARK-37055) Apply 'compute.eager_check' across all the codebase

2021-11-28 Thread Hyukjin Kwon (Jira)



[ 
https://issues.apache.org/jira/browse/SPARK-37055?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17450196#comment-17450196
 ] 

Hyukjin Kwon commented on SPARK-37055:
--

You can, for example, find some instances relying on 
is_moninotically_increasing 
(https://github.com/apache/spark/blob/2fe9af8b2b91d0a46782dd6fff57eca8609be105/python/pyspark/pandas/base.py#L703-L758)
 which is super expensive e.g.) 
https://github.com/apache/spark/blob/master/python/pyspark/pandas/series.py#L5219
 

> Apply 'compute.eager_check' across all the codebase
> ---
>
> Key: SPARK-37055
> URL: https://issues.apache.org/jira/browse/SPARK-37055
> Project: Spark
>  Issue Type: Umbrella
>  Components: PySpark
>Affects Versions: 3.3.0
>Reporter: dch nguyen
>Priority: Major
>
> As [~hyukjin.kwon] guide
>  1 Make every input validation like this covered by the new configuration. 
> For example:
> {code:python}
> - a == b
> + def eager_check(f): # Utility function 
> + return not config.compute.eager_check and f() 
> + 
> + eager_check(lambda: a == b)
> {code}
> 2 We should check if the output makes sense although the behaviour is not 
> matched with pandas'. If the output does not make sense, we shouldn't cover 
> it with this configuration.
> 3 Make this configuration enabled by default so we match the behaviour to 
> pandas' by default.
>  
> We have to make sure listing which API is affected in the description of 
> 'compute.eager_check'



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Commented] (SPARK-37055) Apply 'compute.eager_check' across all the codebase

2021-11-28 Thread dch nguyen (Jira)



[ 
https://issues.apache.org/jira/browse/SPARK-37055?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17450191#comment-17450191
 ] 

dch nguyen commented on SPARK-37055:


[~hyukjin.kwon] , no, I am not now. I did not find anywhere to apply this conf 
more :(

> Apply 'compute.eager_check' across all the codebase
> ---
>
> Key: SPARK-37055
> URL: https://issues.apache.org/jira/browse/SPARK-37055
> Project: Spark
>  Issue Type: Umbrella
>  Components: PySpark
>Affects Versions: 3.3.0
>Reporter: dch nguyen
>Priority: Major
>
> As [~hyukjin.kwon] guide
>  1 Make every input validation like this covered by the new configuration. 
> For example:
> {code:python}
> - a == b
> + def eager_check(f): # Utility function 
> + return not config.compute.eager_check and f() 
> + 
> + eager_check(lambda: a == b)
> {code}
> 2 We should check if the output makes sense although the behaviour is not 
> matched with pandas'. If the output does not make sense, we shouldn't cover 
> it with this configuration.
> 3 Make this configuration enabled by default so we match the behaviour to 
> pandas' by default.
>  
> We have to make sure listing which API is affected in the description of 
> 'compute.eager_check'



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Commented] (SPARK-37479) Migrate DROP NAMESPACE to use V2 command by default

2021-11-28 Thread dch nguyen (Jira)



[ 
https://issues.apache.org/jira/browse/SPARK-37479?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17450189#comment-17450189
 ] 

dch nguyen commented on SPARK-37479:


working on this

> Migrate DROP NAMESPACE to use V2 command by default
> ---
>
> Key: SPARK-37479
> URL: https://issues.apache.org/jira/browse/SPARK-37479
> Project: Spark
>  Issue Type: Sub-task
>  Components: SQL
>Affects Versions: 3.3.0
>Reporter: dch nguyen
>Priority: Major
>




--
This message was sent by Atlassian Jira
(v8.20.1#820001)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Created] (SPARK-37479) Migrate DROP NAMESPACE to use V2 command by default

2021-11-28 Thread dch nguyen (Jira)

dch nguyen created SPARK-37479:
--

 Summary: Migrate DROP NAMESPACE to use V2 command by default
 Key: SPARK-37479
 URL: https://issues.apache.org/jira/browse/SPARK-37479
 Project: Spark
  Issue Type: Sub-task
  Components: SQL
Affects Versions: 3.3.0
Reporter: dch nguyen






--
This message was sent by Atlassian Jira
(v8.20.1#820001)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Commented] (SPARK-37478) Unify v1 and v2 DROP NAMESPACE tests

2021-11-28 Thread dch nguyen (Jira)



[ 
https://issues.apache.org/jira/browse/SPARK-37478?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17450188#comment-17450188
 ] 

dch nguyen commented on SPARK-37478:


working on this

> Unify v1 and v2 DROP NAMESPACE tests
> 
>
> Key: SPARK-37478
> URL: https://issues.apache.org/jira/browse/SPARK-37478
> Project: Spark
>  Issue Type: Sub-task
>  Components: SQL
>Affects Versions: 3.3.0
>Reporter: dch nguyen
>Priority: Major
>




--
This message was sent by Atlassian Jira
(v8.20.1#820001)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Created] (SPARK-37478) Unify v1 and v2 DROP NAMESPACE tests

2021-11-28 Thread dch nguyen (Jira)

dch nguyen created SPARK-37478:
--

 Summary: Unify v1 and v2 DROP NAMESPACE tests
 Key: SPARK-37478
 URL: https://issues.apache.org/jira/browse/SPARK-37478
 Project: Spark
  Issue Type: Sub-task
  Components: SQL
Affects Versions: 3.3.0
Reporter: dch nguyen






--
This message was sent by Atlassian Jira
(v8.20.1#820001)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Commented] (SPARK-37055) Apply 'compute.eager_check' across all the codebase

2021-11-28 Thread Hyukjin Kwon (Jira)



[ 
https://issues.apache.org/jira/browse/SPARK-37055?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17450187#comment-17450187
 ] 

Hyukjin Kwon commented on SPARK-37055:
--

[~dchvn], just checking - are you working on this?

> Apply 'compute.eager_check' across all the codebase
> ---
>
> Key: SPARK-37055
> URL: https://issues.apache.org/jira/browse/SPARK-37055
> Project: Spark
>  Issue Type: Umbrella
>  Components: PySpark
>Affects Versions: 3.3.0
>Reporter: dch nguyen
>Priority: Major
>
> As [~hyukjin.kwon] guide
>  1 Make every input validation like this covered by the new configuration. 
> For example:
> {code:python}
> - a == b
> + def eager_check(f): # Utility function 
> + return not config.compute.eager_check and f() 
> + 
> + eager_check(lambda: a == b)
> {code}
> 2 We should check if the output makes sense although the behaviour is not 
> matched with pandas'. If the output does not make sense, we shouldn't cover 
> it with this configuration.
> 3 Make this configuration enabled by default so we match the behaviour to 
> pandas' by default.
>  
> We have to make sure listing which API is affected in the description of 
> 'compute.eager_check'



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Commented] (SPARK-37153) Inline type hints for python/pyspark/profiler.py

2021-11-28 Thread Apache Spark (Jira)



[ 
https://issues.apache.org/jira/browse/SPARK-37153?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17450180#comment-17450180
 ] 

Apache Spark commented on SPARK-37153:
--

User 'dchvn' has created a pull request for this issue:
https://github.com/apache/spark/pull/34731

> Inline type hints for python/pyspark/profiler.py
> 
>
> Key: SPARK-37153
> URL: https://issues.apache.org/jira/browse/SPARK-37153
> Project: Spark
>  Issue Type: Sub-task
>  Components: PySpark
>Affects Versions: 3.2.0
>Reporter: Byron Hsu
>Priority: Major
>




--
This message was sent by Atlassian Jira
(v8.20.1#820001)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Assigned] (SPARK-37153) Inline type hints for python/pyspark/profiler.py

2021-11-28 Thread Apache Spark (Jira)



 [ 
https://issues.apache.org/jira/browse/SPARK-37153?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Apache Spark reassigned SPARK-37153:


Assignee: Apache Spark

> Inline type hints for python/pyspark/profiler.py
> 
>
> Key: SPARK-37153
> URL: https://issues.apache.org/jira/browse/SPARK-37153
> Project: Spark
>  Issue Type: Sub-task
>  Components: PySpark
>Affects Versions: 3.2.0
>Reporter: Byron Hsu
>Assignee: Apache Spark
>Priority: Major
>




--
This message was sent by Atlassian Jira
(v8.20.1#820001)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Assigned] (SPARK-37153) Inline type hints for python/pyspark/profiler.py

2021-11-28 Thread Apache Spark (Jira)



 [ 
https://issues.apache.org/jira/browse/SPARK-37153?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Apache Spark reassigned SPARK-37153:


Assignee: (was: Apache Spark)

> Inline type hints for python/pyspark/profiler.py
> 
>
> Key: SPARK-37153
> URL: https://issues.apache.org/jira/browse/SPARK-37153
> Project: Spark
>  Issue Type: Sub-task
>  Components: PySpark
>Affects Versions: 3.2.0
>Reporter: Byron Hsu
>Priority: Major
>




--
This message was sent by Atlassian Jira
(v8.20.1#820001)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Created] (SPARK-37477) Migrate SHOW CREATE TABLE to use V2 command by default

2021-11-28 Thread PengLei (Jira)

PengLei created SPARK-37477:
---

 Summary: Migrate SHOW CREATE TABLE to use V2 command by default
 Key: SPARK-37477
 URL: https://issues.apache.org/jira/browse/SPARK-37477
 Project: Spark
  Issue Type: Sub-task
  Components: SQL
Affects Versions: 3.3.0
Reporter: PengLei
 Fix For: 3.3.0






--
This message was sent by Atlassian Jira
(v8.20.1#820001)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Resolved] (SPARK-37461) yarn-client mode client's appid value is null

2021-11-28 Thread Sean R. Owen (Jira)



 [ 
https://issues.apache.org/jira/browse/SPARK-37461?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sean R. Owen resolved SPARK-37461.
--
Fix Version/s: 3.3.0
   Resolution: Fixed

Issue resolved by pull request 34710
[https://github.com/apache/spark/pull/34710]

> yarn-client mode client's appid value is null
> -
>
> Key: SPARK-37461
> URL: https://issues.apache.org/jira/browse/SPARK-37461
> Project: Spark
>  Issue Type: Task
>  Components: YARN
>Affects Versions: 3.2.0
>Reporter: angerszhu
>Assignee: angerszhu
>Priority: Minor
> Fix For: 3.3.0
>
>




--
This message was sent by Atlassian Jira
(v8.20.1#820001)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Updated] (SPARK-37461) yarn-client mode client's appid value is null

2021-11-28 Thread Sean R. Owen (Jira)



 [ 
https://issues.apache.org/jira/browse/SPARK-37461?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sean R. Owen updated SPARK-37461:
-
Priority: Minor  (was: Major)

> yarn-client mode client's appid value is null
> -
>
> Key: SPARK-37461
> URL: https://issues.apache.org/jira/browse/SPARK-37461
> Project: Spark
>  Issue Type: Task
>  Components: YARN
>Affects Versions: 3.2.0
>Reporter: angerszhu
>Assignee: angerszhu
>Priority: Minor
>




--
This message was sent by Atlassian Jira
(v8.20.1#820001)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Assigned] (SPARK-37461) yarn-client mode client's appid value is null

2021-11-28 Thread Sean R. Owen (Jira)



 [ 
https://issues.apache.org/jira/browse/SPARK-37461?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sean R. Owen reassigned SPARK-37461:


Assignee: angerszhu

> yarn-client mode client's appid value is null
> -
>
> Key: SPARK-37461
> URL: https://issues.apache.org/jira/browse/SPARK-37461
> Project: Spark
>  Issue Type: Task
>  Components: YARN
>Affects Versions: 3.2.0
>Reporter: angerszhu
>Assignee: angerszhu
>Priority: Major
>




--
This message was sent by Atlassian Jira
(v8.20.1#820001)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Commented] (SPARK-37213) In the latest version release (Spark.3.2.O) in the Apache Spark documentation, the "O" at the end feels wrong, and it is written as the English letter "O"

2021-11-28 Thread liu zhuang (Jira)



[ 
https://issues.apache.org/jira/browse/SPARK-37213?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17450027#comment-17450027
 ] 

liu zhuang commented on SPARK-37213:


OK，thank you.

> In the latest version release (Spark.3.2.O) in the Apache Spark 
> documentation, the "O" at the end feels wrong, and it is written as the 
> English letter "O"
> --
>
> Key: SPARK-37213
> URL: https://issues.apache.org/jira/browse/SPARK-37213
> Project: Spark
>  Issue Type: Bug
>  Components: Documentation
>Affects Versions: 3.2.0
>Reporter: liu zhuang
>Priority: Major
> Attachments: Spark3.2.0.png
>
>
> In the latest version release (Spark.3.2.O) in the Apache Spark 
> documentation, the "O" at the end feels wrong, and it is written as the 
> English letter "O".



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Assigned] (SPARK-37481) Disappearance of skipped stages mislead the bug hunting

[jira] [Assigned] (SPARK-37481) Disappearance of skipped stages mislead the bug hunting

[jira] [Commented] (SPARK-37481) Disappearance of skipped stages mislead the bug hunting

[jira] [Updated] (SPARK-37481) Disappearance of skipped stages mislead the bug hunting

[jira] [Created] (SPARK-37481) Disappearance of skipped stages mislead the bug hunting

[jira] [Commented] (SPARK-37480) Configurations in docs/running-on-kubernetes.md are not uptodate

[jira] [Assigned] (SPARK-37480) Configurations in docs/running-on-kubernetes.md are not uptodate

[jira] [Assigned] (SPARK-37480) Configurations in docs/running-on-kubernetes.md are not uptodate

[jira] [Created] (SPARK-37480) Configurations in docs/running-on-kubernetes.md are not uptodate

[jira] [Commented] (SPARK-36525) DS V2 Index Support

[jira] [Commented] (SPARK-36346) Support TimestampNTZ type in Orc file source

[jira] [Updated] (SPARK-37392) Catalyst optimizer very time-consuming and memory-intensive with some "explode(array)"

[jira] [Commented] (SPARK-37291) PySpark init SparkSession should copy conf to sharedState

[jira] [Commented] (SPARK-37291) PySpark init SparkSession should copy conf to sharedState

[jira] [Commented] (SPARK-37055) Apply 'compute.eager_check' across all the codebase

[jira] [Commented] (SPARK-36525) DS V2 Index Support

[jira] [Assigned] (SPARK-37443) Provide a profiler for Python/Pandas UDFs

[jira] [Resolved] (SPARK-37443) Provide a profiler for Python/Pandas UDFs

[jira] [Commented] (SPARK-37055) Apply 'compute.eager_check' across all the codebase

[jira] [Commented] (SPARK-37055) Apply 'compute.eager_check' across all the codebase

[jira] [Commented] (SPARK-37055) Apply 'compute.eager_check' across all the codebase

[jira] [Commented] (SPARK-37479) Migrate DROP NAMESPACE to use V2 command by default

[jira] [Created] (SPARK-37479) Migrate DROP NAMESPACE to use V2 command by default

[jira] [Commented] (SPARK-37478) Unify v1 and v2 DROP NAMESPACE tests

[jira] [Created] (SPARK-37478) Unify v1 and v2 DROP NAMESPACE tests

[jira] [Commented] (SPARK-37055) Apply 'compute.eager_check' across all the codebase

[jira] [Commented] (SPARK-37153) Inline type hints for python/pyspark/profiler.py

[jira] [Assigned] (SPARK-37153) Inline type hints for python/pyspark/profiler.py

[jira] [Assigned] (SPARK-37153) Inline type hints for python/pyspark/profiler.py

[jira] [Created] (SPARK-37477) Migrate SHOW CREATE TABLE to use V2 command by default

[jira] [Resolved] (SPARK-37461) yarn-client mode client's appid value is null

[jira] [Updated] (SPARK-37461) yarn-client mode client's appid value is null

[jira] [Assigned] (SPARK-37461) yarn-client mode client's appid value is null

[jira] [Commented] (SPARK-37213) In the latest version release (Spark.3.2.O) in the Apache Spark documentation, the "O" at the end feels wrong, and it is written as the English letter "O"

34 matches

Site Navigation

Mail list logo

Footer information