[jira] [Assigned] (SPARK-37481) Disappearance of skipped stages mislead the bug hunting

2021-11-28 Thread Apache Spark (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-37481?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Apache Spark reassigned SPARK-37481:


Assignee: (was: Apache Spark)

> Disappearance of skipped stages mislead the bug hunting 
> 
>
> Key: SPARK-37481
> URL: https://issues.apache.org/jira/browse/SPARK-37481
> Project: Spark
>  Issue Type: Bug
>  Components: Spark Core
>Affects Versions: 3.1.2, 3.2.0, 3.3.0
>Reporter: Kent Yao
>Priority: Major
>
> # 
>  ## With FetchFailedException and Map Stage Retries
> When rerunning spark-sql shell with the original SQL in 
> [https://gist.github.com/yaooqinn/6acb7b74b343a6a6dffe8401f6b7b45c#gistcomment-3977315]
> !https://user-images.githubusercontent.com/8326978/143821530-ff498caa-abce-483d-a24b-315aacf7e0a0.png!
> 1. stage 3 threw FetchFailedException and caused itself and its parent 
> stage(stage 2) to retry
> 2. stage 2 was skipped before but its attemptId was still 0, so when its 
> retry happened it got removed from `Skipped Stages` 
> The DAG of Job 2 doesn't show that stage 2 is skipped anymore.
> !https://user-images.githubusercontent.com/8326978/143824666-6390b64a-a45b-4bc8-b05d-c5abbb28cdef.png!
> Besides, a retried stage usually has a subset of tasks from the original 
> stage. If we mark it as an original one, the metrics might lead us into 
> pitfalls.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Assigned] (SPARK-37481) Disappearance of skipped stages mislead the bug hunting

2021-11-28 Thread Apache Spark (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-37481?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Apache Spark reassigned SPARK-37481:


Assignee: Apache Spark

> Disappearance of skipped stages mislead the bug hunting 
> 
>
> Key: SPARK-37481
> URL: https://issues.apache.org/jira/browse/SPARK-37481
> Project: Spark
>  Issue Type: Bug
>  Components: Spark Core
>Affects Versions: 3.1.2, 3.2.0, 3.3.0
>Reporter: Kent Yao
>Assignee: Apache Spark
>Priority: Major
>
> # 
>  ## With FetchFailedException and Map Stage Retries
> When rerunning spark-sql shell with the original SQL in 
> [https://gist.github.com/yaooqinn/6acb7b74b343a6a6dffe8401f6b7b45c#gistcomment-3977315]
> !https://user-images.githubusercontent.com/8326978/143821530-ff498caa-abce-483d-a24b-315aacf7e0a0.png!
> 1. stage 3 threw FetchFailedException and caused itself and its parent 
> stage(stage 2) to retry
> 2. stage 2 was skipped before but its attemptId was still 0, so when its 
> retry happened it got removed from `Skipped Stages` 
> The DAG of Job 2 doesn't show that stage 2 is skipped anymore.
> !https://user-images.githubusercontent.com/8326978/143824666-6390b64a-a45b-4bc8-b05d-c5abbb28cdef.png!
> Besides, a retried stage usually has a subset of tasks from the original 
> stage. If we mark it as an original one, the metrics might lead us into 
> pitfalls.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-37481) Disappearance of skipped stages mislead the bug hunting

2021-11-28 Thread Apache Spark (Jira)


[ 
https://issues.apache.org/jira/browse/SPARK-37481?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17450235#comment-17450235
 ] 

Apache Spark commented on SPARK-37481:
--

User 'yaooqinn' has created a pull request for this issue:
https://github.com/apache/spark/pull/34735

> Disappearance of skipped stages mislead the bug hunting 
> 
>
> Key: SPARK-37481
> URL: https://issues.apache.org/jira/browse/SPARK-37481
> Project: Spark
>  Issue Type: Bug
>  Components: Spark Core
>Affects Versions: 3.1.2, 3.2.0, 3.3.0
>Reporter: Kent Yao
>Priority: Major
>
> # 
>  ## With FetchFailedException and Map Stage Retries
> When rerunning spark-sql shell with the original SQL in 
> [https://gist.github.com/yaooqinn/6acb7b74b343a6a6dffe8401f6b7b45c#gistcomment-3977315]
> !https://user-images.githubusercontent.com/8326978/143821530-ff498caa-abce-483d-a24b-315aacf7e0a0.png!
> 1. stage 3 threw FetchFailedException and caused itself and its parent 
> stage(stage 2) to retry
> 2. stage 2 was skipped before but its attemptId was still 0, so when its 
> retry happened it got removed from `Skipped Stages` 
> The DAG of Job 2 doesn't show that stage 2 is skipped anymore.
> !https://user-images.githubusercontent.com/8326978/143824666-6390b64a-a45b-4bc8-b05d-c5abbb28cdef.png!
> Besides, a retried stage usually has a subset of tasks from the original 
> stage. If we mark it as an original one, the metrics might lead us into 
> pitfalls.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Updated] (SPARK-37481) Disappearance of skipped stages mislead the bug hunting

2021-11-28 Thread Kent Yao (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-37481?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kent Yao updated SPARK-37481:
-
Description: 
# 
 ## With FetchFailedException and Map Stage Retries

When rerunning spark-sql shell with the original SQL in 
[https://gist.github.com/yaooqinn/6acb7b74b343a6a6dffe8401f6b7b45c#gistcomment-3977315]

!https://user-images.githubusercontent.com/8326978/143821530-ff498caa-abce-483d-a24b-315aacf7e0a0.png!

1. stage 3 threw FetchFailedException and caused itself and its parent 
stage(stage 2) to retry
2. stage 2 was skipped before but its attemptId was still 0, so when its retry 
happened it got removed from `Skipped Stages` 

The DAG of Job 2 doesn't show that stage 2 is skipped anymore.

!https://user-images.githubusercontent.com/8326978/143824666-6390b64a-a45b-4bc8-b05d-c5abbb28cdef.png!

Besides, a retried stage usually has a subset of tasks from the original stage. 
If we mark it as an original one, the metrics might lead us into pitfalls.

  was:
## With FetchFailedException and Map Stage Retries

When rerunning spark-sql shell with the original SQL in 
https://gist.github.com/yaooqinn/6acb7b74b343a6a6dffe8401f6b7b45c#gistcomment-3977315

![image](https://user-images.githubusercontent.com/8326978/143821530-ff498caa-abce-483d-a24b-315aacf7e0a0.png)

1. stage 3 threw FetchFailedException and caused itself and its parent 
stage(stage 2) to retry
2. stage 2 was skipped before but its attemptId was still 0, so when its retry 
happened it got removed from `Skipped Stages` 

The DAG of Job 2 doesn't show that stage 2 is skipped anymore.

![image](https://user-images.githubusercontent.com/8326978/143824666-6390b64a-a45b-4bc8-b05d-c5abbb28cdef.png)


Besides, a retried stage usually has a subset of tasks from the original stage. 
If we mark it as an original one, the metrics might lead us into pitfalls.


> Disappearance of skipped stages mislead the bug hunting 
> 
>
> Key: SPARK-37481
> URL: https://issues.apache.org/jira/browse/SPARK-37481
> Project: Spark
>  Issue Type: Bug
>  Components: Spark Core
>Affects Versions: 3.1.2, 3.2.0, 3.3.0
>Reporter: Kent Yao
>Priority: Major
>
> # 
>  ## With FetchFailedException and Map Stage Retries
> When rerunning spark-sql shell with the original SQL in 
> [https://gist.github.com/yaooqinn/6acb7b74b343a6a6dffe8401f6b7b45c#gistcomment-3977315]
> !https://user-images.githubusercontent.com/8326978/143821530-ff498caa-abce-483d-a24b-315aacf7e0a0.png!
> 1. stage 3 threw FetchFailedException and caused itself and its parent 
> stage(stage 2) to retry
> 2. stage 2 was skipped before but its attemptId was still 0, so when its 
> retry happened it got removed from `Skipped Stages` 
> The DAG of Job 2 doesn't show that stage 2 is skipped anymore.
> !https://user-images.githubusercontent.com/8326978/143824666-6390b64a-a45b-4bc8-b05d-c5abbb28cdef.png!
> Besides, a retried stage usually has a subset of tasks from the original 
> stage. If we mark it as an original one, the metrics might lead us into 
> pitfalls.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Created] (SPARK-37481) Disappearance of skipped stages mislead the bug hunting

2021-11-28 Thread Kent Yao (Jira)
Kent Yao created SPARK-37481:


 Summary: Disappearance of skipped stages mislead the bug hunting 
 Key: SPARK-37481
 URL: https://issues.apache.org/jira/browse/SPARK-37481
 Project: Spark
  Issue Type: Bug
  Components: Spark Core
Affects Versions: 3.2.0, 3.1.2, 3.3.0
Reporter: Kent Yao


## With FetchFailedException and Map Stage Retries

When rerunning spark-sql shell with the original SQL in 
https://gist.github.com/yaooqinn/6acb7b74b343a6a6dffe8401f6b7b45c#gistcomment-3977315

![image](https://user-images.githubusercontent.com/8326978/143821530-ff498caa-abce-483d-a24b-315aacf7e0a0.png)

1. stage 3 threw FetchFailedException and caused itself and its parent 
stage(stage 2) to retry
2. stage 2 was skipped before but its attemptId was still 0, so when its retry 
happened it got removed from `Skipped Stages` 

The DAG of Job 2 doesn't show that stage 2 is skipped anymore.

![image](https://user-images.githubusercontent.com/8326978/143824666-6390b64a-a45b-4bc8-b05d-c5abbb28cdef.png)


Besides, a retried stage usually has a subset of tasks from the original stage. 
If we mark it as an original one, the metrics might lead us into pitfalls.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-37480) Configurations in docs/running-on-kubernetes.md are not uptodate

2021-11-28 Thread Apache Spark (Jira)


[ 
https://issues.apache.org/jira/browse/SPARK-37480?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17450233#comment-17450233
 ] 

Apache Spark commented on SPARK-37480:
--

User 'Yikun' has created a pull request for this issue:
https://github.com/apache/spark/pull/34734

> Configurations in docs/running-on-kubernetes.md are not uptodate
> 
>
> Key: SPARK-37480
> URL: https://issues.apache.org/jira/browse/SPARK-37480
> Project: Spark
>  Issue Type: Bug
>  Components: Kubernetes
>Affects Versions: 3.3.0
>Reporter: Yikun Jiang
>Priority: Minor
>




--
This message was sent by Atlassian Jira
(v8.20.1#820001)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Assigned] (SPARK-37480) Configurations in docs/running-on-kubernetes.md are not uptodate

2021-11-28 Thread Apache Spark (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-37480?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Apache Spark reassigned SPARK-37480:


Assignee: (was: Apache Spark)

> Configurations in docs/running-on-kubernetes.md are not uptodate
> 
>
> Key: SPARK-37480
> URL: https://issues.apache.org/jira/browse/SPARK-37480
> Project: Spark
>  Issue Type: Bug
>  Components: Kubernetes
>Affects Versions: 3.3.0
>Reporter: Yikun Jiang
>Priority: Minor
>




--
This message was sent by Atlassian Jira
(v8.20.1#820001)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Assigned] (SPARK-37480) Configurations in docs/running-on-kubernetes.md are not uptodate

2021-11-28 Thread Apache Spark (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-37480?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Apache Spark reassigned SPARK-37480:


Assignee: Apache Spark

> Configurations in docs/running-on-kubernetes.md are not uptodate
> 
>
> Key: SPARK-37480
> URL: https://issues.apache.org/jira/browse/SPARK-37480
> Project: Spark
>  Issue Type: Bug
>  Components: Kubernetes
>Affects Versions: 3.3.0
>Reporter: Yikun Jiang
>Assignee: Apache Spark
>Priority: Minor
>




--
This message was sent by Atlassian Jira
(v8.20.1#820001)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Created] (SPARK-37480) Configurations in docs/running-on-kubernetes.md are not uptodate

2021-11-28 Thread Yikun Jiang (Jira)
Yikun Jiang created SPARK-37480:
---

 Summary: Configurations in docs/running-on-kubernetes.md are not 
uptodate
 Key: SPARK-37480
 URL: https://issues.apache.org/jira/browse/SPARK-37480
 Project: Spark
  Issue Type: Bug
  Components: Kubernetes
Affects Versions: 3.3.0
Reporter: Yikun Jiang






--
This message was sent by Atlassian Jira
(v8.20.1#820001)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-36525) DS V2 Index Support

2021-11-28 Thread Huaxin Gao (Jira)


[ 
https://issues.apache.org/jira/browse/SPARK-36525?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17450224#comment-17450224
 ] 

Huaxin Gao commented on SPARK-36525:


The major reason I work on index support is because I have customers who need 
this in iceberg. I don't have any plan to make FileTable implement SupportIndex 
because parquet or ORC doesn't support index. 

> DS V2 Index Support
> ---
>
> Key: SPARK-36525
> URL: https://issues.apache.org/jira/browse/SPARK-36525
> Project: Spark
>  Issue Type: Improvement
>  Components: SQL
>Affects Versions: 3.3.0
>Reporter: Huaxin Gao
>Priority: Major
>
> Many data sources support index to improvement query performance. In order to 
> take advantage of the index support in data source, the following APIs will 
> be added for working with indexes:
> {code:java}
>  public interface SupportsIndex extends Table {
>   /**
>* Creates an index.
>*
>* @param indexName the name of the index to be created
>* @param indexType the type of the index to be created. If this is not 
> specified, Spark
>*  will use empty String.
>* @param columns the columns on which index to be created
>* @param columnsProperties the properties of the columns on which index to 
> be created
>* @param properties the properties of the index to be created
>* @throws IndexAlreadyExistsException If the index already exists.
>*/
>   void createIndex(String indexName,
>   String indexType,
>   NamedReference[] columns,
>   Map> columnsProperties,
>   Map properties)
>   throws IndexAlreadyExistsException;
>   /**
>* Drops the index with the given name.
>*
>* @param indexName the name of the index to be dropped.
>* @throws NoSuchIndexException If the index does not exist.
>*/
>   void dropIndex(String indexName) throws NoSuchIndexException;
>   /**
>* Checks whether an index exists in this table.
>*
>* @param indexName the name of the index
>* @return true if the index exists, false otherwise
>*/
>   boolean indexExists(String indexName);
>   /**
>* Lists all the indexes in this table.
>*/
>   TableIndex[] listIndexes();
> }
> {code}



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-36346) Support TimestampNTZ type in Orc file source

2021-11-28 Thread Apache Spark (Jira)


[ 
https://issues.apache.org/jira/browse/SPARK-36346?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17450219#comment-17450219
 ] 

Apache Spark commented on SPARK-36346:
--

User 'dongjoon-hyun' has created a pull request for this issue:
https://github.com/apache/spark/pull/34733

> Support TimestampNTZ type in Orc file source
> 
>
> Key: SPARK-36346
> URL: https://issues.apache.org/jira/browse/SPARK-36346
> Project: Spark
>  Issue Type: Sub-task
>  Components: SQL
>Affects Versions: 3.3.0
>Reporter: jiaan.geng
>Assignee: jiaan.geng
>Priority: Major
> Fix For: 3.3.0
>
>
> As per https://orc.apache.org/docs/types.html, Orc supports both 
> TIMESTAMP_NTZ and TIMESTAMP_LTZ (Spark's current default timestamp type):
> * A TIMESTAMP => TIMESTAMP_LTZ
> * Timestamp with local time zone => TIMESTAMP_NTZ
> In Spark 3.1 or prior, Spark only considered TIMESTAMP.
> Since 3.2, with the support of timestamp without time zone type:
> * Orc writer follows the definition and uses "Timestamp with local time zone" 
> on writing TIMESTAMP_NTZ.
> * Orc reader converts the "Timestamp with local time zone" to TIMESTAMP_NTZ.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Updated] (SPARK-37392) Catalyst optimizer very time-consuming and memory-intensive with some "explode(array)"

2021-11-28 Thread Francois MARTIN (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-37392?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Francois MARTIN updated SPARK-37392:

Description: 
The problem occurs with the simple code below:
{code:java}
import session.implicits._

Seq(
  (1, "x", "x", "x", "x", "x", "x", "x", "x", "x", "x", "x", "x", "x", "x", 
"x", "x", "x", "x", "x", "x")
).toDF()
  .checkpoint() // or save and reload to truncate lineage
  .createOrReplaceTempView("sub")

session.sql("""
  SELECT
*
  FROM
  (
SELECT
  EXPLODE( ARRAY( * ) ) result
FROM
(
  SELECT
_1 a, _2 b, _3 c, _4 d, _5 e, _6 f, _7 g, _8 h, _9 i, _10 j, _11 k, _12 
l, _13 m, _14 n, _15 o, _16 p, _17 q, _18 r, _19 s, _20 t, _21 u
  FROM
sub
)
  )
  WHERE
result != ''
  """).show() {code}
It takes several minutes and a very high Java heap usage, when it should be 
immediate.

It does not occur when replacing the unique integer value (1) with a string 
value ({_}"x"{_}).

All the time is spent in the _PruneFilters_ optimization rule.

Not reproduced in Spark 2.4.1.

  was:
The problem occurs with the simple code below:
{code:java}
import session.implicits._

Seq(
  (1, "x", "x", "x", "x", "x", "x", "x", "x", "x", "x", "x", "x", "x", "x", 
"x", "x", "x", "x", "x", "x")
).toDF()
  .checkpoint() // or save and reload to truncate lineage
  .createOrReplaceTempView("sub")

session.sql("""
  SELECT
*
  FROM
  (
SELECT
  EXPLODE( ARRAY( * ) ) result
FROM
(
  SELECT
_1 a, _2 b, _3 c, _4 d, _5 e, _6 f, _7 g, _8 h, _9 i, _10 j, _11 k, _12 
l, _13 m, _14 n, _15 o, _16 p, _17 q, _18 r, _19 s, _20 t, _21 u
  FROM
sub
)
  )
  WHERE
result != ''
  """).show() {code}
It takes several minutes and a very high Java heap usage, when it should be 
immediate.

It does not occur when replacing the unique integer value ({_}1{_}) with a 
string value ({_}"x"{_}).

All the time is spent in the _PruneFilters_ optimization rule.

Not reproduced in Spark 2.4.1.


> Catalyst optimizer very time-consuming and memory-intensive with some 
> "explode(array)" 
> ---
>
> Key: SPARK-37392
> URL: https://issues.apache.org/jira/browse/SPARK-37392
> Project: Spark
>  Issue Type: Bug
>  Components: Spark Core
>Affects Versions: 3.1.2
>Reporter: Francois MARTIN
>Priority: Major
>
> The problem occurs with the simple code below:
> {code:java}
> import session.implicits._
> Seq(
>   (1, "x", "x", "x", "x", "x", "x", "x", "x", "x", "x", "x", "x", "x", "x", 
> "x", "x", "x", "x", "x", "x")
> ).toDF()
>   .checkpoint() // or save and reload to truncate lineage
>   .createOrReplaceTempView("sub")
> session.sql("""
>   SELECT
> *
>   FROM
>   (
> SELECT
>   EXPLODE( ARRAY( * ) ) result
> FROM
> (
>   SELECT
> _1 a, _2 b, _3 c, _4 d, _5 e, _6 f, _7 g, _8 h, _9 i, _10 j, _11 k, 
> _12 l, _13 m, _14 n, _15 o, _16 p, _17 q, _18 r, _19 s, _20 t, _21 u
>   FROM
> sub
> )
>   )
>   WHERE
> result != ''
>   """).show() {code}
> It takes several minutes and a very high Java heap usage, when it should be 
> immediate.
> It does not occur when replacing the unique integer value (1) with a string 
> value ({_}"x"{_}).
> All the time is spent in the _PruneFilters_ optimization rule.
> Not reproduced in Spark 2.4.1.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-37291) PySpark init SparkSession should copy conf to sharedState

2021-11-28 Thread Apache Spark (Jira)


[ 
https://issues.apache.org/jira/browse/SPARK-37291?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17450214#comment-17450214
 ] 

Apache Spark commented on SPARK-37291:
--

User 'AngersZh' has created a pull request for this issue:
https://github.com/apache/spark/pull/34732

>  PySpark init SparkSession should copy conf to sharedState
> --
>
> Key: SPARK-37291
> URL: https://issues.apache.org/jira/browse/SPARK-37291
> Project: Spark
>  Issue Type: Bug
>  Components: PySpark, SQL
>Affects Versions: 3.2.0
>Reporter: angerszhu
>Assignee: angerszhu
>Priority: Major
> Fix For: 3.3.0
>
>
> PySpark SparkSession.config should respect enableHiveSupport



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-37291) PySpark init SparkSession should copy conf to sharedState

2021-11-28 Thread Apache Spark (Jira)


[ 
https://issues.apache.org/jira/browse/SPARK-37291?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17450213#comment-17450213
 ] 

Apache Spark commented on SPARK-37291:
--

User 'AngersZh' has created a pull request for this issue:
https://github.com/apache/spark/pull/34732

>  PySpark init SparkSession should copy conf to sharedState
> --
>
> Key: SPARK-37291
> URL: https://issues.apache.org/jira/browse/SPARK-37291
> Project: Spark
>  Issue Type: Bug
>  Components: PySpark, SQL
>Affects Versions: 3.2.0
>Reporter: angerszhu
>Assignee: angerszhu
>Priority: Major
> Fix For: 3.3.0
>
>
> PySpark SparkSession.config should respect enableHiveSupport



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-37055) Apply 'compute.eager_check' across all the codebase

2021-11-28 Thread dch nguyen (Jira)


[ 
https://issues.apache.org/jira/browse/SPARK-37055?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17450210#comment-17450210
 ] 

dch nguyen commented on SPARK-37055:


thanks! I will try to address them

> Apply 'compute.eager_check' across all the codebase
> ---
>
> Key: SPARK-37055
> URL: https://issues.apache.org/jira/browse/SPARK-37055
> Project: Spark
>  Issue Type: Umbrella
>  Components: PySpark
>Affects Versions: 3.3.0
>Reporter: dch nguyen
>Priority: Major
>
> As [~hyukjin.kwon] guide
>  1 Make every input validation like this covered by the new configuration. 
> For example:
> {code:python}
> - a == b
> + def eager_check(f): # Utility function 
> + return not config.compute.eager_check and f() 
> + 
> + eager_check(lambda: a == b)
> {code}
> 2 We should check if the output makes sense although the behaviour is not 
> matched with pandas'. If the output does not make sense, we shouldn't cover 
> it with this configuration.
> 3 Make this configuration enabled by default so we match the behaviour to 
> pandas' by default.
>  
> We have to make sure listing which API is affected in the description of 
> 'compute.eager_check'



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-36525) DS V2 Index Support

2021-11-28 Thread Yang Jie (Jira)


[ 
https://issues.apache.org/jira/browse/SPARK-36525?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17450207#comment-17450207
 ] 

Yang Jie commented on SPARK-36525:
--

Do we plan to make FileTable support the trait of SupportIndex

> DS V2 Index Support
> ---
>
> Key: SPARK-36525
> URL: https://issues.apache.org/jira/browse/SPARK-36525
> Project: Spark
>  Issue Type: Improvement
>  Components: SQL
>Affects Versions: 3.3.0
>Reporter: Huaxin Gao
>Priority: Major
>
> Many data sources support index to improvement query performance. In order to 
> take advantage of the index support in data source, the following APIs will 
> be added for working with indexes:
> {code:java}
>  public interface SupportsIndex extends Table {
>   /**
>* Creates an index.
>*
>* @param indexName the name of the index to be created
>* @param indexType the type of the index to be created. If this is not 
> specified, Spark
>*  will use empty String.
>* @param columns the columns on which index to be created
>* @param columnsProperties the properties of the columns on which index to 
> be created
>* @param properties the properties of the index to be created
>* @throws IndexAlreadyExistsException If the index already exists.
>*/
>   void createIndex(String indexName,
>   String indexType,
>   NamedReference[] columns,
>   Map> columnsProperties,
>   Map properties)
>   throws IndexAlreadyExistsException;
>   /**
>* Drops the index with the given name.
>*
>* @param indexName the name of the index to be dropped.
>* @throws NoSuchIndexException If the index does not exist.
>*/
>   void dropIndex(String indexName) throws NoSuchIndexException;
>   /**
>* Checks whether an index exists in this table.
>*
>* @param indexName the name of the index
>* @return true if the index exists, false otherwise
>*/
>   boolean indexExists(String indexName);
>   /**
>* Lists all the indexes in this table.
>*/
>   TableIndex[] listIndexes();
> }
> {code}



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Assigned] (SPARK-37443) Provide a profiler for Python/Pandas UDFs

2021-11-28 Thread Hyukjin Kwon (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-37443?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hyukjin Kwon reassigned SPARK-37443:


Assignee: Takuya Ueshin

> Provide a profiler for Python/Pandas UDFs
> -
>
> Key: SPARK-37443
> URL: https://issues.apache.org/jira/browse/SPARK-37443
> Project: Spark
>  Issue Type: Improvement
>  Components: PySpark
>Affects Versions: 3.3.0
>Reporter: Takuya Ueshin
>Assignee: Takuya Ueshin
>Priority: Major
>
> Currently a profiler is provided for only {{RDD}} operations, but providing a 
> profiler for Python/Pandas UDFs would be great.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Resolved] (SPARK-37443) Provide a profiler for Python/Pandas UDFs

2021-11-28 Thread Hyukjin Kwon (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-37443?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hyukjin Kwon resolved SPARK-37443.
--
Fix Version/s: 3.3.0
   Resolution: Fixed

Issue resolved by pull request 34685
[https://github.com/apache/spark/pull/34685]

> Provide a profiler for Python/Pandas UDFs
> -
>
> Key: SPARK-37443
> URL: https://issues.apache.org/jira/browse/SPARK-37443
> Project: Spark
>  Issue Type: Improvement
>  Components: PySpark
>Affects Versions: 3.3.0
>Reporter: Takuya Ueshin
>Assignee: Takuya Ueshin
>Priority: Major
> Fix For: 3.3.0
>
>
> Currently a profiler is provided for only {{RDD}} operations, but providing a 
> profiler for Python/Pandas UDFs would be great.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-37055) Apply 'compute.eager_check' across all the codebase

2021-11-28 Thread Hyukjin Kwon (Jira)


[ 
https://issues.apache.org/jira/browse/SPARK-37055?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17450197#comment-17450197
 ] 

Hyukjin Kwon commented on SPARK-37055:
--

equals is the same too: 
https://github.com/apache/spark/blob/master/python/pyspark/pandas/series.py#L5842

> Apply 'compute.eager_check' across all the codebase
> ---
>
> Key: SPARK-37055
> URL: https://issues.apache.org/jira/browse/SPARK-37055
> Project: Spark
>  Issue Type: Umbrella
>  Components: PySpark
>Affects Versions: 3.3.0
>Reporter: dch nguyen
>Priority: Major
>
> As [~hyukjin.kwon] guide
>  1 Make every input validation like this covered by the new configuration. 
> For example:
> {code:python}
> - a == b
> + def eager_check(f): # Utility function 
> + return not config.compute.eager_check and f() 
> + 
> + eager_check(lambda: a == b)
> {code}
> 2 We should check if the output makes sense although the behaviour is not 
> matched with pandas'. If the output does not make sense, we shouldn't cover 
> it with this configuration.
> 3 Make this configuration enabled by default so we match the behaviour to 
> pandas' by default.
>  
> We have to make sure listing which API is affected in the description of 
> 'compute.eager_check'



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-37055) Apply 'compute.eager_check' across all the codebase

2021-11-28 Thread Hyukjin Kwon (Jira)


[ 
https://issues.apache.org/jira/browse/SPARK-37055?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17450196#comment-17450196
 ] 

Hyukjin Kwon commented on SPARK-37055:
--

You can, for example, find some instances relying on 
is_moninotically_increasing 
(https://github.com/apache/spark/blob/2fe9af8b2b91d0a46782dd6fff57eca8609be105/python/pyspark/pandas/base.py#L703-L758)
 which is super expensive e.g.) 
https://github.com/apache/spark/blob/master/python/pyspark/pandas/series.py#L5219
 

> Apply 'compute.eager_check' across all the codebase
> ---
>
> Key: SPARK-37055
> URL: https://issues.apache.org/jira/browse/SPARK-37055
> Project: Spark
>  Issue Type: Umbrella
>  Components: PySpark
>Affects Versions: 3.3.0
>Reporter: dch nguyen
>Priority: Major
>
> As [~hyukjin.kwon] guide
>  1 Make every input validation like this covered by the new configuration. 
> For example:
> {code:python}
> - a == b
> + def eager_check(f): # Utility function 
> + return not config.compute.eager_check and f() 
> + 
> + eager_check(lambda: a == b)
> {code}
> 2 We should check if the output makes sense although the behaviour is not 
> matched with pandas'. If the output does not make sense, we shouldn't cover 
> it with this configuration.
> 3 Make this configuration enabled by default so we match the behaviour to 
> pandas' by default.
>  
> We have to make sure listing which API is affected in the description of 
> 'compute.eager_check'



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-37055) Apply 'compute.eager_check' across all the codebase

2021-11-28 Thread dch nguyen (Jira)


[ 
https://issues.apache.org/jira/browse/SPARK-37055?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17450191#comment-17450191
 ] 

dch nguyen commented on SPARK-37055:


[~hyukjin.kwon] , no, I am not now. I did not find anywhere to apply this conf 
more :(

> Apply 'compute.eager_check' across all the codebase
> ---
>
> Key: SPARK-37055
> URL: https://issues.apache.org/jira/browse/SPARK-37055
> Project: Spark
>  Issue Type: Umbrella
>  Components: PySpark
>Affects Versions: 3.3.0
>Reporter: dch nguyen
>Priority: Major
>
> As [~hyukjin.kwon] guide
>  1 Make every input validation like this covered by the new configuration. 
> For example:
> {code:python}
> - a == b
> + def eager_check(f): # Utility function 
> + return not config.compute.eager_check and f() 
> + 
> + eager_check(lambda: a == b)
> {code}
> 2 We should check if the output makes sense although the behaviour is not 
> matched with pandas'. If the output does not make sense, we shouldn't cover 
> it with this configuration.
> 3 Make this configuration enabled by default so we match the behaviour to 
> pandas' by default.
>  
> We have to make sure listing which API is affected in the description of 
> 'compute.eager_check'



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-37479) Migrate DROP NAMESPACE to use V2 command by default

2021-11-28 Thread dch nguyen (Jira)


[ 
https://issues.apache.org/jira/browse/SPARK-37479?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17450189#comment-17450189
 ] 

dch nguyen commented on SPARK-37479:


working on this

> Migrate DROP NAMESPACE to use V2 command by default
> ---
>
> Key: SPARK-37479
> URL: https://issues.apache.org/jira/browse/SPARK-37479
> Project: Spark
>  Issue Type: Sub-task
>  Components: SQL
>Affects Versions: 3.3.0
>Reporter: dch nguyen
>Priority: Major
>




--
This message was sent by Atlassian Jira
(v8.20.1#820001)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Created] (SPARK-37479) Migrate DROP NAMESPACE to use V2 command by default

2021-11-28 Thread dch nguyen (Jira)
dch nguyen created SPARK-37479:
--

 Summary: Migrate DROP NAMESPACE to use V2 command by default
 Key: SPARK-37479
 URL: https://issues.apache.org/jira/browse/SPARK-37479
 Project: Spark
  Issue Type: Sub-task
  Components: SQL
Affects Versions: 3.3.0
Reporter: dch nguyen






--
This message was sent by Atlassian Jira
(v8.20.1#820001)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-37478) Unify v1 and v2 DROP NAMESPACE tests

2021-11-28 Thread dch nguyen (Jira)


[ 
https://issues.apache.org/jira/browse/SPARK-37478?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17450188#comment-17450188
 ] 

dch nguyen commented on SPARK-37478:


working on this

> Unify v1 and v2 DROP NAMESPACE tests
> 
>
> Key: SPARK-37478
> URL: https://issues.apache.org/jira/browse/SPARK-37478
> Project: Spark
>  Issue Type: Sub-task
>  Components: SQL
>Affects Versions: 3.3.0
>Reporter: dch nguyen
>Priority: Major
>




--
This message was sent by Atlassian Jira
(v8.20.1#820001)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Created] (SPARK-37478) Unify v1 and v2 DROP NAMESPACE tests

2021-11-28 Thread dch nguyen (Jira)
dch nguyen created SPARK-37478:
--

 Summary: Unify v1 and v2 DROP NAMESPACE tests
 Key: SPARK-37478
 URL: https://issues.apache.org/jira/browse/SPARK-37478
 Project: Spark
  Issue Type: Sub-task
  Components: SQL
Affects Versions: 3.3.0
Reporter: dch nguyen






--
This message was sent by Atlassian Jira
(v8.20.1#820001)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-37055) Apply 'compute.eager_check' across all the codebase

2021-11-28 Thread Hyukjin Kwon (Jira)


[ 
https://issues.apache.org/jira/browse/SPARK-37055?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17450187#comment-17450187
 ] 

Hyukjin Kwon commented on SPARK-37055:
--

[~dchvn], just checking - are you working on this?

> Apply 'compute.eager_check' across all the codebase
> ---
>
> Key: SPARK-37055
> URL: https://issues.apache.org/jira/browse/SPARK-37055
> Project: Spark
>  Issue Type: Umbrella
>  Components: PySpark
>Affects Versions: 3.3.0
>Reporter: dch nguyen
>Priority: Major
>
> As [~hyukjin.kwon] guide
>  1 Make every input validation like this covered by the new configuration. 
> For example:
> {code:python}
> - a == b
> + def eager_check(f): # Utility function 
> + return not config.compute.eager_check and f() 
> + 
> + eager_check(lambda: a == b)
> {code}
> 2 We should check if the output makes sense although the behaviour is not 
> matched with pandas'. If the output does not make sense, we shouldn't cover 
> it with this configuration.
> 3 Make this configuration enabled by default so we match the behaviour to 
> pandas' by default.
>  
> We have to make sure listing which API is affected in the description of 
> 'compute.eager_check'



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-37153) Inline type hints for python/pyspark/profiler.py

2021-11-28 Thread Apache Spark (Jira)


[ 
https://issues.apache.org/jira/browse/SPARK-37153?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17450180#comment-17450180
 ] 

Apache Spark commented on SPARK-37153:
--

User 'dchvn' has created a pull request for this issue:
https://github.com/apache/spark/pull/34731

> Inline type hints for python/pyspark/profiler.py
> 
>
> Key: SPARK-37153
> URL: https://issues.apache.org/jira/browse/SPARK-37153
> Project: Spark
>  Issue Type: Sub-task
>  Components: PySpark
>Affects Versions: 3.2.0
>Reporter: Byron Hsu
>Priority: Major
>




--
This message was sent by Atlassian Jira
(v8.20.1#820001)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Assigned] (SPARK-37153) Inline type hints for python/pyspark/profiler.py

2021-11-28 Thread Apache Spark (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-37153?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Apache Spark reassigned SPARK-37153:


Assignee: Apache Spark

> Inline type hints for python/pyspark/profiler.py
> 
>
> Key: SPARK-37153
> URL: https://issues.apache.org/jira/browse/SPARK-37153
> Project: Spark
>  Issue Type: Sub-task
>  Components: PySpark
>Affects Versions: 3.2.0
>Reporter: Byron Hsu
>Assignee: Apache Spark
>Priority: Major
>




--
This message was sent by Atlassian Jira
(v8.20.1#820001)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Assigned] (SPARK-37153) Inline type hints for python/pyspark/profiler.py

2021-11-28 Thread Apache Spark (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-37153?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Apache Spark reassigned SPARK-37153:


Assignee: (was: Apache Spark)

> Inline type hints for python/pyspark/profiler.py
> 
>
> Key: SPARK-37153
> URL: https://issues.apache.org/jira/browse/SPARK-37153
> Project: Spark
>  Issue Type: Sub-task
>  Components: PySpark
>Affects Versions: 3.2.0
>Reporter: Byron Hsu
>Priority: Major
>




--
This message was sent by Atlassian Jira
(v8.20.1#820001)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Created] (SPARK-37477) Migrate SHOW CREATE TABLE to use V2 command by default

2021-11-28 Thread PengLei (Jira)
PengLei created SPARK-37477:
---

 Summary: Migrate SHOW CREATE TABLE to use V2 command by default
 Key: SPARK-37477
 URL: https://issues.apache.org/jira/browse/SPARK-37477
 Project: Spark
  Issue Type: Sub-task
  Components: SQL
Affects Versions: 3.3.0
Reporter: PengLei
 Fix For: 3.3.0






--
This message was sent by Atlassian Jira
(v8.20.1#820001)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Resolved] (SPARK-37461) yarn-client mode client's appid value is null

2021-11-28 Thread Sean R. Owen (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-37461?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sean R. Owen resolved SPARK-37461.
--
Fix Version/s: 3.3.0
   Resolution: Fixed

Issue resolved by pull request 34710
[https://github.com/apache/spark/pull/34710]

> yarn-client mode client's appid value is null
> -
>
> Key: SPARK-37461
> URL: https://issues.apache.org/jira/browse/SPARK-37461
> Project: Spark
>  Issue Type: Task
>  Components: YARN
>Affects Versions: 3.2.0
>Reporter: angerszhu
>Assignee: angerszhu
>Priority: Minor
> Fix For: 3.3.0
>
>




--
This message was sent by Atlassian Jira
(v8.20.1#820001)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Updated] (SPARK-37461) yarn-client mode client's appid value is null

2021-11-28 Thread Sean R. Owen (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-37461?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sean R. Owen updated SPARK-37461:
-
Priority: Minor  (was: Major)

> yarn-client mode client's appid value is null
> -
>
> Key: SPARK-37461
> URL: https://issues.apache.org/jira/browse/SPARK-37461
> Project: Spark
>  Issue Type: Task
>  Components: YARN
>Affects Versions: 3.2.0
>Reporter: angerszhu
>Assignee: angerszhu
>Priority: Minor
>




--
This message was sent by Atlassian Jira
(v8.20.1#820001)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Assigned] (SPARK-37461) yarn-client mode client's appid value is null

2021-11-28 Thread Sean R. Owen (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-37461?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sean R. Owen reassigned SPARK-37461:


Assignee: angerszhu

> yarn-client mode client's appid value is null
> -
>
> Key: SPARK-37461
> URL: https://issues.apache.org/jira/browse/SPARK-37461
> Project: Spark
>  Issue Type: Task
>  Components: YARN
>Affects Versions: 3.2.0
>Reporter: angerszhu
>Assignee: angerszhu
>Priority: Major
>




--
This message was sent by Atlassian Jira
(v8.20.1#820001)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-37213) In the latest version release (Spark.3.2.O) in the Apache Spark documentation, the "O" at the end feels wrong, and it is written as the English letter "O"

2021-11-28 Thread liu zhuang (Jira)


[ 
https://issues.apache.org/jira/browse/SPARK-37213?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17450027#comment-17450027
 ] 

liu zhuang commented on SPARK-37213:


OK,thank you.

> In the latest version release (Spark.3.2.O) in the Apache Spark 
> documentation, the "O" at the end feels wrong, and it is written as the 
> English letter "O"
> --
>
> Key: SPARK-37213
> URL: https://issues.apache.org/jira/browse/SPARK-37213
> Project: Spark
>  Issue Type: Bug
>  Components: Documentation
>Affects Versions: 3.2.0
>Reporter: liu zhuang
>Priority: Major
> Attachments: Spark3.2.0.png
>
>
> In the latest version release (Spark.3.2.O) in the Apache Spark 
> documentation, the "O" at the end feels wrong, and it is written as the 
> English letter "O".



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org