[jira] [Assigned] (SPARK-41677) Protobuf serializer for StreamingQueryProgressWrapper

2023-01-20 Thread Apache Spark (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-41677?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Apache Spark reassigned SPARK-41677:


Assignee: Apache Spark  (was: Yang Jie)

> Protobuf serializer for StreamingQueryProgressWrapper
> -
>
> Key: SPARK-41677
> URL: https://issues.apache.org/jira/browse/SPARK-41677
> Project: Spark
>  Issue Type: Sub-task
>  Components: SQL
>Affects Versions: 3.4.0
>Reporter: Yang Jie
>Assignee: Apache Spark
>Priority: Major
> Fix For: 3.4.0
>
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Assigned] (SPARK-41677) Protobuf serializer for StreamingQueryProgressWrapper

2023-01-20 Thread Apache Spark (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-41677?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Apache Spark reassigned SPARK-41677:


Assignee: Yang Jie  (was: Apache Spark)

> Protobuf serializer for StreamingQueryProgressWrapper
> -
>
> Key: SPARK-41677
> URL: https://issues.apache.org/jira/browse/SPARK-41677
> Project: Spark
>  Issue Type: Sub-task
>  Components: SQL
>Affects Versions: 3.4.0
>Reporter: Yang Jie
>Assignee: Yang Jie
>Priority: Major
> Fix For: 3.4.0
>
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-42143) Handle null string values in RDDStorageInfo/RDDDataDistribution/RDDPartitionInfo

2023-01-20 Thread Apache Spark (Jira)


[ 
https://issues.apache.org/jira/browse/SPARK-42143?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17679416#comment-17679416
 ] 

Apache Spark commented on SPARK-42143:
--

User 'gengliangwang' has created a pull request for this issue:
https://github.com/apache/spark/pull/39686

> Handle null string values in 
> RDDStorageInfo/RDDDataDistribution/RDDPartitionInfo
> 
>
> Key: SPARK-42143
> URL: https://issues.apache.org/jira/browse/SPARK-42143
> Project: Spark
>  Issue Type: Sub-task
>  Components: Spark Core, Web UI
>Affects Versions: 3.4.0
>Reporter: Yang Jie
>Assignee: Gengliang Wang
>Priority: Major
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-42143) Handle null string values in RDDStorageInfo/RDDDataDistribution/RDDPartitionInfo

2023-01-20 Thread Apache Spark (Jira)


[ 
https://issues.apache.org/jira/browse/SPARK-42143?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17679415#comment-17679415
 ] 

Apache Spark commented on SPARK-42143:
--

User 'gengliangwang' has created a pull request for this issue:
https://github.com/apache/spark/pull/39686

> Handle null string values in 
> RDDStorageInfo/RDDDataDistribution/RDDPartitionInfo
> 
>
> Key: SPARK-42143
> URL: https://issues.apache.org/jira/browse/SPARK-42143
> Project: Spark
>  Issue Type: Sub-task
>  Components: Spark Core, Web UI
>Affects Versions: 3.4.0
>Reporter: Yang Jie
>Assignee: Gengliang Wang
>Priority: Major
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Assigned] (SPARK-42143) Handle null string values in RDDStorageInfo/RDDDataDistribution/RDDPartitionInfo

2023-01-20 Thread Apache Spark (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-42143?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Apache Spark reassigned SPARK-42143:


Assignee: Gengliang Wang  (was: Apache Spark)

> Handle null string values in 
> RDDStorageInfo/RDDDataDistribution/RDDPartitionInfo
> 
>
> Key: SPARK-42143
> URL: https://issues.apache.org/jira/browse/SPARK-42143
> Project: Spark
>  Issue Type: Sub-task
>  Components: Spark Core, Web UI
>Affects Versions: 3.4.0
>Reporter: Yang Jie
>Assignee: Gengliang Wang
>Priority: Major
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Assigned] (SPARK-42143) Handle null string values in RDDStorageInfo/RDDDataDistribution/RDDPartitionInfo

2023-01-20 Thread Apache Spark (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-42143?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Apache Spark reassigned SPARK-42143:


Assignee: Apache Spark  (was: Gengliang Wang)

> Handle null string values in 
> RDDStorageInfo/RDDDataDistribution/RDDPartitionInfo
> 
>
> Key: SPARK-42143
> URL: https://issues.apache.org/jira/browse/SPARK-42143
> Project: Spark
>  Issue Type: Sub-task
>  Components: Spark Core, Web UI
>Affects Versions: 3.4.0
>Reporter: Yang Jie
>Assignee: Apache Spark
>Priority: Major
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Assigned] (SPARK-42143) Handle null string values in RDDStorageInfo/RDDDataDistribution/RDDPartitionInfo

2023-01-20 Thread Gengliang Wang (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-42143?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gengliang Wang reassigned SPARK-42143:
--

Assignee: Gengliang Wang

> Handle null string values in 
> RDDStorageInfo/RDDDataDistribution/RDDPartitionInfo
> 
>
> Key: SPARK-42143
> URL: https://issues.apache.org/jira/browse/SPARK-42143
> Project: Spark
>  Issue Type: Sub-task
>  Components: Spark Core, Web UI
>Affects Versions: 3.4.0
>Reporter: Yang Jie
>Assignee: Gengliang Wang
>Priority: Major
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Resolved] (SPARK-41777) Add Integration Tests

2023-01-20 Thread Hyukjin Kwon (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-41777?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hyukjin Kwon resolved SPARK-41777.
--
Fix Version/s: 3.4.0
   Resolution: Fixed

Issue resolved by pull request 39637
[https://github.com/apache/spark/pull/39637]

> Add Integration Tests
> -
>
> Key: SPARK-41777
> URL: https://issues.apache.org/jira/browse/SPARK-41777
> Project: Spark
>  Issue Type: Sub-task
>  Components: ML, PySpark
>Affects Versions: 3.4.0
>Reporter: Rithwik Ediga Lakhamsani
>Assignee: Rithwik Ediga Lakhamsani
>Priority: Major
> Fix For: 3.4.0
>
>
> This requires us to add PyTorch as a testing dependency.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Assigned] (SPARK-41777) Add Integration Tests

2023-01-20 Thread Hyukjin Kwon (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-41777?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hyukjin Kwon reassigned SPARK-41777:


Assignee: Rithwik Ediga Lakhamsani

> Add Integration Tests
> -
>
> Key: SPARK-41777
> URL: https://issues.apache.org/jira/browse/SPARK-41777
> Project: Spark
>  Issue Type: Sub-task
>  Components: ML, PySpark
>Affects Versions: 3.4.0
>Reporter: Rithwik Ediga Lakhamsani
>Assignee: Rithwik Ediga Lakhamsani
>Priority: Major
>
> This requires us to add PyTorch as a testing dependency.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Assigned] (SPARK-41593) Implement logging from the executor nodes

2023-01-20 Thread Hyukjin Kwon (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-41593?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hyukjin Kwon reassigned SPARK-41593:


Assignee: Rithwik Ediga Lakhamsani

> Implement logging from the executor nodes
> -
>
> Key: SPARK-41593
> URL: https://issues.apache.org/jira/browse/SPARK-41593
> Project: Spark
>  Issue Type: Sub-task
>  Components: ML
>Affects Versions: 3.4.0
>Reporter: Rithwik Ediga Lakhamsani
>Assignee: Rithwik Ediga Lakhamsani
>Priority: Major
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Resolved] (SPARK-41593) Implement logging from the executor nodes

2023-01-20 Thread Hyukjin Kwon (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-41593?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hyukjin Kwon resolved SPARK-41593.
--
Fix Version/s: 3.4.0
   Resolution: Fixed

Issue resolved by pull request 39299
[https://github.com/apache/spark/pull/39299]

> Implement logging from the executor nodes
> -
>
> Key: SPARK-41593
> URL: https://issues.apache.org/jira/browse/SPARK-41593
> Project: Spark
>  Issue Type: Sub-task
>  Components: ML
>Affects Versions: 3.4.0
>Reporter: Rithwik Ediga Lakhamsani
>Assignee: Rithwik Ediga Lakhamsani
>Priority: Major
> Fix For: 3.4.0
>
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Assigned] (SPARK-40264) Add helper function for DL model inference in pyspark.ml.functions

2023-01-20 Thread Hyukjin Kwon (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-40264?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hyukjin Kwon reassigned SPARK-40264:


Assignee: Lee Yang

> Add helper function for DL model inference in pyspark.ml.functions
> --
>
> Key: SPARK-40264
> URL: https://issues.apache.org/jira/browse/SPARK-40264
> Project: Spark
>  Issue Type: New Feature
>  Components: ML
>Affects Versions: 3.2.2
>Reporter: Lee Yang
>Assignee: Lee Yang
>Priority: Minor
>
> Add a helper function to create a pandas_udf for inference on a given DL 
> model, where the user provides a predict function that is responsible for 
> loading the model and inferring on a batch of numpy inputs.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Resolved] (SPARK-40264) Add helper function for DL model inference in pyspark.ml.functions

2023-01-20 Thread Hyukjin Kwon (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-40264?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hyukjin Kwon resolved SPARK-40264.
--
Fix Version/s: 3.4.0
   Resolution: Fixed

Issue resolved by pull request 39628
[https://github.com/apache/spark/pull/39628]

> Add helper function for DL model inference in pyspark.ml.functions
> --
>
> Key: SPARK-40264
> URL: https://issues.apache.org/jira/browse/SPARK-40264
> Project: Spark
>  Issue Type: New Feature
>  Components: ML
>Affects Versions: 3.2.2
>Reporter: Lee Yang
>Assignee: Lee Yang
>Priority: Minor
> Fix For: 3.4.0
>
>
> Add a helper function to create a pandas_udf for inference on a given DL 
> model, where the user provides a predict function that is responsible for 
> loading the model and inferring on a batch of numpy inputs.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-42142) Handle null string values in CachedQuantile/ExecutorSummary/PoolData

2023-01-20 Thread Apache Spark (Jira)


[ 
https://issues.apache.org/jira/browse/SPARK-42142?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17679411#comment-17679411
 ] 

Apache Spark commented on SPARK-42142:
--

User 'gengliangwang' has created a pull request for this issue:
https://github.com/apache/spark/pull/39685

> Handle null string values in CachedQuantile/ExecutorSummary/PoolData
> 
>
> Key: SPARK-42142
> URL: https://issues.apache.org/jira/browse/SPARK-42142
> Project: Spark
>  Issue Type: Sub-task
>  Components: Spark Core
>Affects Versions: 3.4.0
>Reporter: Yang Jie
>Assignee: Gengliang Wang
>Priority: Major
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-42142) Handle null string values in CachedQuantile/ExecutorSummary/PoolData

2023-01-20 Thread Apache Spark (Jira)


[ 
https://issues.apache.org/jira/browse/SPARK-42142?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17679410#comment-17679410
 ] 

Apache Spark commented on SPARK-42142:
--

User 'gengliangwang' has created a pull request for this issue:
https://github.com/apache/spark/pull/39685

> Handle null string values in CachedQuantile/ExecutorSummary/PoolData
> 
>
> Key: SPARK-42142
> URL: https://issues.apache.org/jira/browse/SPARK-42142
> Project: Spark
>  Issue Type: Sub-task
>  Components: Spark Core
>Affects Versions: 3.4.0
>Reporter: Yang Jie
>Assignee: Gengliang Wang
>Priority: Major
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Assigned] (SPARK-42142) Handle null string values in CachedQuantile/ExecutorSummary/PoolData

2023-01-20 Thread Apache Spark (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-42142?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Apache Spark reassigned SPARK-42142:


Assignee: Gengliang Wang  (was: Apache Spark)

> Handle null string values in CachedQuantile/ExecutorSummary/PoolData
> 
>
> Key: SPARK-42142
> URL: https://issues.apache.org/jira/browse/SPARK-42142
> Project: Spark
>  Issue Type: Sub-task
>  Components: Spark Core
>Affects Versions: 3.4.0
>Reporter: Yang Jie
>Assignee: Gengliang Wang
>Priority: Major
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Assigned] (SPARK-42142) Handle null string values in CachedQuantile/ExecutorSummary/PoolData

2023-01-20 Thread Apache Spark (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-42142?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Apache Spark reassigned SPARK-42142:


Assignee: Apache Spark  (was: Gengliang Wang)

> Handle null string values in CachedQuantile/ExecutorSummary/PoolData
> 
>
> Key: SPARK-42142
> URL: https://issues.apache.org/jira/browse/SPARK-42142
> Project: Spark
>  Issue Type: Sub-task
>  Components: Spark Core
>Affects Versions: 3.4.0
>Reporter: Yang Jie
>Assignee: Apache Spark
>Priority: Major
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-42143) Handle null string values in RDDStorageInfo/RDDDataDistribution/RDDPartitionInfo

2023-01-20 Thread Gengliang Wang (Jira)


[ 
https://issues.apache.org/jira/browse/SPARK-42143?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17679409#comment-17679409
 ] 

Gengliang Wang commented on SPARK-42143:


I am working on this one

> Handle null string values in 
> RDDStorageInfo/RDDDataDistribution/RDDPartitionInfo
> 
>
> Key: SPARK-42143
> URL: https://issues.apache.org/jira/browse/SPARK-42143
> Project: Spark
>  Issue Type: Sub-task
>  Components: Spark Core, Web UI
>Affects Versions: 3.4.0
>Reporter: Yang Jie
>Priority: Major
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Assigned] (SPARK-42140) Handle null string values in ApplicationEnvironmentInfoWrapper/ApplicationInfoWrapper

2023-01-20 Thread Apache Spark (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-42140?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Apache Spark reassigned SPARK-42140:


Assignee: (was: Apache Spark)

> Handle null string values in 
> ApplicationEnvironmentInfoWrapper/ApplicationInfoWrapper
> -
>
> Key: SPARK-42140
> URL: https://issues.apache.org/jira/browse/SPARK-42140
> Project: Spark
>  Issue Type: Sub-task
>  Components: Spark Core, Web UI
>Affects Versions: 3.4.0
>Reporter: Yang Jie
>Priority: Major
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Assigned] (SPARK-42140) Handle null string values in ApplicationEnvironmentInfoWrapper/ApplicationInfoWrapper

2023-01-20 Thread Apache Spark (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-42140?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Apache Spark reassigned SPARK-42140:


Assignee: Apache Spark

> Handle null string values in 
> ApplicationEnvironmentInfoWrapper/ApplicationInfoWrapper
> -
>
> Key: SPARK-42140
> URL: https://issues.apache.org/jira/browse/SPARK-42140
> Project: Spark
>  Issue Type: Sub-task
>  Components: Spark Core, Web UI
>Affects Versions: 3.4.0
>Reporter: Yang Jie
>Assignee: Apache Spark
>Priority: Major
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-42140) Handle null string values in ApplicationEnvironmentInfoWrapper/ApplicationInfoWrapper

2023-01-20 Thread Apache Spark (Jira)


[ 
https://issues.apache.org/jira/browse/SPARK-42140?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17679408#comment-17679408
 ] 

Apache Spark commented on SPARK-42140:
--

User 'LuciferYang' has created a pull request for this issue:
https://github.com/apache/spark/pull/39684

> Handle null string values in 
> ApplicationEnvironmentInfoWrapper/ApplicationInfoWrapper
> -
>
> Key: SPARK-42140
> URL: https://issues.apache.org/jira/browse/SPARK-42140
> Project: Spark
>  Issue Type: Sub-task
>  Components: Spark Core, Web UI
>Affects Versions: 3.4.0
>Reporter: Yang Jie
>Priority: Major
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Assigned] (SPARK-42056) Add missing options for Protobuf functions.

2023-01-20 Thread Hyukjin Kwon (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-42056?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hyukjin Kwon reassigned SPARK-42056:


Assignee: Raghu Angadi

> Add missing options for Protobuf functions.
> ---
>
> Key: SPARK-42056
> URL: https://issues.apache.org/jira/browse/SPARK-42056
> Project: Spark
>  Issue Type: Improvement
>  Components: Protobuf
>Affects Versions: 3.4.0
>Reporter: Raghu Angadi
>Assignee: Raghu Angadi
>Priority: Major
> Fix For: 3.4.0
>
>
>  e should be able to pass options for both {{from_protobuf()}} and 
> {{{}to_protobuf(){}}}.
> Currently there are some gaps:
>  * In Scala {{to_protobuf()}} does not have a way to pass options.
>  * In Scala {{from_protobuf()}} that takes Java class name does not allow 
> options.
>  * In Python, {{from_protobuf()}} that uses Java class name does not 
> propagate options.
>  * In Python {{to_protobuf()}} does not pass options.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Resolved] (SPARK-42056) Add missing options for Protobuf functions.

2023-01-20 Thread Hyukjin Kwon (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-42056?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hyukjin Kwon resolved SPARK-42056.
--
Resolution: Fixed

Issue resolved by pull request 39550
[https://github.com/apache/spark/pull/39550]

> Add missing options for Protobuf functions.
> ---
>
> Key: SPARK-42056
> URL: https://issues.apache.org/jira/browse/SPARK-42056
> Project: Spark
>  Issue Type: Improvement
>  Components: Protobuf
>Affects Versions: 3.4.0
>Reporter: Raghu Angadi
>Assignee: Raghu Angadi
>Priority: Major
> Fix For: 3.4.0
>
>
>  e should be able to pass options for both {{from_protobuf()}} and 
> {{{}to_protobuf(){}}}.
> Currently there are some gaps:
>  * In Scala {{to_protobuf()}} does not have a way to pass options.
>  * In Scala {{from_protobuf()}} that takes Java class name does not allow 
> options.
>  * In Python, {{from_protobuf()}} that uses Java class name does not 
> propagate options.
>  * In Python {{to_protobuf()}} does not pass options.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-42142) Handle null string values in CachedQuantile/ExecutorSummary/PoolData

2023-01-20 Thread Gengliang Wang (Jira)


[ 
https://issues.apache.org/jira/browse/SPARK-42142?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17679406#comment-17679406
 ] 

Gengliang Wang commented on SPARK-42142:


I am working on this one

> Handle null string values in CachedQuantile/ExecutorSummary/PoolData
> 
>
> Key: SPARK-42142
> URL: https://issues.apache.org/jira/browse/SPARK-42142
> Project: Spark
>  Issue Type: Sub-task
>  Components: Spark Core
>Affects Versions: 3.4.0
>Reporter: Yang Jie
>Priority: Major
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Assigned] (SPARK-42142) Handle null string values in CachedQuantile/ExecutorSummary/PoolData

2023-01-20 Thread Gengliang Wang (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-42142?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gengliang Wang reassigned SPARK-42142:
--

Assignee: Gengliang Wang

> Handle null string values in CachedQuantile/ExecutorSummary/PoolData
> 
>
> Key: SPARK-42142
> URL: https://issues.apache.org/jira/browse/SPARK-42142
> Project: Spark
>  Issue Type: Sub-task
>  Components: Spark Core
>Affects Versions: 3.4.0
>Reporter: Yang Jie
>Assignee: Gengliang Wang
>Priority: Major
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-42140) Handle null string values in ApplicationEnvironmentInfoWrapper/ApplicationInfoWrapper

2023-01-20 Thread Gengliang Wang (Jira)


[ 
https://issues.apache.org/jira/browse/SPARK-42140?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17679405#comment-17679405
 ] 

Gengliang Wang commented on SPARK-42140:


PairStrings can be null

> Handle null string values in 
> ApplicationEnvironmentInfoWrapper/ApplicationInfoWrapper
> -
>
> Key: SPARK-42140
> URL: https://issues.apache.org/jira/browse/SPARK-42140
> Project: Spark
>  Issue Type: Sub-task
>  Components: Spark Core, Web UI
>Affects Versions: 3.4.0
>Reporter: Yang Jie
>Priority: Major
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Resolved] (SPARK-42138) Handle null string values in JobData/TaskDataWrapper/ExecutorStageSummaryWrapper

2023-01-20 Thread Gengliang Wang (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-42138?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gengliang Wang resolved SPARK-42138.

Fix Version/s: 3.4.0
   Resolution: Fixed

Issue resolved by pull request 39680
[https://github.com/apache/spark/pull/39680]

> Handle null string values in 
> JobData/TaskDataWrapper/ExecutorStageSummaryWrapper
> 
>
> Key: SPARK-42138
> URL: https://issues.apache.org/jira/browse/SPARK-42138
> Project: Spark
>  Issue Type: Sub-task
>  Components: Web UI
>Affects Versions: 3.4.0
>Reporter: Gengliang Wang
>Assignee: Gengliang Wang
>Priority: Major
> Fix For: 3.4.0
>
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Comment Edited] (SPARK-42140) Handle null string values in ApplicationEnvironmentInfoWrapper/ApplicationInfoWrapper

2023-01-20 Thread Yang Jie (Jira)


[ 
https://issues.apache.org/jira/browse/SPARK-42140?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17679374#comment-17679374
 ] 

Yang Jie edited comment on SPARK-42140 at 1/21/23 6:30 AM:
---

PairStrings should not be null String,  can they be special cases? 
[~Gengliang.Wang] 


was (Author: luciferyang):
RuntimeInfo and PairStrings should not be null String,  can they be special 
cases? [~Gengliang.Wang] 

> Handle null string values in 
> ApplicationEnvironmentInfoWrapper/ApplicationInfoWrapper
> -
>
> Key: SPARK-42140
> URL: https://issues.apache.org/jira/browse/SPARK-42140
> Project: Spark
>  Issue Type: Sub-task
>  Components: Spark Core, Web UI
>Affects Versions: 3.4.0
>Reporter: Yang Jie
>Priority: Major
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-42141) Handle null string values in ApplicationInfo/ApplicationAttemptInfo

2023-01-20 Thread Yang Jie (Jira)


[ 
https://issues.apache.org/jira/browse/SPARK-42141?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17679402#comment-17679402
 ] 

Yang Jie commented on SPARK-42141:
--

Too small, merge into SPARK-42140

> Handle null string values in ApplicationInfo/ApplicationAttemptInfo
> ---
>
> Key: SPARK-42141
> URL: https://issues.apache.org/jira/browse/SPARK-42141
> Project: Spark
>  Issue Type: Sub-task
>  Components: Spark Core
>Affects Versions: 3.4.0
>Reporter: Yang Jie
>Priority: Major
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Resolved] (SPARK-42141) Handle null string values in ApplicationInfo/ApplicationAttemptInfo

2023-01-20 Thread Yang Jie (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-42141?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yang Jie resolved SPARK-42141.
--
Resolution: Duplicate

> Handle null string values in ApplicationInfo/ApplicationAttemptInfo
> ---
>
> Key: SPARK-42141
> URL: https://issues.apache.org/jira/browse/SPARK-42141
> Project: Spark
>  Issue Type: Sub-task
>  Components: Spark Core
>Affects Versions: 3.4.0
>Reporter: Yang Jie
>Priority: Major
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-42140) Handle null string values in ApplicationEnvironmentInfoWrapper/ApplicationInfoWrapper

2023-01-20 Thread Yang Jie (Jira)


[ 
https://issues.apache.org/jira/browse/SPARK-42140?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17679401#comment-17679401
 ] 

Yang Jie commented on SPARK-42140:
--

working on this

> Handle null string values in 
> ApplicationEnvironmentInfoWrapper/ApplicationInfoWrapper
> -
>
> Key: SPARK-42140
> URL: https://issues.apache.org/jira/browse/SPARK-42140
> Project: Spark
>  Issue Type: Sub-task
>  Components: Spark Core, Web UI
>Affects Versions: 3.4.0
>Reporter: Yang Jie
>Priority: Major
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Updated] (SPARK-42140) Handle null string values in ApplicationEnvironmentInfoWrapper/ApplicationInfoWrapper

2023-01-20 Thread Yang Jie (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-42140?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yang Jie updated SPARK-42140:
-
Summary: Handle null string values in 
ApplicationEnvironmentInfoWrapper/ApplicationInfoWrapper  (was: Handle null 
string values in ApplicationEnvironmentInfoWrapper)

> Handle null string values in 
> ApplicationEnvironmentInfoWrapper/ApplicationInfoWrapper
> -
>
> Key: SPARK-42140
> URL: https://issues.apache.org/jira/browse/SPARK-42140
> Project: Spark
>  Issue Type: Sub-task
>  Components: Spark Core, Web UI
>Affects Versions: 3.4.0
>Reporter: Yang Jie
>Priority: Major
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Updated] (SPARK-42140) Handle null string values in ApplicationEnvironmentInfoWrapper

2023-01-20 Thread Yang Jie (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-42140?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yang Jie updated SPARK-42140:
-
Summary: Handle null string values in ApplicationEnvironmentInfoWrapper  
(was: Handle null string values in 
ApplicationEnvironmentInfo/RuntimeInfo/PairStrings/ExecutorResourceRequest/TaskResourceRequest)

> Handle null string values in ApplicationEnvironmentInfoWrapper
> --
>
> Key: SPARK-42140
> URL: https://issues.apache.org/jira/browse/SPARK-42140
> Project: Spark
>  Issue Type: Sub-task
>  Components: Spark Core, Web UI
>Affects Versions: 3.4.0
>Reporter: Yang Jie
>Priority: Major
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Assigned] (SPARK-42144) Handle null string values in StageData/StreamBlockData/StreamingQueryData

2023-01-20 Thread Apache Spark (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-42144?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Apache Spark reassigned SPARK-42144:


Assignee: (was: Apache Spark)

> Handle null string values in StageData/StreamBlockData/StreamingQueryData
> -
>
> Key: SPARK-42144
> URL: https://issues.apache.org/jira/browse/SPARK-42144
> Project: Spark
>  Issue Type: Sub-task
>  Components: Spark Core, SQL, Web UI
>Affects Versions: 3.4.0
>Reporter: Yang Jie
>Priority: Minor
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-42144) Handle null string values in StageData/StreamBlockData/StreamingQueryData

2023-01-20 Thread Apache Spark (Jira)


[ 
https://issues.apache.org/jira/browse/SPARK-42144?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17679397#comment-17679397
 ] 

Apache Spark commented on SPARK-42144:
--

User 'LuciferYang' has created a pull request for this issue:
https://github.com/apache/spark/pull/39683

> Handle null string values in StageData/StreamBlockData/StreamingQueryData
> -
>
> Key: SPARK-42144
> URL: https://issues.apache.org/jira/browse/SPARK-42144
> Project: Spark
>  Issue Type: Sub-task
>  Components: Spark Core, SQL, Web UI
>Affects Versions: 3.4.0
>Reporter: Yang Jie
>Priority: Minor
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Assigned] (SPARK-42144) Handle null string values in StageData/StreamBlockData/StreamingQueryData

2023-01-20 Thread Apache Spark (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-42144?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Apache Spark reassigned SPARK-42144:


Assignee: Apache Spark

> Handle null string values in StageData/StreamBlockData/StreamingQueryData
> -
>
> Key: SPARK-42144
> URL: https://issues.apache.org/jira/browse/SPARK-42144
> Project: Spark
>  Issue Type: Sub-task
>  Components: Spark Core, SQL, Web UI
>Affects Versions: 3.4.0
>Reporter: Yang Jie
>Assignee: Apache Spark
>Priority: Minor
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-42144) Handle null string values in StageData/StreamBlockData/StreamingQueryData

2023-01-20 Thread Yang Jie (Jira)


[ 
https://issues.apache.org/jira/browse/SPARK-42144?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17679389#comment-17679389
 ] 

Yang Jie commented on SPARK-42144:
--

working on this

> Handle null string values in StageData/StreamBlockData/StreamingQueryData
> -
>
> Key: SPARK-42144
> URL: https://issues.apache.org/jira/browse/SPARK-42144
> Project: Spark
>  Issue Type: Sub-task
>  Components: Spark Core, SQL, Web UI
>Affects Versions: 3.4.0
>Reporter: Yang Jie
>Priority: Minor
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Updated] (SPARK-41415) SASL Request Retries

2023-01-20 Thread Mridul Muralidharan (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-41415?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mridul Muralidharan updated SPARK-41415:

Fix Version/s: 3.2.4
   3.3.2

> SASL Request Retries
> 
>
> Key: SPARK-41415
> URL: https://issues.apache.org/jira/browse/SPARK-41415
> Project: Spark
>  Issue Type: Task
>  Components: Shuffle
>Affects Versions: 3.2.4
>Reporter: Aravind Patnam
>Assignee: Aravind Patnam
>Priority: Major
> Fix For: 3.2.4, 3.3.2, 3.4.0
>
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Resolved] (SPARK-42145) Handle null string values in SparkPlanGraphNode/SparkPlanGraphClusterWrapper

2023-01-20 Thread Yang Jie (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-42145?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yang Jie resolved SPARK-42145.
--
Resolution: Duplicate

> Handle null string values in SparkPlanGraphNode/SparkPlanGraphClusterWrapper
> 
>
> Key: SPARK-42145
> URL: https://issues.apache.org/jira/browse/SPARK-42145
> Project: Spark
>  Issue Type: Sub-task
>  Components: SQL, Web UI
>Affects Versions: 3.4.0
>Reporter: Yang Jie
>Priority: Major
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-42145) Handle null string values in SparkPlanGraphNode/SparkPlanGraphClusterWrapper

2023-01-20 Thread Yang Jie (Jira)


[ 
https://issues.apache.org/jira/browse/SPARK-42145?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17679388#comment-17679388
 ] 

Yang Jie commented on SPARK-42145:
--

Too small, will complete with SPARK-42139

 

> Handle null string values in SparkPlanGraphNode/SparkPlanGraphClusterWrapper
> 
>
> Key: SPARK-42145
> URL: https://issues.apache.org/jira/browse/SPARK-42145
> Project: Spark
>  Issue Type: Sub-task
>  Components: SQL, Web UI
>Affects Versions: 3.4.0
>Reporter: Yang Jie
>Priority: Major
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Updated] (SPARK-42139) Handle null string values in SQLExecutionUIData/SQLPlanMetric/SparkPlanGraphWrapper

2023-01-20 Thread Yang Jie (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-42139?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yang Jie updated SPARK-42139:
-
Summary: Handle null string values in 
SQLExecutionUIData/SQLPlanMetric/SparkPlanGraphWrapper  (was: Handle null 
string values in SQLExecutionUIData/SQLPlanMetric)

> Handle null string values in 
> SQLExecutionUIData/SQLPlanMetric/SparkPlanGraphWrapper
> ---
>
> Key: SPARK-42139
> URL: https://issues.apache.org/jira/browse/SPARK-42139
> Project: Spark
>  Issue Type: Sub-task
>  Components: SQL, Web UI
>Affects Versions: 3.4.0
>Reporter: Yang Jie
>Priority: Major
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Assigned] (SPARK-42139) Handle null string values in SQLExecutionUIData/SQLPlanMetric

2023-01-20 Thread Apache Spark (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-42139?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Apache Spark reassigned SPARK-42139:


Assignee: (was: Apache Spark)

> Handle null string values in SQLExecutionUIData/SQLPlanMetric
> -
>
> Key: SPARK-42139
> URL: https://issues.apache.org/jira/browse/SPARK-42139
> Project: Spark
>  Issue Type: Sub-task
>  Components: SQL, Web UI
>Affects Versions: 3.4.0
>Reporter: Yang Jie
>Priority: Major
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-42139) Handle null string values in SQLExecutionUIData/SQLPlanMetric

2023-01-20 Thread Apache Spark (Jira)


[ 
https://issues.apache.org/jira/browse/SPARK-42139?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17679386#comment-17679386
 ] 

Apache Spark commented on SPARK-42139:
--

User 'LuciferYang' has created a pull request for this issue:
https://github.com/apache/spark/pull/39682

> Handle null string values in SQLExecutionUIData/SQLPlanMetric
> -
>
> Key: SPARK-42139
> URL: https://issues.apache.org/jira/browse/SPARK-42139
> Project: Spark
>  Issue Type: Sub-task
>  Components: SQL, Web UI
>Affects Versions: 3.4.0
>Reporter: Yang Jie
>Priority: Major
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-42139) Handle null string values in SQLExecutionUIData/SQLPlanMetric

2023-01-20 Thread Apache Spark (Jira)


[ 
https://issues.apache.org/jira/browse/SPARK-42139?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17679387#comment-17679387
 ] 

Apache Spark commented on SPARK-42139:
--

User 'LuciferYang' has created a pull request for this issue:
https://github.com/apache/spark/pull/39682

> Handle null string values in SQLExecutionUIData/SQLPlanMetric
> -
>
> Key: SPARK-42139
> URL: https://issues.apache.org/jira/browse/SPARK-42139
> Project: Spark
>  Issue Type: Sub-task
>  Components: SQL, Web UI
>Affects Versions: 3.4.0
>Reporter: Yang Jie
>Priority: Major
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Assigned] (SPARK-42139) Handle null string values in SQLExecutionUIData/SQLPlanMetric

2023-01-20 Thread Apache Spark (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-42139?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Apache Spark reassigned SPARK-42139:


Assignee: Apache Spark

> Handle null string values in SQLExecutionUIData/SQLPlanMetric
> -
>
> Key: SPARK-42139
> URL: https://issues.apache.org/jira/browse/SPARK-42139
> Project: Spark
>  Issue Type: Sub-task
>  Components: SQL, Web UI
>Affects Versions: 3.4.0
>Reporter: Yang Jie
>Assignee: Apache Spark
>Priority: Major
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Resolved] (SPARK-42134) Fix getPartitionFiltersAndDataFilters() to handle filters without referenced attributes

2023-01-20 Thread Huaxin Gao (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-42134?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Huaxin Gao resolved SPARK-42134.

Fix Version/s: 3.3.2
   3.4.0
 Assignee: Peter Toth
   Resolution: Fixed

> Fix getPartitionFiltersAndDataFilters() to handle filters without referenced 
> attributes
> ---
>
> Key: SPARK-42134
> URL: https://issues.apache.org/jira/browse/SPARK-42134
> Project: Spark
>  Issue Type: Bug
>  Components: SQL
>Affects Versions: 3.4.0
>Reporter: Peter Toth
>Assignee: Peter Toth
>Priority: Major
> Fix For: 3.3.2, 3.4.0
>
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-18011) SparkR serialize "NA" throws exception

2023-01-20 Thread Apache Spark (Jira)


[ 
https://issues.apache.org/jira/browse/SPARK-18011?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17679385#comment-17679385
 ] 

Apache Spark commented on SPARK-18011:
--

User 'joveyuan-db' has created a pull request for this issue:
https://github.com/apache/spark/pull/39681

> SparkR serialize "NA" throws exception
> --
>
> Key: SPARK-18011
> URL: https://issues.apache.org/jira/browse/SPARK-18011
> Project: Spark
>  Issue Type: Bug
>  Components: SparkR
>Reporter: Miao Wang
>Priority: Major
>  Labels: bulk-closed
>
> For some versions of R, if Date has "NA" field, backend will throw negative 
> index exception.
> To reproduce the problem:
> {code}
> > a <- as.Date(c("2016-11-11", "NA"))
> > b <- as.data.frame(a)
> > c <- createDataFrame(b)
> > dim(c)
> 16/10/19 10:31:24 ERROR Executor: Exception in task 0.0 in stage 1.0 (TID 1)
> java.lang.NegativeArraySizeException
>   at org.apache.spark.api.r.SerDe$.readStringBytes(SerDe.scala:110)
>   at org.apache.spark.api.r.SerDe$.readString(SerDe.scala:119)
>   at org.apache.spark.api.r.SerDe$.readDate(SerDe.scala:128)
>   at org.apache.spark.api.r.SerDe$.readTypedObject(SerDe.scala:77)
>   at org.apache.spark.api.r.SerDe$.readObject(SerDe.scala:61)
>   at 
> org.apache.spark.sql.api.r.SQLUtils$$anonfun$bytesToRow$1.apply(SQLUtils.scala:161)
>   at 
> org.apache.spark.sql.api.r.SQLUtils$$anonfun$bytesToRow$1.apply(SQLUtils.scala:160)
>   at 
> scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:234)
>   at 
> scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:234)
>   at scala.collection.immutable.Range.foreach(Range.scala:160)
>   at scala.collection.TraversableLike$class.map(TraversableLike.scala:234)
>   at scala.collection.AbstractTraversable.map(Traversable.scala:104)
>   at org.apache.spark.sql.api.r.SQLUtils$.bytesToRow(SQLUtils.scala:160)
>   at 
> org.apache.spark.sql.api.r.SQLUtils$$anonfun$5.apply(SQLUtils.scala:138)
>   at 
> org.apache.spark.sql.api.r.SQLUtils$$anonfun$5.apply(SQLUtils.scala:138)
>   at scala.collection.Iterator$$anon$11.next(Iterator.scala:409)
>   at scala.collection.Iterator$$anon$11.next(Iterator.scala:409)
>   at scala.collection.Iterator$$anon$11.next(Iterator.scala:409)
>   at 
> org.apache.spark.sql.catalyst.expressions.GeneratedClass$GeneratedIterator.agg_doAggregateWithoutKey$(Unknown
>  Source)
>   at 
> org.apache.spark.sql.catalyst.expressions.GeneratedClass$GeneratedIterator.processNext(Unknown
>  Source)
>   at 
> org.apache.spark.sql.execution.BufferedRowIterator.hasNext(BufferedRowIterator.java:43)
>   at 
> org.apache.spark.sql.execution.WholeStageCodegenExec$$anonfun$8$$anon$1.hasNext(WholeStageCodegenExec.scala:372)
>   at scala.collection.Iterator$$anon$11.hasNext(Iterator.scala:408)
>   at 
> org.apache.spark.shuffle.sort.BypassMergeSortShuffleWriter.write(BypassMergeSortShuffleWriter.java:126)
>   at 
> org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:96)
>   at 
> org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:53)
>   at org.apache.spark.scheduler.Task.run(Task.scala:99)
>   at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:282)
>   at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
>   at java.lang.Thread.run(Thread.java:745)
> {code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Resolved] (SPARK-41987) createDataFrame supports column with map type.

2023-01-20 Thread Ruifeng Zheng (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-41987?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ruifeng Zheng resolved SPARK-41987.
---
Resolution: Resolved

> createDataFrame supports column with map type.
> --
>
> Key: SPARK-41987
> URL: https://issues.apache.org/jira/browse/SPARK-41987
> Project: Spark
>  Issue Type: Sub-task
>  Components: Connect
>Affects Versions: 3.4.0
>Reporter: jiaan.geng
>Priority: Major
>
> Currently, Connect API createDataFrame does not support create dataframe with 
> map type.
> For example, 
> {code:java}
> >>> df = spark.createDataFrame(
> ... [(1, ["foo", "bar"], {"x": 1.0}), (2, [], {}), (3, None, None)],
> ... ("id", "an_array", "a_map")
> ... )
> {code}
> The above code want create a dataframe with column 'a_map' which is map type.
> But pyarrow recognize {"x": 1.0} as a struct not map.
> pyarrow supports map with format [('x', 1.0)]
> Because the data frame's schema is not correct, so the other sequence 
> operator will be impacted.
> For example:
> {code:java}
> df.select("id", "a_map", posexplode_outer("an_array")).show()
> {code}
> Expected:
> {code:java}
> +---+--+++
> | id| a_map| pos| col|
> +---+--+++
> |  1|{x -> 1.0}|   0| foo|
> |  1|{x -> 1.0}|   1| bar|
> |  2|{}|null|null|
> |  3|  null|null|null|
> +---+--+++
> {code}
> Got:
> {code:java}
> +---+--+++
> | id| a_map| pos| col|
> +---+--+++
> |  1| {1.0}|   0| foo|
> |  1| {1.0}|   1| bar|
> |  2|{null}|null|null|
> |  3|  null|null|null|
> +---+--+++
> 
> {code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-42139) Handle null string values in SQLExecutionUIData/SQLPlanMetric

2023-01-20 Thread Yang Jie (Jira)


[ 
https://issues.apache.org/jira/browse/SPARK-42139?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17679384#comment-17679384
 ] 

Yang Jie commented on SPARK-42139:
--

work on this one

> Handle null string values in SQLExecutionUIData/SQLPlanMetric
> -
>
> Key: SPARK-42139
> URL: https://issues.apache.org/jira/browse/SPARK-42139
> Project: Spark
>  Issue Type: Sub-task
>  Components: SQL, Web UI
>Affects Versions: 3.4.0
>Reporter: Yang Jie
>Priority: Major
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Resolved] (SPARK-41845) Fix `count(expr("*"))` function

2023-01-20 Thread Ruifeng Zheng (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-41845?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ruifeng Zheng resolved SPARK-41845.
---
Fix Version/s: 3.4.0
   Resolution: Fixed

Issue resolved by pull request 39622
[https://github.com/apache/spark/pull/39622]

> Fix `count(expr("*"))` function
> ---
>
> Key: SPARK-41845
> URL: https://issues.apache.org/jira/browse/SPARK-41845
> Project: Spark
>  Issue Type: Sub-task
>  Components: Connect, PySpark
>Affects Versions: 3.4.0
>Reporter: Sandeep Singh
>Assignee: Ruifeng Zheng
>Priority: Major
> Fix For: 3.4.0
>
>
> {code:java}
> File 
> "/Users/s.singh/personal/spark-oss/python/pyspark/sql/connect/functions.py", 
> line 801, in pyspark.sql.connect.functions.count
> Failed example:
>     df.select(count(expr("*")), count(df.alphabets)).show()
> Expected:
>     +++
>     |count(1)|count(alphabets)|
>     +++
>     |       4|               3|
>     +++
> Got:
>     +++
>     |count(alphabets)|count(alphabets)|
>     +++
>     |               3|               3|
>     +++
>      {code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Assigned] (SPARK-42099) Make `count(*)` work correctly

2023-01-20 Thread Ruifeng Zheng (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-42099?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ruifeng Zheng reassigned SPARK-42099:
-

Assignee: Ruifeng Zheng

> Make `count(*)` work correctly
> --
>
> Key: SPARK-42099
> URL: https://issues.apache.org/jira/browse/SPARK-42099
> Project: Spark
>  Issue Type: Sub-task
>  Components: Connect, PySpark
>Affects Versions: 3.4.0
>Reporter: Ruifeng Zheng
>Assignee: Ruifeng Zheng
>Priority: Major
>
> cdf.select(CF.count("*"), CF.count(cdf.alphabets)).collect()
> {code:java}
> pyspark.sql.connect.client.SparkConnectAnalysisException: 
> [UNRESOLVED_COLUMN.WITH_SUGGESTION] A column or function parameter with name 
> `*` cannot be resolved. Did you mean one of the following? [`alphabets`]
> Plan: 'Aggregate [unresolvedalias('count('*), None), count(alphabets#32) AS 
> count(alphabets)#35L]
> +- Project [alphabets#30 AS alphabets#32]
>+- LocalRelation [alphabets#30]
> {code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Assigned] (SPARK-41845) Fix `count(expr("*"))` function

2023-01-20 Thread Ruifeng Zheng (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-41845?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ruifeng Zheng reassigned SPARK-41845:
-

Assignee: Ruifeng Zheng

> Fix `count(expr("*"))` function
> ---
>
> Key: SPARK-41845
> URL: https://issues.apache.org/jira/browse/SPARK-41845
> Project: Spark
>  Issue Type: Sub-task
>  Components: Connect, PySpark
>Affects Versions: 3.4.0
>Reporter: Sandeep Singh
>Assignee: Ruifeng Zheng
>Priority: Major
>
> {code:java}
> File 
> "/Users/s.singh/personal/spark-oss/python/pyspark/sql/connect/functions.py", 
> line 801, in pyspark.sql.connect.functions.count
> Failed example:
>     df.select(count(expr("*")), count(df.alphabets)).show()
> Expected:
>     +++
>     |count(1)|count(alphabets)|
>     +++
>     |       4|               3|
>     +++
> Got:
>     +++
>     |count(alphabets)|count(alphabets)|
>     +++
>     |               3|               3|
>     +++
>      {code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Resolved] (SPARK-42099) Make `count(*)` work correctly

2023-01-20 Thread Ruifeng Zheng (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-42099?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ruifeng Zheng resolved SPARK-42099.
---
Fix Version/s: 3.4.0
   Resolution: Fixed

Issue resolved by pull request 39622
[https://github.com/apache/spark/pull/39622]

> Make `count(*)` work correctly
> --
>
> Key: SPARK-42099
> URL: https://issues.apache.org/jira/browse/SPARK-42099
> Project: Spark
>  Issue Type: Sub-task
>  Components: Connect, PySpark
>Affects Versions: 3.4.0
>Reporter: Ruifeng Zheng
>Assignee: Ruifeng Zheng
>Priority: Major
> Fix For: 3.4.0
>
>
> cdf.select(CF.count("*"), CF.count(cdf.alphabets)).collect()
> {code:java}
> pyspark.sql.connect.client.SparkConnectAnalysisException: 
> [UNRESOLVED_COLUMN.WITH_SUGGESTION] A column or function parameter with name 
> `*` cannot be resolved. Did you mean one of the following? [`alphabets`]
> Plan: 'Aggregate [unresolvedalias('count('*), None), count(alphabets#32) AS 
> count(alphabets)#35L]
> +- Project [alphabets#30 AS alphabets#32]
>+- LocalRelation [alphabets#30]
> {code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Created] (SPARK-42145) Handle null string values in SparkPlanGraphNode/SparkPlanGraphClusterWrapper

2023-01-20 Thread Yang Jie (Jira)
Yang Jie created SPARK-42145:


 Summary: Handle null string values in 
SparkPlanGraphNode/SparkPlanGraphClusterWrapper
 Key: SPARK-42145
 URL: https://issues.apache.org/jira/browse/SPARK-42145
 Project: Spark
  Issue Type: Sub-task
  Components: SQL, Web UI
Affects Versions: 3.4.0
Reporter: Yang Jie






--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Reopened] (SPARK-41677) Protobuf serializer for StreamingQueryProgressWrapper

2023-01-20 Thread Yang Jie (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-41677?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yang Jie reopened SPARK-41677:
--

Restore this one

> Protobuf serializer for StreamingQueryProgressWrapper
> -
>
> Key: SPARK-41677
> URL: https://issues.apache.org/jira/browse/SPARK-41677
> Project: Spark
>  Issue Type: Sub-task
>  Components: SQL
>Affects Versions: 3.4.0
>Reporter: Yang Jie
>Assignee: Yang Jie
>Priority: Major
> Fix For: 3.4.0
>
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Updated] (SPARK-42144) Handle null string values in StageData/StreamBlockData/StreamingQueryData

2023-01-20 Thread Yang Jie (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-42144?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yang Jie updated SPARK-42144:
-
Summary: Handle null string values in 
StageData/StreamBlockData/StreamingQueryData  (was: Handle null string values 
in StageData/StreamBlockData)

> Handle null string values in StageData/StreamBlockData/StreamingQueryData
> -
>
> Key: SPARK-42144
> URL: https://issues.apache.org/jira/browse/SPARK-42144
> Project: Spark
>  Issue Type: Sub-task
>  Components: Spark Core
>Affects Versions: 3.4.0
>Reporter: Yang Jie
>Priority: Minor
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Updated] (SPARK-42144) Handle null string values in StageData/StreamBlockData/StreamingQueryData

2023-01-20 Thread Yang Jie (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-42144?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yang Jie updated SPARK-42144:
-
Component/s: SQL
 Web UI

> Handle null string values in StageData/StreamBlockData/StreamingQueryData
> -
>
> Key: SPARK-42144
> URL: https://issues.apache.org/jira/browse/SPARK-42144
> Project: Spark
>  Issue Type: Sub-task
>  Components: Spark Core, SQL, Web UI
>Affects Versions: 3.4.0
>Reporter: Yang Jie
>Priority: Minor
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Created] (SPARK-42144) Handle null string values in StageData/StreamBlockData

2023-01-20 Thread Yang Jie (Jira)
Yang Jie created SPARK-42144:


 Summary: Handle null string values in StageData/StreamBlockData
 Key: SPARK-42144
 URL: https://issues.apache.org/jira/browse/SPARK-42144
 Project: Spark
  Issue Type: Sub-task
  Components: Spark Core
Affects Versions: 3.4.0
Reporter: Yang Jie






--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Created] (SPARK-42143) Handle null string values in RDDStorageInfo/RDDDataDistribution/RDDPartitionInfo

2023-01-20 Thread Yang Jie (Jira)
Yang Jie created SPARK-42143:


 Summary: Handle null string values in 
RDDStorageInfo/RDDDataDistribution/RDDPartitionInfo
 Key: SPARK-42143
 URL: https://issues.apache.org/jira/browse/SPARK-42143
 Project: Spark
  Issue Type: Sub-task
  Components: Spark Core, Web UI
Affects Versions: 3.4.0
Reporter: Yang Jie






--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Created] (SPARK-42142) Handle null string values in CachedQuantile/ExecutorSummary/PoolData

2023-01-20 Thread Yang Jie (Jira)
Yang Jie created SPARK-42142:


 Summary: Handle null string values in 
CachedQuantile/ExecutorSummary/PoolData
 Key: SPARK-42142
 URL: https://issues.apache.org/jira/browse/SPARK-42142
 Project: Spark
  Issue Type: Sub-task
  Components: Spark Core
Affects Versions: 3.4.0
Reporter: Yang Jie






--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Assigned] (SPARK-42137) Enable spark.kryo.unsafe by default

2023-01-20 Thread Dongjoon Hyun (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-42137?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dongjoon Hyun reassigned SPARK-42137:
-

Assignee: Dongjoon Hyun

> Enable spark.kryo.unsafe by default
> ---
>
> Key: SPARK-42137
> URL: https://issues.apache.org/jira/browse/SPARK-42137
> Project: Spark
>  Issue Type: Improvement
>  Components: Spark Core
>Affects Versions: 3.4.0
>Reporter: Dongjoon Hyun
>Assignee: Dongjoon Hyun
>Priority: Major
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Resolved] (SPARK-42137) Enable spark.kryo.unsafe by default

2023-01-20 Thread Dongjoon Hyun (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-42137?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dongjoon Hyun resolved SPARK-42137.
---
Fix Version/s: 3.4.0
   Resolution: Fixed

Issue resolved by pull request 39679
[https://github.com/apache/spark/pull/39679]

> Enable spark.kryo.unsafe by default
> ---
>
> Key: SPARK-42137
> URL: https://issues.apache.org/jira/browse/SPARK-42137
> Project: Spark
>  Issue Type: Improvement
>  Components: Spark Core
>Affects Versions: 3.4.0
>Reporter: Dongjoon Hyun
>Assignee: Dongjoon Hyun
>Priority: Major
> Fix For: 3.4.0
>
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Created] (SPARK-42141) Handle null string values in ApplicationInfo/ApplicationAttemptInfo

2023-01-20 Thread Yang Jie (Jira)
Yang Jie created SPARK-42141:


 Summary: Handle null string values in 
ApplicationInfo/ApplicationAttemptInfo
 Key: SPARK-42141
 URL: https://issues.apache.org/jira/browse/SPARK-42141
 Project: Spark
  Issue Type: Sub-task
  Components: Spark Core
Affects Versions: 3.4.0
Reporter: Yang Jie






--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-42140) Handle null string values in ApplicationEnvironmentInfo/RuntimeInfo/PairStrings/ExecutorResourceRequest/TaskResourceRequest

2023-01-20 Thread Yang Jie (Jira)


[ 
https://issues.apache.org/jira/browse/SPARK-42140?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17679374#comment-17679374
 ] 

Yang Jie commented on SPARK-42140:
--

RuntimeInfo and PairStrings should not be null String,  can they be special 
cases? [~Gengliang.Wang] 

> Handle null string values in 
> ApplicationEnvironmentInfo/RuntimeInfo/PairStrings/ExecutorResourceRequest/TaskResourceRequest
> ---
>
> Key: SPARK-42140
> URL: https://issues.apache.org/jira/browse/SPARK-42140
> Project: Spark
>  Issue Type: Sub-task
>  Components: Spark Core, Web UI
>Affects Versions: 3.4.0
>Reporter: Yang Jie
>Priority: Major
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Created] (SPARK-42140) Handle null string values in ApplicationEnvironmentInfo/RuntimeInfo/PairStrings/ExecutorResourceRequest/TaskResourceRequest

2023-01-20 Thread Yang Jie (Jira)
Yang Jie created SPARK-42140:


 Summary: Handle null string values in 
ApplicationEnvironmentInfo/RuntimeInfo/PairStrings/ExecutorResourceRequest/TaskResourceRequest
 Key: SPARK-42140
 URL: https://issues.apache.org/jira/browse/SPARK-42140
 Project: Spark
  Issue Type: Sub-task
  Components: Spark Core, Web UI
Affects Versions: 3.4.0
Reporter: Yang Jie






--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Created] (SPARK-42139) Handle null string values in SQLExecutionUIData/SQLPlanMetric

2023-01-20 Thread Yang Jie (Jira)
Yang Jie created SPARK-42139:


 Summary: Handle null string values in 
SQLExecutionUIData/SQLPlanMetric
 Key: SPARK-42139
 URL: https://issues.apache.org/jira/browse/SPARK-42139
 Project: Spark
  Issue Type: Sub-task
  Components: SQL, Web UI
Affects Versions: 3.4.0
Reporter: Yang Jie






--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-42138) Handle null string values in JobData/TaskDataWrapper/ExecutorStageSummaryWrapper

2023-01-20 Thread Apache Spark (Jira)


[ 
https://issues.apache.org/jira/browse/SPARK-42138?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17679359#comment-17679359
 ] 

Apache Spark commented on SPARK-42138:
--

User 'gengliangwang' has created a pull request for this issue:
https://github.com/apache/spark/pull/39680

> Handle null string values in 
> JobData/TaskDataWrapper/ExecutorStageSummaryWrapper
> 
>
> Key: SPARK-42138
> URL: https://issues.apache.org/jira/browse/SPARK-42138
> Project: Spark
>  Issue Type: Sub-task
>  Components: Web UI
>Affects Versions: 3.4.0
>Reporter: Gengliang Wang
>Assignee: Gengliang Wang
>Priority: Major
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Assigned] (SPARK-42138) Handle null string values in JobData/TaskDataWrapper/ExecutorStageSummaryWrapper

2023-01-20 Thread Apache Spark (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-42138?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Apache Spark reassigned SPARK-42138:


Assignee: Apache Spark  (was: Gengliang Wang)

> Handle null string values in 
> JobData/TaskDataWrapper/ExecutorStageSummaryWrapper
> 
>
> Key: SPARK-42138
> URL: https://issues.apache.org/jira/browse/SPARK-42138
> Project: Spark
>  Issue Type: Sub-task
>  Components: Web UI
>Affects Versions: 3.4.0
>Reporter: Gengliang Wang
>Assignee: Apache Spark
>Priority: Major
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-42138) Handle null string values in JobData/TaskDataWrapper/ExecutorStageSummaryWrapper

2023-01-20 Thread Apache Spark (Jira)


[ 
https://issues.apache.org/jira/browse/SPARK-42138?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17679358#comment-17679358
 ] 

Apache Spark commented on SPARK-42138:
--

User 'gengliangwang' has created a pull request for this issue:
https://github.com/apache/spark/pull/39680

> Handle null string values in 
> JobData/TaskDataWrapper/ExecutorStageSummaryWrapper
> 
>
> Key: SPARK-42138
> URL: https://issues.apache.org/jira/browse/SPARK-42138
> Project: Spark
>  Issue Type: Sub-task
>  Components: Web UI
>Affects Versions: 3.4.0
>Reporter: Gengliang Wang
>Assignee: Gengliang Wang
>Priority: Major
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Assigned] (SPARK-42138) Handle null string values in JobData/TaskDataWrapper/ExecutorStageSummaryWrapper

2023-01-20 Thread Apache Spark (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-42138?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Apache Spark reassigned SPARK-42138:


Assignee: Gengliang Wang  (was: Apache Spark)

> Handle null string values in 
> JobData/TaskDataWrapper/ExecutorStageSummaryWrapper
> 
>
> Key: SPARK-42138
> URL: https://issues.apache.org/jira/browse/SPARK-42138
> Project: Spark
>  Issue Type: Sub-task
>  Components: Web UI
>Affects Versions: 3.4.0
>Reporter: Gengliang Wang
>Assignee: Gengliang Wang
>Priority: Major
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Created] (SPARK-42138) Handle null string values in JobData/TaskDataWrapper/ExecutorStageSummaryWrapper

2023-01-20 Thread Gengliang Wang (Jira)
Gengliang Wang created SPARK-42138:
--

 Summary: Handle null string values in 
JobData/TaskDataWrapper/ExecutorStageSummaryWrapper
 Key: SPARK-42138
 URL: https://issues.apache.org/jira/browse/SPARK-42138
 Project: Spark
  Issue Type: Sub-task
  Components: Web UI
Affects Versions: 3.4.0
Reporter: Gengliang Wang
Assignee: Gengliang Wang






--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Updated] (SPARK-40817) Remote spark.jars URIs ignored for Spark on Kubernetes in cluster mode

2023-01-20 Thread Dongjoon Hyun (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-40817?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dongjoon Hyun updated SPARK-40817:
--
Fix Version/s: 3.2.4

> Remote spark.jars URIs ignored for Spark on Kubernetes in cluster mode 
> ---
>
> Key: SPARK-40817
> URL: https://issues.apache.org/jira/browse/SPARK-40817
> Project: Spark
>  Issue Type: Bug
>  Components: Kubernetes, Spark Submit
>Affects Versions: 3.0.0, 3.1.3, 3.3.0, 3.2.2, 3.4.0
> Environment: Spark 3.1.3
> Kubernetes 1.21
> Ubuntu 20.04.1
>Reporter: Anton Ippolitov
>Assignee: Anton Ippolitov
>Priority: Major
> Fix For: 3.2.4, 3.3.2, 3.4.0
>
> Attachments: image-2022-10-17-10-44-46-862.png
>
>
> I discovered that remote URIs in {{spark.jars}} get discarded when launching 
> Spark on Kubernetes in cluster mode via spark-submit.
> h1. Reproduction
> Here is an example reproduction with S3 being used for remote JAR storage: 
> I first created 2 JARs:
>  * {{/opt/my-local-jar.jar}} on the host where I'm running spark-submit
>  * {{s3://$BUCKET_NAME/my-remote-jar.jar}} in an S3 bucket I own
> I then ran the following spark-submit command with {{spark.jars}} pointing to 
> both the local JAR and the remote JAR:
> {code:java}
>  spark-submit \
>   --master k8s://https://$KUBERNETES_API_SERVER_URL:443 \
>   --deploy-mode cluster \
>   --name=spark-submit-test \
>   --class org.apache.spark.examples.SparkPi \
>   --conf 
> spark.jars=/opt/my-local-jar.jar,s3a://$BUCKET_NAME/my-remote-jar.jar \
>   --conf spark.kubernetes.file.upload.path=s3a://$BUCKET_NAME/my-upload-path/ 
> \
>   [...]
>   /opt/spark/examples/jars/spark-examples_2.12-3.1.3.jar
> {code}
> Once the driver and the executors started, I confirmed that there was no 
> trace of {{my-remote-jar.jar}} anymore. For example, looking at the Spark 
> History Server, I could see that {{spark.jars}} got transformed into this:
> !image-2022-10-17-10-44-46-862.png|width=991,height=80!
> There was no mention of {{my-remote-jar.jar}} on the classpath or anywhere 
> else.
> Note that I ran all tests with Spark 3.1.3, however the code which handles 
> those dependencies seems to be the same for more recent versions of Spark as 
> well.
> h1. Root cause description
> I believe that the issue seems to be coming from [this 
> logic|https://github.com/apache/spark/blob/d1f8a503a26bcfb4e466d9accc5fa241a7933667/resource-managers/kubernetes/core/src/main/scala/org/apache/spark/deploy/k8s/features/BasicDriverFeatureStep.scala#L163-L186]
>  in {{{}BasicDriverFeatureStep.getAdditionalPodSystemProperties(){}}}.
> Specifically, this logic takes all URIs in {{{}spark.jars{}}}, [filters only 
> on local 
> URIs,|https://github.com/apache/spark/blob/d1f8a503a26bcfb4e466d9accc5fa241a7933667/resource-managers/kubernetes/core/src/main/scala/org/apache/spark/deploy/k8s/features/BasicDriverFeatureStep.scala#L165]
>  
> [uploads|https://github.com/apache/spark/blob/d1f8a503a26bcfb4e466d9accc5fa241a7933667/resource-managers/kubernetes/core/src/main/scala/org/apache/spark/deploy/k8s/features/BasicDriverFeatureStep.scala#L173]
>  those local files to {{spark.kubernetes.file.upload.path }}and then 
> [*replaces*|https://github.com/apache/spark/blob/d1f8a503a26bcfb4e466d9accc5fa241a7933667/resource-managers/kubernetes/core/src/main/scala/org/apache/spark/deploy/k8s/features/BasicDriverFeatureStep.scala#L182]
>  the value of {{spark.jars}} with those newly uploaded JARs. By overwriting 
> the previous value of {{{}spark.jars{}}}, we are losing all mention of remote 
> JARs that were previously specified there. 
> Consequently, when the Spark driver starts afterwards, it only downloads JARs 
> from {{{}spark.kubernetes.file.upload.path{}}}.
> h1. Possible solution
> I think a possible fix would be to not fully overwrite the value of 
> {{spark.jars}} but to make sure that we keep remote URIs there.
> The new logic would look something like this:
> {code:java}
> Seq(JARS, FILES, ARCHIVES, SUBMIT_PYTHON_FILES).foreach { key =>
>   val uris = conf.get(key).filter(uri => 
> KubernetesUtils.isLocalAndResolvable(uri))
>   // Save remote URIs
>   val remoteUris = conf.get(key).filter(uri => 
> !KubernetesUtils.isLocalAndResolvable(uri))
>   val value = {
> if (key == ARCHIVES) {
>   uris.map(UriBuilder.fromUri(_).fragment(null).build()).map(_.toString)
> } else {
>   uris
> }
>   }
>   val resolved = KubernetesUtils.uploadAndTransformFileUris(value, 
> Some(conf.sparkConf))
>   if (resolved.nonEmpty) {
> val resolvedValue = if (key == ARCHIVES) {
>   uris.zip(resolved).map { case (uri, r) =>
> UriBuilder.fromUri(r).fragment(new 
> java.net.URI(uri).getFragment).build().toString
>   }
> } else {
>   resolved
> }
> // don't forg

[jira] [Assigned] (SPARK-40817) Remote spark.jars URIs ignored for Spark on Kubernetes in cluster mode

2023-01-20 Thread Dongjoon Hyun (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-40817?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dongjoon Hyun reassigned SPARK-40817:
-

Assignee: Anton Ippolitov

> Remote spark.jars URIs ignored for Spark on Kubernetes in cluster mode 
> ---
>
> Key: SPARK-40817
> URL: https://issues.apache.org/jira/browse/SPARK-40817
> Project: Spark
>  Issue Type: Bug
>  Components: Kubernetes, Spark Submit
>Affects Versions: 3.0.0, 3.1.3, 3.3.0, 3.2.2, 3.4.0
> Environment: Spark 3.1.3
> Kubernetes 1.21
> Ubuntu 20.04.1
>Reporter: Anton Ippolitov
>Assignee: Anton Ippolitov
>Priority: Major
> Fix For: 3.3.2, 3.4.0
>
> Attachments: image-2022-10-17-10-44-46-862.png
>
>
> I discovered that remote URIs in {{spark.jars}} get discarded when launching 
> Spark on Kubernetes in cluster mode via spark-submit.
> h1. Reproduction
> Here is an example reproduction with S3 being used for remote JAR storage: 
> I first created 2 JARs:
>  * {{/opt/my-local-jar.jar}} on the host where I'm running spark-submit
>  * {{s3://$BUCKET_NAME/my-remote-jar.jar}} in an S3 bucket I own
> I then ran the following spark-submit command with {{spark.jars}} pointing to 
> both the local JAR and the remote JAR:
> {code:java}
>  spark-submit \
>   --master k8s://https://$KUBERNETES_API_SERVER_URL:443 \
>   --deploy-mode cluster \
>   --name=spark-submit-test \
>   --class org.apache.spark.examples.SparkPi \
>   --conf 
> spark.jars=/opt/my-local-jar.jar,s3a://$BUCKET_NAME/my-remote-jar.jar \
>   --conf spark.kubernetes.file.upload.path=s3a://$BUCKET_NAME/my-upload-path/ 
> \
>   [...]
>   /opt/spark/examples/jars/spark-examples_2.12-3.1.3.jar
> {code}
> Once the driver and the executors started, I confirmed that there was no 
> trace of {{my-remote-jar.jar}} anymore. For example, looking at the Spark 
> History Server, I could see that {{spark.jars}} got transformed into this:
> !image-2022-10-17-10-44-46-862.png|width=991,height=80!
> There was no mention of {{my-remote-jar.jar}} on the classpath or anywhere 
> else.
> Note that I ran all tests with Spark 3.1.3, however the code which handles 
> those dependencies seems to be the same for more recent versions of Spark as 
> well.
> h1. Root cause description
> I believe that the issue seems to be coming from [this 
> logic|https://github.com/apache/spark/blob/d1f8a503a26bcfb4e466d9accc5fa241a7933667/resource-managers/kubernetes/core/src/main/scala/org/apache/spark/deploy/k8s/features/BasicDriverFeatureStep.scala#L163-L186]
>  in {{{}BasicDriverFeatureStep.getAdditionalPodSystemProperties(){}}}.
> Specifically, this logic takes all URIs in {{{}spark.jars{}}}, [filters only 
> on local 
> URIs,|https://github.com/apache/spark/blob/d1f8a503a26bcfb4e466d9accc5fa241a7933667/resource-managers/kubernetes/core/src/main/scala/org/apache/spark/deploy/k8s/features/BasicDriverFeatureStep.scala#L165]
>  
> [uploads|https://github.com/apache/spark/blob/d1f8a503a26bcfb4e466d9accc5fa241a7933667/resource-managers/kubernetes/core/src/main/scala/org/apache/spark/deploy/k8s/features/BasicDriverFeatureStep.scala#L173]
>  those local files to {{spark.kubernetes.file.upload.path }}and then 
> [*replaces*|https://github.com/apache/spark/blob/d1f8a503a26bcfb4e466d9accc5fa241a7933667/resource-managers/kubernetes/core/src/main/scala/org/apache/spark/deploy/k8s/features/BasicDriverFeatureStep.scala#L182]
>  the value of {{spark.jars}} with those newly uploaded JARs. By overwriting 
> the previous value of {{{}spark.jars{}}}, we are losing all mention of remote 
> JARs that were previously specified there. 
> Consequently, when the Spark driver starts afterwards, it only downloads JARs 
> from {{{}spark.kubernetes.file.upload.path{}}}.
> h1. Possible solution
> I think a possible fix would be to not fully overwrite the value of 
> {{spark.jars}} but to make sure that we keep remote URIs there.
> The new logic would look something like this:
> {code:java}
> Seq(JARS, FILES, ARCHIVES, SUBMIT_PYTHON_FILES).foreach { key =>
>   val uris = conf.get(key).filter(uri => 
> KubernetesUtils.isLocalAndResolvable(uri))
>   // Save remote URIs
>   val remoteUris = conf.get(key).filter(uri => 
> !KubernetesUtils.isLocalAndResolvable(uri))
>   val value = {
> if (key == ARCHIVES) {
>   uris.map(UriBuilder.fromUri(_).fragment(null).build()).map(_.toString)
> } else {
>   uris
> }
>   }
>   val resolved = KubernetesUtils.uploadAndTransformFileUris(value, 
> Some(conf.sparkConf))
>   if (resolved.nonEmpty) {
> val resolvedValue = if (key == ARCHIVES) {
>   uris.zip(resolved).map { case (uri, r) =>
> UriBuilder.fromUri(r).fragment(new 
> java.net.URI(uri).getFragment).build().toString
>   }
> } else {
>   resolved
> }
> // don't

[jira] [Resolved] (SPARK-40817) Remote spark.jars URIs ignored for Spark on Kubernetes in cluster mode

2023-01-20 Thread Dongjoon Hyun (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-40817?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dongjoon Hyun resolved SPARK-40817.
---
Fix Version/s: 3.3.2
   3.4.0
   Resolution: Fixed

> Remote spark.jars URIs ignored for Spark on Kubernetes in cluster mode 
> ---
>
> Key: SPARK-40817
> URL: https://issues.apache.org/jira/browse/SPARK-40817
> Project: Spark
>  Issue Type: Bug
>  Components: Kubernetes, Spark Submit
>Affects Versions: 3.0.0, 3.1.3, 3.3.0, 3.2.2, 3.4.0
> Environment: Spark 3.1.3
> Kubernetes 1.21
> Ubuntu 20.04.1
>Reporter: Anton Ippolitov
>Priority: Major
> Fix For: 3.3.2, 3.4.0
>
> Attachments: image-2022-10-17-10-44-46-862.png
>
>
> I discovered that remote URIs in {{spark.jars}} get discarded when launching 
> Spark on Kubernetes in cluster mode via spark-submit.
> h1. Reproduction
> Here is an example reproduction with S3 being used for remote JAR storage: 
> I first created 2 JARs:
>  * {{/opt/my-local-jar.jar}} on the host where I'm running spark-submit
>  * {{s3://$BUCKET_NAME/my-remote-jar.jar}} in an S3 bucket I own
> I then ran the following spark-submit command with {{spark.jars}} pointing to 
> both the local JAR and the remote JAR:
> {code:java}
>  spark-submit \
>   --master k8s://https://$KUBERNETES_API_SERVER_URL:443 \
>   --deploy-mode cluster \
>   --name=spark-submit-test \
>   --class org.apache.spark.examples.SparkPi \
>   --conf 
> spark.jars=/opt/my-local-jar.jar,s3a://$BUCKET_NAME/my-remote-jar.jar \
>   --conf spark.kubernetes.file.upload.path=s3a://$BUCKET_NAME/my-upload-path/ 
> \
>   [...]
>   /opt/spark/examples/jars/spark-examples_2.12-3.1.3.jar
> {code}
> Once the driver and the executors started, I confirmed that there was no 
> trace of {{my-remote-jar.jar}} anymore. For example, looking at the Spark 
> History Server, I could see that {{spark.jars}} got transformed into this:
> !image-2022-10-17-10-44-46-862.png|width=991,height=80!
> There was no mention of {{my-remote-jar.jar}} on the classpath or anywhere 
> else.
> Note that I ran all tests with Spark 3.1.3, however the code which handles 
> those dependencies seems to be the same for more recent versions of Spark as 
> well.
> h1. Root cause description
> I believe that the issue seems to be coming from [this 
> logic|https://github.com/apache/spark/blob/d1f8a503a26bcfb4e466d9accc5fa241a7933667/resource-managers/kubernetes/core/src/main/scala/org/apache/spark/deploy/k8s/features/BasicDriverFeatureStep.scala#L163-L186]
>  in {{{}BasicDriverFeatureStep.getAdditionalPodSystemProperties(){}}}.
> Specifically, this logic takes all URIs in {{{}spark.jars{}}}, [filters only 
> on local 
> URIs,|https://github.com/apache/spark/blob/d1f8a503a26bcfb4e466d9accc5fa241a7933667/resource-managers/kubernetes/core/src/main/scala/org/apache/spark/deploy/k8s/features/BasicDriverFeatureStep.scala#L165]
>  
> [uploads|https://github.com/apache/spark/blob/d1f8a503a26bcfb4e466d9accc5fa241a7933667/resource-managers/kubernetes/core/src/main/scala/org/apache/spark/deploy/k8s/features/BasicDriverFeatureStep.scala#L173]
>  those local files to {{spark.kubernetes.file.upload.path }}and then 
> [*replaces*|https://github.com/apache/spark/blob/d1f8a503a26bcfb4e466d9accc5fa241a7933667/resource-managers/kubernetes/core/src/main/scala/org/apache/spark/deploy/k8s/features/BasicDriverFeatureStep.scala#L182]
>  the value of {{spark.jars}} with those newly uploaded JARs. By overwriting 
> the previous value of {{{}spark.jars{}}}, we are losing all mention of remote 
> JARs that were previously specified there. 
> Consequently, when the Spark driver starts afterwards, it only downloads JARs 
> from {{{}spark.kubernetes.file.upload.path{}}}.
> h1. Possible solution
> I think a possible fix would be to not fully overwrite the value of 
> {{spark.jars}} but to make sure that we keep remote URIs there.
> The new logic would look something like this:
> {code:java}
> Seq(JARS, FILES, ARCHIVES, SUBMIT_PYTHON_FILES).foreach { key =>
>   val uris = conf.get(key).filter(uri => 
> KubernetesUtils.isLocalAndResolvable(uri))
>   // Save remote URIs
>   val remoteUris = conf.get(key).filter(uri => 
> !KubernetesUtils.isLocalAndResolvable(uri))
>   val value = {
> if (key == ARCHIVES) {
>   uris.map(UriBuilder.fromUri(_).fragment(null).build()).map(_.toString)
> } else {
>   uris
> }
>   }
>   val resolved = KubernetesUtils.uploadAndTransformFileUris(value, 
> Some(conf.sparkConf))
>   if (resolved.nonEmpty) {
> val resolvedValue = if (key == ARCHIVES) {
>   uris.zip(resolved).map { case (uri, r) =>
> UriBuilder.fromUri(r).fragment(new 
> java.net.URI(uri).getFragment).build().toString
>   }
> } else {
>   resolved
> }
> // don'

[jira] [Assigned] (SPARK-42137) Enable spark.kryo.unsafe by default

2023-01-20 Thread Apache Spark (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-42137?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Apache Spark reassigned SPARK-42137:


Assignee: (was: Apache Spark)

> Enable spark.kryo.unsafe by default
> ---
>
> Key: SPARK-42137
> URL: https://issues.apache.org/jira/browse/SPARK-42137
> Project: Spark
>  Issue Type: Improvement
>  Components: Spark Core
>Affects Versions: 3.4.0
>Reporter: Dongjoon Hyun
>Priority: Major
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Assigned] (SPARK-42137) Enable spark.kryo.unsafe by default

2023-01-20 Thread Apache Spark (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-42137?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Apache Spark reassigned SPARK-42137:


Assignee: Apache Spark

> Enable spark.kryo.unsafe by default
> ---
>
> Key: SPARK-42137
> URL: https://issues.apache.org/jira/browse/SPARK-42137
> Project: Spark
>  Issue Type: Improvement
>  Components: Spark Core
>Affects Versions: 3.4.0
>Reporter: Dongjoon Hyun
>Assignee: Apache Spark
>Priority: Major
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-42137) Enable spark.kryo.unsafe by default

2023-01-20 Thread Apache Spark (Jira)


[ 
https://issues.apache.org/jira/browse/SPARK-42137?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17679340#comment-17679340
 ] 

Apache Spark commented on SPARK-42137:
--

User 'dongjoon-hyun' has created a pull request for this issue:
https://github.com/apache/spark/pull/39679

> Enable spark.kryo.unsafe by default
> ---
>
> Key: SPARK-42137
> URL: https://issues.apache.org/jira/browse/SPARK-42137
> Project: Spark
>  Issue Type: Improvement
>  Components: Spark Core
>Affects Versions: 3.4.0
>Reporter: Dongjoon Hyun
>Priority: Major
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Created] (SPARK-42137) Enable spark.kryo.unsafe by default

2023-01-20 Thread Dongjoon Hyun (Jira)
Dongjoon Hyun created SPARK-42137:
-

 Summary: Enable spark.kryo.unsafe by default
 Key: SPARK-42137
 URL: https://issues.apache.org/jira/browse/SPARK-42137
 Project: Spark
  Issue Type: Improvement
  Components: Spark Core
Affects Versions: 3.4.0
Reporter: Dongjoon Hyun






--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Resolved] (SPARK-42130) Handle null string values in AccumulableInfo and ProcessSummary

2023-01-20 Thread Gengliang Wang (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-42130?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gengliang Wang resolved SPARK-42130.

Fix Version/s: 3.4.0
   Resolution: Fixed

Issue resolved by pull request 39666
[https://github.com/apache/spark/pull/39666]

> Handle null string values in AccumulableInfo and ProcessSummary
> ---
>
> Key: SPARK-42130
> URL: https://issues.apache.org/jira/browse/SPARK-42130
> Project: Spark
>  Issue Type: Sub-task
>  Components: Web UI
>Affects Versions: 3.4.0
>Reporter: Gengliang Wang
>Assignee: Gengliang Wang
>Priority: Major
> Fix For: 3.4.0
>
>
> Use optional string for string fields so that we can serialize/deserialize 
> null string correctly



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Updated] (SPARK-41916) Address General Fixes

2023-01-20 Thread Rithwik Ediga Lakhamsani (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-41916?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rithwik Ediga Lakhamsani updated SPARK-41916:
-
Description: 
We want the distributor to have the ability to run multiple torchrun processes 
per task if task.gpu.amount > 1.

We want to add a check to see if `import torch` doesn't raise an ImportError 
since the TorchDistributor requires torch. If it raises an ImportError, we will 
give the user more details. 

  was:We want the distributor to have the ability to run multiple torchrun 
processes per task if task.gpu.amount > 1.


> Address General Fixes
> -
>
> Key: SPARK-41916
> URL: https://issues.apache.org/jira/browse/SPARK-41916
> Project: Spark
>  Issue Type: Sub-task
>  Components: ML, PySpark
>Affects Versions: 3.4.0
>Reporter: Rithwik Ediga Lakhamsani
>Priority: Major
>
> We want the distributor to have the ability to run multiple torchrun 
> processes per task if task.gpu.amount > 1.
> We want to add a check to see if `import torch` doesn't raise an ImportError 
> since the TorchDistributor requires torch. If it raises an ImportError, we 
> will give the user more details. 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Updated] (SPARK-41916) Address General Fizes

2023-01-20 Thread Rithwik Ediga Lakhamsani (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-41916?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rithwik Ediga Lakhamsani updated SPARK-41916:
-
Summary: Address General Fizes  (was: Address 
`spark.task.resource.gpu.amount > 1`)

> Address General Fizes
> -
>
> Key: SPARK-41916
> URL: https://issues.apache.org/jira/browse/SPARK-41916
> Project: Spark
>  Issue Type: Sub-task
>  Components: ML, PySpark
>Affects Versions: 3.4.0
>Reporter: Rithwik Ediga Lakhamsani
>Priority: Major
>
> We want the distributor to have the ability to run multiple torchrun 
> processes per task if task.gpu.amount > 1.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Resolved] (SPARK-41776) Implement support for PyTorch Lightning

2023-01-20 Thread Rithwik Ediga Lakhamsani (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-41776?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rithwik Ediga Lakhamsani resolved SPARK-41776.
--
Resolution: Fixed

Not needed, since we are now using `torch.distributed.run`

> Implement support for PyTorch Lightning
> ---
>
> Key: SPARK-41776
> URL: https://issues.apache.org/jira/browse/SPARK-41776
> Project: Spark
>  Issue Type: Sub-task
>  Components: ML, PySpark
>Affects Versions: 3.4.0
>Reporter: Rithwik Ediga Lakhamsani
>Priority: Major
>
> This requires us to just call train() on each spark task separately without 
> much preprocessing or postprocessing because PyTorch Lightning handles that 
> by itself.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Updated] (SPARK-41916) Address `spark.task.resource.gpu.amount > 1`

2023-01-20 Thread Rithwik Ediga Lakhamsani (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-41916?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rithwik Ediga Lakhamsani updated SPARK-41916:
-
Description: We want the distributor to have the ability to run multiple 
torchrun processes per task if task.gpu.amount > 1.  (was: We want the 
distributor to have the ability to run multiple torchrun processes per task if 
task.gpu.amount > 1 + address formatting comments on 
https://github.com/apache/spark/pull/39188#discussion_r1068903058)

> Address `spark.task.resource.gpu.amount > 1`
> 
>
> Key: SPARK-41916
> URL: https://issues.apache.org/jira/browse/SPARK-41916
> Project: Spark
>  Issue Type: Sub-task
>  Components: ML, PySpark
>Affects Versions: 3.4.0
>Reporter: Rithwik Ediga Lakhamsani
>Priority: Major
>
> We want the distributor to have the ability to run multiple torchrun 
> processes per task if task.gpu.amount > 1.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Updated] (SPARK-41916) Address General Fixes

2023-01-20 Thread Rithwik Ediga Lakhamsani (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-41916?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rithwik Ediga Lakhamsani updated SPARK-41916:
-
Summary: Address General Fixes  (was: Address General Fizes)

> Address General Fixes
> -
>
> Key: SPARK-41916
> URL: https://issues.apache.org/jira/browse/SPARK-41916
> Project: Spark
>  Issue Type: Sub-task
>  Components: ML, PySpark
>Affects Versions: 3.4.0
>Reporter: Rithwik Ediga Lakhamsani
>Priority: Major
>
> We want the distributor to have the ability to run multiple torchrun 
> processes per task if task.gpu.amount > 1.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Assigned] (SPARK-16484) Incremental Cardinality estimation operations with Hyperloglog

2023-01-20 Thread Apache Spark (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-16484?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Apache Spark reassigned SPARK-16484:


Assignee: (was: Apache Spark)

> Incremental Cardinality estimation operations with Hyperloglog
> --
>
> Key: SPARK-16484
> URL: https://issues.apache.org/jira/browse/SPARK-16484
> Project: Spark
>  Issue Type: Improvement
>  Components: SQL
>Reporter: Yongjia Wang
>Priority: Major
>  Labels: bulk-closed
>
> Efficient cardinality estimation is very important, and SparkSQL has had 
> approxCountDistinct based on Hyperloglog for quite some time. However, there 
> isn't a way to do incremental estimation. For example, if we want to get 
> updated distinct counts of the last 90 days, we need to do the aggregation 
> for the entire window over and over again. The more efficient way involves 
> serializing the counter for smaller time windows (such as hourly) so the 
> counts can be efficiently updated in an incremental fashion for any time 
> window.
> With the support of custom UDAF, Binary DataType and the HyperloglogPlusPlus 
> implementation in the current Spark version, it's easy enough to extend the 
> functionality to include incremental counting, and even other general set 
> operations such as intersection and set difference. Spark API is already as 
> elegant as it can be, but it still takes quite some effort to do a custom 
> implementation of the aforementioned operations which are supposed to be in 
> high demand. I have been searching but failed to find an usable existing 
> solution nor any ongoing effort for this. The closest I got is the following 
> but it does not work with Spark 1.6 due to API changes. 
> https://github.com/collectivemedia/spark-hyperloglog/blob/master/src/main/scala/org/apache/spark/sql/hyperloglog/aggregates.scala
> I wonder if it worth to integrate such operations into SparkSQL. The only 
> problem I see is it depends on serialization of a specific HLL implementation 
> and introduce compatibility issues. But as long as the user is aware of such 
> issue, it should be fine.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-16484) Incremental Cardinality estimation operations with Hyperloglog

2023-01-20 Thread Apache Spark (Jira)


[ 
https://issues.apache.org/jira/browse/SPARK-16484?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17679252#comment-17679252
 ] 

Apache Spark commented on SPARK-16484:
--

User 'RyanBerti' has created a pull request for this issue:
https://github.com/apache/spark/pull/39678

> Incremental Cardinality estimation operations with Hyperloglog
> --
>
> Key: SPARK-16484
> URL: https://issues.apache.org/jira/browse/SPARK-16484
> Project: Spark
>  Issue Type: Improvement
>  Components: SQL
>Reporter: Yongjia Wang
>Priority: Major
>  Labels: bulk-closed
>
> Efficient cardinality estimation is very important, and SparkSQL has had 
> approxCountDistinct based on Hyperloglog for quite some time. However, there 
> isn't a way to do incremental estimation. For example, if we want to get 
> updated distinct counts of the last 90 days, we need to do the aggregation 
> for the entire window over and over again. The more efficient way involves 
> serializing the counter for smaller time windows (such as hourly) so the 
> counts can be efficiently updated in an incremental fashion for any time 
> window.
> With the support of custom UDAF, Binary DataType and the HyperloglogPlusPlus 
> implementation in the current Spark version, it's easy enough to extend the 
> functionality to include incremental counting, and even other general set 
> operations such as intersection and set difference. Spark API is already as 
> elegant as it can be, but it still takes quite some effort to do a custom 
> implementation of the aforementioned operations which are supposed to be in 
> high demand. I have been searching but failed to find an usable existing 
> solution nor any ongoing effort for this. The closest I got is the following 
> but it does not work with Spark 1.6 due to API changes. 
> https://github.com/collectivemedia/spark-hyperloglog/blob/master/src/main/scala/org/apache/spark/sql/hyperloglog/aggregates.scala
> I wonder if it worth to integrate such operations into SparkSQL. The only 
> problem I see is it depends on serialization of a specific HLL implementation 
> and introduce compatibility issues. But as long as the user is aware of such 
> issue, it should be fine.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Assigned] (SPARK-16484) Incremental Cardinality estimation operations with Hyperloglog

2023-01-20 Thread Apache Spark (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-16484?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Apache Spark reassigned SPARK-16484:


Assignee: Apache Spark

> Incremental Cardinality estimation operations with Hyperloglog
> --
>
> Key: SPARK-16484
> URL: https://issues.apache.org/jira/browse/SPARK-16484
> Project: Spark
>  Issue Type: Improvement
>  Components: SQL
>Reporter: Yongjia Wang
>Assignee: Apache Spark
>Priority: Major
>  Labels: bulk-closed
>
> Efficient cardinality estimation is very important, and SparkSQL has had 
> approxCountDistinct based on Hyperloglog for quite some time. However, there 
> isn't a way to do incremental estimation. For example, if we want to get 
> updated distinct counts of the last 90 days, we need to do the aggregation 
> for the entire window over and over again. The more efficient way involves 
> serializing the counter for smaller time windows (such as hourly) so the 
> counts can be efficiently updated in an incremental fashion for any time 
> window.
> With the support of custom UDAF, Binary DataType and the HyperloglogPlusPlus 
> implementation in the current Spark version, it's easy enough to extend the 
> functionality to include incremental counting, and even other general set 
> operations such as intersection and set difference. Spark API is already as 
> elegant as it can be, but it still takes quite some effort to do a custom 
> implementation of the aforementioned operations which are supposed to be in 
> high demand. I have been searching but failed to find an usable existing 
> solution nor any ongoing effort for this. The closest I got is the following 
> but it does not work with Spark 1.6 due to API changes. 
> https://github.com/collectivemedia/spark-hyperloglog/blob/master/src/main/scala/org/apache/spark/sql/hyperloglog/aggregates.scala
> I wonder if it worth to integrate such operations into SparkSQL. The only 
> problem I see is it depends on serialization of a specific HLL implementation 
> and introduce compatibility issues. But as long as the user is aware of such 
> issue, it should be fine.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Assigned] (SPARK-40303) The performance will be worse after codegen

2023-01-20 Thread Dongjoon Hyun (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-40303?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dongjoon Hyun reassigned SPARK-40303:
-

Assignee: Yuming Wang

> The performance will be worse after codegen
> ---
>
> Key: SPARK-40303
> URL: https://issues.apache.org/jira/browse/SPARK-40303
> Project: Spark
>  Issue Type: Bug
>  Components: SQL
>Affects Versions: 3.4.0
>Reporter: Yuming Wang
>Assignee: Yuming Wang
>Priority: Major
> Attachments: TestApiBenchmark.scala, TestApis.java, 
> TestParameters.java
>
>
> {code:scala}
> import org.apache.spark.benchmark.Benchmark
> val dir = "/tmp/spark/benchmark"
> val N = 200
> val columns = Range(0, 100).map(i => s"id % $i AS id$i")
> spark.range(N).selectExpr(columns: _*).write.mode("Overwrite").parquet(dir)
> // Seq(1, 2, 5, 10, 15, 25, 40, 60, 100)
> Seq(60).foreach{ cnt =>
>   val selectExps = columns.take(cnt).map(_.split(" ").last).map(c => 
> s"count(distinct $c)")
>   val benchmark = new Benchmark("Benchmark count distinct", N, minNumIters = 
> 1)
>   benchmark.addCase(s"$cnt count distinct with codegen") { _ =>
> withSQLConf(
>   "spark.sql.codegen.wholeStage" -> "true",
>   "spark.sql.codegen.factoryMode" -> "FALLBACK") {
>   spark.read.parquet(dir).selectExpr(selectExps: 
> _*).write.format("noop").mode("Overwrite").save()
> }
>   }
>   benchmark.addCase(s"$cnt count distinct without codegen") { _ =>
> withSQLConf(
>   "spark.sql.codegen.wholeStage" -> "false",
>   "spark.sql.codegen.factoryMode" -> "NO_CODEGEN") {
>   spark.read.parquet(dir).selectExpr(selectExps: 
> _*).write.format("noop").mode("Overwrite").save()
> }
>   }
>   benchmark.run()
> }
> {code}
> {noformat}
> Java HotSpot(TM) 64-Bit Server VM 1.8.0_281-b09 on Mac OS X 10.15.7
> Intel(R) Core(TM) i9-9980HK CPU @ 2.40GHz
> Benchmark count distinct: Best Time(ms)   Avg Time(ms)   
> Stdev(ms)Rate(M/s)   Per Row(ns)   Relative
> 
> 60 count distinct with codegen   628146 628146
>0  0.0  314072.8   1.0X
> 60 count distinct without codegen147635 147635
>0  0.0   73817.5   4.3X
> {noformat}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Resolved] (SPARK-40303) The performance will be worse after codegen

2023-01-20 Thread Dongjoon Hyun (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-40303?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dongjoon Hyun resolved SPARK-40303.
---
Fix Version/s: 3.4.0
   Resolution: Fixed

Issue resolved by pull request 39671
[https://github.com/apache/spark/pull/39671]

> The performance will be worse after codegen
> ---
>
> Key: SPARK-40303
> URL: https://issues.apache.org/jira/browse/SPARK-40303
> Project: Spark
>  Issue Type: Bug
>  Components: SQL
>Affects Versions: 3.4.0
>Reporter: Yuming Wang
>Assignee: Yuming Wang
>Priority: Major
> Fix For: 3.4.0
>
> Attachments: TestApiBenchmark.scala, TestApis.java, 
> TestParameters.java
>
>
> {code:scala}
> import org.apache.spark.benchmark.Benchmark
> val dir = "/tmp/spark/benchmark"
> val N = 200
> val columns = Range(0, 100).map(i => s"id % $i AS id$i")
> spark.range(N).selectExpr(columns: _*).write.mode("Overwrite").parquet(dir)
> // Seq(1, 2, 5, 10, 15, 25, 40, 60, 100)
> Seq(60).foreach{ cnt =>
>   val selectExps = columns.take(cnt).map(_.split(" ").last).map(c => 
> s"count(distinct $c)")
>   val benchmark = new Benchmark("Benchmark count distinct", N, minNumIters = 
> 1)
>   benchmark.addCase(s"$cnt count distinct with codegen") { _ =>
> withSQLConf(
>   "spark.sql.codegen.wholeStage" -> "true",
>   "spark.sql.codegen.factoryMode" -> "FALLBACK") {
>   spark.read.parquet(dir).selectExpr(selectExps: 
> _*).write.format("noop").mode("Overwrite").save()
> }
>   }
>   benchmark.addCase(s"$cnt count distinct without codegen") { _ =>
> withSQLConf(
>   "spark.sql.codegen.wholeStage" -> "false",
>   "spark.sql.codegen.factoryMode" -> "NO_CODEGEN") {
>   spark.read.parquet(dir).selectExpr(selectExps: 
> _*).write.format("noop").mode("Overwrite").save()
> }
>   }
>   benchmark.run()
> }
> {code}
> {noformat}
> Java HotSpot(TM) 64-Bit Server VM 1.8.0_281-b09 on Mac OS X 10.15.7
> Intel(R) Core(TM) i9-9980HK CPU @ 2.40GHz
> Benchmark count distinct: Best Time(ms)   Avg Time(ms)   
> Stdev(ms)Rate(M/s)   Per Row(ns)   Relative
> 
> 60 count distinct with codegen   628146 628146
>0  0.0  314072.8   1.0X
> 60 count distinct without codegen147635 147635
>0  0.0   73817.5   4.3X
> {noformat}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-42136) Refactor BroadcastHashJoinExec output partitioning generation

2023-01-20 Thread Apache Spark (Jira)


[ 
https://issues.apache.org/jira/browse/SPARK-42136?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17679220#comment-17679220
 ] 

Apache Spark commented on SPARK-42136:
--

User 'peter-toth' has created a pull request for this issue:
https://github.com/apache/spark/pull/38038

> Refactor BroadcastHashJoinExec output partitioning generation
> -
>
> Key: SPARK-42136
> URL: https://issues.apache.org/jira/browse/SPARK-42136
> Project: Spark
>  Issue Type: Improvement
>  Components: SQL
>Affects Versions: 3.4.0
>Reporter: Peter Toth
>Priority: Major
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-42043) Basic Scala Client Result Implementation

2023-01-20 Thread Apache Spark (Jira)


[ 
https://issues.apache.org/jira/browse/SPARK-42043?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17679219#comment-17679219
 ] 

Apache Spark commented on SPARK-42043:
--

User 'zhenlineo' has created a pull request for this issue:
https://github.com/apache/spark/pull/39677

> Basic Scala Client Result Implementation 
> -
>
> Key: SPARK-42043
> URL: https://issues.apache.org/jira/browse/SPARK-42043
> Project: Spark
>  Issue Type: Improvement
>  Components: Connect
>Affects Versions: 3.4.0
>Reporter: Zhen Li
>Assignee: Zhen Li
>Priority: Major
> Fix For: 3.4.0
>
>
> Adding the basic scala client Result implementation. Add some tests to verify 
> the result can be received correctly.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Assigned] (SPARK-42136) Refactor BroadcastHashJoinExec output partitioning generation

2023-01-20 Thread Apache Spark (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-42136?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Apache Spark reassigned SPARK-42136:


Assignee: Apache Spark

> Refactor BroadcastHashJoinExec output partitioning generation
> -
>
> Key: SPARK-42136
> URL: https://issues.apache.org/jira/browse/SPARK-42136
> Project: Spark
>  Issue Type: Improvement
>  Components: SQL
>Affects Versions: 3.4.0
>Reporter: Peter Toth
>Assignee: Apache Spark
>Priority: Major
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Assigned] (SPARK-42136) Refactor BroadcastHashJoinExec output partitioning generation

2023-01-20 Thread Apache Spark (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-42136?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Apache Spark reassigned SPARK-42136:


Assignee: (was: Apache Spark)

> Refactor BroadcastHashJoinExec output partitioning generation
> -
>
> Key: SPARK-42136
> URL: https://issues.apache.org/jira/browse/SPARK-42136
> Project: Spark
>  Issue Type: Improvement
>  Components: SQL
>Affects Versions: 3.4.0
>Reporter: Peter Toth
>Priority: Major
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Created] (SPARK-42136) Refactor BroadcastHashJoinExec output partitioning generation

2023-01-20 Thread Peter Toth (Jira)
Peter Toth created SPARK-42136:
--

 Summary: Refactor BroadcastHashJoinExec output partitioning 
generation
 Key: SPARK-42136
 URL: https://issues.apache.org/jira/browse/SPARK-42136
 Project: Spark
  Issue Type: Improvement
  Components: SQL
Affects Versions: 3.4.0
Reporter: Peter Toth






--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Created] (SPARK-42135) Scala Client Proper logging for the client

2023-01-20 Thread Zhen Li (Jira)
Zhen Li created SPARK-42135:
---

 Summary: Scala Client Proper logging for the client
 Key: SPARK-42135
 URL: https://issues.apache.org/jira/browse/SPARK-42135
 Project: Spark
  Issue Type: Improvement
  Components: Connect
Affects Versions: 3.4.0
Reporter: Zhen Li


Introduce proper logging for the client and change 
[https://github.com/apache/spark/pull/39541/files/2a589543bdec80f4cf806af0a8566d2de8c04140#r1082062813]
 to use the client logging.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Assigned] (SPARK-42134) Fix getPartitionFiltersAndDataFilters() to handle filters without referenced attributes

2023-01-20 Thread Apache Spark (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-42134?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Apache Spark reassigned SPARK-42134:


Assignee: Apache Spark

> Fix getPartitionFiltersAndDataFilters() to handle filters without referenced 
> attributes
> ---
>
> Key: SPARK-42134
> URL: https://issues.apache.org/jira/browse/SPARK-42134
> Project: Spark
>  Issue Type: Bug
>  Components: SQL
>Affects Versions: 3.4.0
>Reporter: Peter Toth
>Assignee: Apache Spark
>Priority: Major
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-42134) Fix getPartitionFiltersAndDataFilters() to handle filters without referenced attributes

2023-01-20 Thread Apache Spark (Jira)


[ 
https://issues.apache.org/jira/browse/SPARK-42134?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17679156#comment-17679156
 ] 

Apache Spark commented on SPARK-42134:
--

User 'peter-toth' has created a pull request for this issue:
https://github.com/apache/spark/pull/39676

> Fix getPartitionFiltersAndDataFilters() to handle filters without referenced 
> attributes
> ---
>
> Key: SPARK-42134
> URL: https://issues.apache.org/jira/browse/SPARK-42134
> Project: Spark
>  Issue Type: Bug
>  Components: SQL
>Affects Versions: 3.4.0
>Reporter: Peter Toth
>Priority: Major
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Assigned] (SPARK-42134) Fix getPartitionFiltersAndDataFilters() to handle filters without referenced attributes

2023-01-20 Thread Apache Spark (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-42134?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Apache Spark reassigned SPARK-42134:


Assignee: (was: Apache Spark)

> Fix getPartitionFiltersAndDataFilters() to handle filters without referenced 
> attributes
> ---
>
> Key: SPARK-42134
> URL: https://issues.apache.org/jira/browse/SPARK-42134
> Project: Spark
>  Issue Type: Bug
>  Components: SQL
>Affects Versions: 3.4.0
>Reporter: Peter Toth
>Priority: Major
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Created] (SPARK-42134) Fix getPartitionFiltersAndDataFilters() to handle filters without referenced attributes

2023-01-20 Thread Peter Toth (Jira)
Peter Toth created SPARK-42134:
--

 Summary: Fix getPartitionFiltersAndDataFilters() to handle filters 
without referenced attributes
 Key: SPARK-42134
 URL: https://issues.apache.org/jira/browse/SPARK-42134
 Project: Spark
  Issue Type: Bug
  Components: SQL
Affects Versions: 3.4.0
Reporter: Peter Toth






--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Assigned] (SPARK-42129) Upgrade rocksdbjni to 7.9.2

2023-01-20 Thread Dongjoon Hyun (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-42129?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dongjoon Hyun reassigned SPARK-42129:
-

Assignee: Yang Jie

> Upgrade rocksdbjni to 7.9.2
> ---
>
> Key: SPARK-42129
> URL: https://issues.apache.org/jira/browse/SPARK-42129
> Project: Spark
>  Issue Type: Improvement
>  Components: Build
>Affects Versions: 3.4.0
>Reporter: Yang Jie
>Assignee: Yang Jie
>Priority: Major
>
> https://github.com/facebook/rocksdb/releases/tag/v7.9.2



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



  1   2   >