[jira] [Updated] (SPARK-47085) Preformance issue on thrift API

2024-02-19 Thread Izek Greenfield (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-47085?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Izek Greenfield updated SPARK-47085:

Description: 
This new complexity was introduced in SPARK-39041.

In class `RowSetUtils` there is a loop that has _*O(n^2)*_ complexity:
{code:scala}
...
 while (i < rowSize) {
  val row = rows(I)
  ...
{code}
It can be easily converted back into _*O( n )*_ complexity.

 

 

  was:
This new complexity introduced in SPARK-39041.

In class `RowSetUtils` there is a loop that has _*O(n^2)*_ complexity:
{code:scala}
...
 while (i < rowSize) {
  val row = rows(I)
  ...
{code}
It can be easily converted back into _*O(n)*_ complexity.

 

 


> Preformance issue on thrift API
> ---
>
> Key: SPARK-47085
> URL: https://issues.apache.org/jira/browse/SPARK-47085
> Project: Spark
>  Issue Type: Bug
>  Components: SQL
>Affects Versions: 3.4.1, 3.5.0
>Reporter: Izek Greenfield
>Assignee: Izek Greenfield
>Priority: Major
>  Labels: pull-request-available
> Fix For: 4.0.0
>
>
> This new complexity was introduced in SPARK-39041.
> In class `RowSetUtils` there is a loop that has _*O(n^2)*_ complexity:
> {code:scala}
> ...
>  while (i < rowSize) {
>   val row = rows(I)
>   ...
> {code}
> It can be easily converted back into _*O( n )*_ complexity.
>  
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Updated] (SPARK-47085) Preformance issue on thrift API

2024-02-19 Thread Izek Greenfield (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-47085?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Izek Greenfield updated SPARK-47085:

Description: 
This new complexity introduced in SPARK-39041.

In class `RowSetUtils` there is a loop that has _*O(n^2)*_ complexity:
{code:scala}
...
 while (i < rowSize) {
  val row = rows(I)
  ...
{code}
It can be easily converted back into _*O(n)*_ complexity.

 

 

  was:
in class `RowSetUtils` there is a loop that has O(n^2) complexity:


{code:scala}
...
 while (i < rowSize) {
  val row = rows(I)
  ...
{code}

It can be easily converted into O( n ) complexity. 


> Preformance issue on thrift API
> ---
>
> Key: SPARK-47085
> URL: https://issues.apache.org/jira/browse/SPARK-47085
> Project: Spark
>  Issue Type: Bug
>  Components: SQL
>Affects Versions: 3.4.1, 3.5.0
>Reporter: Izek Greenfield
>Assignee: Izek Greenfield
>Priority: Major
>  Labels: pull-request-available
> Fix For: 4.0.0
>
>
> This new complexity introduced in SPARK-39041.
> In class `RowSetUtils` there is a loop that has _*O(n^2)*_ complexity:
> {code:scala}
> ...
>  while (i < rowSize) {
>   val row = rows(I)
>   ...
> {code}
> It can be easily converted back into _*O(n)*_ complexity.
>  
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Updated] (SPARK-47085) Preformance issue on thrift API

2024-02-19 Thread Izek Greenfield (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-47085?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Izek Greenfield updated SPARK-47085:

Issue Type: Bug  (was: Improvement)

> Preformance issue on thrift API
> ---
>
> Key: SPARK-47085
> URL: https://issues.apache.org/jira/browse/SPARK-47085
> Project: Spark
>  Issue Type: Bug
>  Components: SQL
>Affects Versions: 3.4.1, 3.5.0
>Reporter: Izek Greenfield
>Assignee: Izek Greenfield
>Priority: Major
>  Labels: pull-request-available
> Fix For: 4.0.0
>
>
> in class `RowSetUtils` there is a loop that has O(n^2) complexity:
> {code:scala}
> ...
>  while (i < rowSize) {
>   val row = rows(I)
>   ...
> {code}
> It can be easily converted into O( n ) complexity. 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Updated] (SPARK-45615) Remove redundant"Auto-application to `()` is deprecated" compile suppression rules.

2024-02-19 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-45615?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated SPARK-45615:
---
Labels: pull-request-available  (was: )

> Remove redundant"Auto-application to `()` is deprecated" compile suppression 
> rules.
> ---
>
> Key: SPARK-45615
> URL: https://issues.apache.org/jira/browse/SPARK-45615
> Project: Spark
>  Issue Type: Improvement
>  Components: Build
>Affects Versions: 4.0.0
>Reporter: Yang Jie
>Priority: Minor
>  Labels: pull-request-available
>
> Due to the issue https://github.com/scalatest/scalatest/issues/2297, we need 
> to wait until we upgrade a scalatest version before removing these 
> suppression rules.
> Maybe 3.2.18



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Resolved] (SPARK-45789) Support DESCRIBE TABLE for clustering columns

2024-02-19 Thread Wenchen Fan (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-45789?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wenchen Fan resolved SPARK-45789.
-
Fix Version/s: 4.0.0
   Resolution: Fixed

Issue resolved by pull request 45077
[https://github.com/apache/spark/pull/45077]

> Support DESCRIBE TABLE for clustering columns
> -
>
> Key: SPARK-45789
> URL: https://issues.apache.org/jira/browse/SPARK-45789
> Project: Spark
>  Issue Type: Sub-task
>  Components: SQL
>Affects Versions: 4.0.0
>Reporter: Terry Kim
>Assignee: Terry Kim
>Priority: Major
>  Labels: pull-request-available
> Fix For: 4.0.0
>
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Assigned] (SPARK-45789) Support DESCRIBE TABLE for clustering columns

2024-02-19 Thread Wenchen Fan (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-45789?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wenchen Fan reassigned SPARK-45789:
---

Assignee: Terry Kim

> Support DESCRIBE TABLE for clustering columns
> -
>
> Key: SPARK-45789
> URL: https://issues.apache.org/jira/browse/SPARK-45789
> Project: Spark
>  Issue Type: Sub-task
>  Components: SQL
>Affects Versions: 4.0.0
>Reporter: Terry Kim
>Assignee: Terry Kim
>Priority: Major
>  Labels: pull-request-available
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Assigned] (SPARK-47080) Fix `HistoryServerSuite` to get `getNumJobs` in `eventually`

2024-02-19 Thread Dongjoon Hyun (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-47080?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dongjoon Hyun reassigned SPARK-47080:
-

Assignee: Dongjoon Hyun

> Fix `HistoryServerSuite` to get `getNumJobs` in `eventually`
> 
>
> Key: SPARK-47080
> URL: https://issues.apache.org/jira/browse/SPARK-47080
> Project: Spark
>  Issue Type: Sub-task
>  Components: Spark Core, Tests
>Affects Versions: 4.0.0
>Reporter: Dongjoon Hyun
>Assignee: Dongjoon Hyun
>Priority: Minor
>  Labels: pull-request-available
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Resolved] (SPARK-47080) Fix `HistoryServerSuite` to get `getNumJobs` in `eventually`

2024-02-19 Thread Dongjoon Hyun (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-47080?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dongjoon Hyun resolved SPARK-47080.
---
Fix Version/s: 4.0.0
   Resolution: Fixed

Issue resolved by pull request 45147
[https://github.com/apache/spark/pull/45147]

> Fix `HistoryServerSuite` to get `getNumJobs` in `eventually`
> 
>
> Key: SPARK-47080
> URL: https://issues.apache.org/jira/browse/SPARK-47080
> Project: Spark
>  Issue Type: Sub-task
>  Components: Spark Core, Tests
>Affects Versions: 4.0.0
>Reporter: Dongjoon Hyun
>Assignee: Dongjoon Hyun
>Priority: Minor
>  Labels: pull-request-available
> Fix For: 4.0.0
>
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Resolved] (SPARK-47016) Upgrade scalatest related dependencies to the 3.2.18 series

2024-02-19 Thread Dongjoon Hyun (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-47016?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dongjoon Hyun resolved SPARK-47016.
---
Fix Version/s: 4.0.0
   Resolution: Fixed

Issue resolved by pull request 45174
[https://github.com/apache/spark/pull/45174]

> Upgrade scalatest related dependencies to the 3.2.18 series
> ---
>
> Key: SPARK-47016
> URL: https://issues.apache.org/jira/browse/SPARK-47016
> Project: Spark
>  Issue Type: Improvement
>  Components: Build
>Affects Versions: 4.0.0
>Reporter: Yang Jie
>Assignee: Yang Jie
>Priority: Major
>  Labels: pull-request-available
> Fix For: 4.0.0
>
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Updated] (SPARK-47016) Upgrade scalatest related dependencies to the 3.2.18 series

2024-02-19 Thread Dongjoon Hyun (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-47016?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dongjoon Hyun updated SPARK-47016:
--
Parent: SPARK-47046
Issue Type: Sub-task  (was: Improvement)

> Upgrade scalatest related dependencies to the 3.2.18 series
> ---
>
> Key: SPARK-47016
> URL: https://issues.apache.org/jira/browse/SPARK-47016
> Project: Spark
>  Issue Type: Sub-task
>  Components: Build
>Affects Versions: 4.0.0
>Reporter: Yang Jie
>Assignee: Yang Jie
>Priority: Major
>  Labels: pull-request-available
> Fix For: 4.0.0
>
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Assigned] (SPARK-47016) Upgrade scalatest related dependencies to the 3.2.18 series

2024-02-19 Thread Dongjoon Hyun (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-47016?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dongjoon Hyun reassigned SPARK-47016:
-

Assignee: Yang Jie

> Upgrade scalatest related dependencies to the 3.2.18 series
> ---
>
> Key: SPARK-47016
> URL: https://issues.apache.org/jira/browse/SPARK-47016
> Project: Spark
>  Issue Type: Improvement
>  Components: Build
>Affects Versions: 4.0.0
>Reporter: Yang Jie
>Assignee: Yang Jie
>Priority: Major
>  Labels: pull-request-available
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Updated] (SPARK-45376) [CORE] Add netty-tcnative-boringssl-static dependency

2024-02-19 Thread Dongjoon Hyun (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-45376?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dongjoon Hyun updated SPARK-45376:
--
Parent: SPARK-47046
Issue Type: Sub-task  (was: Task)

> [CORE] Add netty-tcnative-boringssl-static dependency
> -
>
> Key: SPARK-45376
> URL: https://issues.apache.org/jira/browse/SPARK-45376
> Project: Spark
>  Issue Type: Sub-task
>  Components: Spark Core
>Affects Versions: 4.0.0
>Reporter: Hasnain Lakhani
>Assignee: Hasnain Lakhani
>Priority: Major
>  Labels: pull-request-available
> Fix For: 4.0.0
>
>
> Add the boringssl dependency which is needed for SSL functionality to work, 
> and provide the network common test helper to other test modules which need 
> to test SSL functionality



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Assigned] (SPARK-47100) Upgrade netty to 4.1.107.Final and netty-tcnative to 2.0.62.Final

2024-02-19 Thread Dongjoon Hyun (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-47100?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dongjoon Hyun reassigned SPARK-47100:
-

Assignee: Dongjoon Hyun

> Upgrade netty to 4.1.107.Final and netty-tcnative to 2.0.62.Final
> -
>
> Key: SPARK-47100
> URL: https://issues.apache.org/jira/browse/SPARK-47100
> Project: Spark
>  Issue Type: Sub-task
>  Components: Build
>Affects Versions: 4.0.0
>Reporter: Dongjoon Hyun
>Assignee: Dongjoon Hyun
>Priority: Major
>  Labels: pull-request-available
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Updated] (SPARK-45376) [CORE] Add netty-tcnative-boringssl-static dependency

2024-02-19 Thread Dongjoon Hyun (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-45376?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dongjoon Hyun updated SPARK-45376:
--
Epic Link: (was: SPARK-44937)

> [CORE] Add netty-tcnative-boringssl-static dependency
> -
>
> Key: SPARK-45376
> URL: https://issues.apache.org/jira/browse/SPARK-45376
> Project: Spark
>  Issue Type: Sub-task
>  Components: Spark Core
>Affects Versions: 4.0.0
>Reporter: Hasnain Lakhani
>Assignee: Hasnain Lakhani
>Priority: Major
>  Labels: pull-request-available
> Fix For: 4.0.0
>
>
> Add the boringssl dependency which is needed for SSL functionality to work, 
> and provide the network common test helper to other test modules which need 
> to test SSL functionality



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Updated] (SPARK-47100) Upgrade netty to 4.1.107.Final and netty-tcnative to 2.0.62.Final

2024-02-19 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-47100?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated SPARK-47100:
---
Labels: pull-request-available  (was: )

> Upgrade netty to 4.1.107.Final and netty-tcnative to 2.0.62.Final
> -
>
> Key: SPARK-47100
> URL: https://issues.apache.org/jira/browse/SPARK-47100
> Project: Spark
>  Issue Type: Sub-task
>  Components: Build
>Affects Versions: 4.0.0
>Reporter: Dongjoon Hyun
>Priority: Major
>  Labels: pull-request-available
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Updated] (SPARK-47100) Upgrade netty to 4.1.107.Final and netty-tcnative to 2.0.62.Final

2024-02-19 Thread Dongjoon Hyun (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-47100?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dongjoon Hyun updated SPARK-47100:
--
Summary: Upgrade netty to 4.1.107.Final and netty-tcnative to 2.0.62.Final  
(was: Upgrade Netty to 4.1.107.Final)

> Upgrade netty to 4.1.107.Final and netty-tcnative to 2.0.62.Final
> -
>
> Key: SPARK-47100
> URL: https://issues.apache.org/jira/browse/SPARK-47100
> Project: Spark
>  Issue Type: Sub-task
>  Components: Build
>Affects Versions: 4.0.0
>Reporter: Dongjoon Hyun
>Priority: Major
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Created] (SPARK-47100) Upgrade Netty to 4.1.107.Final

2024-02-19 Thread Dongjoon Hyun (Jira)
Dongjoon Hyun created SPARK-47100:
-

 Summary: Upgrade Netty to 4.1.107.Final
 Key: SPARK-47100
 URL: https://issues.apache.org/jira/browse/SPARK-47100
 Project: Spark
  Issue Type: Sub-task
  Components: Build
Affects Versions: 4.0.0
Reporter: Dongjoon Hyun






--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Updated] (SPARK-46432) Upgrade Netty to 4.1.106.Final

2024-02-19 Thread Dongjoon Hyun (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-46432?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dongjoon Hyun updated SPARK-46432:
--
Parent: SPARK-47046
Issue Type: Sub-task  (was: Improvement)

> Upgrade Netty to 4.1.106.Final
> --
>
> Key: SPARK-46432
> URL: https://issues.apache.org/jira/browse/SPARK-46432
> Project: Spark
>  Issue Type: Sub-task
>  Components: Build
>Affects Versions: 4.0.0
>Reporter: BingKun Pan
>Assignee: BingKun Pan
>Priority: Minor
>  Labels: pull-request-available
> Fix For: 4.0.0
>
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-46934) Unable to create Hive View from certain Spark Dataframe StructType

2024-02-19 Thread Jungtaek Lim (Jira)


[ 
https://issues.apache.org/jira/browse/SPARK-46934?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17818634#comment-17818634
 ] 

Jungtaek Lim commented on SPARK-46934:
--

Maybe the priority also has to be updated as well - we could say it's a release 
blocker if we have a consensus this is a blocker. Looks like it doesn't.

> Unable to create Hive View from certain Spark Dataframe StructType
> --
>
> Key: SPARK-46934
> URL: https://issues.apache.org/jira/browse/SPARK-46934
> Project: Spark
>  Issue Type: Bug
>  Components: SQL
>Affects Versions: 3.3.0, 3.3.2, 3.3.4
> Environment: Tested in Spark 3.3.0, 3.3.2.
>Reporter: Yu-Ting LIN
>Assignee: Kent Yao
>Priority: Blocker
>  Labels: pull-request-available
> Fix For: 4.0.0
>
>
> We are trying to create a Hive View using following SQL command "CREATE OR 
> REPLACE VIEW yuting AS SELECT INFO_ANN FROM table_2611810".
> Our table_2611810 has certain columns contain special characters such as "/". 
> Here is the schema of this table.
> {code:java}
> contigName              string
> start                   bigint
> end                     bigint
> names                   array
> referenceAllele         string
> alternateAlleles        array
> qual                    double
> filters                 array
> splitFromMultiAllelic    boolean
> INFO_NCAMP              int
> INFO_ODDRATIO           double
> INFO_NM                 double
> INFO_DBSNP_CAF          array
> INFO_SPANPAIR           int
> INFO_TLAMP              int
> INFO_PSTD               double
> INFO_QSTD               double
> INFO_SBF                double
> INFO_AF                 array
> INFO_QUAL               double
> INFO_SHIFT3             int
> INFO_VARBIAS            string
> INFO_HICOV              int
> INFO_PMEAN              double
> INFO_MSI                double
> INFO_VD                 int
> INFO_DP                 int
> INFO_HICNT              int
> INFO_ADJAF              double
> INFO_SVLEN              int
> INFO_RSEQ               string
> INFO_MSigDb             array
> INFO_NMD                array
> INFO_ANN                
> array,Annotation_Impact:string,Gene_Name:string,Gene_ID:string,Feature_Type:string,Feature_ID:string,Transcript_BioType:string,Rank:struct,HGVS_c:string,HGVS_p:string,cDNA_pos/cDNA_length:struct,CDS_pos/CDS_length:struct,AA_pos/AA_length:struct,Distance:int,ERRORS/WARNINGS/INFO:string>>
> INFO_BIAS               string
> INFO_MQ                 double
> INFO_HIAF               double
> INFO_END                int
> INFO_SPLITREAD          int
> INFO_GDAMP              int
> INFO_LSEQ               string
> INFO_LOF                array
> INFO_SAMPLE             string
> INFO_AMPFLAG            int
> INFO_SN                 double
> INFO_SVTYPE             string
> INFO_TYPE               string
> INFO_MSILEN             double
> INFO_DUPRATE            double
> INFO_DBSNP_COMMON       int
> INFO_REFBIAS            string
> genotypes               
> array,ALD:array,AF:array,phased:boolean,calls:array,VD:int,depth:int,RD:array>>
>  {code}
> You can see that column INFO_ANN is an array of struct and it contains column 
> which has "/" inside such as "cDNA_pos/cDNA_length", etc. 
> We believe that it is the root cause that cause the following SparkException:
> {code:java}
> scala> val schema = spark.sql("CREATE OR REPLACE VIEW yuting AS SELECT 
> INFO_ANN FROM table_2611810")
> 24/01/31 07:50:02.658 [main] WARN  o.a.spark.sql.catalyst.util.package - 
> Truncated the string representation of a plan since it was too large. This 
> behavior can be adjusted by setting 'spark.sql.debug.maxToStringFields'.
> org.apache.spark.SparkException: Cannot recognize hive type string: 
> array,Annotation_Impact:string,Gene_Name:string,Gene_ID:string,Feature_Type:string,Feature_ID:string,Transcript_BioType:string,Rank:struct,HGVS_c:string,HGVS_p:string,cDNA_pos/cDNA_length:struct,CDS_pos/CDS_length:struct,AA_pos/AA_length:struct,Distance:int,ERRORS/WARNINGS/INFO:string>>,
>  column: INFO_ANN
>   at 
> org.apache.spark.sql.errors.QueryExecutionErrors$.cannotRecognizeHiveTypeError(QueryExecutionErrors.scala:1455)
>   at 
> org.apache.spark.sql.hive.client.HiveClientImpl$.getSparkSQLDataType(HiveClientImpl.scala:1022)
>   at 
> org.apache.spark.sql.hive.client.HiveClientImpl$.$anonfun$verifyColumnDataType$1(HiveClientImpl.scala:1037)
>   at scala.collection.Iterator.foreach(Iterator.scala:943)
>   at scala.collection.Iterator.foreach$(Iterator.scala:943)
>   at scala.collection.AbstractIterator.foreach(Iterator.scala:1431)
>   at scala.collection.IterableLike.foreach(IterableLike.scala:74)
>   at scala.collection.IterableLike.foreach$(IterableLike.scala:73)
>   at 

[jira] [Commented] (SPARK-46934) Unable to create Hive View from certain Spark Dataframe StructType

2024-02-19 Thread Yu-Ting LIN (Jira)


[ 
https://issues.apache.org/jira/browse/SPARK-46934?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17818633#comment-17818633
 ] 

Yu-Ting LIN commented on SPARK-46934:
-

[~dongjoon] What do you mean for regression ?

> Unable to create Hive View from certain Spark Dataframe StructType
> --
>
> Key: SPARK-46934
> URL: https://issues.apache.org/jira/browse/SPARK-46934
> Project: Spark
>  Issue Type: Bug
>  Components: SQL
>Affects Versions: 3.3.0, 3.3.2, 3.3.4
> Environment: Tested in Spark 3.3.0, 3.3.2.
>Reporter: Yu-Ting LIN
>Assignee: Kent Yao
>Priority: Blocker
>  Labels: pull-request-available
> Fix For: 4.0.0
>
>
> We are trying to create a Hive View using following SQL command "CREATE OR 
> REPLACE VIEW yuting AS SELECT INFO_ANN FROM table_2611810".
> Our table_2611810 has certain columns contain special characters such as "/". 
> Here is the schema of this table.
> {code:java}
> contigName              string
> start                   bigint
> end                     bigint
> names                   array
> referenceAllele         string
> alternateAlleles        array
> qual                    double
> filters                 array
> splitFromMultiAllelic    boolean
> INFO_NCAMP              int
> INFO_ODDRATIO           double
> INFO_NM                 double
> INFO_DBSNP_CAF          array
> INFO_SPANPAIR           int
> INFO_TLAMP              int
> INFO_PSTD               double
> INFO_QSTD               double
> INFO_SBF                double
> INFO_AF                 array
> INFO_QUAL               double
> INFO_SHIFT3             int
> INFO_VARBIAS            string
> INFO_HICOV              int
> INFO_PMEAN              double
> INFO_MSI                double
> INFO_VD                 int
> INFO_DP                 int
> INFO_HICNT              int
> INFO_ADJAF              double
> INFO_SVLEN              int
> INFO_RSEQ               string
> INFO_MSigDb             array
> INFO_NMD                array
> INFO_ANN                
> array,Annotation_Impact:string,Gene_Name:string,Gene_ID:string,Feature_Type:string,Feature_ID:string,Transcript_BioType:string,Rank:struct,HGVS_c:string,HGVS_p:string,cDNA_pos/cDNA_length:struct,CDS_pos/CDS_length:struct,AA_pos/AA_length:struct,Distance:int,ERRORS/WARNINGS/INFO:string>>
> INFO_BIAS               string
> INFO_MQ                 double
> INFO_HIAF               double
> INFO_END                int
> INFO_SPLITREAD          int
> INFO_GDAMP              int
> INFO_LSEQ               string
> INFO_LOF                array
> INFO_SAMPLE             string
> INFO_AMPFLAG            int
> INFO_SN                 double
> INFO_SVTYPE             string
> INFO_TYPE               string
> INFO_MSILEN             double
> INFO_DUPRATE            double
> INFO_DBSNP_COMMON       int
> INFO_REFBIAS            string
> genotypes               
> array,ALD:array,AF:array,phased:boolean,calls:array,VD:int,depth:int,RD:array>>
>  {code}
> You can see that column INFO_ANN is an array of struct and it contains column 
> which has "/" inside such as "cDNA_pos/cDNA_length", etc. 
> We believe that it is the root cause that cause the following SparkException:
> {code:java}
> scala> val schema = spark.sql("CREATE OR REPLACE VIEW yuting AS SELECT 
> INFO_ANN FROM table_2611810")
> 24/01/31 07:50:02.658 [main] WARN  o.a.spark.sql.catalyst.util.package - 
> Truncated the string representation of a plan since it was too large. This 
> behavior can be adjusted by setting 'spark.sql.debug.maxToStringFields'.
> org.apache.spark.SparkException: Cannot recognize hive type string: 
> array,Annotation_Impact:string,Gene_Name:string,Gene_ID:string,Feature_Type:string,Feature_ID:string,Transcript_BioType:string,Rank:struct,HGVS_c:string,HGVS_p:string,cDNA_pos/cDNA_length:struct,CDS_pos/CDS_length:struct,AA_pos/AA_length:struct,Distance:int,ERRORS/WARNINGS/INFO:string>>,
>  column: INFO_ANN
>   at 
> org.apache.spark.sql.errors.QueryExecutionErrors$.cannotRecognizeHiveTypeError(QueryExecutionErrors.scala:1455)
>   at 
> org.apache.spark.sql.hive.client.HiveClientImpl$.getSparkSQLDataType(HiveClientImpl.scala:1022)
>   at 
> org.apache.spark.sql.hive.client.HiveClientImpl$.$anonfun$verifyColumnDataType$1(HiveClientImpl.scala:1037)
>   at scala.collection.Iterator.foreach(Iterator.scala:943)
>   at scala.collection.Iterator.foreach$(Iterator.scala:943)
>   at scala.collection.AbstractIterator.foreach(Iterator.scala:1431)
>   at scala.collection.IterableLike.foreach(IterableLike.scala:74)
>   at scala.collection.IterableLike.foreach$(IterableLike.scala:73)
>   at org.apache.spark.sql.types.StructType.foreach(StructType.scala:102)
>   at 
> 

[jira] [Commented] (SPARK-46934) Unable to create Hive View from certain Spark Dataframe StructType

2024-02-19 Thread Yu-Ting LIN (Jira)


[ 
https://issues.apache.org/jira/browse/SPARK-46934?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17818629#comment-17818629
 ] 

Yu-Ting LIN commented on SPARK-46934:
-

[~dongjoon] we are mainly using Spark 3.3 and have a plan to migrate to Spark 
3.5 so it would be good for us to have this fix in Spark 3.5. Many thanks.

> Unable to create Hive View from certain Spark Dataframe StructType
> --
>
> Key: SPARK-46934
> URL: https://issues.apache.org/jira/browse/SPARK-46934
> Project: Spark
>  Issue Type: Bug
>  Components: SQL
>Affects Versions: 3.3.0, 3.3.2, 3.3.4
> Environment: Tested in Spark 3.3.0, 3.3.2.
>Reporter: Yu-Ting LIN
>Assignee: Kent Yao
>Priority: Blocker
>  Labels: pull-request-available
> Fix For: 4.0.0
>
>
> We are trying to create a Hive View using following SQL command "CREATE OR 
> REPLACE VIEW yuting AS SELECT INFO_ANN FROM table_2611810".
> Our table_2611810 has certain columns contain special characters such as "/". 
> Here is the schema of this table.
> {code:java}
> contigName              string
> start                   bigint
> end                     bigint
> names                   array
> referenceAllele         string
> alternateAlleles        array
> qual                    double
> filters                 array
> splitFromMultiAllelic    boolean
> INFO_NCAMP              int
> INFO_ODDRATIO           double
> INFO_NM                 double
> INFO_DBSNP_CAF          array
> INFO_SPANPAIR           int
> INFO_TLAMP              int
> INFO_PSTD               double
> INFO_QSTD               double
> INFO_SBF                double
> INFO_AF                 array
> INFO_QUAL               double
> INFO_SHIFT3             int
> INFO_VARBIAS            string
> INFO_HICOV              int
> INFO_PMEAN              double
> INFO_MSI                double
> INFO_VD                 int
> INFO_DP                 int
> INFO_HICNT              int
> INFO_ADJAF              double
> INFO_SVLEN              int
> INFO_RSEQ               string
> INFO_MSigDb             array
> INFO_NMD                array
> INFO_ANN                
> array,Annotation_Impact:string,Gene_Name:string,Gene_ID:string,Feature_Type:string,Feature_ID:string,Transcript_BioType:string,Rank:struct,HGVS_c:string,HGVS_p:string,cDNA_pos/cDNA_length:struct,CDS_pos/CDS_length:struct,AA_pos/AA_length:struct,Distance:int,ERRORS/WARNINGS/INFO:string>>
> INFO_BIAS               string
> INFO_MQ                 double
> INFO_HIAF               double
> INFO_END                int
> INFO_SPLITREAD          int
> INFO_GDAMP              int
> INFO_LSEQ               string
> INFO_LOF                array
> INFO_SAMPLE             string
> INFO_AMPFLAG            int
> INFO_SN                 double
> INFO_SVTYPE             string
> INFO_TYPE               string
> INFO_MSILEN             double
> INFO_DUPRATE            double
> INFO_DBSNP_COMMON       int
> INFO_REFBIAS            string
> genotypes               
> array,ALD:array,AF:array,phased:boolean,calls:array,VD:int,depth:int,RD:array>>
>  {code}
> You can see that column INFO_ANN is an array of struct and it contains column 
> which has "/" inside such as "cDNA_pos/cDNA_length", etc. 
> We believe that it is the root cause that cause the following SparkException:
> {code:java}
> scala> val schema = spark.sql("CREATE OR REPLACE VIEW yuting AS SELECT 
> INFO_ANN FROM table_2611810")
> 24/01/31 07:50:02.658 [main] WARN  o.a.spark.sql.catalyst.util.package - 
> Truncated the string representation of a plan since it was too large. This 
> behavior can be adjusted by setting 'spark.sql.debug.maxToStringFields'.
> org.apache.spark.SparkException: Cannot recognize hive type string: 
> array,Annotation_Impact:string,Gene_Name:string,Gene_ID:string,Feature_Type:string,Feature_ID:string,Transcript_BioType:string,Rank:struct,HGVS_c:string,HGVS_p:string,cDNA_pos/cDNA_length:struct,CDS_pos/CDS_length:struct,AA_pos/AA_length:struct,Distance:int,ERRORS/WARNINGS/INFO:string>>,
>  column: INFO_ANN
>   at 
> org.apache.spark.sql.errors.QueryExecutionErrors$.cannotRecognizeHiveTypeError(QueryExecutionErrors.scala:1455)
>   at 
> org.apache.spark.sql.hive.client.HiveClientImpl$.getSparkSQLDataType(HiveClientImpl.scala:1022)
>   at 
> org.apache.spark.sql.hive.client.HiveClientImpl$.$anonfun$verifyColumnDataType$1(HiveClientImpl.scala:1037)
>   at scala.collection.Iterator.foreach(Iterator.scala:943)
>   at scala.collection.Iterator.foreach$(Iterator.scala:943)
>   at scala.collection.AbstractIterator.foreach(Iterator.scala:1431)
>   at scala.collection.IterableLike.foreach(IterableLike.scala:74)
>   at scala.collection.IterableLike.foreach$(IterableLike.scala:73)
>   at 

[jira] [Updated] (SPARK-47080) Fix `HistoryServerSuite` to get `getNumJobs` in `eventually`

2024-02-19 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-47080?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated SPARK-47080:
---
Labels: pull-request-available  (was: )

> Fix `HistoryServerSuite` to get `getNumJobs` in `eventually`
> 
>
> Key: SPARK-47080
> URL: https://issues.apache.org/jira/browse/SPARK-47080
> Project: Spark
>  Issue Type: Sub-task
>  Components: Spark Core, Tests
>Affects Versions: 4.0.0
>Reporter: Dongjoon Hyun
>Priority: Minor
>  Labels: pull-request-available
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Updated] (SPARK-47099) The `start` value of `paramIndex` for the error class `UNEXPECTED_INPUT_TYPE` should be `1`1

2024-02-19 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-47099?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated SPARK-47099:
---
Labels: pull-request-available  (was: )

> The `start` value of `paramIndex` for the error class `UNEXPECTED_INPUT_TYPE` 
> should be `1`1
> 
>
> Key: SPARK-47099
> URL: https://issues.apache.org/jira/browse/SPARK-47099
> Project: Spark
>  Issue Type: Improvement
>  Components: SQL
>Affects Versions: 4.0.0
>Reporter: BingKun Pan
>Priority: Minor
>  Labels: pull-request-available
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Created] (SPARK-47099) The `start` value of `paramIndex` for the error class `UNEXPECTED_INPUT_TYPE` should be `1`1

2024-02-19 Thread BingKun Pan (Jira)
BingKun Pan created SPARK-47099:
---

 Summary: The `start` value of `paramIndex` for the error class 
`UNEXPECTED_INPUT_TYPE` should be `1`1
 Key: SPARK-47099
 URL: https://issues.apache.org/jira/browse/SPARK-47099
 Project: Spark
  Issue Type: Improvement
  Components: SQL
Affects Versions: 4.0.0
Reporter: BingKun Pan






--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Assigned] (SPARK-47097) Deflake "interrupt tag" at SparkSessionE2ESuite

2024-02-19 Thread Hyukjin Kwon (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-47097?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hyukjin Kwon reassigned SPARK-47097:


Assignee: Hyukjin Kwon

> Deflake "interrupt tag" at SparkSessionE2ESuite
> ---
>
> Key: SPARK-47097
> URL: https://issues.apache.org/jira/browse/SPARK-47097
> Project: Spark
>  Issue Type: Test
>  Components: Connect
>Affects Versions: 4.0.0
>Reporter: Hyukjin Kwon
>Assignee: Hyukjin Kwon
>Priority: Major
>  Labels: pull-request-available
>
> {code}
> - interrupt tag *** FAILED ***
>   The code passed to eventually never returned normally. Attempted 30 times 
> over 20.03742146498 seconds. Last failure message: 
> ListBuffer("2beba4ac-a994-45f5-bd46-fca3e43fb5ef") had length 1 instead of 
> expected length 2 Interrupted operations: 
> ListBuffer(2beba4ac-a994-45f5-bd46-fca3e43fb5ef).. 
> (SparkSessionE2ESuite.scala:216)
> {code}
> https://github.com/apache/spark/actions/runs/7959951623/job/21727929211



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Resolved] (SPARK-47097) Deflake "interrupt tag" at SparkSessionE2ESuite

2024-02-19 Thread Hyukjin Kwon (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-47097?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hyukjin Kwon resolved SPARK-47097.
--
Fix Version/s: 4.0.0
   Resolution: Fixed

Issue resolved by pull request 45173
[https://github.com/apache/spark/pull/45173]

> Deflake "interrupt tag" at SparkSessionE2ESuite
> ---
>
> Key: SPARK-47097
> URL: https://issues.apache.org/jira/browse/SPARK-47097
> Project: Spark
>  Issue Type: Test
>  Components: Connect
>Affects Versions: 4.0.0
>Reporter: Hyukjin Kwon
>Assignee: Hyukjin Kwon
>Priority: Major
>  Labels: pull-request-available
> Fix For: 4.0.0
>
>
> {code}
> - interrupt tag *** FAILED ***
>   The code passed to eventually never returned normally. Attempted 30 times 
> over 20.03742146498 seconds. Last failure message: 
> ListBuffer("2beba4ac-a994-45f5-bd46-fca3e43fb5ef") had length 1 instead of 
> expected length 2 Interrupted operations: 
> ListBuffer(2beba4ac-a994-45f5-bd46-fca3e43fb5ef).. 
> (SparkSessionE2ESuite.scala:216)
> {code}
> https://github.com/apache/spark/actions/runs/7959951623/job/21727929211



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Updated] (SPARK-46973) Add table cache for V2 tables

2024-02-19 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-46973?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated SPARK-46973:
---
Labels: pull-request-available  (was: )

> Add table cache for V2 tables
> -
>
> Key: SPARK-46973
> URL: https://issues.apache.org/jira/browse/SPARK-46973
> Project: Spark
>  Issue Type: Sub-task
>  Components: SQL
>Affects Versions: 4.0.0
>Reporter: Allison Wang
>Priority: Major
>  Labels: pull-request-available
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Updated] (SPARK-47016) Upgrade scalatest related dependencies to the 3.2.18 series

2024-02-19 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-47016?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated SPARK-47016:
---
Labels: pull-request-available  (was: )

> Upgrade scalatest related dependencies to the 3.2.18 series
> ---
>
> Key: SPARK-47016
> URL: https://issues.apache.org/jira/browse/SPARK-47016
> Project: Spark
>  Issue Type: Improvement
>  Components: Build
>Affects Versions: 4.0.0
>Reporter: Yang Jie
>Priority: Major
>  Labels: pull-request-available
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Resolved] (SPARK-46820) Fix error message regression by restoring new_msg

2024-02-19 Thread Hyukjin Kwon (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-46820?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hyukjin Kwon resolved SPARK-46820.
--
Fix Version/s: 4.0.0
   Resolution: Fixed

Issue resolved by pull request 44859
[https://github.com/apache/spark/pull/44859]

> Fix error message regression by restoring new_msg
> -
>
> Key: SPARK-46820
> URL: https://issues.apache.org/jira/browse/SPARK-46820
> Project: Spark
>  Issue Type: Sub-task
>  Components: PySpark
>Affects Versions: 4.0.0
>Reporter: Haejoon Lee
>Assignee: Haejoon Lee
>Priority: Major
>  Labels: pull-request-available
> Fix For: 4.0.0
>
>
> >>> from pyspark.sql.types import StructType, StructField, StringType, 
> >>> IntegerType
> >>> schema = StructType([
> ...     StructField("name", StringType(), nullable=True),
> ...     StructField("age", IntegerType(), nullable=False)
> ... ])
> >>> df = spark.createDataFrame([("asd", None])], schema)
> pyspark.errors.exceptions.base.PySparkValueError: [CANNOT_BE_NONE] Argument 
> `obj` cannot be None.
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Assigned] (SPARK-46820) Fix error message regression by restoring new_msg

2024-02-19 Thread Hyukjin Kwon (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-46820?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hyukjin Kwon reassigned SPARK-46820:


Assignee: Haejoon Lee

> Fix error message regression by restoring new_msg
> -
>
> Key: SPARK-46820
> URL: https://issues.apache.org/jira/browse/SPARK-46820
> Project: Spark
>  Issue Type: Sub-task
>  Components: PySpark
>Affects Versions: 4.0.0
>Reporter: Haejoon Lee
>Assignee: Haejoon Lee
>Priority: Major
>  Labels: pull-request-available
>
> >>> from pyspark.sql.types import StructType, StructField, StringType, 
> >>> IntegerType
> >>> schema = StructType([
> ...     StructField("name", StringType(), nullable=True),
> ...     StructField("age", IntegerType(), nullable=False)
> ... ])
> >>> df = spark.createDataFrame([("asd", None])], schema)
> pyspark.errors.exceptions.base.PySparkValueError: [CANNOT_BE_NONE] Argument 
> `obj` cannot be None.
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Resolved] (SPARK-47093) Upgrade `mockito` to 5.10.0

2024-02-19 Thread Dongjoon Hyun (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-47093?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dongjoon Hyun resolved SPARK-47093.
---
Fix Version/s: 4.0.0
   Resolution: Fixed

Issue resolved by pull request 45169
[https://github.com/apache/spark/pull/45169]

> Upgrade `mockito` to 5.10.0
> ---
>
> Key: SPARK-47093
> URL: https://issues.apache.org/jira/browse/SPARK-47093
> Project: Spark
>  Issue Type: Sub-task
>  Components: Tests
>Affects Versions: 4.0.0
>Reporter: Dongjoon Hyun
>Assignee: Dongjoon Hyun
>Priority: Major
>  Labels: pull-request-available
> Fix For: 4.0.0
>
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Updated] (SPARK-47097) Deflake "interrupt tag" at SparkSessionE2ESuite

2024-02-19 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-47097?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated SPARK-47097:
---
Labels: pull-request-available  (was: )

> Deflake "interrupt tag" at SparkSessionE2ESuite
> ---
>
> Key: SPARK-47097
> URL: https://issues.apache.org/jira/browse/SPARK-47097
> Project: Spark
>  Issue Type: Test
>  Components: Connect
>Affects Versions: 4.0.0
>Reporter: Hyukjin Kwon
>Priority: Major
>  Labels: pull-request-available
>
> {code}
> - interrupt tag *** FAILED ***
>   The code passed to eventually never returned normally. Attempted 30 times 
> over 20.03742146498 seconds. Last failure message: 
> ListBuffer("2beba4ac-a994-45f5-bd46-fca3e43fb5ef") had length 1 instead of 
> expected length 2 Interrupted operations: 
> ListBuffer(2beba4ac-a994-45f5-bd46-fca3e43fb5ef).. 
> (SparkSessionE2ESuite.scala:216)
> {code}
> https://github.com/apache/spark/actions/runs/7959951623/job/21727929211



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Updated] (SPARK-47097) Deflake "interrupt tag" at SparkSessionE2ESuite

2024-02-19 Thread Hyukjin Kwon (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-47097?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hyukjin Kwon updated SPARK-47097:
-
Issue Type: Test  (was: Improvement)

> Deflake "interrupt tag" at SparkSessionE2ESuite
> ---
>
> Key: SPARK-47097
> URL: https://issues.apache.org/jira/browse/SPARK-47097
> Project: Spark
>  Issue Type: Test
>  Components: Connect
>Affects Versions: 4.0.0
>Reporter: Hyukjin Kwon
>Priority: Major
>
> {code}
> - interrupt tag *** FAILED ***
>   The code passed to eventually never returned normally. Attempted 30 times 
> over 20.03742146498 seconds. Last failure message: 
> ListBuffer("2beba4ac-a994-45f5-bd46-fca3e43fb5ef") had length 1 instead of 
> expected length 2 Interrupted operations: 
> ListBuffer(2beba4ac-a994-45f5-bd46-fca3e43fb5ef).. 
> (SparkSessionE2ESuite.scala:216)
> {code}
> https://github.com/apache/spark/actions/runs/7959951623/job/21727929211



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Created] (SPARK-47097) Deflake "interrupt tag" at SparkSessionE2ESuite

2024-02-19 Thread Hyukjin Kwon (Jira)
Hyukjin Kwon created SPARK-47097:


 Summary: Deflake "interrupt tag" at SparkSessionE2ESuite
 Key: SPARK-47097
 URL: https://issues.apache.org/jira/browse/SPARK-47097
 Project: Spark
  Issue Type: Improvement
  Components: Connect
Affects Versions: 4.0.0
Reporter: Hyukjin Kwon


{code}
- interrupt tag *** FAILED ***
  The code passed to eventually never returned normally. Attempted 30 times 
over 20.03742146498 seconds. Last failure message: 
ListBuffer("2beba4ac-a994-45f5-bd46-fca3e43fb5ef") had length 1 instead of 
expected length 2 Interrupted operations: 
ListBuffer(2beba4ac-a994-45f5-bd46-fca3e43fb5ef).. 
(SparkSessionE2ESuite.scala:216)
{code}

https://github.com/apache/spark/actions/runs/7959951623/job/21727929211



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Resolved] (SPARK-47062) Make Spark Connect Plugins Java Compatible

2024-02-19 Thread Hyukjin Kwon (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-47062?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hyukjin Kwon resolved SPARK-47062.
--
Fix Version/s: 4.0.0
   Resolution: Fixed

Issue resolved by pull request 45114
[https://github.com/apache/spark/pull/45114]

> Make Spark Connect Plugins Java Compatible
> --
>
> Key: SPARK-47062
> URL: https://issues.apache.org/jira/browse/SPARK-47062
> Project: Spark
>  Issue Type: Improvement
>  Components: Connect
>Affects Versions: 3.5.0
>Reporter: Martin Grund
>Assignee: Martin Grund
>Priority: Major
>  Labels: pull-request-available
> Fix For: 4.0.0
>
>
> Make Spark Connect Plugins Java Compatible



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Assigned] (SPARK-37434) Disable unsupported `ExtendedLevelDBTest` on `MacOS/aarch64`

2024-02-19 Thread Dongjoon Hyun (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-37434?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dongjoon Hyun reassigned SPARK-37434:
-

Assignee: Yang Jie  (was: Apache Spark)

> Disable unsupported `ExtendedLevelDBTest` on `MacOS/aarch64`
> 
>
> Key: SPARK-37434
> URL: https://issues.apache.org/jira/browse/SPARK-37434
> Project: Spark
>  Issue Type: Sub-task
>  Components: Build
>Affects Versions: 4.0.0
>Reporter: Yang Jie
>Assignee: Yang Jie
>Priority: Minor
>  Labels: pull-request-available
> Fix For: 4.0.0
>
>
> After SPARK-37272  and SPARK-37282,  we can manually add
> {code:java}
> -Dtest.exclude.tags=org.apache.spark.tags.ExtendedLevelDBTest,org.apache.spark.tags.ExtendedRocksDBTest
>  {code}
> when run mvn test or sbt test to disable unsupported UTs on Macos using Apple 
> Silicon.
>  
> We can add a profile to and  activate this property automatically when run 
> UTs on Macos using Apple Silicon.
>  
>  
>  
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Updated] (SPARK-47096) Upgrade Python to 3.11 in Maven builds

2024-02-19 Thread Dongjoon Hyun (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-47096?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dongjoon Hyun updated SPARK-47096:
--
Parent: SPARK-44111
Issue Type: Sub-task  (was: Improvement)

> Upgrade Python to 3.11 in Maven builds
> --
>
> Key: SPARK-47096
> URL: https://issues.apache.org/jira/browse/SPARK-47096
> Project: Spark
>  Issue Type: Sub-task
>  Components: Project Infra
>Affects Versions: 4.0.0
>Reporter: Hyukjin Kwon
>Assignee: Hyukjin Kwon
>Priority: Major
>  Labels: pull-request-available
> Fix For: 4.0.0
>
>
> {code}
>   Error: dyld[4738]: Library not loaded: 
> /usr/local/opt/gettext/lib/libintl.8.dylib
> Referenced from:Error: -3B8E-94C6-6649527BFDBE> 
> /Users/runner/hostedtoolcache/Python/3.9.18/x64/bin/python3.9
> Reason: tried: '/usr/local/opt/gettext/lib/libintl.8.dylib' (no such 
> file), 
> '/System/Volumes/Preboot/Cryptexes/OS/usr/local/opt/gettext/lib/libintl.8.dylib'
>  (no such file), '/usr/local/opt/gettext/lib/libintl.8.dylib' (no such file), 
> '/usr/local/lib/libintl.8.dylib' (no such file), '/usr/lib/libintl.8.dylib' 
> (no such file, not in dyld cache)
>   ./setup.sh: line 52:  4738 Abort trap: 6   ./python -m ensurepip
> {code}
> https://github.com/apache/spark/actions/runs/7964626045/job/21742574260
> https://github.com/actions/runner-images/blob/main/images/macos/macos-14-Readme.md



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Updated] (SPARK-47096) Upgrade Python to 3.11 in Maven builds

2024-02-19 Thread Dongjoon Hyun (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-47096?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dongjoon Hyun updated SPARK-47096:
--
Summary: Upgrade Python to 3.11 in Maven builds  (was: Upgrade Python 
version in Maven builds)

> Upgrade Python to 3.11 in Maven builds
> --
>
> Key: SPARK-47096
> URL: https://issues.apache.org/jira/browse/SPARK-47096
> Project: Spark
>  Issue Type: Improvement
>  Components: Project Infra
>Affects Versions: 4.0.0
>Reporter: Hyukjin Kwon
>Assignee: Hyukjin Kwon
>Priority: Major
>  Labels: pull-request-available
> Fix For: 4.0.0
>
>
> {code}
>   Error: dyld[4738]: Library not loaded: 
> /usr/local/opt/gettext/lib/libintl.8.dylib
> Referenced from:Error: -3B8E-94C6-6649527BFDBE> 
> /Users/runner/hostedtoolcache/Python/3.9.18/x64/bin/python3.9
> Reason: tried: '/usr/local/opt/gettext/lib/libintl.8.dylib' (no such 
> file), 
> '/System/Volumes/Preboot/Cryptexes/OS/usr/local/opt/gettext/lib/libintl.8.dylib'
>  (no such file), '/usr/local/opt/gettext/lib/libintl.8.dylib' (no such file), 
> '/usr/local/lib/libintl.8.dylib' (no such file), '/usr/lib/libintl.8.dylib' 
> (no such file, not in dyld cache)
>   ./setup.sh: line 52:  4738 Abort trap: 6   ./python -m ensurepip
> {code}
> https://github.com/apache/spark/actions/runs/7964626045/job/21742574260
> https://github.com/actions/runner-images/blob/main/images/macos/macos-14-Readme.md



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Resolved] (SPARK-47096) Upgrade Python version in Maven build in macos-14 build

2024-02-19 Thread Dongjoon Hyun (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-47096?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dongjoon Hyun resolved SPARK-47096.
---
Fix Version/s: 4.0.0
   Resolution: Fixed

Issue resolved by pull request 45172
[https://github.com/apache/spark/pull/45172]

> Upgrade Python version in Maven build in macos-14 build
> ---
>
> Key: SPARK-47096
> URL: https://issues.apache.org/jira/browse/SPARK-47096
> Project: Spark
>  Issue Type: Improvement
>  Components: Project Infra
>Affects Versions: 4.0.0
>Reporter: Hyukjin Kwon
>Assignee: Hyukjin Kwon
>Priority: Major
>  Labels: pull-request-available
> Fix For: 4.0.0
>
>
> {code}
>   Error: dyld[4738]: Library not loaded: 
> /usr/local/opt/gettext/lib/libintl.8.dylib
> Referenced from:Error: -3B8E-94C6-6649527BFDBE> 
> /Users/runner/hostedtoolcache/Python/3.9.18/x64/bin/python3.9
> Reason: tried: '/usr/local/opt/gettext/lib/libintl.8.dylib' (no such 
> file), 
> '/System/Volumes/Preboot/Cryptexes/OS/usr/local/opt/gettext/lib/libintl.8.dylib'
>  (no such file), '/usr/local/opt/gettext/lib/libintl.8.dylib' (no such file), 
> '/usr/local/lib/libintl.8.dylib' (no such file), '/usr/lib/libintl.8.dylib' 
> (no such file, not in dyld cache)
>   ./setup.sh: line 52:  4738 Abort trap: 6   ./python -m ensurepip
> {code}
> https://github.com/apache/spark/actions/runs/7964626045/job/21742574260
> https://github.com/actions/runner-images/blob/main/images/macos/macos-14-Readme.md



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Assigned] (SPARK-47096) Upgrade Python version in Maven build in macos-14 build

2024-02-19 Thread Dongjoon Hyun (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-47096?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dongjoon Hyun reassigned SPARK-47096:
-

Assignee: Hyukjin Kwon

> Upgrade Python version in Maven build in macos-14 build
> ---
>
> Key: SPARK-47096
> URL: https://issues.apache.org/jira/browse/SPARK-47096
> Project: Spark
>  Issue Type: Improvement
>  Components: Project Infra
>Affects Versions: 4.0.0
>Reporter: Hyukjin Kwon
>Assignee: Hyukjin Kwon
>Priority: Major
>  Labels: pull-request-available
>
> {code}
>   Error: dyld[4738]: Library not loaded: 
> /usr/local/opt/gettext/lib/libintl.8.dylib
> Referenced from:Error: -3B8E-94C6-6649527BFDBE> 
> /Users/runner/hostedtoolcache/Python/3.9.18/x64/bin/python3.9
> Reason: tried: '/usr/local/opt/gettext/lib/libintl.8.dylib' (no such 
> file), 
> '/System/Volumes/Preboot/Cryptexes/OS/usr/local/opt/gettext/lib/libintl.8.dylib'
>  (no such file), '/usr/local/opt/gettext/lib/libintl.8.dylib' (no such file), 
> '/usr/local/lib/libintl.8.dylib' (no such file), '/usr/lib/libintl.8.dylib' 
> (no such file, not in dyld cache)
>   ./setup.sh: line 52:  4738 Abort trap: 6   ./python -m ensurepip
> {code}
> https://github.com/apache/spark/actions/runs/7964626045/job/21742574260
> https://github.com/actions/runner-images/blob/main/images/macos/macos-14-Readme.md



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Updated] (SPARK-47096) Upgrade Python version in Maven builds

2024-02-19 Thread Dongjoon Hyun (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-47096?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dongjoon Hyun updated SPARK-47096:
--
Summary: Upgrade Python version in Maven builds  (was: Upgrade Python 
version in Maven build in macos-14 build)

> Upgrade Python version in Maven builds
> --
>
> Key: SPARK-47096
> URL: https://issues.apache.org/jira/browse/SPARK-47096
> Project: Spark
>  Issue Type: Improvement
>  Components: Project Infra
>Affects Versions: 4.0.0
>Reporter: Hyukjin Kwon
>Assignee: Hyukjin Kwon
>Priority: Major
>  Labels: pull-request-available
> Fix For: 4.0.0
>
>
> {code}
>   Error: dyld[4738]: Library not loaded: 
> /usr/local/opt/gettext/lib/libintl.8.dylib
> Referenced from:Error: -3B8E-94C6-6649527BFDBE> 
> /Users/runner/hostedtoolcache/Python/3.9.18/x64/bin/python3.9
> Reason: tried: '/usr/local/opt/gettext/lib/libintl.8.dylib' (no such 
> file), 
> '/System/Volumes/Preboot/Cryptexes/OS/usr/local/opt/gettext/lib/libintl.8.dylib'
>  (no such file), '/usr/local/opt/gettext/lib/libintl.8.dylib' (no such file), 
> '/usr/local/lib/libintl.8.dylib' (no such file), '/usr/lib/libintl.8.dylib' 
> (no such file, not in dyld cache)
>   ./setup.sh: line 52:  4738 Abort trap: 6   ./python -m ensurepip
> {code}
> https://github.com/apache/spark/actions/runs/7964626045/job/21742574260
> https://github.com/actions/runner-images/blob/main/images/macos/macos-14-Readme.md



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Updated] (SPARK-47096) Upgrade Python version in Maven build in macos-14 build

2024-02-19 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-47096?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated SPARK-47096:
---
Labels: pull-request-available  (was: )

> Upgrade Python version in Maven build in macos-14 build
> ---
>
> Key: SPARK-47096
> URL: https://issues.apache.org/jira/browse/SPARK-47096
> Project: Spark
>  Issue Type: Improvement
>  Components: Project Infra
>Affects Versions: 4.0.0
>Reporter: Hyukjin Kwon
>Priority: Major
>  Labels: pull-request-available
>
> {code}
>   Error: dyld[4738]: Library not loaded: 
> /usr/local/opt/gettext/lib/libintl.8.dylib
> Referenced from:Error: -3B8E-94C6-6649527BFDBE> 
> /Users/runner/hostedtoolcache/Python/3.9.18/x64/bin/python3.9
> Reason: tried: '/usr/local/opt/gettext/lib/libintl.8.dylib' (no such 
> file), 
> '/System/Volumes/Preboot/Cryptexes/OS/usr/local/opt/gettext/lib/libintl.8.dylib'
>  (no such file), '/usr/local/opt/gettext/lib/libintl.8.dylib' (no such file), 
> '/usr/local/lib/libintl.8.dylib' (no such file), '/usr/lib/libintl.8.dylib' 
> (no such file, not in dyld cache)
>   ./setup.sh: line 52:  4738 Abort trap: 6   ./python -m ensurepip
> {code}
> https://github.com/apache/spark/actions/runs/7964626045/job/21742574260
> https://github.com/actions/runner-images/blob/main/images/macos/macos-14-Readme.md



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Updated] (SPARK-47096) Upgrade Python version in Maven build in macos-14 build

2024-02-19 Thread Hyukjin Kwon (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-47096?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hyukjin Kwon updated SPARK-47096:
-
Description: 
{code}
  Error: dyld[4738]: Library not loaded: 
/usr/local/opt/gettext/lib/libintl.8.dylib
Referenced from:  
/Users/runner/hostedtoolcache/Python/3.9.18/x64/bin/python3.9
Reason: tried: '/usr/local/opt/gettext/lib/libintl.8.dylib' (no such file), 
'/System/Volumes/Preboot/Cryptexes/OS/usr/local/opt/gettext/lib/libintl.8.dylib'
 (no such file), '/usr/local/opt/gettext/lib/libintl.8.dylib' (no such file), 
'/usr/local/lib/libintl.8.dylib' (no such file), '/usr/lib/libintl.8.dylib' (no 
such file, not in dyld cache)
  ./setup.sh: line 52:  4738 Abort trap: 6   ./python -m ensurepip
{code}

https://github.com/apache/spark/actions/runs/7964626045/job/21742574260

https://github.com/actions/runner-images/blob/main/images/macos/macos-14-Readme.md

  was:
{code}
  Error: dyld[4738]: Library not loaded: 
/usr/local/opt/gettext/lib/libintl.8.dylib
Referenced from:  
/Users/runner/hostedtoolcache/Python/3.9.18/x64/bin/python3.9
Reason: tried: '/usr/local/opt/gettext/lib/libintl.8.dylib' (no such file), 
'/System/Volumes/Preboot/Cryptexes/OS/usr/local/opt/gettext/lib/libintl.8.dylib'
 (no such file), '/usr/local/opt/gettext/lib/libintl.8.dylib' (no such file), 
'/usr/local/lib/libintl.8.dylib' (no such file), '/usr/lib/libintl.8.dylib' (no 
such file, not in dyld cache)
  ./setup.sh: line 52:  4738 Abort trap: 6   ./python -m ensurepip
{code}

https://github.com/apache/spark/actions/runs/7964626045/job/21742574260


> Upgrade Python version in Maven build in macos-14 build
> ---
>
> Key: SPARK-47096
> URL: https://issues.apache.org/jira/browse/SPARK-47096
> Project: Spark
>  Issue Type: Improvement
>  Components: Project Infra
>Affects Versions: 4.0.0
>Reporter: Hyukjin Kwon
>Priority: Major
>
> {code}
>   Error: dyld[4738]: Library not loaded: 
> /usr/local/opt/gettext/lib/libintl.8.dylib
> Referenced from:Error: -3B8E-94C6-6649527BFDBE> 
> /Users/runner/hostedtoolcache/Python/3.9.18/x64/bin/python3.9
> Reason: tried: '/usr/local/opt/gettext/lib/libintl.8.dylib' (no such 
> file), 
> '/System/Volumes/Preboot/Cryptexes/OS/usr/local/opt/gettext/lib/libintl.8.dylib'
>  (no such file), '/usr/local/opt/gettext/lib/libintl.8.dylib' (no such file), 
> '/usr/local/lib/libintl.8.dylib' (no such file), '/usr/lib/libintl.8.dylib' 
> (no such file, not in dyld cache)
>   ./setup.sh: line 52:  4738 Abort trap: 6   ./python -m ensurepip
> {code}
> https://github.com/apache/spark/actions/runs/7964626045/job/21742574260
> https://github.com/actions/runner-images/blob/main/images/macos/macos-14-Readme.md



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Updated] (SPARK-47096) Upgrade Python version in Maven build in macos-14 build

2024-02-19 Thread Hyukjin Kwon (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-47096?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hyukjin Kwon updated SPARK-47096:
-
Summary: Upgrade Python version in Maven build in macos-14 build  (was: 
Upgrade Python version in Maven build for macos compatibility)

> Upgrade Python version in Maven build in macos-14 build
> ---
>
> Key: SPARK-47096
> URL: https://issues.apache.org/jira/browse/SPARK-47096
> Project: Spark
>  Issue Type: Improvement
>  Components: Project Infra
>Affects Versions: 4.0.0
>Reporter: Hyukjin Kwon
>Priority: Major
>
> {code}
>   Error: dyld[4738]: Library not loaded: 
> /usr/local/opt/gettext/lib/libintl.8.dylib
> Referenced from:Error: -3B8E-94C6-6649527BFDBE> 
> /Users/runner/hostedtoolcache/Python/3.9.18/x64/bin/python3.9
> Reason: tried: '/usr/local/opt/gettext/lib/libintl.8.dylib' (no such 
> file), 
> '/System/Volumes/Preboot/Cryptexes/OS/usr/local/opt/gettext/lib/libintl.8.dylib'
>  (no such file), '/usr/local/opt/gettext/lib/libintl.8.dylib' (no such file), 
> '/usr/local/lib/libintl.8.dylib' (no such file), '/usr/lib/libintl.8.dylib' 
> (no such file, not in dyld cache)
>   ./setup.sh: line 52:  4738 Abort trap: 6   ./python -m ensurepip
> {code}
> https://github.com/apache/spark/actions/runs/7964626045/job/21742574260



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Created] (SPARK-47096) Upgrade Python version in Maven build for macos compatibility

2024-02-19 Thread Hyukjin Kwon (Jira)
Hyukjin Kwon created SPARK-47096:


 Summary: Upgrade Python version in Maven build for macos 
compatibility
 Key: SPARK-47096
 URL: https://issues.apache.org/jira/browse/SPARK-47096
 Project: Spark
  Issue Type: Improvement
  Components: Project Infra
Affects Versions: 4.0.0
Reporter: Hyukjin Kwon


{code}
  Error: dyld[4738]: Library not loaded: 
/usr/local/opt/gettext/lib/libintl.8.dylib
Referenced from:  
/Users/runner/hostedtoolcache/Python/3.9.18/x64/bin/python3.9
Reason: tried: '/usr/local/opt/gettext/lib/libintl.8.dylib' (no such file), 
'/System/Volumes/Preboot/Cryptexes/OS/usr/local/opt/gettext/lib/libintl.8.dylib'
 (no such file), '/usr/local/opt/gettext/lib/libintl.8.dylib' (no such file), 
'/usr/local/lib/libintl.8.dylib' (no such file), '/usr/lib/libintl.8.dylib' (no 
such file, not in dyld cache)
  ./setup.sh: line 52:  4738 Abort trap: 6   ./python -m ensurepip
{code}

https://github.com/apache/spark/actions/runs/7964626045/job/21742574260



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Resolved] (SPARK-37434) Disable unsupported `ExtendedLevelDBTest` on `MacOS/aarch64`

2024-02-19 Thread Dongjoon Hyun (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-37434?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dongjoon Hyun resolved SPARK-37434.
---
Fix Version/s: 4.0.0
   Resolution: Fixed

Issue resolved by pull request 45170
[https://github.com/apache/spark/pull/45170]

> Disable unsupported `ExtendedLevelDBTest` on `MacOS/aarch64`
> 
>
> Key: SPARK-37434
> URL: https://issues.apache.org/jira/browse/SPARK-37434
> Project: Spark
>  Issue Type: Sub-task
>  Components: Build
>Affects Versions: 4.0.0
>Reporter: Yang Jie
>Assignee: Apache Spark
>Priority: Minor
>  Labels: pull-request-available
> Fix For: 4.0.0
>
>
> After SPARK-37272  and SPARK-37282,  we can manually add
> {code:java}
> -Dtest.exclude.tags=org.apache.spark.tags.ExtendedLevelDBTest,org.apache.spark.tags.ExtendedRocksDBTest
>  {code}
> when run mvn test or sbt test to disable unsupported UTs on Macos using Apple 
> Silicon.
>  
> We can add a profile to and  activate this property automatically when run 
> UTs on Macos using Apple Silicon.
>  
>  
>  
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Assigned] (SPARK-37434) Disable unsupported `ExtendedLevelDBTest` on `MacOS/aarch64`

2024-02-19 Thread Dongjoon Hyun (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-37434?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dongjoon Hyun reassigned SPARK-37434:
-

Assignee: Apache Spark

> Disable unsupported `ExtendedLevelDBTest` on `MacOS/aarch64`
> 
>
> Key: SPARK-37434
> URL: https://issues.apache.org/jira/browse/SPARK-37434
> Project: Spark
>  Issue Type: Sub-task
>  Components: Build
>Affects Versions: 4.0.0
>Reporter: Yang Jie
>Assignee: Apache Spark
>Priority: Minor
>  Labels: pull-request-available
>
> After SPARK-37272  and SPARK-37282,  we can manually add
> {code:java}
> -Dtest.exclude.tags=org.apache.spark.tags.ExtendedLevelDBTest,org.apache.spark.tags.ExtendedRocksDBTest
>  {code}
> when run mvn test or sbt test to disable unsupported UTs on Macos using Apple 
> Silicon.
>  
> We can add a profile to and  activate this property automatically when run 
> UTs on Macos using Apple Silicon.
>  
>  
>  
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Resolved] (SPARK-47095) Uses proper options for command script in macos-14 build

2024-02-19 Thread Dongjoon Hyun (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-47095?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dongjoon Hyun resolved SPARK-47095.
---
Fix Version/s: 4.0.0
   Resolution: Fixed

Issue resolved by pull request 45171
[https://github.com/apache/spark/pull/45171]

> Uses proper options for command script in macos-14 build
> 
>
> Key: SPARK-47095
> URL: https://issues.apache.org/jira/browse/SPARK-47095
> Project: Spark
>  Issue Type: Improvement
>  Components: Project Infra
>Affects Versions: 4.0.0
>Reporter: Hyukjin Kwon
>Assignee: Hyukjin Kwon
>Priority: Major
>  Labels: pull-request-available
> Fix For: 4.0.0
>
>
> https://github.com/apache/spark/actions/runs/7964626045/job/21742573399
> It fails as below:
> {code}
> Run # Fix for TTY related issues when launching the Ammonite REPL in tests.
> /usr/bin/script: illegal option -- c
> usage: script [-aeFkpqr] [-t time] [file [command ...]]
>script -p [-deq] [-T fmt] [file]
> Error: Process completed with exit code 1.
> {code}
> See https://man.freebsd.org/cgi/man.cgi?script(1)
> https://man7.org/linux/man-pages/man1/script.1.html



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Updated] (SPARK-47095) Uses proper options for command script in macos-14 build

2024-02-19 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-47095?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated SPARK-47095:
---
Labels: pull-request-available  (was: )

> Uses proper options for command script in macos-14 build
> 
>
> Key: SPARK-47095
> URL: https://issues.apache.org/jira/browse/SPARK-47095
> Project: Spark
>  Issue Type: Improvement
>  Components: Project Infra
>Affects Versions: 4.0.0
>Reporter: Hyukjin Kwon
>Assignee: Hyukjin Kwon
>Priority: Major
>  Labels: pull-request-available
>
> https://github.com/apache/spark/actions/runs/7964626045/job/21742573399
> It fails as below:
> {code}
> Run # Fix for TTY related issues when launching the Ammonite REPL in tests.
> /usr/bin/script: illegal option -- c
> usage: script [-aeFkpqr] [-t time] [file [command ...]]
>script -p [-deq] [-T fmt] [file]
> Error: Process completed with exit code 1.
> {code}
> See https://man.freebsd.org/cgi/man.cgi?script(1)
> https://man7.org/linux/man-pages/man1/script.1.html



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Updated] (SPARK-47095) Uses proper options for command script in macos-14 build

2024-02-19 Thread Hyukjin Kwon (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-47095?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hyukjin Kwon updated SPARK-47095:
-
Description: 
https://github.com/apache/spark/actions/runs/7964626045/job/21742573399

It fails as below:

{code}
Run # Fix for TTY related issues when launching the Ammonite REPL in tests.
/usr/bin/script: illegal option -- c
usage: script [-aeFkpqr] [-t time] [file [command ...]]
   script -p [-deq] [-T fmt] [file]
Error: Process completed with exit code 1.
{code}

See https://man.freebsd.org/cgi/man.cgi?script(1)
https://man7.org/linux/man-pages/man1/script.1.html

  was:
https://github.com/apache/spark/actions/runs/7964626045/job/21742573399

It fails as below:

{code}
Run # Fix for TTY related issues when launching the Ammonite REPL in tests.
/usr/bin/script: illegal option -- c
usage: script [-aeFkpqr] [-t time] [file [command ...]]
   script -p [-deq] [-T fmt] [file]
Error: Process completed with exit code 1.
{code}


> Uses proper options for command script in macos-14 build
> 
>
> Key: SPARK-47095
> URL: https://issues.apache.org/jira/browse/SPARK-47095
> Project: Spark
>  Issue Type: Improvement
>  Components: Project Infra
>Affects Versions: 4.0.0
>Reporter: Hyukjin Kwon
>Assignee: Hyukjin Kwon
>Priority: Major
>
> https://github.com/apache/spark/actions/runs/7964626045/job/21742573399
> It fails as below:
> {code}
> Run # Fix for TTY related issues when launching the Ammonite REPL in tests.
> /usr/bin/script: illegal option -- c
> usage: script [-aeFkpqr] [-t time] [file [command ...]]
>script -p [-deq] [-T fmt] [file]
> Error: Process completed with exit code 1.
> {code}
> See https://man.freebsd.org/cgi/man.cgi?script(1)
> https://man7.org/linux/man-pages/man1/script.1.html



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Updated] (SPARK-47095) Uses proper options for command script for macos-14 build

2024-02-19 Thread Hyukjin Kwon (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-47095?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hyukjin Kwon updated SPARK-47095:
-
Summary: Uses proper options for command script for macos-14 build  (was: 
Uses Mac dedicated options for command script for macos-14 build)

> Uses proper options for command script for macos-14 build
> -
>
> Key: SPARK-47095
> URL: https://issues.apache.org/jira/browse/SPARK-47095
> Project: Spark
>  Issue Type: Improvement
>  Components: Project Infra
>Affects Versions: 4.0.0
>Reporter: Hyukjin Kwon
>Priority: Major
>
> https://github.com/apache/spark/actions/runs/7964626045/job/21742573399
> It fails as below:
> {code}
> Run # Fix for TTY related issues when launching the Ammonite REPL in tests.
> /usr/bin/script: illegal option -- c
> usage: script [-aeFkpqr] [-t time] [file [command ...]]
>script -p [-deq] [-T fmt] [file]
> Error: Process completed with exit code 1.
> {code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Assigned] (SPARK-47095) Uses proper options for command script for macos-14 build

2024-02-19 Thread Hyukjin Kwon (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-47095?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hyukjin Kwon reassigned SPARK-47095:


Assignee: Hyukjin Kwon

> Uses proper options for command script for macos-14 build
> -
>
> Key: SPARK-47095
> URL: https://issues.apache.org/jira/browse/SPARK-47095
> Project: Spark
>  Issue Type: Improvement
>  Components: Project Infra
>Affects Versions: 4.0.0
>Reporter: Hyukjin Kwon
>Assignee: Hyukjin Kwon
>Priority: Major
>
> https://github.com/apache/spark/actions/runs/7964626045/job/21742573399
> It fails as below:
> {code}
> Run # Fix for TTY related issues when launching the Ammonite REPL in tests.
> /usr/bin/script: illegal option -- c
> usage: script [-aeFkpqr] [-t time] [file [command ...]]
>script -p [-deq] [-T fmt] [file]
> Error: Process completed with exit code 1.
> {code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Updated] (SPARK-47095) Uses Mac dedicated options for command script for macos-14 build

2024-02-19 Thread Hyukjin Kwon (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-47095?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hyukjin Kwon updated SPARK-47095:
-
Summary: Uses Mac dedicated options for command script for macos-14 build  
(was: Uses Mac dedicated options for command script)

> Uses Mac dedicated options for command script for macos-14 build
> 
>
> Key: SPARK-47095
> URL: https://issues.apache.org/jira/browse/SPARK-47095
> Project: Spark
>  Issue Type: Improvement
>  Components: Project Infra
>Affects Versions: 4.0.0
>Reporter: Hyukjin Kwon
>Priority: Major
>
> https://github.com/apache/spark/actions/runs/7964626045/job/21742573399
> It fails as below:
> {code}
> Run # Fix for TTY related issues when launching the Ammonite REPL in tests.
> /usr/bin/script: illegal option -- c
> usage: script [-aeFkpqr] [-t time] [file [command ...]]
>script -p [-deq] [-T fmt] [file]
> Error: Process completed with exit code 1.
> {code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Updated] (SPARK-47095) Uses proper options for command script in macos-14 build

2024-02-19 Thread Hyukjin Kwon (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-47095?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hyukjin Kwon updated SPARK-47095:
-
Summary: Uses proper options for command script in macos-14 build  (was: 
Uses proper options for command script for macos-14 build)

> Uses proper options for command script in macos-14 build
> 
>
> Key: SPARK-47095
> URL: https://issues.apache.org/jira/browse/SPARK-47095
> Project: Spark
>  Issue Type: Improvement
>  Components: Project Infra
>Affects Versions: 4.0.0
>Reporter: Hyukjin Kwon
>Assignee: Hyukjin Kwon
>Priority: Major
>
> https://github.com/apache/spark/actions/runs/7964626045/job/21742573399
> It fails as below:
> {code}
> Run # Fix for TTY related issues when launching the Ammonite REPL in tests.
> /usr/bin/script: illegal option -- c
> usage: script [-aeFkpqr] [-t time] [file [command ...]]
>script -p [-deq] [-T fmt] [file]
> Error: Process completed with exit code 1.
> {code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Created] (SPARK-47095) Uses Mac dedicated options for command script

2024-02-19 Thread Hyukjin Kwon (Jira)
Hyukjin Kwon created SPARK-47095:


 Summary: Uses Mac dedicated options for command script
 Key: SPARK-47095
 URL: https://issues.apache.org/jira/browse/SPARK-47095
 Project: Spark
  Issue Type: Improvement
  Components: Project Infra
Affects Versions: 4.0.0
Reporter: Hyukjin Kwon


https://github.com/apache/spark/actions/runs/7964626045/job/21742573399

It fails as below:

{code}
Run # Fix for TTY related issues when launching the Ammonite REPL in tests.
/usr/bin/script: illegal option -- c
usage: script [-aeFkpqr] [-t time] [file [command ...]]
   script -p [-deq] [-T fmt] [file]
Error: Process completed with exit code 1.
{code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Resolved] (SPARK-47092) Add `getUriBuilder` to `o.a.s.u.Utils` and use it

2024-02-19 Thread Dongjoon Hyun (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-47092?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dongjoon Hyun resolved SPARK-47092.
---
Fix Version/s: 4.0.0
   Resolution: Fixed

Issue resolved by pull request 45168
[https://github.com/apache/spark/pull/45168]

> Add `getUriBuilder` to `o.a.s.u.Utils` and use it
> -
>
> Key: SPARK-47092
> URL: https://issues.apache.org/jira/browse/SPARK-47092
> Project: Spark
>  Issue Type: Sub-task
>  Components: Spark Core
>Affects Versions: 4.0.0
>Reporter: Dongjoon Hyun
>Assignee: Dongjoon Hyun
>Priority: Major
>  Labels: pull-request-available
> Fix For: 4.0.0
>
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Created] (SPARK-47094) SPJ : Dynamically rebalance number of buckets when they are not equal

2024-02-19 Thread Himadri Pal (Jira)
Himadri Pal created SPARK-47094:
---

 Summary: SPJ : Dynamically rebalance number of buckets when they 
are not equal
 Key: SPARK-47094
 URL: https://issues.apache.org/jira/browse/SPARK-47094
 Project: Spark
  Issue Type: New Feature
  Components: Spark Core
Affects Versions: 3.4.0, 3.3.0
Reporter: Himadri Pal


SPJ: Storage Partition Join works with Iceberg tables when both the tables have 
same number of buckets. As part of this feature request, we would like spark to 
gather the number of buckets information from both the tables and dynamically 
rebalance the number of buckets by coalesce or repartition so that SPJ will 
work fine. In this case, we would still have to shuffle but would be better 
than no SPJ.

Use Case : 

Many times we do not have control of the input tables, hence it's not possible 
to change partitioning scheme on those tables. As a consumer, we would still 
like them to be used with SPJ when used with other tables and output tables 
which has different number of buckets.

In these scenario, we would need to read those tables rewrite them with 
matching number of buckets for the SPJ to work, this extra step could outweigh 
the benefits of less shuffle via SPJ. Also when there are multiple different 
tables being joined, each tables need to be rewritten with matching number of 
buckets. 

If this feature is implemented, SPJ functionality will be more powerful.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Updated] (SPARK-37434) Disable unsupported `ExtendedLevelDBTest` on `MacOS/aarch64`

2024-02-19 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-37434?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated SPARK-37434:
---
Labels: pull-request-available  (was: )

> Disable unsupported `ExtendedLevelDBTest` on `MacOS/aarch64`
> 
>
> Key: SPARK-37434
> URL: https://issues.apache.org/jira/browse/SPARK-37434
> Project: Spark
>  Issue Type: Sub-task
>  Components: Build
>Affects Versions: 4.0.0
>Reporter: Yang Jie
>Priority: Minor
>  Labels: pull-request-available
>
> After SPARK-37272  and SPARK-37282,  we can manually add
> {code:java}
> -Dtest.exclude.tags=org.apache.spark.tags.ExtendedLevelDBTest,org.apache.spark.tags.ExtendedRocksDBTest
>  {code}
> when run mvn test or sbt test to disable unsupported UTs on Macos using Apple 
> Silicon.
>  
> We can add a profile to and  activate this property automatically when run 
> UTs on Macos using Apple Silicon.
>  
>  
>  
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Updated] (SPARK-37434) Disable unsupported `ExtendedLevelDBTest` on `MacOS/aarch64`

2024-02-19 Thread Dongjoon Hyun (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-37434?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dongjoon Hyun updated SPARK-37434:
--
Parent: SPARK-44111
Issue Type: Sub-task  (was: Test)

> Disable unsupported `ExtendedLevelDBTest` on `MacOS/aarch64`
> 
>
> Key: SPARK-37434
> URL: https://issues.apache.org/jira/browse/SPARK-37434
> Project: Spark
>  Issue Type: Sub-task
>  Components: Build
>Affects Versions: 4.0.0
>Reporter: Yang Jie
>Priority: Minor
>
> After SPARK-37272  and SPARK-37282,  we can manually add
> {code:java}
> -Dtest.exclude.tags=org.apache.spark.tags.ExtendedLevelDBTest,org.apache.spark.tags.ExtendedRocksDBTest
>  {code}
> when run mvn test or sbt test to disable unsupported UTs on Macos using Apple 
> Silicon.
>  
> We can add a profile to and  activate this property automatically when run 
> UTs on Macos using Apple Silicon.
>  
>  
>  
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Updated] (SPARK-37434) Disable unsupported `ExtendedLevelDBTest` on `MacOS/aarch64`

2024-02-19 Thread Dongjoon Hyun (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-37434?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dongjoon Hyun updated SPARK-37434:
--
Summary: Disable unsupported `ExtendedLevelDBTest` on `MacOS/aarch64`  
(was: Disable `ExtendedLevelDBTest` on `MacOS/aarch64`)

> Disable unsupported `ExtendedLevelDBTest` on `MacOS/aarch64`
> 
>
> Key: SPARK-37434
> URL: https://issues.apache.org/jira/browse/SPARK-37434
> Project: Spark
>  Issue Type: Test
>  Components: Build
>Affects Versions: 4.0.0
>Reporter: Yang Jie
>Priority: Minor
>
> After SPARK-37272  and SPARK-37282,  we can manually add
> {code:java}
> -Dtest.exclude.tags=org.apache.spark.tags.ExtendedLevelDBTest,org.apache.spark.tags.ExtendedRocksDBTest
>  {code}
> when run mvn test or sbt test to disable unsupported UTs on Macos using Apple 
> Silicon.
>  
> We can add a profile to and  activate this property automatically when run 
> UTs on Macos using Apple Silicon.
>  
>  
>  
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Updated] (SPARK-37434) Disable `ExtendedLevelDBTest` on `MacOS/aarch64`

2024-02-19 Thread Dongjoon Hyun (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-37434?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dongjoon Hyun updated SPARK-37434:
--
Affects Version/s: 4.0.0
   (was: 3.3.0)

> Disable `ExtendedLevelDBTest` on `MacOS/aarch64`
> 
>
> Key: SPARK-37434
> URL: https://issues.apache.org/jira/browse/SPARK-37434
> Project: Spark
>  Issue Type: Improvement
>  Components: Build
>Affects Versions: 4.0.0
>Reporter: Yang Jie
>Priority: Major
>
> After SPARK-37272  and SPARK-37282,  we can manually add
> {code:java}
> -Dtest.exclude.tags=org.apache.spark.tags.ExtendedLevelDBTest,org.apache.spark.tags.ExtendedRocksDBTest
>  {code}
> when run mvn test or sbt test to disable unsupported UTs on Macos using Apple 
> Silicon.
>  
> We can add a profile to and  activate this property automatically when run 
> UTs on Macos using Apple Silicon.
>  
>  
>  
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Updated] (SPARK-37434) Disable `ExtendedLevelDBTest` on `MacOS/aarch64`

2024-02-19 Thread Dongjoon Hyun (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-37434?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dongjoon Hyun updated SPARK-37434:
--
Summary: Disable `ExtendedLevelDBTest` on `MacOS/aarch64`  (was: Add a new 
profile to auto disable unsupported UTs on Macos using Apple Silicon)

> Disable `ExtendedLevelDBTest` on `MacOS/aarch64`
> 
>
> Key: SPARK-37434
> URL: https://issues.apache.org/jira/browse/SPARK-37434
> Project: Spark
>  Issue Type: Improvement
>  Components: Build
>Affects Versions: 3.3.0
>Reporter: Yang Jie
>Priority: Major
>
> After SPARK-37272  and SPARK-37282,  we can manually add
> {code:java}
> -Dtest.exclude.tags=org.apache.spark.tags.ExtendedLevelDBTest,org.apache.spark.tags.ExtendedRocksDBTest
>  {code}
> when run mvn test or sbt test to disable unsupported UTs on Macos using Apple 
> Silicon.
>  
> We can add a profile to and  activate this property automatically when run 
> UTs on Macos using Apple Silicon.
>  
>  
>  
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Updated] (SPARK-37434) Disable `ExtendedLevelDBTest` on `MacOS/aarch64`

2024-02-19 Thread Dongjoon Hyun (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-37434?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dongjoon Hyun updated SPARK-37434:
--
Priority: Minor  (was: Major)

> Disable `ExtendedLevelDBTest` on `MacOS/aarch64`
> 
>
> Key: SPARK-37434
> URL: https://issues.apache.org/jira/browse/SPARK-37434
> Project: Spark
>  Issue Type: Improvement
>  Components: Build
>Affects Versions: 4.0.0
>Reporter: Yang Jie
>Priority: Minor
>
> After SPARK-37272  and SPARK-37282,  we can manually add
> {code:java}
> -Dtest.exclude.tags=org.apache.spark.tags.ExtendedLevelDBTest,org.apache.spark.tags.ExtendedRocksDBTest
>  {code}
> when run mvn test or sbt test to disable unsupported UTs on Macos using Apple 
> Silicon.
>  
> We can add a profile to and  activate this property automatically when run 
> UTs on Macos using Apple Silicon.
>  
>  
>  
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Updated] (SPARK-37434) Disable `ExtendedLevelDBTest` on `MacOS/aarch64`

2024-02-19 Thread Dongjoon Hyun (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-37434?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dongjoon Hyun updated SPARK-37434:
--
Issue Type: Test  (was: Improvement)

> Disable `ExtendedLevelDBTest` on `MacOS/aarch64`
> 
>
> Key: SPARK-37434
> URL: https://issues.apache.org/jira/browse/SPARK-37434
> Project: Spark
>  Issue Type: Test
>  Components: Build
>Affects Versions: 4.0.0
>Reporter: Yang Jie
>Priority: Minor
>
> After SPARK-37272  and SPARK-37282,  we can manually add
> {code:java}
> -Dtest.exclude.tags=org.apache.spark.tags.ExtendedLevelDBTest,org.apache.spark.tags.ExtendedRocksDBTest
>  {code}
> when run mvn test or sbt test to disable unsupported UTs on Macos using Apple 
> Silicon.
>  
> We can add a profile to and  activate this property automatically when run 
> UTs on Macos using Apple Silicon.
>  
>  
>  
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Updated] (SPARK-47093) Upgrade `mockito` to 5.10.0

2024-02-19 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-47093?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated SPARK-47093:
---
Labels: pull-request-available  (was: )

> Upgrade `mockito` to 5.10.0
> ---
>
> Key: SPARK-47093
> URL: https://issues.apache.org/jira/browse/SPARK-47093
> Project: Spark
>  Issue Type: Sub-task
>  Components: Tests
>Affects Versions: 4.0.0
>Reporter: Dongjoon Hyun
>Priority: Major
>  Labels: pull-request-available
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-46934) Unable to create Hive View from certain Spark Dataframe StructType

2024-02-19 Thread Dongjoon Hyun (Jira)


[ 
https://issues.apache.org/jira/browse/SPARK-46934?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17818565#comment-17818565
 ] 

Dongjoon Hyun commented on SPARK-46934:
---

This is resolved at Apache Spark 4.0.0.

Do you think this is a regression from some old Spark versions or a blocker for 
Apache Spark 3.5.1 release, [~yutinglin] ?

> Unable to create Hive View from certain Spark Dataframe StructType
> --
>
> Key: SPARK-46934
> URL: https://issues.apache.org/jira/browse/SPARK-46934
> Project: Spark
>  Issue Type: Bug
>  Components: SQL
>Affects Versions: 3.3.0, 3.3.2, 3.3.4
> Environment: Tested in Spark 3.3.0, 3.3.2.
>Reporter: Yu-Ting LIN
>Assignee: Kent Yao
>Priority: Blocker
>  Labels: pull-request-available
> Fix For: 4.0.0
>
>
> We are trying to create a Hive View using following SQL command "CREATE OR 
> REPLACE VIEW yuting AS SELECT INFO_ANN FROM table_2611810".
> Our table_2611810 has certain columns contain special characters such as "/". 
> Here is the schema of this table.
> {code:java}
> contigName              string
> start                   bigint
> end                     bigint
> names                   array
> referenceAllele         string
> alternateAlleles        array
> qual                    double
> filters                 array
> splitFromMultiAllelic    boolean
> INFO_NCAMP              int
> INFO_ODDRATIO           double
> INFO_NM                 double
> INFO_DBSNP_CAF          array
> INFO_SPANPAIR           int
> INFO_TLAMP              int
> INFO_PSTD               double
> INFO_QSTD               double
> INFO_SBF                double
> INFO_AF                 array
> INFO_QUAL               double
> INFO_SHIFT3             int
> INFO_VARBIAS            string
> INFO_HICOV              int
> INFO_PMEAN              double
> INFO_MSI                double
> INFO_VD                 int
> INFO_DP                 int
> INFO_HICNT              int
> INFO_ADJAF              double
> INFO_SVLEN              int
> INFO_RSEQ               string
> INFO_MSigDb             array
> INFO_NMD                array
> INFO_ANN                
> array,Annotation_Impact:string,Gene_Name:string,Gene_ID:string,Feature_Type:string,Feature_ID:string,Transcript_BioType:string,Rank:struct,HGVS_c:string,HGVS_p:string,cDNA_pos/cDNA_length:struct,CDS_pos/CDS_length:struct,AA_pos/AA_length:struct,Distance:int,ERRORS/WARNINGS/INFO:string>>
> INFO_BIAS               string
> INFO_MQ                 double
> INFO_HIAF               double
> INFO_END                int
> INFO_SPLITREAD          int
> INFO_GDAMP              int
> INFO_LSEQ               string
> INFO_LOF                array
> INFO_SAMPLE             string
> INFO_AMPFLAG            int
> INFO_SN                 double
> INFO_SVTYPE             string
> INFO_TYPE               string
> INFO_MSILEN             double
> INFO_DUPRATE            double
> INFO_DBSNP_COMMON       int
> INFO_REFBIAS            string
> genotypes               
> array,ALD:array,AF:array,phased:boolean,calls:array,VD:int,depth:int,RD:array>>
>  {code}
> You can see that column INFO_ANN is an array of struct and it contains column 
> which has "/" inside such as "cDNA_pos/cDNA_length", etc. 
> We believe that it is the root cause that cause the following SparkException:
> {code:java}
> scala> val schema = spark.sql("CREATE OR REPLACE VIEW yuting AS SELECT 
> INFO_ANN FROM table_2611810")
> 24/01/31 07:50:02.658 [main] WARN  o.a.spark.sql.catalyst.util.package - 
> Truncated the string representation of a plan since it was too large. This 
> behavior can be adjusted by setting 'spark.sql.debug.maxToStringFields'.
> org.apache.spark.SparkException: Cannot recognize hive type string: 
> array,Annotation_Impact:string,Gene_Name:string,Gene_ID:string,Feature_Type:string,Feature_ID:string,Transcript_BioType:string,Rank:struct,HGVS_c:string,HGVS_p:string,cDNA_pos/cDNA_length:struct,CDS_pos/CDS_length:struct,AA_pos/AA_length:struct,Distance:int,ERRORS/WARNINGS/INFO:string>>,
>  column: INFO_ANN
>   at 
> org.apache.spark.sql.errors.QueryExecutionErrors$.cannotRecognizeHiveTypeError(QueryExecutionErrors.scala:1455)
>   at 
> org.apache.spark.sql.hive.client.HiveClientImpl$.getSparkSQLDataType(HiveClientImpl.scala:1022)
>   at 
> org.apache.spark.sql.hive.client.HiveClientImpl$.$anonfun$verifyColumnDataType$1(HiveClientImpl.scala:1037)
>   at scala.collection.Iterator.foreach(Iterator.scala:943)
>   at scala.collection.Iterator.foreach$(Iterator.scala:943)
>   at scala.collection.AbstractIterator.foreach(Iterator.scala:1431)
>   at scala.collection.IterableLike.foreach(IterableLike.scala:74)
>   at scala.collection.IterableLike.foreach$(IterableLike.scala:73)
>   at 

[jira] [Comment Edited] (SPARK-46934) Unable to create Hive View from certain Spark Dataframe StructType

2024-02-19 Thread Dongjoon Hyun (Jira)


[ 
https://issues.apache.org/jira/browse/SPARK-46934?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17818565#comment-17818565
 ] 

Dongjoon Hyun edited comment on SPARK-46934 at 2/19/24 7:57 PM:


This is resolved at Apache Spark 4.0.0.

Do you think this is a regression from some old Spark versions or a blocker for 
Apache Spark 3.5.1 release, [~yutinglin] and [~yao] ?


was (Author: dongjoon):
This is resolved at Apache Spark 4.0.0.

Do you think this is a regression from some old Spark versions or a blocker for 
Apache Spark 3.5.1 release, [~yutinglin] ?

> Unable to create Hive View from certain Spark Dataframe StructType
> --
>
> Key: SPARK-46934
> URL: https://issues.apache.org/jira/browse/SPARK-46934
> Project: Spark
>  Issue Type: Bug
>  Components: SQL
>Affects Versions: 3.3.0, 3.3.2, 3.3.4
> Environment: Tested in Spark 3.3.0, 3.3.2.
>Reporter: Yu-Ting LIN
>Assignee: Kent Yao
>Priority: Blocker
>  Labels: pull-request-available
> Fix For: 4.0.0
>
>
> We are trying to create a Hive View using following SQL command "CREATE OR 
> REPLACE VIEW yuting AS SELECT INFO_ANN FROM table_2611810".
> Our table_2611810 has certain columns contain special characters such as "/". 
> Here is the schema of this table.
> {code:java}
> contigName              string
> start                   bigint
> end                     bigint
> names                   array
> referenceAllele         string
> alternateAlleles        array
> qual                    double
> filters                 array
> splitFromMultiAllelic    boolean
> INFO_NCAMP              int
> INFO_ODDRATIO           double
> INFO_NM                 double
> INFO_DBSNP_CAF          array
> INFO_SPANPAIR           int
> INFO_TLAMP              int
> INFO_PSTD               double
> INFO_QSTD               double
> INFO_SBF                double
> INFO_AF                 array
> INFO_QUAL               double
> INFO_SHIFT3             int
> INFO_VARBIAS            string
> INFO_HICOV              int
> INFO_PMEAN              double
> INFO_MSI                double
> INFO_VD                 int
> INFO_DP                 int
> INFO_HICNT              int
> INFO_ADJAF              double
> INFO_SVLEN              int
> INFO_RSEQ               string
> INFO_MSigDb             array
> INFO_NMD                array
> INFO_ANN                
> array,Annotation_Impact:string,Gene_Name:string,Gene_ID:string,Feature_Type:string,Feature_ID:string,Transcript_BioType:string,Rank:struct,HGVS_c:string,HGVS_p:string,cDNA_pos/cDNA_length:struct,CDS_pos/CDS_length:struct,AA_pos/AA_length:struct,Distance:int,ERRORS/WARNINGS/INFO:string>>
> INFO_BIAS               string
> INFO_MQ                 double
> INFO_HIAF               double
> INFO_END                int
> INFO_SPLITREAD          int
> INFO_GDAMP              int
> INFO_LSEQ               string
> INFO_LOF                array
> INFO_SAMPLE             string
> INFO_AMPFLAG            int
> INFO_SN                 double
> INFO_SVTYPE             string
> INFO_TYPE               string
> INFO_MSILEN             double
> INFO_DUPRATE            double
> INFO_DBSNP_COMMON       int
> INFO_REFBIAS            string
> genotypes               
> array,ALD:array,AF:array,phased:boolean,calls:array,VD:int,depth:int,RD:array>>
>  {code}
> You can see that column INFO_ANN is an array of struct and it contains column 
> which has "/" inside such as "cDNA_pos/cDNA_length", etc. 
> We believe that it is the root cause that cause the following SparkException:
> {code:java}
> scala> val schema = spark.sql("CREATE OR REPLACE VIEW yuting AS SELECT 
> INFO_ANN FROM table_2611810")
> 24/01/31 07:50:02.658 [main] WARN  o.a.spark.sql.catalyst.util.package - 
> Truncated the string representation of a plan since it was too large. This 
> behavior can be adjusted by setting 'spark.sql.debug.maxToStringFields'.
> org.apache.spark.SparkException: Cannot recognize hive type string: 
> array,Annotation_Impact:string,Gene_Name:string,Gene_ID:string,Feature_Type:string,Feature_ID:string,Transcript_BioType:string,Rank:struct,HGVS_c:string,HGVS_p:string,cDNA_pos/cDNA_length:struct,CDS_pos/CDS_length:struct,AA_pos/AA_length:struct,Distance:int,ERRORS/WARNINGS/INFO:string>>,
>  column: INFO_ANN
>   at 
> org.apache.spark.sql.errors.QueryExecutionErrors$.cannotRecognizeHiveTypeError(QueryExecutionErrors.scala:1455)
>   at 
> org.apache.spark.sql.hive.client.HiveClientImpl$.getSparkSQLDataType(HiveClientImpl.scala:1022)
>   at 
> org.apache.spark.sql.hive.client.HiveClientImpl$.$anonfun$verifyColumnDataType$1(HiveClientImpl.scala:1037)
>   at scala.collection.Iterator.foreach(Iterator.scala:943)
>   at scala.collection.Iterator.foreach$(Iterator.scala:943)
>   

[jira] [Resolved] (SPARK-46934) Unable to create Hive View from certain Spark Dataframe StructType

2024-02-19 Thread Dongjoon Hyun (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-46934?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dongjoon Hyun resolved SPARK-46934.
---
Fix Version/s: 4.0.0
   Resolution: Fixed

Issue resolved by pull request 45039
[https://github.com/apache/spark/pull/45039]

> Unable to create Hive View from certain Spark Dataframe StructType
> --
>
> Key: SPARK-46934
> URL: https://issues.apache.org/jira/browse/SPARK-46934
> Project: Spark
>  Issue Type: Bug
>  Components: SQL
>Affects Versions: 3.3.0, 3.3.2, 3.3.4
> Environment: Tested in Spark 3.3.0, 3.3.2.
>Reporter: Yu-Ting LIN
>Assignee: Kent Yao
>Priority: Blocker
>  Labels: pull-request-available
> Fix For: 4.0.0
>
>
> We are trying to create a Hive View using following SQL command "CREATE OR 
> REPLACE VIEW yuting AS SELECT INFO_ANN FROM table_2611810".
> Our table_2611810 has certain columns contain special characters such as "/". 
> Here is the schema of this table.
> {code:java}
> contigName              string
> start                   bigint
> end                     bigint
> names                   array
> referenceAllele         string
> alternateAlleles        array
> qual                    double
> filters                 array
> splitFromMultiAllelic    boolean
> INFO_NCAMP              int
> INFO_ODDRATIO           double
> INFO_NM                 double
> INFO_DBSNP_CAF          array
> INFO_SPANPAIR           int
> INFO_TLAMP              int
> INFO_PSTD               double
> INFO_QSTD               double
> INFO_SBF                double
> INFO_AF                 array
> INFO_QUAL               double
> INFO_SHIFT3             int
> INFO_VARBIAS            string
> INFO_HICOV              int
> INFO_PMEAN              double
> INFO_MSI                double
> INFO_VD                 int
> INFO_DP                 int
> INFO_HICNT              int
> INFO_ADJAF              double
> INFO_SVLEN              int
> INFO_RSEQ               string
> INFO_MSigDb             array
> INFO_NMD                array
> INFO_ANN                
> array,Annotation_Impact:string,Gene_Name:string,Gene_ID:string,Feature_Type:string,Feature_ID:string,Transcript_BioType:string,Rank:struct,HGVS_c:string,HGVS_p:string,cDNA_pos/cDNA_length:struct,CDS_pos/CDS_length:struct,AA_pos/AA_length:struct,Distance:int,ERRORS/WARNINGS/INFO:string>>
> INFO_BIAS               string
> INFO_MQ                 double
> INFO_HIAF               double
> INFO_END                int
> INFO_SPLITREAD          int
> INFO_GDAMP              int
> INFO_LSEQ               string
> INFO_LOF                array
> INFO_SAMPLE             string
> INFO_AMPFLAG            int
> INFO_SN                 double
> INFO_SVTYPE             string
> INFO_TYPE               string
> INFO_MSILEN             double
> INFO_DUPRATE            double
> INFO_DBSNP_COMMON       int
> INFO_REFBIAS            string
> genotypes               
> array,ALD:array,AF:array,phased:boolean,calls:array,VD:int,depth:int,RD:array>>
>  {code}
> You can see that column INFO_ANN is an array of struct and it contains column 
> which has "/" inside such as "cDNA_pos/cDNA_length", etc. 
> We believe that it is the root cause that cause the following SparkException:
> {code:java}
> scala> val schema = spark.sql("CREATE OR REPLACE VIEW yuting AS SELECT 
> INFO_ANN FROM table_2611810")
> 24/01/31 07:50:02.658 [main] WARN  o.a.spark.sql.catalyst.util.package - 
> Truncated the string representation of a plan since it was too large. This 
> behavior can be adjusted by setting 'spark.sql.debug.maxToStringFields'.
> org.apache.spark.SparkException: Cannot recognize hive type string: 
> array,Annotation_Impact:string,Gene_Name:string,Gene_ID:string,Feature_Type:string,Feature_ID:string,Transcript_BioType:string,Rank:struct,HGVS_c:string,HGVS_p:string,cDNA_pos/cDNA_length:struct,CDS_pos/CDS_length:struct,AA_pos/AA_length:struct,Distance:int,ERRORS/WARNINGS/INFO:string>>,
>  column: INFO_ANN
>   at 
> org.apache.spark.sql.errors.QueryExecutionErrors$.cannotRecognizeHiveTypeError(QueryExecutionErrors.scala:1455)
>   at 
> org.apache.spark.sql.hive.client.HiveClientImpl$.getSparkSQLDataType(HiveClientImpl.scala:1022)
>   at 
> org.apache.spark.sql.hive.client.HiveClientImpl$.$anonfun$verifyColumnDataType$1(HiveClientImpl.scala:1037)
>   at scala.collection.Iterator.foreach(Iterator.scala:943)
>   at scala.collection.Iterator.foreach$(Iterator.scala:943)
>   at scala.collection.AbstractIterator.foreach(Iterator.scala:1431)
>   at scala.collection.IterableLike.foreach(IterableLike.scala:74)
>   at scala.collection.IterableLike.foreach$(IterableLike.scala:73)
>   at org.apache.spark.sql.types.StructType.foreach(StructType.scala:102)
>   at 
> 

[jira] [Assigned] (SPARK-46934) Unable to create Hive View from certain Spark Dataframe StructType

2024-02-19 Thread Dongjoon Hyun (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-46934?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dongjoon Hyun reassigned SPARK-46934:
-

Assignee: Kent Yao

> Unable to create Hive View from certain Spark Dataframe StructType
> --
>
> Key: SPARK-46934
> URL: https://issues.apache.org/jira/browse/SPARK-46934
> Project: Spark
>  Issue Type: Bug
>  Components: SQL
>Affects Versions: 3.3.0, 3.3.2, 3.3.4
> Environment: Tested in Spark 3.3.0, 3.3.2.
>Reporter: Yu-Ting LIN
>Assignee: Kent Yao
>Priority: Blocker
>  Labels: pull-request-available
>
> We are trying to create a Hive View using following SQL command "CREATE OR 
> REPLACE VIEW yuting AS SELECT INFO_ANN FROM table_2611810".
> Our table_2611810 has certain columns contain special characters such as "/". 
> Here is the schema of this table.
> {code:java}
> contigName              string
> start                   bigint
> end                     bigint
> names                   array
> referenceAllele         string
> alternateAlleles        array
> qual                    double
> filters                 array
> splitFromMultiAllelic    boolean
> INFO_NCAMP              int
> INFO_ODDRATIO           double
> INFO_NM                 double
> INFO_DBSNP_CAF          array
> INFO_SPANPAIR           int
> INFO_TLAMP              int
> INFO_PSTD               double
> INFO_QSTD               double
> INFO_SBF                double
> INFO_AF                 array
> INFO_QUAL               double
> INFO_SHIFT3             int
> INFO_VARBIAS            string
> INFO_HICOV              int
> INFO_PMEAN              double
> INFO_MSI                double
> INFO_VD                 int
> INFO_DP                 int
> INFO_HICNT              int
> INFO_ADJAF              double
> INFO_SVLEN              int
> INFO_RSEQ               string
> INFO_MSigDb             array
> INFO_NMD                array
> INFO_ANN                
> array,Annotation_Impact:string,Gene_Name:string,Gene_ID:string,Feature_Type:string,Feature_ID:string,Transcript_BioType:string,Rank:struct,HGVS_c:string,HGVS_p:string,cDNA_pos/cDNA_length:struct,CDS_pos/CDS_length:struct,AA_pos/AA_length:struct,Distance:int,ERRORS/WARNINGS/INFO:string>>
> INFO_BIAS               string
> INFO_MQ                 double
> INFO_HIAF               double
> INFO_END                int
> INFO_SPLITREAD          int
> INFO_GDAMP              int
> INFO_LSEQ               string
> INFO_LOF                array
> INFO_SAMPLE             string
> INFO_AMPFLAG            int
> INFO_SN                 double
> INFO_SVTYPE             string
> INFO_TYPE               string
> INFO_MSILEN             double
> INFO_DUPRATE            double
> INFO_DBSNP_COMMON       int
> INFO_REFBIAS            string
> genotypes               
> array,ALD:array,AF:array,phased:boolean,calls:array,VD:int,depth:int,RD:array>>
>  {code}
> You can see that column INFO_ANN is an array of struct and it contains column 
> which has "/" inside such as "cDNA_pos/cDNA_length", etc. 
> We believe that it is the root cause that cause the following SparkException:
> {code:java}
> scala> val schema = spark.sql("CREATE OR REPLACE VIEW yuting AS SELECT 
> INFO_ANN FROM table_2611810")
> 24/01/31 07:50:02.658 [main] WARN  o.a.spark.sql.catalyst.util.package - 
> Truncated the string representation of a plan since it was too large. This 
> behavior can be adjusted by setting 'spark.sql.debug.maxToStringFields'.
> org.apache.spark.SparkException: Cannot recognize hive type string: 
> array,Annotation_Impact:string,Gene_Name:string,Gene_ID:string,Feature_Type:string,Feature_ID:string,Transcript_BioType:string,Rank:struct,HGVS_c:string,HGVS_p:string,cDNA_pos/cDNA_length:struct,CDS_pos/CDS_length:struct,AA_pos/AA_length:struct,Distance:int,ERRORS/WARNINGS/INFO:string>>,
>  column: INFO_ANN
>   at 
> org.apache.spark.sql.errors.QueryExecutionErrors$.cannotRecognizeHiveTypeError(QueryExecutionErrors.scala:1455)
>   at 
> org.apache.spark.sql.hive.client.HiveClientImpl$.getSparkSQLDataType(HiveClientImpl.scala:1022)
>   at 
> org.apache.spark.sql.hive.client.HiveClientImpl$.$anonfun$verifyColumnDataType$1(HiveClientImpl.scala:1037)
>   at scala.collection.Iterator.foreach(Iterator.scala:943)
>   at scala.collection.Iterator.foreach$(Iterator.scala:943)
>   at scala.collection.AbstractIterator.foreach(Iterator.scala:1431)
>   at scala.collection.IterableLike.foreach(IterableLike.scala:74)
>   at scala.collection.IterableLike.foreach$(IterableLike.scala:73)
>   at org.apache.spark.sql.types.StructType.foreach(StructType.scala:102)
>   at 
> org.apache.spark.sql.hive.client.HiveClientImpl$.org$apache$spark$sql$hive$client$HiveClientImpl$$verifyColumnDataType(HiveClientImpl.scala:1037)
>   at 
> 

[jira] [Resolved] (SPARK-44826) Resolve testing timeout issue from Spark Connect

2024-02-19 Thread Dongjoon Hyun (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-44826?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dongjoon Hyun resolved SPARK-44826.
---
Fix Version/s: 4.0.0
   Resolution: Fixed

Issue resolved by pull request 45166
[https://github.com/apache/spark/pull/45166]

> Resolve testing timeout issue from Spark Connect
> 
>
> Key: SPARK-44826
> URL: https://issues.apache.org/jira/browse/SPARK-44826
> Project: Spark
>  Issue Type: Bug
>  Components: Connect, Pandas API on Spark, Tests
>Affects Versions: 4.0.0
>Reporter: Haejoon Lee
>Assignee: Haejoon Lee
>Priority: Major
>  Labels: pull-request-available
> Fix For: 4.0.0
>
>
> DiffFramesParitySetItemSeriesTests.test_series_iloc_setitem is failing on 
> Spark Connect due to unexpected timeout issue: 
> https://github.com/itholic/spark/actions/runs/5850534247/job/15860127608



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Assigned] (SPARK-44826) Resolve testing timeout issue from Spark Connect

2024-02-19 Thread Dongjoon Hyun (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-44826?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dongjoon Hyun reassigned SPARK-44826:
-

Assignee: Haejoon Lee

> Resolve testing timeout issue from Spark Connect
> 
>
> Key: SPARK-44826
> URL: https://issues.apache.org/jira/browse/SPARK-44826
> Project: Spark
>  Issue Type: Bug
>  Components: Connect, Pandas API on Spark, Tests
>Affects Versions: 4.0.0
>Reporter: Haejoon Lee
>Assignee: Haejoon Lee
>Priority: Major
>  Labels: pull-request-available
>
> DiffFramesParitySetItemSeriesTests.test_series_iloc_setitem is failing on 
> Spark Connect due to unexpected timeout issue: 
> https://github.com/itholic/spark/actions/runs/5850534247/job/15860127608



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Assigned] (SPARK-47092) Add `getUriBuilder` to `o.a.s.u.Utils` and use it

2024-02-19 Thread Dongjoon Hyun (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-47092?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dongjoon Hyun reassigned SPARK-47092:
-

Assignee: Dongjoon Hyun

> Add `getUriBuilder` to `o.a.s.u.Utils` and use it
> -
>
> Key: SPARK-47092
> URL: https://issues.apache.org/jira/browse/SPARK-47092
> Project: Spark
>  Issue Type: Sub-task
>  Components: Spark Core
>Affects Versions: 4.0.0
>Reporter: Dongjoon Hyun
>Assignee: Dongjoon Hyun
>Priority: Major
>  Labels: pull-request-available
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Resolved] (SPARK-47089) Migrate mockito 4 to mockito5

2024-02-19 Thread Dongjoon Hyun (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-47089?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dongjoon Hyun resolved SPARK-47089.
---
Fix Version/s: 4.0.0
   Resolution: Fixed

Issue resolved by pull request 45158
[https://github.com/apache/spark/pull/45158]

> Migrate mockito 4 to mockito5
> -
>
> Key: SPARK-47089
> URL: https://issues.apache.org/jira/browse/SPARK-47089
> Project: Spark
>  Issue Type: Improvement
>  Components: Build, Tests
>Affects Versions: 4.0.0
>Reporter: Yang Jie
>Assignee: BingKun Pan
>Priority: Major
>  Labels: pull-request-available
> Fix For: 4.0.0
>
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Assigned] (SPARK-47089) Migrate mockito 4 to mockito5

2024-02-19 Thread Dongjoon Hyun (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-47089?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dongjoon Hyun reassigned SPARK-47089:
-

Assignee: BingKun Pan

> Migrate mockito 4 to mockito5
> -
>
> Key: SPARK-47089
> URL: https://issues.apache.org/jira/browse/SPARK-47089
> Project: Spark
>  Issue Type: Improvement
>  Components: Build, Tests
>Affects Versions: 4.0.0
>Reporter: Yang Jie
>Assignee: BingKun Pan
>Priority: Major
>  Labels: pull-request-available
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Updated] (SPARK-47089) Migrate mockito 4 to mockito5

2024-02-19 Thread Dongjoon Hyun (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-47089?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dongjoon Hyun updated SPARK-47089:
--
Parent: SPARK-47046
Issue Type: Sub-task  (was: Improvement)

> Migrate mockito 4 to mockito5
> -
>
> Key: SPARK-47089
> URL: https://issues.apache.org/jira/browse/SPARK-47089
> Project: Spark
>  Issue Type: Sub-task
>  Components: Build, Tests
>Affects Versions: 4.0.0
>Reporter: Yang Jie
>Assignee: BingKun Pan
>Priority: Major
>  Labels: pull-request-available
> Fix For: 4.0.0
>
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Updated] (SPARK-47092) Add `getUriBuilder` to `o.a.s.u.Utils` and use it

2024-02-19 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-47092?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated SPARK-47092:
---
Labels: pull-request-available  (was: )

> Add `getUriBuilder` to `o.a.s.u.Utils` and use it
> -
>
> Key: SPARK-47092
> URL: https://issues.apache.org/jira/browse/SPARK-47092
> Project: Spark
>  Issue Type: Sub-task
>  Components: Spark Core
>Affects Versions: 4.0.0
>Reporter: Dongjoon Hyun
>Priority: Major
>  Labels: pull-request-available
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Updated] (SPARK-47092) Add `getUriBuilder` to `o.a.s.u.Utils` and use it

2024-02-19 Thread Dongjoon Hyun (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-47092?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dongjoon Hyun updated SPARK-47092:
--
Summary: Add `getUriBuilder` to `o.a.s.u.Utils` and use it  (was: Add 
`getUriBuilder` to `o.a.s.u.Utils`)

> Add `getUriBuilder` to `o.a.s.u.Utils` and use it
> -
>
> Key: SPARK-47092
> URL: https://issues.apache.org/jira/browse/SPARK-47092
> Project: Spark
>  Issue Type: Sub-task
>  Components: Spark Core
>Affects Versions: 4.0.0
>Reporter: Dongjoon Hyun
>Priority: Major
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Created] (SPARK-47092) Add `getUriBuilder` to `o.a.s.u.Utils`

2024-02-19 Thread Dongjoon Hyun (Jira)
Dongjoon Hyun created SPARK-47092:
-

 Summary: Add `getUriBuilder` to `o.a.s.u.Utils`
 Key: SPARK-47092
 URL: https://issues.apache.org/jira/browse/SPARK-47092
 Project: Spark
  Issue Type: Sub-task
  Components: Spark Core
Affects Versions: 4.0.0
Reporter: Dongjoon Hyun






--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Resolved] (SPARK-47067) Add Daily Apple Silicon Github Action Job (Java/Scala)

2024-02-19 Thread Dongjoon Hyun (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-47067?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dongjoon Hyun resolved SPARK-47067.
---
Fix Version/s: 4.0.0
   Resolution: Fixed

Issue resolved by pull request 45162
[https://github.com/apache/spark/pull/45162]

> Add Daily Apple Silicon Github Action Job (Java/Scala)
> --
>
> Key: SPARK-47067
> URL: https://issues.apache.org/jira/browse/SPARK-47067
> Project: Spark
>  Issue Type: Sub-task
>  Components: Project Infra
>Affects Versions: 4.0.0
>Reporter: Dongjoon Hyun
>Assignee: Hyukjin Kwon
>Priority: Major
>  Labels: pull-request-available
> Fix For: 4.0.0
>
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Resolved] (SPARK-47087) Raise Spark's exception with an error class in config value check

2024-02-19 Thread Max Gekk (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-47087?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Max Gekk resolved SPARK-47087.
--
Fix Version/s: 4.0.0
   Resolution: Fixed

Issue resolved by pull request 45156
[https://github.com/apache/spark/pull/45156]

> Raise Spark's exception with an error class in config value check
> -
>
> Key: SPARK-47087
> URL: https://issues.apache.org/jira/browse/SPARK-47087
> Project: Spark
>  Issue Type: Sub-task
>  Components: SQL
>Affects Versions: 4.0.0
>Reporter: Max Gekk
>Assignee: Max Gekk
>Priority: Major
>  Labels: pull-request-available
> Fix For: 4.0.0
>
>
> Currently, Spark throws *IllegalArgumentException* in `checkValue` of 
> ConfigBuilder. Need to overload `checkValue` to throw 
> `SparkIllegalArgumentException` with an error class. This should improve user 
> experience with Spark SQL, and impressions of Spark's errors.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Updated] (SPARK-24578) Reading remote cache block behavior changes and causes timeout issue

2024-02-19 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-24578?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated SPARK-24578:
---
Labels: pull-request-available  (was: )

> Reading remote cache block behavior changes and causes timeout issue
> 
>
> Key: SPARK-24578
> URL: https://issues.apache.org/jira/browse/SPARK-24578
> Project: Spark
>  Issue Type: Bug
>  Components: Spark Core
>Affects Versions: 2.3.0, 2.3.1
>Reporter: Wenbo Zhao
>Assignee: Wenbo Zhao
>Priority: Blocker
>  Labels: pull-request-available
> Fix For: 2.3.2, 2.4.0
>
>
> After Spark 2.3, we observed lots of errors like the following in some of our 
> production job
> {code:java}
> 18/06/15 20:59:42 ERROR TransportRequestHandler: Error sending result 
> ChunkFetchSuccess{streamChunkId=StreamChunkId{streamId=91672904003, 
> chunkIndex=0}, 
> buffer=org.apache.spark.storage.BlockManagerManagedBuffer@783a9324} to 
> /172.22.18.7:60865; closing connection
> java.io.IOException: Broken pipe
> at sun.nio.ch.FileDispatcherImpl.write0(Native Method)
> at sun.nio.ch.SocketDispatcher.write(SocketDispatcher.java:47)
> at sun.nio.ch.IOUtil.writeFromNativeBuffer(IOUtil.java:93)
> at sun.nio.ch.IOUtil.write(IOUtil.java:65)
> at sun.nio.ch.SocketChannelImpl.write(SocketChannelImpl.java:471)
> at 
> org.apache.spark.network.protocol.MessageWithHeader.writeNioBuffer(MessageWithHeader.java:156)
> at 
> org.apache.spark.network.protocol.MessageWithHeader.copyByteBuf(MessageWithHeader.java:142)
> at 
> org.apache.spark.network.protocol.MessageWithHeader.transferTo(MessageWithHeader.java:123)
> at 
> io.netty.channel.socket.nio.NioSocketChannel.doWriteFileRegion(NioSocketChannel.java:355)
> at 
> io.netty.channel.nio.AbstractNioByteChannel.doWrite(AbstractNioByteChannel.java:224)
> at 
> io.netty.channel.socket.nio.NioSocketChannel.doWrite(NioSocketChannel.java:382)
> at 
> io.netty.channel.AbstractChannel$AbstractUnsafe.flush0(AbstractChannel.java:934)
> at 
> io.netty.channel.nio.AbstractNioChannel$AbstractNioUnsafe.flush0(AbstractNioChannel.java:362)
> at 
> io.netty.channel.AbstractChannel$AbstractUnsafe.flush(AbstractChannel.java:901)
> at 
> io.netty.channel.DefaultChannelPipeline$HeadContext.flush(DefaultChannelPipeline.java:1321)
> at 
> io.netty.channel.AbstractChannelHandlerContext.invokeFlush0(AbstractChannelHandlerContext.java:776)
> at 
> io.netty.channel.AbstractChannelHandlerContext.invokeFlush(AbstractChannelHandlerContext.java:768)
> at 
> io.netty.channel.AbstractChannelHandlerContext.flush(AbstractChannelHandlerContext.java:749)
> at 
> io.netty.channel.ChannelOutboundHandlerAdapter.flush(ChannelOutboundHandlerAdapter.java:115)
> at 
> io.netty.channel.AbstractChannelHandlerContext.invokeFlush0(AbstractChannelHandlerContext.java:776)
> at 
> io.netty.channel.AbstractChannelHandlerContext.invokeFlush(AbstractChannelHandlerContext.java:768)
> at 
> io.netty.channel.AbstractChannelHandlerContext.flush(AbstractChannelHandlerContext.java:749)
> at io.netty.channel.ChannelDuplexHandler.flush(ChannelDuplexHandler.java:117)
> at 
> io.netty.channel.AbstractChannelHandlerContext.invokeFlush0(AbstractChannelHandlerContext.java:776)
> at 
> io.netty.channel.AbstractChannelHandlerContext.invokeFlush(AbstractChannelHandlerContext.java:768)
> at 
> io.netty.channel.AbstractChannelHandlerContext.flush(AbstractChannelHandlerContext.java:749)
> at 
> io.netty.channel.DefaultChannelPipeline.flush(DefaultChannelPipeline.java:983)
> at io.netty.channel.AbstractChannel.flush(AbstractChannel.java:248)
> at 
> io.netty.channel.nio.AbstractNioByteChannel$1.run(AbstractNioByteChannel.java:284)
> at 
> io.netty.util.concurrent.AbstractEventExecutor.safeExecute(AbstractEventExecutor.java:163)
> at 
> io.netty.util.concurrent.SingleThreadEventExecutor.runAllTasks(SingleThreadEventExecutor.java:403)
> at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:463)
> at 
> io.netty.util.concurrent.SingleThreadEventExecutor$5.run(SingleThreadEventExecutor.java:858)
> at 
> io.netty.util.concurrent.DefaultThreadFactory$DefaultRunnableDecorator.run(DefaultThreadFactory.java:138)
> {code}
>  
> Here is a small reproducible for a small cluster of 2 executors (say host-1 
> and host-2) each with 8 cores. Here, the memory of driver and executors are 
> not an import factor here as long as it is big enough, say 20G. 
> {code:java}
> val n = 1
> val df0 = sc.parallelize(1 to n).toDF
> val df = df0.withColumn("x0", rand()).withColumn("x0", rand()
> ).withColumn("x1", rand()
> ).withColumn("x2", rand()
> ).withColumn("x3", rand()
> ).withColumn("x4", rand()
> ).withColumn("x5", rand()
> ).withColumn("x6", rand()
> ).withColumn("x7", rand()
> ).withColumn("x8", rand()
> ).withColumn("x9", rand())
> df.cache; 

[jira] [Updated] (SPARK-47088) Utilize BigDecimal to calculate the GPU resource

2024-02-19 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-47088?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated SPARK-47088:
---
Labels: pull-request-available  (was: )

> Utilize BigDecimal to calculate the GPU resource 
> -
>
> Key: SPARK-47088
> URL: https://issues.apache.org/jira/browse/SPARK-47088
> Project: Spark
>  Issue Type: Improvement
>  Components: Spark Core
>Affects Versions: 4.0.0
>Reporter: Bobby Wang
>Priority: Minor
>  Labels: pull-request-available
>
> To prevent precision errors, the current method of calculating GPU resources 
> involves multiplying by 1E16 to convert doubles to Longs. If needed, it will 
> also convert Longs back to doubles. This approach introduces redundancy in 
> the code, especially for test code.
> More details can be found at 
> https://github.com/apache/spark/pull/44690#discussion_r1482301112



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Updated] (SPARK-44826) Resolve testing timeout issue from Spark Connect

2024-02-19 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-44826?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated SPARK-44826:
---
Labels: pull-request-available  (was: )

> Resolve testing timeout issue from Spark Connect
> 
>
> Key: SPARK-44826
> URL: https://issues.apache.org/jira/browse/SPARK-44826
> Project: Spark
>  Issue Type: Bug
>  Components: Connect, Pandas API on Spark, Tests
>Affects Versions: 4.0.0
>Reporter: Haejoon Lee
>Priority: Major
>  Labels: pull-request-available
>
> DiffFramesParitySetItemSeriesTests.test_series_iloc_setitem is failing on 
> Spark Connect due to unexpected timeout issue: 
> https://github.com/itholic/spark/actions/runs/5850534247/job/15860127608



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Resolved] (SPARK-47090) Skip JDK 17/21 Maven build in branch-3.4 scheduled job

2024-02-19 Thread Hyukjin Kwon (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-47090?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hyukjin Kwon resolved SPARK-47090.
--
Resolution: Invalid

> Skip JDK 17/21 Maven build in branch-3.4 scheduled job
> --
>
> Key: SPARK-47090
> URL: https://issues.apache.org/jira/browse/SPARK-47090
> Project: Spark
>  Issue Type: Improvement
>  Components: Project Infra
>Affects Versions: 4.0.0
>Reporter: Hyukjin Kwon
>Priority: Major
>  Labels: pull-request-available
>
> https://github.com/apache/spark/actions/runs/7928294496/job/21664443573



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Created] (SPARK-47091) An error occurs when executing the pyspark program

2024-02-19 Thread jackyjfhu (Jira)
jackyjfhu created SPARK-47091:
-

 Summary: An error occurs when executing the pyspark program
 Key: SPARK-47091
 URL: https://issues.apache.org/jira/browse/SPARK-47091
 Project: Spark
  Issue Type: Bug
  Components: Spark Core
Affects Versions: 3.5.0, 3.1.3
Reporter: jackyjfhu


When I excute this code via pyspark:

spark._sc.textFile("/tmp/spark_data1").repartition(50).toDF().show

I get an error:

ERROR spark.TaskContextImpl: Error in TaskCompletionListener 
io.netty.util.IllegalReferenceCountException: refCnt: 0, decrement: 1 at 
io.netty.util.internal.ReferenceCountUpdater.toLiveRealRefCnt(ReferenceCountUpdater.java:74)
 ~[iceberg-spark-runtime-3.1_2.12-0.14.3-5-tencent.jar:?] at 
io.netty.util.internal.ReferenceCountUpdater.release(ReferenceCountUpdater.java:138)
 ~[iceberg-spark-runtime-3.1_2.12-0.14.3-5-tencent.jar:?] at 
io.netty.buffer.AbstractReferenceCountedByteBuf.release(AbstractReferenceCountedByteBuf.java:100)
 ~[netty-all-4.1.51.Final.jar:4.1.51.Final] at 
io.netty.buffer.AbstractDerivedByteBuf.release0(AbstractDerivedByteBuf.java:94) 
~[netty-all-4.1.51.Final.jar:4.1.51.Final] at 
io.netty.buffer.AbstractDerivedByteBuf.release(AbstractDerivedByteBuf.java:90) 
~[netty-all-4.1.51.Final.jar:4.1.51.Final] at 
org.apache.spark.network.buffer.NettyManagedBuffer.release(NettyManagedBuffer.java:62)
 ~[spark-network-common_2.12-3.1.3.jar:3.1.3] at 
org.apache.spark.storage.ShuffleBlockFetcherIterator.cleanup(ShuffleBlockFetcherIterator.scala:226)
 ~[spark-core_2.12-3.1.3.jar:3.1.3] at 
org.apache.spark.storage.ShuffleFetchCompletionListener.onTaskCompletion(ShuffleBlockFetcherIterator.scala:862)
 ~[spark-core_2.12-3.1.3.jar:3.1.3] at 
org.apache.spark.TaskContextImpl.$anonfun$markTaskCompleted$1(TaskContextImpl.scala:124)
 ~[spark-core_2.12-3.1.3.jar:3.1.3] at 
org.apache.spark.TaskContextImpl.$anonfun$markTaskCompleted$1$adapted(TaskContextImpl.scala:124)
 ~[spark-core_2.12-3.1.3.jar:3.1.3] at 
org.apache.spark.TaskContextImpl.$anonfun$invokeListeners$1(TaskContextImpl.scala:137)
 ~[spark-core_2.12-3.1.3.jar:3.1.3] at 
org.apache.spark.TaskContextImpl.$anonfun$invokeListeners$1$adapted(TaskContextImpl.scala:135)
 ~[spark-core_2.12-3.1.3.jar:3.1.3] at 
scala.collection.mutable.ResizableArray.foreach(ResizableArray.scala:62) 
~[scala-library-2.12.10.jar:?] at 
scala.collection.mutable.ResizableArray.foreach$(ResizableArray.scala:55) 
~[scala-library-2.12.10.jar:?] at 
scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:49) 
~[scala-library-2.12.10.jar:?] at 
org.apache.spark.TaskContextImpl.invokeListeners(TaskContextImpl.scala:135) 
~[spark-core_2.12-3.1.3.jar:3.1.3] at 
org.apache.spark.TaskContextImpl.markTaskCompleted(TaskContextImpl.scala:124) 
~[spark-core_2.12-3.1.3.jar:3.1.3] at 
org.apache.spark.scheduler.Task.run(Task.scala:147) 
~[spark-core_2.12-3.1.3.jar:3.1.3] at 
org.apache.spark.executor.Executor$TaskRunner.$anonfun$run$3(Executor.scala:498)
 ~[spark-core_2.12-3.1.3.jar:3.1.3] at 
org.apache.spark.util.Utils$.tryWithSafeFinally(Utils.scala:1439) 
[spark-core_2.12-3.1.3.jar:3.1.3] at 
org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:501) 
[spark-core_2.12-3.1.3.jar:3.1.3] at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) 
[?:1.8.0_362] at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) 
[?:1.8.0_362] at java.lang.Thread.run(Thread.java:750) [?:1.8.0_362] 24/02/19 
11:26:53 ERROR executor.Executor: Exception in task 0.1 in stage 1.0 (TID 4001) 
org.apache.spark.util.TaskCompletionListenerException: refCnt: 0, decrement: 1 
at org.apache.spark.TaskContextImpl.invokeListeners(TaskContextImpl.scala:145) 
~[spark-core_2.12-3.1.3.jar:3.1.3] at 
org.apache.spark.TaskContextImpl.markTaskCompleted(TaskContextImpl.scala:124) 
~[spark-core_2.12-3.1.3.jar:3.1.3] at 
org.apache.spark.scheduler.Task.run(Task.scala:147) 
~[spark-core_2.12-3.1.3.jar:3.1.3] at 
org.apache.spark.executor.Executor$TaskRunner.$anonfun$run$3(Executor.scala:498)
 ~[spark-core_2.12-3.1.3.jar:3.1.3] at 
org.apache.spark.util.Utils$.tryWithSafeFinally(Utils.scala:1439) 
~[spark-core_2.12-3.1.3.jar:3.1.3] at 
org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:501) 
[spark-core_2.12-3.1.3.jar:3.1.3] at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) 
[?:1.8.0_362] at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) 
[?:1.8.0_362] at java.lang.Thread.run(Thread.java:750) 

 
Note: There are 4000 small files in this directory /tmp/spark_data1;

In addition, if there are relatively few small files, no error will be reported.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, 

[jira] [Commented] (SPARK-9174) Add documentation for all public SQLConfs

2024-02-19 Thread Nandini (Jira)


[ 
https://issues.apache.org/jira/browse/SPARK-9174?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17818378#comment-17818378
 ] 

Nandini commented on SPARK-9174:


Hi Team,

I was not able to find the doc for _spark.sql.retainGroupColumns_

Can I add 'Whether to retain group by columns or not in GroupedData.agg.'?


> Add documentation for all public SQLConfs
> -
>
> Key: SPARK-9174
> URL: https://issues.apache.org/jira/browse/SPARK-9174
> Project: Spark
>  Issue Type: Improvement
>  Components: SQL
>Reporter: Reynold Xin
>Assignee: Reynold Xin
>Priority: Major
> Fix For: 1.5.0
>
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org