[jira] [Updated] (SPARK-47085) Preformance issue on thrift API
[ https://issues.apache.org/jira/browse/SPARK-47085?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Izek Greenfield updated SPARK-47085: Description: This new complexity was introduced in SPARK-39041. In class `RowSetUtils` there is a loop that has _*O(n^2)*_ complexity: {code:scala} ... while (i < rowSize) { val row = rows(I) ... {code} It can be easily converted back into _*O( n )*_ complexity. was: This new complexity introduced in SPARK-39041. In class `RowSetUtils` there is a loop that has _*O(n^2)*_ complexity: {code:scala} ... while (i < rowSize) { val row = rows(I) ... {code} It can be easily converted back into _*O(n)*_ complexity. > Preformance issue on thrift API > --- > > Key: SPARK-47085 > URL: https://issues.apache.org/jira/browse/SPARK-47085 > Project: Spark > Issue Type: Bug > Components: SQL >Affects Versions: 3.4.1, 3.5.0 >Reporter: Izek Greenfield >Assignee: Izek Greenfield >Priority: Major > Labels: pull-request-available > Fix For: 4.0.0 > > > This new complexity was introduced in SPARK-39041. > In class `RowSetUtils` there is a loop that has _*O(n^2)*_ complexity: > {code:scala} > ... > while (i < rowSize) { > val row = rows(I) > ... > {code} > It can be easily converted back into _*O( n )*_ complexity. > > -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-47085) Preformance issue on thrift API
[ https://issues.apache.org/jira/browse/SPARK-47085?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Izek Greenfield updated SPARK-47085: Description: This new complexity introduced in SPARK-39041. In class `RowSetUtils` there is a loop that has _*O(n^2)*_ complexity: {code:scala} ... while (i < rowSize) { val row = rows(I) ... {code} It can be easily converted back into _*O(n)*_ complexity. was: in class `RowSetUtils` there is a loop that has O(n^2) complexity: {code:scala} ... while (i < rowSize) { val row = rows(I) ... {code} It can be easily converted into O( n ) complexity. > Preformance issue on thrift API > --- > > Key: SPARK-47085 > URL: https://issues.apache.org/jira/browse/SPARK-47085 > Project: Spark > Issue Type: Bug > Components: SQL >Affects Versions: 3.4.1, 3.5.0 >Reporter: Izek Greenfield >Assignee: Izek Greenfield >Priority: Major > Labels: pull-request-available > Fix For: 4.0.0 > > > This new complexity introduced in SPARK-39041. > In class `RowSetUtils` there is a loop that has _*O(n^2)*_ complexity: > {code:scala} > ... > while (i < rowSize) { > val row = rows(I) > ... > {code} > It can be easily converted back into _*O(n)*_ complexity. > > -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-47085) Preformance issue on thrift API
[ https://issues.apache.org/jira/browse/SPARK-47085?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Izek Greenfield updated SPARK-47085: Issue Type: Bug (was: Improvement) > Preformance issue on thrift API > --- > > Key: SPARK-47085 > URL: https://issues.apache.org/jira/browse/SPARK-47085 > Project: Spark > Issue Type: Bug > Components: SQL >Affects Versions: 3.4.1, 3.5.0 >Reporter: Izek Greenfield >Assignee: Izek Greenfield >Priority: Major > Labels: pull-request-available > Fix For: 4.0.0 > > > in class `RowSetUtils` there is a loop that has O(n^2) complexity: > {code:scala} > ... > while (i < rowSize) { > val row = rows(I) > ... > {code} > It can be easily converted into O( n ) complexity. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-45615) Remove redundant"Auto-application to `()` is deprecated" compile suppression rules.
[ https://issues.apache.org/jira/browse/SPARK-45615?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated SPARK-45615: --- Labels: pull-request-available (was: ) > Remove redundant"Auto-application to `()` is deprecated" compile suppression > rules. > --- > > Key: SPARK-45615 > URL: https://issues.apache.org/jira/browse/SPARK-45615 > Project: Spark > Issue Type: Improvement > Components: Build >Affects Versions: 4.0.0 >Reporter: Yang Jie >Priority: Minor > Labels: pull-request-available > > Due to the issue https://github.com/scalatest/scalatest/issues/2297, we need > to wait until we upgrade a scalatest version before removing these > suppression rules. > Maybe 3.2.18 -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Resolved] (SPARK-45789) Support DESCRIBE TABLE for clustering columns
[ https://issues.apache.org/jira/browse/SPARK-45789?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wenchen Fan resolved SPARK-45789. - Fix Version/s: 4.0.0 Resolution: Fixed Issue resolved by pull request 45077 [https://github.com/apache/spark/pull/45077] > Support DESCRIBE TABLE for clustering columns > - > > Key: SPARK-45789 > URL: https://issues.apache.org/jira/browse/SPARK-45789 > Project: Spark > Issue Type: Sub-task > Components: SQL >Affects Versions: 4.0.0 >Reporter: Terry Kim >Assignee: Terry Kim >Priority: Major > Labels: pull-request-available > Fix For: 4.0.0 > > -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Assigned] (SPARK-45789) Support DESCRIBE TABLE for clustering columns
[ https://issues.apache.org/jira/browse/SPARK-45789?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wenchen Fan reassigned SPARK-45789: --- Assignee: Terry Kim > Support DESCRIBE TABLE for clustering columns > - > > Key: SPARK-45789 > URL: https://issues.apache.org/jira/browse/SPARK-45789 > Project: Spark > Issue Type: Sub-task > Components: SQL >Affects Versions: 4.0.0 >Reporter: Terry Kim >Assignee: Terry Kim >Priority: Major > Labels: pull-request-available > -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Assigned] (SPARK-47080) Fix `HistoryServerSuite` to get `getNumJobs` in `eventually`
[ https://issues.apache.org/jira/browse/SPARK-47080?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dongjoon Hyun reassigned SPARK-47080: - Assignee: Dongjoon Hyun > Fix `HistoryServerSuite` to get `getNumJobs` in `eventually` > > > Key: SPARK-47080 > URL: https://issues.apache.org/jira/browse/SPARK-47080 > Project: Spark > Issue Type: Sub-task > Components: Spark Core, Tests >Affects Versions: 4.0.0 >Reporter: Dongjoon Hyun >Assignee: Dongjoon Hyun >Priority: Minor > Labels: pull-request-available > -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Resolved] (SPARK-47080) Fix `HistoryServerSuite` to get `getNumJobs` in `eventually`
[ https://issues.apache.org/jira/browse/SPARK-47080?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dongjoon Hyun resolved SPARK-47080. --- Fix Version/s: 4.0.0 Resolution: Fixed Issue resolved by pull request 45147 [https://github.com/apache/spark/pull/45147] > Fix `HistoryServerSuite` to get `getNumJobs` in `eventually` > > > Key: SPARK-47080 > URL: https://issues.apache.org/jira/browse/SPARK-47080 > Project: Spark > Issue Type: Sub-task > Components: Spark Core, Tests >Affects Versions: 4.0.0 >Reporter: Dongjoon Hyun >Assignee: Dongjoon Hyun >Priority: Minor > Labels: pull-request-available > Fix For: 4.0.0 > > -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Resolved] (SPARK-47016) Upgrade scalatest related dependencies to the 3.2.18 series
[ https://issues.apache.org/jira/browse/SPARK-47016?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dongjoon Hyun resolved SPARK-47016. --- Fix Version/s: 4.0.0 Resolution: Fixed Issue resolved by pull request 45174 [https://github.com/apache/spark/pull/45174] > Upgrade scalatest related dependencies to the 3.2.18 series > --- > > Key: SPARK-47016 > URL: https://issues.apache.org/jira/browse/SPARK-47016 > Project: Spark > Issue Type: Improvement > Components: Build >Affects Versions: 4.0.0 >Reporter: Yang Jie >Assignee: Yang Jie >Priority: Major > Labels: pull-request-available > Fix For: 4.0.0 > > -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-47016) Upgrade scalatest related dependencies to the 3.2.18 series
[ https://issues.apache.org/jira/browse/SPARK-47016?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dongjoon Hyun updated SPARK-47016: -- Parent: SPARK-47046 Issue Type: Sub-task (was: Improvement) > Upgrade scalatest related dependencies to the 3.2.18 series > --- > > Key: SPARK-47016 > URL: https://issues.apache.org/jira/browse/SPARK-47016 > Project: Spark > Issue Type: Sub-task > Components: Build >Affects Versions: 4.0.0 >Reporter: Yang Jie >Assignee: Yang Jie >Priority: Major > Labels: pull-request-available > Fix For: 4.0.0 > > -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Assigned] (SPARK-47016) Upgrade scalatest related dependencies to the 3.2.18 series
[ https://issues.apache.org/jira/browse/SPARK-47016?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dongjoon Hyun reassigned SPARK-47016: - Assignee: Yang Jie > Upgrade scalatest related dependencies to the 3.2.18 series > --- > > Key: SPARK-47016 > URL: https://issues.apache.org/jira/browse/SPARK-47016 > Project: Spark > Issue Type: Improvement > Components: Build >Affects Versions: 4.0.0 >Reporter: Yang Jie >Assignee: Yang Jie >Priority: Major > Labels: pull-request-available > -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-45376) [CORE] Add netty-tcnative-boringssl-static dependency
[ https://issues.apache.org/jira/browse/SPARK-45376?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dongjoon Hyun updated SPARK-45376: -- Parent: SPARK-47046 Issue Type: Sub-task (was: Task) > [CORE] Add netty-tcnative-boringssl-static dependency > - > > Key: SPARK-45376 > URL: https://issues.apache.org/jira/browse/SPARK-45376 > Project: Spark > Issue Type: Sub-task > Components: Spark Core >Affects Versions: 4.0.0 >Reporter: Hasnain Lakhani >Assignee: Hasnain Lakhani >Priority: Major > Labels: pull-request-available > Fix For: 4.0.0 > > > Add the boringssl dependency which is needed for SSL functionality to work, > and provide the network common test helper to other test modules which need > to test SSL functionality -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Assigned] (SPARK-47100) Upgrade netty to 4.1.107.Final and netty-tcnative to 2.0.62.Final
[ https://issues.apache.org/jira/browse/SPARK-47100?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dongjoon Hyun reassigned SPARK-47100: - Assignee: Dongjoon Hyun > Upgrade netty to 4.1.107.Final and netty-tcnative to 2.0.62.Final > - > > Key: SPARK-47100 > URL: https://issues.apache.org/jira/browse/SPARK-47100 > Project: Spark > Issue Type: Sub-task > Components: Build >Affects Versions: 4.0.0 >Reporter: Dongjoon Hyun >Assignee: Dongjoon Hyun >Priority: Major > Labels: pull-request-available > -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-45376) [CORE] Add netty-tcnative-boringssl-static dependency
[ https://issues.apache.org/jira/browse/SPARK-45376?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dongjoon Hyun updated SPARK-45376: -- Epic Link: (was: SPARK-44937) > [CORE] Add netty-tcnative-boringssl-static dependency > - > > Key: SPARK-45376 > URL: https://issues.apache.org/jira/browse/SPARK-45376 > Project: Spark > Issue Type: Sub-task > Components: Spark Core >Affects Versions: 4.0.0 >Reporter: Hasnain Lakhani >Assignee: Hasnain Lakhani >Priority: Major > Labels: pull-request-available > Fix For: 4.0.0 > > > Add the boringssl dependency which is needed for SSL functionality to work, > and provide the network common test helper to other test modules which need > to test SSL functionality -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-47100) Upgrade netty to 4.1.107.Final and netty-tcnative to 2.0.62.Final
[ https://issues.apache.org/jira/browse/SPARK-47100?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated SPARK-47100: --- Labels: pull-request-available (was: ) > Upgrade netty to 4.1.107.Final and netty-tcnative to 2.0.62.Final > - > > Key: SPARK-47100 > URL: https://issues.apache.org/jira/browse/SPARK-47100 > Project: Spark > Issue Type: Sub-task > Components: Build >Affects Versions: 4.0.0 >Reporter: Dongjoon Hyun >Priority: Major > Labels: pull-request-available > -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-47100) Upgrade netty to 4.1.107.Final and netty-tcnative to 2.0.62.Final
[ https://issues.apache.org/jira/browse/SPARK-47100?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dongjoon Hyun updated SPARK-47100: -- Summary: Upgrade netty to 4.1.107.Final and netty-tcnative to 2.0.62.Final (was: Upgrade Netty to 4.1.107.Final) > Upgrade netty to 4.1.107.Final and netty-tcnative to 2.0.62.Final > - > > Key: SPARK-47100 > URL: https://issues.apache.org/jira/browse/SPARK-47100 > Project: Spark > Issue Type: Sub-task > Components: Build >Affects Versions: 4.0.0 >Reporter: Dongjoon Hyun >Priority: Major > -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-47100) Upgrade Netty to 4.1.107.Final
Dongjoon Hyun created SPARK-47100: - Summary: Upgrade Netty to 4.1.107.Final Key: SPARK-47100 URL: https://issues.apache.org/jira/browse/SPARK-47100 Project: Spark Issue Type: Sub-task Components: Build Affects Versions: 4.0.0 Reporter: Dongjoon Hyun -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-46432) Upgrade Netty to 4.1.106.Final
[ https://issues.apache.org/jira/browse/SPARK-46432?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dongjoon Hyun updated SPARK-46432: -- Parent: SPARK-47046 Issue Type: Sub-task (was: Improvement) > Upgrade Netty to 4.1.106.Final > -- > > Key: SPARK-46432 > URL: https://issues.apache.org/jira/browse/SPARK-46432 > Project: Spark > Issue Type: Sub-task > Components: Build >Affects Versions: 4.0.0 >Reporter: BingKun Pan >Assignee: BingKun Pan >Priority: Minor > Labels: pull-request-available > Fix For: 4.0.0 > > -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-46934) Unable to create Hive View from certain Spark Dataframe StructType
[ https://issues.apache.org/jira/browse/SPARK-46934?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17818634#comment-17818634 ] Jungtaek Lim commented on SPARK-46934: -- Maybe the priority also has to be updated as well - we could say it's a release blocker if we have a consensus this is a blocker. Looks like it doesn't. > Unable to create Hive View from certain Spark Dataframe StructType > -- > > Key: SPARK-46934 > URL: https://issues.apache.org/jira/browse/SPARK-46934 > Project: Spark > Issue Type: Bug > Components: SQL >Affects Versions: 3.3.0, 3.3.2, 3.3.4 > Environment: Tested in Spark 3.3.0, 3.3.2. >Reporter: Yu-Ting LIN >Assignee: Kent Yao >Priority: Blocker > Labels: pull-request-available > Fix For: 4.0.0 > > > We are trying to create a Hive View using following SQL command "CREATE OR > REPLACE VIEW yuting AS SELECT INFO_ANN FROM table_2611810". > Our table_2611810 has certain columns contain special characters such as "/". > Here is the schema of this table. > {code:java} > contigName string > start bigint > end bigint > names array > referenceAllele string > alternateAlleles array > qual double > filters array > splitFromMultiAllelic boolean > INFO_NCAMP int > INFO_ODDRATIO double > INFO_NM double > INFO_DBSNP_CAF array > INFO_SPANPAIR int > INFO_TLAMP int > INFO_PSTD double > INFO_QSTD double > INFO_SBF double > INFO_AF array > INFO_QUAL double > INFO_SHIFT3 int > INFO_VARBIAS string > INFO_HICOV int > INFO_PMEAN double > INFO_MSI double > INFO_VD int > INFO_DP int > INFO_HICNT int > INFO_ADJAF double > INFO_SVLEN int > INFO_RSEQ string > INFO_MSigDb array > INFO_NMD array > INFO_ANN > array,Annotation_Impact:string,Gene_Name:string,Gene_ID:string,Feature_Type:string,Feature_ID:string,Transcript_BioType:string,Rank:struct,HGVS_c:string,HGVS_p:string,cDNA_pos/cDNA_length:struct,CDS_pos/CDS_length:struct,AA_pos/AA_length:struct,Distance:int,ERRORS/WARNINGS/INFO:string>> > INFO_BIAS string > INFO_MQ double > INFO_HIAF double > INFO_END int > INFO_SPLITREAD int > INFO_GDAMP int > INFO_LSEQ string > INFO_LOF array > INFO_SAMPLE string > INFO_AMPFLAG int > INFO_SN double > INFO_SVTYPE string > INFO_TYPE string > INFO_MSILEN double > INFO_DUPRATE double > INFO_DBSNP_COMMON int > INFO_REFBIAS string > genotypes > array,ALD:array,AF:array,phased:boolean,calls:array,VD:int,depth:int,RD:array>> > {code} > You can see that column INFO_ANN is an array of struct and it contains column > which has "/" inside such as "cDNA_pos/cDNA_length", etc. > We believe that it is the root cause that cause the following SparkException: > {code:java} > scala> val schema = spark.sql("CREATE OR REPLACE VIEW yuting AS SELECT > INFO_ANN FROM table_2611810") > 24/01/31 07:50:02.658 [main] WARN o.a.spark.sql.catalyst.util.package - > Truncated the string representation of a plan since it was too large. This > behavior can be adjusted by setting 'spark.sql.debug.maxToStringFields'. > org.apache.spark.SparkException: Cannot recognize hive type string: > array,Annotation_Impact:string,Gene_Name:string,Gene_ID:string,Feature_Type:string,Feature_ID:string,Transcript_BioType:string,Rank:struct,HGVS_c:string,HGVS_p:string,cDNA_pos/cDNA_length:struct,CDS_pos/CDS_length:struct,AA_pos/AA_length:struct,Distance:int,ERRORS/WARNINGS/INFO:string>>, > column: INFO_ANN > at > org.apache.spark.sql.errors.QueryExecutionErrors$.cannotRecognizeHiveTypeError(QueryExecutionErrors.scala:1455) > at > org.apache.spark.sql.hive.client.HiveClientImpl$.getSparkSQLDataType(HiveClientImpl.scala:1022) > at > org.apache.spark.sql.hive.client.HiveClientImpl$.$anonfun$verifyColumnDataType$1(HiveClientImpl.scala:1037) > at scala.collection.Iterator.foreach(Iterator.scala:943) > at scala.collection.Iterator.foreach$(Iterator.scala:943) > at scala.collection.AbstractIterator.foreach(Iterator.scala:1431) > at scala.collection.IterableLike.foreach(IterableLike.scala:74) > at scala.collection.IterableLike.foreach$(IterableLike.scala:73) > at
[jira] [Commented] (SPARK-46934) Unable to create Hive View from certain Spark Dataframe StructType
[ https://issues.apache.org/jira/browse/SPARK-46934?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17818633#comment-17818633 ] Yu-Ting LIN commented on SPARK-46934: - [~dongjoon] What do you mean for regression ? > Unable to create Hive View from certain Spark Dataframe StructType > -- > > Key: SPARK-46934 > URL: https://issues.apache.org/jira/browse/SPARK-46934 > Project: Spark > Issue Type: Bug > Components: SQL >Affects Versions: 3.3.0, 3.3.2, 3.3.4 > Environment: Tested in Spark 3.3.0, 3.3.2. >Reporter: Yu-Ting LIN >Assignee: Kent Yao >Priority: Blocker > Labels: pull-request-available > Fix For: 4.0.0 > > > We are trying to create a Hive View using following SQL command "CREATE OR > REPLACE VIEW yuting AS SELECT INFO_ANN FROM table_2611810". > Our table_2611810 has certain columns contain special characters such as "/". > Here is the schema of this table. > {code:java} > contigName string > start bigint > end bigint > names array > referenceAllele string > alternateAlleles array > qual double > filters array > splitFromMultiAllelic boolean > INFO_NCAMP int > INFO_ODDRATIO double > INFO_NM double > INFO_DBSNP_CAF array > INFO_SPANPAIR int > INFO_TLAMP int > INFO_PSTD double > INFO_QSTD double > INFO_SBF double > INFO_AF array > INFO_QUAL double > INFO_SHIFT3 int > INFO_VARBIAS string > INFO_HICOV int > INFO_PMEAN double > INFO_MSI double > INFO_VD int > INFO_DP int > INFO_HICNT int > INFO_ADJAF double > INFO_SVLEN int > INFO_RSEQ string > INFO_MSigDb array > INFO_NMD array > INFO_ANN > array,Annotation_Impact:string,Gene_Name:string,Gene_ID:string,Feature_Type:string,Feature_ID:string,Transcript_BioType:string,Rank:struct,HGVS_c:string,HGVS_p:string,cDNA_pos/cDNA_length:struct,CDS_pos/CDS_length:struct,AA_pos/AA_length:struct,Distance:int,ERRORS/WARNINGS/INFO:string>> > INFO_BIAS string > INFO_MQ double > INFO_HIAF double > INFO_END int > INFO_SPLITREAD int > INFO_GDAMP int > INFO_LSEQ string > INFO_LOF array > INFO_SAMPLE string > INFO_AMPFLAG int > INFO_SN double > INFO_SVTYPE string > INFO_TYPE string > INFO_MSILEN double > INFO_DUPRATE double > INFO_DBSNP_COMMON int > INFO_REFBIAS string > genotypes > array,ALD:array,AF:array,phased:boolean,calls:array,VD:int,depth:int,RD:array>> > {code} > You can see that column INFO_ANN is an array of struct and it contains column > which has "/" inside such as "cDNA_pos/cDNA_length", etc. > We believe that it is the root cause that cause the following SparkException: > {code:java} > scala> val schema = spark.sql("CREATE OR REPLACE VIEW yuting AS SELECT > INFO_ANN FROM table_2611810") > 24/01/31 07:50:02.658 [main] WARN o.a.spark.sql.catalyst.util.package - > Truncated the string representation of a plan since it was too large. This > behavior can be adjusted by setting 'spark.sql.debug.maxToStringFields'. > org.apache.spark.SparkException: Cannot recognize hive type string: > array,Annotation_Impact:string,Gene_Name:string,Gene_ID:string,Feature_Type:string,Feature_ID:string,Transcript_BioType:string,Rank:struct,HGVS_c:string,HGVS_p:string,cDNA_pos/cDNA_length:struct,CDS_pos/CDS_length:struct,AA_pos/AA_length:struct,Distance:int,ERRORS/WARNINGS/INFO:string>>, > column: INFO_ANN > at > org.apache.spark.sql.errors.QueryExecutionErrors$.cannotRecognizeHiveTypeError(QueryExecutionErrors.scala:1455) > at > org.apache.spark.sql.hive.client.HiveClientImpl$.getSparkSQLDataType(HiveClientImpl.scala:1022) > at > org.apache.spark.sql.hive.client.HiveClientImpl$.$anonfun$verifyColumnDataType$1(HiveClientImpl.scala:1037) > at scala.collection.Iterator.foreach(Iterator.scala:943) > at scala.collection.Iterator.foreach$(Iterator.scala:943) > at scala.collection.AbstractIterator.foreach(Iterator.scala:1431) > at scala.collection.IterableLike.foreach(IterableLike.scala:74) > at scala.collection.IterableLike.foreach$(IterableLike.scala:73) > at org.apache.spark.sql.types.StructType.foreach(StructType.scala:102) > at >
[jira] [Commented] (SPARK-46934) Unable to create Hive View from certain Spark Dataframe StructType
[ https://issues.apache.org/jira/browse/SPARK-46934?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17818629#comment-17818629 ] Yu-Ting LIN commented on SPARK-46934: - [~dongjoon] we are mainly using Spark 3.3 and have a plan to migrate to Spark 3.5 so it would be good for us to have this fix in Spark 3.5. Many thanks. > Unable to create Hive View from certain Spark Dataframe StructType > -- > > Key: SPARK-46934 > URL: https://issues.apache.org/jira/browse/SPARK-46934 > Project: Spark > Issue Type: Bug > Components: SQL >Affects Versions: 3.3.0, 3.3.2, 3.3.4 > Environment: Tested in Spark 3.3.0, 3.3.2. >Reporter: Yu-Ting LIN >Assignee: Kent Yao >Priority: Blocker > Labels: pull-request-available > Fix For: 4.0.0 > > > We are trying to create a Hive View using following SQL command "CREATE OR > REPLACE VIEW yuting AS SELECT INFO_ANN FROM table_2611810". > Our table_2611810 has certain columns contain special characters such as "/". > Here is the schema of this table. > {code:java} > contigName string > start bigint > end bigint > names array > referenceAllele string > alternateAlleles array > qual double > filters array > splitFromMultiAllelic boolean > INFO_NCAMP int > INFO_ODDRATIO double > INFO_NM double > INFO_DBSNP_CAF array > INFO_SPANPAIR int > INFO_TLAMP int > INFO_PSTD double > INFO_QSTD double > INFO_SBF double > INFO_AF array > INFO_QUAL double > INFO_SHIFT3 int > INFO_VARBIAS string > INFO_HICOV int > INFO_PMEAN double > INFO_MSI double > INFO_VD int > INFO_DP int > INFO_HICNT int > INFO_ADJAF double > INFO_SVLEN int > INFO_RSEQ string > INFO_MSigDb array > INFO_NMD array > INFO_ANN > array,Annotation_Impact:string,Gene_Name:string,Gene_ID:string,Feature_Type:string,Feature_ID:string,Transcript_BioType:string,Rank:struct,HGVS_c:string,HGVS_p:string,cDNA_pos/cDNA_length:struct,CDS_pos/CDS_length:struct,AA_pos/AA_length:struct,Distance:int,ERRORS/WARNINGS/INFO:string>> > INFO_BIAS string > INFO_MQ double > INFO_HIAF double > INFO_END int > INFO_SPLITREAD int > INFO_GDAMP int > INFO_LSEQ string > INFO_LOF array > INFO_SAMPLE string > INFO_AMPFLAG int > INFO_SN double > INFO_SVTYPE string > INFO_TYPE string > INFO_MSILEN double > INFO_DUPRATE double > INFO_DBSNP_COMMON int > INFO_REFBIAS string > genotypes > array,ALD:array,AF:array,phased:boolean,calls:array,VD:int,depth:int,RD:array>> > {code} > You can see that column INFO_ANN is an array of struct and it contains column > which has "/" inside such as "cDNA_pos/cDNA_length", etc. > We believe that it is the root cause that cause the following SparkException: > {code:java} > scala> val schema = spark.sql("CREATE OR REPLACE VIEW yuting AS SELECT > INFO_ANN FROM table_2611810") > 24/01/31 07:50:02.658 [main] WARN o.a.spark.sql.catalyst.util.package - > Truncated the string representation of a plan since it was too large. This > behavior can be adjusted by setting 'spark.sql.debug.maxToStringFields'. > org.apache.spark.SparkException: Cannot recognize hive type string: > array,Annotation_Impact:string,Gene_Name:string,Gene_ID:string,Feature_Type:string,Feature_ID:string,Transcript_BioType:string,Rank:struct,HGVS_c:string,HGVS_p:string,cDNA_pos/cDNA_length:struct,CDS_pos/CDS_length:struct,AA_pos/AA_length:struct,Distance:int,ERRORS/WARNINGS/INFO:string>>, > column: INFO_ANN > at > org.apache.spark.sql.errors.QueryExecutionErrors$.cannotRecognizeHiveTypeError(QueryExecutionErrors.scala:1455) > at > org.apache.spark.sql.hive.client.HiveClientImpl$.getSparkSQLDataType(HiveClientImpl.scala:1022) > at > org.apache.spark.sql.hive.client.HiveClientImpl$.$anonfun$verifyColumnDataType$1(HiveClientImpl.scala:1037) > at scala.collection.Iterator.foreach(Iterator.scala:943) > at scala.collection.Iterator.foreach$(Iterator.scala:943) > at scala.collection.AbstractIterator.foreach(Iterator.scala:1431) > at scala.collection.IterableLike.foreach(IterableLike.scala:74) > at scala.collection.IterableLike.foreach$(IterableLike.scala:73) > at
[jira] [Updated] (SPARK-47080) Fix `HistoryServerSuite` to get `getNumJobs` in `eventually`
[ https://issues.apache.org/jira/browse/SPARK-47080?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated SPARK-47080: --- Labels: pull-request-available (was: ) > Fix `HistoryServerSuite` to get `getNumJobs` in `eventually` > > > Key: SPARK-47080 > URL: https://issues.apache.org/jira/browse/SPARK-47080 > Project: Spark > Issue Type: Sub-task > Components: Spark Core, Tests >Affects Versions: 4.0.0 >Reporter: Dongjoon Hyun >Priority: Minor > Labels: pull-request-available > -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-47099) The `start` value of `paramIndex` for the error class `UNEXPECTED_INPUT_TYPE` should be `1`1
[ https://issues.apache.org/jira/browse/SPARK-47099?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated SPARK-47099: --- Labels: pull-request-available (was: ) > The `start` value of `paramIndex` for the error class `UNEXPECTED_INPUT_TYPE` > should be `1`1 > > > Key: SPARK-47099 > URL: https://issues.apache.org/jira/browse/SPARK-47099 > Project: Spark > Issue Type: Improvement > Components: SQL >Affects Versions: 4.0.0 >Reporter: BingKun Pan >Priority: Minor > Labels: pull-request-available > -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-47099) The `start` value of `paramIndex` for the error class `UNEXPECTED_INPUT_TYPE` should be `1`1
BingKun Pan created SPARK-47099: --- Summary: The `start` value of `paramIndex` for the error class `UNEXPECTED_INPUT_TYPE` should be `1`1 Key: SPARK-47099 URL: https://issues.apache.org/jira/browse/SPARK-47099 Project: Spark Issue Type: Improvement Components: SQL Affects Versions: 4.0.0 Reporter: BingKun Pan -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Assigned] (SPARK-47097) Deflake "interrupt tag" at SparkSessionE2ESuite
[ https://issues.apache.org/jira/browse/SPARK-47097?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hyukjin Kwon reassigned SPARK-47097: Assignee: Hyukjin Kwon > Deflake "interrupt tag" at SparkSessionE2ESuite > --- > > Key: SPARK-47097 > URL: https://issues.apache.org/jira/browse/SPARK-47097 > Project: Spark > Issue Type: Test > Components: Connect >Affects Versions: 4.0.0 >Reporter: Hyukjin Kwon >Assignee: Hyukjin Kwon >Priority: Major > Labels: pull-request-available > > {code} > - interrupt tag *** FAILED *** > The code passed to eventually never returned normally. Attempted 30 times > over 20.03742146498 seconds. Last failure message: > ListBuffer("2beba4ac-a994-45f5-bd46-fca3e43fb5ef") had length 1 instead of > expected length 2 Interrupted operations: > ListBuffer(2beba4ac-a994-45f5-bd46-fca3e43fb5ef).. > (SparkSessionE2ESuite.scala:216) > {code} > https://github.com/apache/spark/actions/runs/7959951623/job/21727929211 -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Resolved] (SPARK-47097) Deflake "interrupt tag" at SparkSessionE2ESuite
[ https://issues.apache.org/jira/browse/SPARK-47097?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hyukjin Kwon resolved SPARK-47097. -- Fix Version/s: 4.0.0 Resolution: Fixed Issue resolved by pull request 45173 [https://github.com/apache/spark/pull/45173] > Deflake "interrupt tag" at SparkSessionE2ESuite > --- > > Key: SPARK-47097 > URL: https://issues.apache.org/jira/browse/SPARK-47097 > Project: Spark > Issue Type: Test > Components: Connect >Affects Versions: 4.0.0 >Reporter: Hyukjin Kwon >Assignee: Hyukjin Kwon >Priority: Major > Labels: pull-request-available > Fix For: 4.0.0 > > > {code} > - interrupt tag *** FAILED *** > The code passed to eventually never returned normally. Attempted 30 times > over 20.03742146498 seconds. Last failure message: > ListBuffer("2beba4ac-a994-45f5-bd46-fca3e43fb5ef") had length 1 instead of > expected length 2 Interrupted operations: > ListBuffer(2beba4ac-a994-45f5-bd46-fca3e43fb5ef).. > (SparkSessionE2ESuite.scala:216) > {code} > https://github.com/apache/spark/actions/runs/7959951623/job/21727929211 -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-46973) Add table cache for V2 tables
[ https://issues.apache.org/jira/browse/SPARK-46973?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated SPARK-46973: --- Labels: pull-request-available (was: ) > Add table cache for V2 tables > - > > Key: SPARK-46973 > URL: https://issues.apache.org/jira/browse/SPARK-46973 > Project: Spark > Issue Type: Sub-task > Components: SQL >Affects Versions: 4.0.0 >Reporter: Allison Wang >Priority: Major > Labels: pull-request-available > -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-47016) Upgrade scalatest related dependencies to the 3.2.18 series
[ https://issues.apache.org/jira/browse/SPARK-47016?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated SPARK-47016: --- Labels: pull-request-available (was: ) > Upgrade scalatest related dependencies to the 3.2.18 series > --- > > Key: SPARK-47016 > URL: https://issues.apache.org/jira/browse/SPARK-47016 > Project: Spark > Issue Type: Improvement > Components: Build >Affects Versions: 4.0.0 >Reporter: Yang Jie >Priority: Major > Labels: pull-request-available > -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Resolved] (SPARK-46820) Fix error message regression by restoring new_msg
[ https://issues.apache.org/jira/browse/SPARK-46820?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hyukjin Kwon resolved SPARK-46820. -- Fix Version/s: 4.0.0 Resolution: Fixed Issue resolved by pull request 44859 [https://github.com/apache/spark/pull/44859] > Fix error message regression by restoring new_msg > - > > Key: SPARK-46820 > URL: https://issues.apache.org/jira/browse/SPARK-46820 > Project: Spark > Issue Type: Sub-task > Components: PySpark >Affects Versions: 4.0.0 >Reporter: Haejoon Lee >Assignee: Haejoon Lee >Priority: Major > Labels: pull-request-available > Fix For: 4.0.0 > > > >>> from pyspark.sql.types import StructType, StructField, StringType, > >>> IntegerType > >>> schema = StructType([ > ... StructField("name", StringType(), nullable=True), > ... StructField("age", IntegerType(), nullable=False) > ... ]) > >>> df = spark.createDataFrame([("asd", None])], schema) > pyspark.errors.exceptions.base.PySparkValueError: [CANNOT_BE_NONE] Argument > `obj` cannot be None. > -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Assigned] (SPARK-46820) Fix error message regression by restoring new_msg
[ https://issues.apache.org/jira/browse/SPARK-46820?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hyukjin Kwon reassigned SPARK-46820: Assignee: Haejoon Lee > Fix error message regression by restoring new_msg > - > > Key: SPARK-46820 > URL: https://issues.apache.org/jira/browse/SPARK-46820 > Project: Spark > Issue Type: Sub-task > Components: PySpark >Affects Versions: 4.0.0 >Reporter: Haejoon Lee >Assignee: Haejoon Lee >Priority: Major > Labels: pull-request-available > > >>> from pyspark.sql.types import StructType, StructField, StringType, > >>> IntegerType > >>> schema = StructType([ > ... StructField("name", StringType(), nullable=True), > ... StructField("age", IntegerType(), nullable=False) > ... ]) > >>> df = spark.createDataFrame([("asd", None])], schema) > pyspark.errors.exceptions.base.PySparkValueError: [CANNOT_BE_NONE] Argument > `obj` cannot be None. > -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Resolved] (SPARK-47093) Upgrade `mockito` to 5.10.0
[ https://issues.apache.org/jira/browse/SPARK-47093?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dongjoon Hyun resolved SPARK-47093. --- Fix Version/s: 4.0.0 Resolution: Fixed Issue resolved by pull request 45169 [https://github.com/apache/spark/pull/45169] > Upgrade `mockito` to 5.10.0 > --- > > Key: SPARK-47093 > URL: https://issues.apache.org/jira/browse/SPARK-47093 > Project: Spark > Issue Type: Sub-task > Components: Tests >Affects Versions: 4.0.0 >Reporter: Dongjoon Hyun >Assignee: Dongjoon Hyun >Priority: Major > Labels: pull-request-available > Fix For: 4.0.0 > > -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-47097) Deflake "interrupt tag" at SparkSessionE2ESuite
[ https://issues.apache.org/jira/browse/SPARK-47097?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated SPARK-47097: --- Labels: pull-request-available (was: ) > Deflake "interrupt tag" at SparkSessionE2ESuite > --- > > Key: SPARK-47097 > URL: https://issues.apache.org/jira/browse/SPARK-47097 > Project: Spark > Issue Type: Test > Components: Connect >Affects Versions: 4.0.0 >Reporter: Hyukjin Kwon >Priority: Major > Labels: pull-request-available > > {code} > - interrupt tag *** FAILED *** > The code passed to eventually never returned normally. Attempted 30 times > over 20.03742146498 seconds. Last failure message: > ListBuffer("2beba4ac-a994-45f5-bd46-fca3e43fb5ef") had length 1 instead of > expected length 2 Interrupted operations: > ListBuffer(2beba4ac-a994-45f5-bd46-fca3e43fb5ef).. > (SparkSessionE2ESuite.scala:216) > {code} > https://github.com/apache/spark/actions/runs/7959951623/job/21727929211 -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-47097) Deflake "interrupt tag" at SparkSessionE2ESuite
[ https://issues.apache.org/jira/browse/SPARK-47097?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hyukjin Kwon updated SPARK-47097: - Issue Type: Test (was: Improvement) > Deflake "interrupt tag" at SparkSessionE2ESuite > --- > > Key: SPARK-47097 > URL: https://issues.apache.org/jira/browse/SPARK-47097 > Project: Spark > Issue Type: Test > Components: Connect >Affects Versions: 4.0.0 >Reporter: Hyukjin Kwon >Priority: Major > > {code} > - interrupt tag *** FAILED *** > The code passed to eventually never returned normally. Attempted 30 times > over 20.03742146498 seconds. Last failure message: > ListBuffer("2beba4ac-a994-45f5-bd46-fca3e43fb5ef") had length 1 instead of > expected length 2 Interrupted operations: > ListBuffer(2beba4ac-a994-45f5-bd46-fca3e43fb5ef).. > (SparkSessionE2ESuite.scala:216) > {code} > https://github.com/apache/spark/actions/runs/7959951623/job/21727929211 -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-47097) Deflake "interrupt tag" at SparkSessionE2ESuite
Hyukjin Kwon created SPARK-47097: Summary: Deflake "interrupt tag" at SparkSessionE2ESuite Key: SPARK-47097 URL: https://issues.apache.org/jira/browse/SPARK-47097 Project: Spark Issue Type: Improvement Components: Connect Affects Versions: 4.0.0 Reporter: Hyukjin Kwon {code} - interrupt tag *** FAILED *** The code passed to eventually never returned normally. Attempted 30 times over 20.03742146498 seconds. Last failure message: ListBuffer("2beba4ac-a994-45f5-bd46-fca3e43fb5ef") had length 1 instead of expected length 2 Interrupted operations: ListBuffer(2beba4ac-a994-45f5-bd46-fca3e43fb5ef).. (SparkSessionE2ESuite.scala:216) {code} https://github.com/apache/spark/actions/runs/7959951623/job/21727929211 -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Resolved] (SPARK-47062) Make Spark Connect Plugins Java Compatible
[ https://issues.apache.org/jira/browse/SPARK-47062?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hyukjin Kwon resolved SPARK-47062. -- Fix Version/s: 4.0.0 Resolution: Fixed Issue resolved by pull request 45114 [https://github.com/apache/spark/pull/45114] > Make Spark Connect Plugins Java Compatible > -- > > Key: SPARK-47062 > URL: https://issues.apache.org/jira/browse/SPARK-47062 > Project: Spark > Issue Type: Improvement > Components: Connect >Affects Versions: 3.5.0 >Reporter: Martin Grund >Assignee: Martin Grund >Priority: Major > Labels: pull-request-available > Fix For: 4.0.0 > > > Make Spark Connect Plugins Java Compatible -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Assigned] (SPARK-37434) Disable unsupported `ExtendedLevelDBTest` on `MacOS/aarch64`
[ https://issues.apache.org/jira/browse/SPARK-37434?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dongjoon Hyun reassigned SPARK-37434: - Assignee: Yang Jie (was: Apache Spark) > Disable unsupported `ExtendedLevelDBTest` on `MacOS/aarch64` > > > Key: SPARK-37434 > URL: https://issues.apache.org/jira/browse/SPARK-37434 > Project: Spark > Issue Type: Sub-task > Components: Build >Affects Versions: 4.0.0 >Reporter: Yang Jie >Assignee: Yang Jie >Priority: Minor > Labels: pull-request-available > Fix For: 4.0.0 > > > After SPARK-37272 and SPARK-37282, we can manually add > {code:java} > -Dtest.exclude.tags=org.apache.spark.tags.ExtendedLevelDBTest,org.apache.spark.tags.ExtendedRocksDBTest > {code} > when run mvn test or sbt test to disable unsupported UTs on Macos using Apple > Silicon. > > We can add a profile to and activate this property automatically when run > UTs on Macos using Apple Silicon. > > > > -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-47096) Upgrade Python to 3.11 in Maven builds
[ https://issues.apache.org/jira/browse/SPARK-47096?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dongjoon Hyun updated SPARK-47096: -- Parent: SPARK-44111 Issue Type: Sub-task (was: Improvement) > Upgrade Python to 3.11 in Maven builds > -- > > Key: SPARK-47096 > URL: https://issues.apache.org/jira/browse/SPARK-47096 > Project: Spark > Issue Type: Sub-task > Components: Project Infra >Affects Versions: 4.0.0 >Reporter: Hyukjin Kwon >Assignee: Hyukjin Kwon >Priority: Major > Labels: pull-request-available > Fix For: 4.0.0 > > > {code} > Error: dyld[4738]: Library not loaded: > /usr/local/opt/gettext/lib/libintl.8.dylib > Referenced from:Error: -3B8E-94C6-6649527BFDBE> > /Users/runner/hostedtoolcache/Python/3.9.18/x64/bin/python3.9 > Reason: tried: '/usr/local/opt/gettext/lib/libintl.8.dylib' (no such > file), > '/System/Volumes/Preboot/Cryptexes/OS/usr/local/opt/gettext/lib/libintl.8.dylib' > (no such file), '/usr/local/opt/gettext/lib/libintl.8.dylib' (no such file), > '/usr/local/lib/libintl.8.dylib' (no such file), '/usr/lib/libintl.8.dylib' > (no such file, not in dyld cache) > ./setup.sh: line 52: 4738 Abort trap: 6 ./python -m ensurepip > {code} > https://github.com/apache/spark/actions/runs/7964626045/job/21742574260 > https://github.com/actions/runner-images/blob/main/images/macos/macos-14-Readme.md -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-47096) Upgrade Python to 3.11 in Maven builds
[ https://issues.apache.org/jira/browse/SPARK-47096?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dongjoon Hyun updated SPARK-47096: -- Summary: Upgrade Python to 3.11 in Maven builds (was: Upgrade Python version in Maven builds) > Upgrade Python to 3.11 in Maven builds > -- > > Key: SPARK-47096 > URL: https://issues.apache.org/jira/browse/SPARK-47096 > Project: Spark > Issue Type: Improvement > Components: Project Infra >Affects Versions: 4.0.0 >Reporter: Hyukjin Kwon >Assignee: Hyukjin Kwon >Priority: Major > Labels: pull-request-available > Fix For: 4.0.0 > > > {code} > Error: dyld[4738]: Library not loaded: > /usr/local/opt/gettext/lib/libintl.8.dylib > Referenced from:Error: -3B8E-94C6-6649527BFDBE> > /Users/runner/hostedtoolcache/Python/3.9.18/x64/bin/python3.9 > Reason: tried: '/usr/local/opt/gettext/lib/libintl.8.dylib' (no such > file), > '/System/Volumes/Preboot/Cryptexes/OS/usr/local/opt/gettext/lib/libintl.8.dylib' > (no such file), '/usr/local/opt/gettext/lib/libintl.8.dylib' (no such file), > '/usr/local/lib/libintl.8.dylib' (no such file), '/usr/lib/libintl.8.dylib' > (no such file, not in dyld cache) > ./setup.sh: line 52: 4738 Abort trap: 6 ./python -m ensurepip > {code} > https://github.com/apache/spark/actions/runs/7964626045/job/21742574260 > https://github.com/actions/runner-images/blob/main/images/macos/macos-14-Readme.md -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Resolved] (SPARK-47096) Upgrade Python version in Maven build in macos-14 build
[ https://issues.apache.org/jira/browse/SPARK-47096?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dongjoon Hyun resolved SPARK-47096. --- Fix Version/s: 4.0.0 Resolution: Fixed Issue resolved by pull request 45172 [https://github.com/apache/spark/pull/45172] > Upgrade Python version in Maven build in macos-14 build > --- > > Key: SPARK-47096 > URL: https://issues.apache.org/jira/browse/SPARK-47096 > Project: Spark > Issue Type: Improvement > Components: Project Infra >Affects Versions: 4.0.0 >Reporter: Hyukjin Kwon >Assignee: Hyukjin Kwon >Priority: Major > Labels: pull-request-available > Fix For: 4.0.0 > > > {code} > Error: dyld[4738]: Library not loaded: > /usr/local/opt/gettext/lib/libintl.8.dylib > Referenced from:Error: -3B8E-94C6-6649527BFDBE> > /Users/runner/hostedtoolcache/Python/3.9.18/x64/bin/python3.9 > Reason: tried: '/usr/local/opt/gettext/lib/libintl.8.dylib' (no such > file), > '/System/Volumes/Preboot/Cryptexes/OS/usr/local/opt/gettext/lib/libintl.8.dylib' > (no such file), '/usr/local/opt/gettext/lib/libintl.8.dylib' (no such file), > '/usr/local/lib/libintl.8.dylib' (no such file), '/usr/lib/libintl.8.dylib' > (no such file, not in dyld cache) > ./setup.sh: line 52: 4738 Abort trap: 6 ./python -m ensurepip > {code} > https://github.com/apache/spark/actions/runs/7964626045/job/21742574260 > https://github.com/actions/runner-images/blob/main/images/macos/macos-14-Readme.md -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Assigned] (SPARK-47096) Upgrade Python version in Maven build in macos-14 build
[ https://issues.apache.org/jira/browse/SPARK-47096?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dongjoon Hyun reassigned SPARK-47096: - Assignee: Hyukjin Kwon > Upgrade Python version in Maven build in macos-14 build > --- > > Key: SPARK-47096 > URL: https://issues.apache.org/jira/browse/SPARK-47096 > Project: Spark > Issue Type: Improvement > Components: Project Infra >Affects Versions: 4.0.0 >Reporter: Hyukjin Kwon >Assignee: Hyukjin Kwon >Priority: Major > Labels: pull-request-available > > {code} > Error: dyld[4738]: Library not loaded: > /usr/local/opt/gettext/lib/libintl.8.dylib > Referenced from:Error: -3B8E-94C6-6649527BFDBE> > /Users/runner/hostedtoolcache/Python/3.9.18/x64/bin/python3.9 > Reason: tried: '/usr/local/opt/gettext/lib/libintl.8.dylib' (no such > file), > '/System/Volumes/Preboot/Cryptexes/OS/usr/local/opt/gettext/lib/libintl.8.dylib' > (no such file), '/usr/local/opt/gettext/lib/libintl.8.dylib' (no such file), > '/usr/local/lib/libintl.8.dylib' (no such file), '/usr/lib/libintl.8.dylib' > (no such file, not in dyld cache) > ./setup.sh: line 52: 4738 Abort trap: 6 ./python -m ensurepip > {code} > https://github.com/apache/spark/actions/runs/7964626045/job/21742574260 > https://github.com/actions/runner-images/blob/main/images/macos/macos-14-Readme.md -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-47096) Upgrade Python version in Maven builds
[ https://issues.apache.org/jira/browse/SPARK-47096?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dongjoon Hyun updated SPARK-47096: -- Summary: Upgrade Python version in Maven builds (was: Upgrade Python version in Maven build in macos-14 build) > Upgrade Python version in Maven builds > -- > > Key: SPARK-47096 > URL: https://issues.apache.org/jira/browse/SPARK-47096 > Project: Spark > Issue Type: Improvement > Components: Project Infra >Affects Versions: 4.0.0 >Reporter: Hyukjin Kwon >Assignee: Hyukjin Kwon >Priority: Major > Labels: pull-request-available > Fix For: 4.0.0 > > > {code} > Error: dyld[4738]: Library not loaded: > /usr/local/opt/gettext/lib/libintl.8.dylib > Referenced from:Error: -3B8E-94C6-6649527BFDBE> > /Users/runner/hostedtoolcache/Python/3.9.18/x64/bin/python3.9 > Reason: tried: '/usr/local/opt/gettext/lib/libintl.8.dylib' (no such > file), > '/System/Volumes/Preboot/Cryptexes/OS/usr/local/opt/gettext/lib/libintl.8.dylib' > (no such file), '/usr/local/opt/gettext/lib/libintl.8.dylib' (no such file), > '/usr/local/lib/libintl.8.dylib' (no such file), '/usr/lib/libintl.8.dylib' > (no such file, not in dyld cache) > ./setup.sh: line 52: 4738 Abort trap: 6 ./python -m ensurepip > {code} > https://github.com/apache/spark/actions/runs/7964626045/job/21742574260 > https://github.com/actions/runner-images/blob/main/images/macos/macos-14-Readme.md -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-47096) Upgrade Python version in Maven build in macos-14 build
[ https://issues.apache.org/jira/browse/SPARK-47096?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated SPARK-47096: --- Labels: pull-request-available (was: ) > Upgrade Python version in Maven build in macos-14 build > --- > > Key: SPARK-47096 > URL: https://issues.apache.org/jira/browse/SPARK-47096 > Project: Spark > Issue Type: Improvement > Components: Project Infra >Affects Versions: 4.0.0 >Reporter: Hyukjin Kwon >Priority: Major > Labels: pull-request-available > > {code} > Error: dyld[4738]: Library not loaded: > /usr/local/opt/gettext/lib/libintl.8.dylib > Referenced from:Error: -3B8E-94C6-6649527BFDBE> > /Users/runner/hostedtoolcache/Python/3.9.18/x64/bin/python3.9 > Reason: tried: '/usr/local/opt/gettext/lib/libintl.8.dylib' (no such > file), > '/System/Volumes/Preboot/Cryptexes/OS/usr/local/opt/gettext/lib/libintl.8.dylib' > (no such file), '/usr/local/opt/gettext/lib/libintl.8.dylib' (no such file), > '/usr/local/lib/libintl.8.dylib' (no such file), '/usr/lib/libintl.8.dylib' > (no such file, not in dyld cache) > ./setup.sh: line 52: 4738 Abort trap: 6 ./python -m ensurepip > {code} > https://github.com/apache/spark/actions/runs/7964626045/job/21742574260 > https://github.com/actions/runner-images/blob/main/images/macos/macos-14-Readme.md -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-47096) Upgrade Python version in Maven build in macos-14 build
[ https://issues.apache.org/jira/browse/SPARK-47096?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hyukjin Kwon updated SPARK-47096: - Description: {code} Error: dyld[4738]: Library not loaded: /usr/local/opt/gettext/lib/libintl.8.dylib Referenced from: /Users/runner/hostedtoolcache/Python/3.9.18/x64/bin/python3.9 Reason: tried: '/usr/local/opt/gettext/lib/libintl.8.dylib' (no such file), '/System/Volumes/Preboot/Cryptexes/OS/usr/local/opt/gettext/lib/libintl.8.dylib' (no such file), '/usr/local/opt/gettext/lib/libintl.8.dylib' (no such file), '/usr/local/lib/libintl.8.dylib' (no such file), '/usr/lib/libintl.8.dylib' (no such file, not in dyld cache) ./setup.sh: line 52: 4738 Abort trap: 6 ./python -m ensurepip {code} https://github.com/apache/spark/actions/runs/7964626045/job/21742574260 https://github.com/actions/runner-images/blob/main/images/macos/macos-14-Readme.md was: {code} Error: dyld[4738]: Library not loaded: /usr/local/opt/gettext/lib/libintl.8.dylib Referenced from: /Users/runner/hostedtoolcache/Python/3.9.18/x64/bin/python3.9 Reason: tried: '/usr/local/opt/gettext/lib/libintl.8.dylib' (no such file), '/System/Volumes/Preboot/Cryptexes/OS/usr/local/opt/gettext/lib/libintl.8.dylib' (no such file), '/usr/local/opt/gettext/lib/libintl.8.dylib' (no such file), '/usr/local/lib/libintl.8.dylib' (no such file), '/usr/lib/libintl.8.dylib' (no such file, not in dyld cache) ./setup.sh: line 52: 4738 Abort trap: 6 ./python -m ensurepip {code} https://github.com/apache/spark/actions/runs/7964626045/job/21742574260 > Upgrade Python version in Maven build in macos-14 build > --- > > Key: SPARK-47096 > URL: https://issues.apache.org/jira/browse/SPARK-47096 > Project: Spark > Issue Type: Improvement > Components: Project Infra >Affects Versions: 4.0.0 >Reporter: Hyukjin Kwon >Priority: Major > > {code} > Error: dyld[4738]: Library not loaded: > /usr/local/opt/gettext/lib/libintl.8.dylib > Referenced from:Error: -3B8E-94C6-6649527BFDBE> > /Users/runner/hostedtoolcache/Python/3.9.18/x64/bin/python3.9 > Reason: tried: '/usr/local/opt/gettext/lib/libintl.8.dylib' (no such > file), > '/System/Volumes/Preboot/Cryptexes/OS/usr/local/opt/gettext/lib/libintl.8.dylib' > (no such file), '/usr/local/opt/gettext/lib/libintl.8.dylib' (no such file), > '/usr/local/lib/libintl.8.dylib' (no such file), '/usr/lib/libintl.8.dylib' > (no such file, not in dyld cache) > ./setup.sh: line 52: 4738 Abort trap: 6 ./python -m ensurepip > {code} > https://github.com/apache/spark/actions/runs/7964626045/job/21742574260 > https://github.com/actions/runner-images/blob/main/images/macos/macos-14-Readme.md -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-47096) Upgrade Python version in Maven build in macos-14 build
[ https://issues.apache.org/jira/browse/SPARK-47096?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hyukjin Kwon updated SPARK-47096: - Summary: Upgrade Python version in Maven build in macos-14 build (was: Upgrade Python version in Maven build for macos compatibility) > Upgrade Python version in Maven build in macos-14 build > --- > > Key: SPARK-47096 > URL: https://issues.apache.org/jira/browse/SPARK-47096 > Project: Spark > Issue Type: Improvement > Components: Project Infra >Affects Versions: 4.0.0 >Reporter: Hyukjin Kwon >Priority: Major > > {code} > Error: dyld[4738]: Library not loaded: > /usr/local/opt/gettext/lib/libintl.8.dylib > Referenced from:Error: -3B8E-94C6-6649527BFDBE> > /Users/runner/hostedtoolcache/Python/3.9.18/x64/bin/python3.9 > Reason: tried: '/usr/local/opt/gettext/lib/libintl.8.dylib' (no such > file), > '/System/Volumes/Preboot/Cryptexes/OS/usr/local/opt/gettext/lib/libintl.8.dylib' > (no such file), '/usr/local/opt/gettext/lib/libintl.8.dylib' (no such file), > '/usr/local/lib/libintl.8.dylib' (no such file), '/usr/lib/libintl.8.dylib' > (no such file, not in dyld cache) > ./setup.sh: line 52: 4738 Abort trap: 6 ./python -m ensurepip > {code} > https://github.com/apache/spark/actions/runs/7964626045/job/21742574260 -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-47096) Upgrade Python version in Maven build for macos compatibility
Hyukjin Kwon created SPARK-47096: Summary: Upgrade Python version in Maven build for macos compatibility Key: SPARK-47096 URL: https://issues.apache.org/jira/browse/SPARK-47096 Project: Spark Issue Type: Improvement Components: Project Infra Affects Versions: 4.0.0 Reporter: Hyukjin Kwon {code} Error: dyld[4738]: Library not loaded: /usr/local/opt/gettext/lib/libintl.8.dylib Referenced from: /Users/runner/hostedtoolcache/Python/3.9.18/x64/bin/python3.9 Reason: tried: '/usr/local/opt/gettext/lib/libintl.8.dylib' (no such file), '/System/Volumes/Preboot/Cryptexes/OS/usr/local/opt/gettext/lib/libintl.8.dylib' (no such file), '/usr/local/opt/gettext/lib/libintl.8.dylib' (no such file), '/usr/local/lib/libintl.8.dylib' (no such file), '/usr/lib/libintl.8.dylib' (no such file, not in dyld cache) ./setup.sh: line 52: 4738 Abort trap: 6 ./python -m ensurepip {code} https://github.com/apache/spark/actions/runs/7964626045/job/21742574260 -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Resolved] (SPARK-37434) Disable unsupported `ExtendedLevelDBTest` on `MacOS/aarch64`
[ https://issues.apache.org/jira/browse/SPARK-37434?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dongjoon Hyun resolved SPARK-37434. --- Fix Version/s: 4.0.0 Resolution: Fixed Issue resolved by pull request 45170 [https://github.com/apache/spark/pull/45170] > Disable unsupported `ExtendedLevelDBTest` on `MacOS/aarch64` > > > Key: SPARK-37434 > URL: https://issues.apache.org/jira/browse/SPARK-37434 > Project: Spark > Issue Type: Sub-task > Components: Build >Affects Versions: 4.0.0 >Reporter: Yang Jie >Assignee: Apache Spark >Priority: Minor > Labels: pull-request-available > Fix For: 4.0.0 > > > After SPARK-37272 and SPARK-37282, we can manually add > {code:java} > -Dtest.exclude.tags=org.apache.spark.tags.ExtendedLevelDBTest,org.apache.spark.tags.ExtendedRocksDBTest > {code} > when run mvn test or sbt test to disable unsupported UTs on Macos using Apple > Silicon. > > We can add a profile to and activate this property automatically when run > UTs on Macos using Apple Silicon. > > > > -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Assigned] (SPARK-37434) Disable unsupported `ExtendedLevelDBTest` on `MacOS/aarch64`
[ https://issues.apache.org/jira/browse/SPARK-37434?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dongjoon Hyun reassigned SPARK-37434: - Assignee: Apache Spark > Disable unsupported `ExtendedLevelDBTest` on `MacOS/aarch64` > > > Key: SPARK-37434 > URL: https://issues.apache.org/jira/browse/SPARK-37434 > Project: Spark > Issue Type: Sub-task > Components: Build >Affects Versions: 4.0.0 >Reporter: Yang Jie >Assignee: Apache Spark >Priority: Minor > Labels: pull-request-available > > After SPARK-37272 and SPARK-37282, we can manually add > {code:java} > -Dtest.exclude.tags=org.apache.spark.tags.ExtendedLevelDBTest,org.apache.spark.tags.ExtendedRocksDBTest > {code} > when run mvn test or sbt test to disable unsupported UTs on Macos using Apple > Silicon. > > We can add a profile to and activate this property automatically when run > UTs on Macos using Apple Silicon. > > > > -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Resolved] (SPARK-47095) Uses proper options for command script in macos-14 build
[ https://issues.apache.org/jira/browse/SPARK-47095?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dongjoon Hyun resolved SPARK-47095. --- Fix Version/s: 4.0.0 Resolution: Fixed Issue resolved by pull request 45171 [https://github.com/apache/spark/pull/45171] > Uses proper options for command script in macos-14 build > > > Key: SPARK-47095 > URL: https://issues.apache.org/jira/browse/SPARK-47095 > Project: Spark > Issue Type: Improvement > Components: Project Infra >Affects Versions: 4.0.0 >Reporter: Hyukjin Kwon >Assignee: Hyukjin Kwon >Priority: Major > Labels: pull-request-available > Fix For: 4.0.0 > > > https://github.com/apache/spark/actions/runs/7964626045/job/21742573399 > It fails as below: > {code} > Run # Fix for TTY related issues when launching the Ammonite REPL in tests. > /usr/bin/script: illegal option -- c > usage: script [-aeFkpqr] [-t time] [file [command ...]] >script -p [-deq] [-T fmt] [file] > Error: Process completed with exit code 1. > {code} > See https://man.freebsd.org/cgi/man.cgi?script(1) > https://man7.org/linux/man-pages/man1/script.1.html -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-47095) Uses proper options for command script in macos-14 build
[ https://issues.apache.org/jira/browse/SPARK-47095?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated SPARK-47095: --- Labels: pull-request-available (was: ) > Uses proper options for command script in macos-14 build > > > Key: SPARK-47095 > URL: https://issues.apache.org/jira/browse/SPARK-47095 > Project: Spark > Issue Type: Improvement > Components: Project Infra >Affects Versions: 4.0.0 >Reporter: Hyukjin Kwon >Assignee: Hyukjin Kwon >Priority: Major > Labels: pull-request-available > > https://github.com/apache/spark/actions/runs/7964626045/job/21742573399 > It fails as below: > {code} > Run # Fix for TTY related issues when launching the Ammonite REPL in tests. > /usr/bin/script: illegal option -- c > usage: script [-aeFkpqr] [-t time] [file [command ...]] >script -p [-deq] [-T fmt] [file] > Error: Process completed with exit code 1. > {code} > See https://man.freebsd.org/cgi/man.cgi?script(1) > https://man7.org/linux/man-pages/man1/script.1.html -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-47095) Uses proper options for command script in macos-14 build
[ https://issues.apache.org/jira/browse/SPARK-47095?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hyukjin Kwon updated SPARK-47095: - Description: https://github.com/apache/spark/actions/runs/7964626045/job/21742573399 It fails as below: {code} Run # Fix for TTY related issues when launching the Ammonite REPL in tests. /usr/bin/script: illegal option -- c usage: script [-aeFkpqr] [-t time] [file [command ...]] script -p [-deq] [-T fmt] [file] Error: Process completed with exit code 1. {code} See https://man.freebsd.org/cgi/man.cgi?script(1) https://man7.org/linux/man-pages/man1/script.1.html was: https://github.com/apache/spark/actions/runs/7964626045/job/21742573399 It fails as below: {code} Run # Fix for TTY related issues when launching the Ammonite REPL in tests. /usr/bin/script: illegal option -- c usage: script [-aeFkpqr] [-t time] [file [command ...]] script -p [-deq] [-T fmt] [file] Error: Process completed with exit code 1. {code} > Uses proper options for command script in macos-14 build > > > Key: SPARK-47095 > URL: https://issues.apache.org/jira/browse/SPARK-47095 > Project: Spark > Issue Type: Improvement > Components: Project Infra >Affects Versions: 4.0.0 >Reporter: Hyukjin Kwon >Assignee: Hyukjin Kwon >Priority: Major > > https://github.com/apache/spark/actions/runs/7964626045/job/21742573399 > It fails as below: > {code} > Run # Fix for TTY related issues when launching the Ammonite REPL in tests. > /usr/bin/script: illegal option -- c > usage: script [-aeFkpqr] [-t time] [file [command ...]] >script -p [-deq] [-T fmt] [file] > Error: Process completed with exit code 1. > {code} > See https://man.freebsd.org/cgi/man.cgi?script(1) > https://man7.org/linux/man-pages/man1/script.1.html -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-47095) Uses proper options for command script for macos-14 build
[ https://issues.apache.org/jira/browse/SPARK-47095?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hyukjin Kwon updated SPARK-47095: - Summary: Uses proper options for command script for macos-14 build (was: Uses Mac dedicated options for command script for macos-14 build) > Uses proper options for command script for macos-14 build > - > > Key: SPARK-47095 > URL: https://issues.apache.org/jira/browse/SPARK-47095 > Project: Spark > Issue Type: Improvement > Components: Project Infra >Affects Versions: 4.0.0 >Reporter: Hyukjin Kwon >Priority: Major > > https://github.com/apache/spark/actions/runs/7964626045/job/21742573399 > It fails as below: > {code} > Run # Fix for TTY related issues when launching the Ammonite REPL in tests. > /usr/bin/script: illegal option -- c > usage: script [-aeFkpqr] [-t time] [file [command ...]] >script -p [-deq] [-T fmt] [file] > Error: Process completed with exit code 1. > {code} -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Assigned] (SPARK-47095) Uses proper options for command script for macos-14 build
[ https://issues.apache.org/jira/browse/SPARK-47095?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hyukjin Kwon reassigned SPARK-47095: Assignee: Hyukjin Kwon > Uses proper options for command script for macos-14 build > - > > Key: SPARK-47095 > URL: https://issues.apache.org/jira/browse/SPARK-47095 > Project: Spark > Issue Type: Improvement > Components: Project Infra >Affects Versions: 4.0.0 >Reporter: Hyukjin Kwon >Assignee: Hyukjin Kwon >Priority: Major > > https://github.com/apache/spark/actions/runs/7964626045/job/21742573399 > It fails as below: > {code} > Run # Fix for TTY related issues when launching the Ammonite REPL in tests. > /usr/bin/script: illegal option -- c > usage: script [-aeFkpqr] [-t time] [file [command ...]] >script -p [-deq] [-T fmt] [file] > Error: Process completed with exit code 1. > {code} -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-47095) Uses Mac dedicated options for command script for macos-14 build
[ https://issues.apache.org/jira/browse/SPARK-47095?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hyukjin Kwon updated SPARK-47095: - Summary: Uses Mac dedicated options for command script for macos-14 build (was: Uses Mac dedicated options for command script) > Uses Mac dedicated options for command script for macos-14 build > > > Key: SPARK-47095 > URL: https://issues.apache.org/jira/browse/SPARK-47095 > Project: Spark > Issue Type: Improvement > Components: Project Infra >Affects Versions: 4.0.0 >Reporter: Hyukjin Kwon >Priority: Major > > https://github.com/apache/spark/actions/runs/7964626045/job/21742573399 > It fails as below: > {code} > Run # Fix for TTY related issues when launching the Ammonite REPL in tests. > /usr/bin/script: illegal option -- c > usage: script [-aeFkpqr] [-t time] [file [command ...]] >script -p [-deq] [-T fmt] [file] > Error: Process completed with exit code 1. > {code} -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-47095) Uses proper options for command script in macos-14 build
[ https://issues.apache.org/jira/browse/SPARK-47095?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hyukjin Kwon updated SPARK-47095: - Summary: Uses proper options for command script in macos-14 build (was: Uses proper options for command script for macos-14 build) > Uses proper options for command script in macos-14 build > > > Key: SPARK-47095 > URL: https://issues.apache.org/jira/browse/SPARK-47095 > Project: Spark > Issue Type: Improvement > Components: Project Infra >Affects Versions: 4.0.0 >Reporter: Hyukjin Kwon >Assignee: Hyukjin Kwon >Priority: Major > > https://github.com/apache/spark/actions/runs/7964626045/job/21742573399 > It fails as below: > {code} > Run # Fix for TTY related issues when launching the Ammonite REPL in tests. > /usr/bin/script: illegal option -- c > usage: script [-aeFkpqr] [-t time] [file [command ...]] >script -p [-deq] [-T fmt] [file] > Error: Process completed with exit code 1. > {code} -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-47095) Uses Mac dedicated options for command script
Hyukjin Kwon created SPARK-47095: Summary: Uses Mac dedicated options for command script Key: SPARK-47095 URL: https://issues.apache.org/jira/browse/SPARK-47095 Project: Spark Issue Type: Improvement Components: Project Infra Affects Versions: 4.0.0 Reporter: Hyukjin Kwon https://github.com/apache/spark/actions/runs/7964626045/job/21742573399 It fails as below: {code} Run # Fix for TTY related issues when launching the Ammonite REPL in tests. /usr/bin/script: illegal option -- c usage: script [-aeFkpqr] [-t time] [file [command ...]] script -p [-deq] [-T fmt] [file] Error: Process completed with exit code 1. {code} -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Resolved] (SPARK-47092) Add `getUriBuilder` to `o.a.s.u.Utils` and use it
[ https://issues.apache.org/jira/browse/SPARK-47092?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dongjoon Hyun resolved SPARK-47092. --- Fix Version/s: 4.0.0 Resolution: Fixed Issue resolved by pull request 45168 [https://github.com/apache/spark/pull/45168] > Add `getUriBuilder` to `o.a.s.u.Utils` and use it > - > > Key: SPARK-47092 > URL: https://issues.apache.org/jira/browse/SPARK-47092 > Project: Spark > Issue Type: Sub-task > Components: Spark Core >Affects Versions: 4.0.0 >Reporter: Dongjoon Hyun >Assignee: Dongjoon Hyun >Priority: Major > Labels: pull-request-available > Fix For: 4.0.0 > > -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-47094) SPJ : Dynamically rebalance number of buckets when they are not equal
Himadri Pal created SPARK-47094: --- Summary: SPJ : Dynamically rebalance number of buckets when they are not equal Key: SPARK-47094 URL: https://issues.apache.org/jira/browse/SPARK-47094 Project: Spark Issue Type: New Feature Components: Spark Core Affects Versions: 3.4.0, 3.3.0 Reporter: Himadri Pal SPJ: Storage Partition Join works with Iceberg tables when both the tables have same number of buckets. As part of this feature request, we would like spark to gather the number of buckets information from both the tables and dynamically rebalance the number of buckets by coalesce or repartition so that SPJ will work fine. In this case, we would still have to shuffle but would be better than no SPJ. Use Case : Many times we do not have control of the input tables, hence it's not possible to change partitioning scheme on those tables. As a consumer, we would still like them to be used with SPJ when used with other tables and output tables which has different number of buckets. In these scenario, we would need to read those tables rewrite them with matching number of buckets for the SPJ to work, this extra step could outweigh the benefits of less shuffle via SPJ. Also when there are multiple different tables being joined, each tables need to be rewritten with matching number of buckets. If this feature is implemented, SPJ functionality will be more powerful. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-37434) Disable unsupported `ExtendedLevelDBTest` on `MacOS/aarch64`
[ https://issues.apache.org/jira/browse/SPARK-37434?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated SPARK-37434: --- Labels: pull-request-available (was: ) > Disable unsupported `ExtendedLevelDBTest` on `MacOS/aarch64` > > > Key: SPARK-37434 > URL: https://issues.apache.org/jira/browse/SPARK-37434 > Project: Spark > Issue Type: Sub-task > Components: Build >Affects Versions: 4.0.0 >Reporter: Yang Jie >Priority: Minor > Labels: pull-request-available > > After SPARK-37272 and SPARK-37282, we can manually add > {code:java} > -Dtest.exclude.tags=org.apache.spark.tags.ExtendedLevelDBTest,org.apache.spark.tags.ExtendedRocksDBTest > {code} > when run mvn test or sbt test to disable unsupported UTs on Macos using Apple > Silicon. > > We can add a profile to and activate this property automatically when run > UTs on Macos using Apple Silicon. > > > > -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-37434) Disable unsupported `ExtendedLevelDBTest` on `MacOS/aarch64`
[ https://issues.apache.org/jira/browse/SPARK-37434?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dongjoon Hyun updated SPARK-37434: -- Parent: SPARK-44111 Issue Type: Sub-task (was: Test) > Disable unsupported `ExtendedLevelDBTest` on `MacOS/aarch64` > > > Key: SPARK-37434 > URL: https://issues.apache.org/jira/browse/SPARK-37434 > Project: Spark > Issue Type: Sub-task > Components: Build >Affects Versions: 4.0.0 >Reporter: Yang Jie >Priority: Minor > > After SPARK-37272 and SPARK-37282, we can manually add > {code:java} > -Dtest.exclude.tags=org.apache.spark.tags.ExtendedLevelDBTest,org.apache.spark.tags.ExtendedRocksDBTest > {code} > when run mvn test or sbt test to disable unsupported UTs on Macos using Apple > Silicon. > > We can add a profile to and activate this property automatically when run > UTs on Macos using Apple Silicon. > > > > -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-37434) Disable unsupported `ExtendedLevelDBTest` on `MacOS/aarch64`
[ https://issues.apache.org/jira/browse/SPARK-37434?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dongjoon Hyun updated SPARK-37434: -- Summary: Disable unsupported `ExtendedLevelDBTest` on `MacOS/aarch64` (was: Disable `ExtendedLevelDBTest` on `MacOS/aarch64`) > Disable unsupported `ExtendedLevelDBTest` on `MacOS/aarch64` > > > Key: SPARK-37434 > URL: https://issues.apache.org/jira/browse/SPARK-37434 > Project: Spark > Issue Type: Test > Components: Build >Affects Versions: 4.0.0 >Reporter: Yang Jie >Priority: Minor > > After SPARK-37272 and SPARK-37282, we can manually add > {code:java} > -Dtest.exclude.tags=org.apache.spark.tags.ExtendedLevelDBTest,org.apache.spark.tags.ExtendedRocksDBTest > {code} > when run mvn test or sbt test to disable unsupported UTs on Macos using Apple > Silicon. > > We can add a profile to and activate this property automatically when run > UTs on Macos using Apple Silicon. > > > > -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-37434) Disable `ExtendedLevelDBTest` on `MacOS/aarch64`
[ https://issues.apache.org/jira/browse/SPARK-37434?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dongjoon Hyun updated SPARK-37434: -- Affects Version/s: 4.0.0 (was: 3.3.0) > Disable `ExtendedLevelDBTest` on `MacOS/aarch64` > > > Key: SPARK-37434 > URL: https://issues.apache.org/jira/browse/SPARK-37434 > Project: Spark > Issue Type: Improvement > Components: Build >Affects Versions: 4.0.0 >Reporter: Yang Jie >Priority: Major > > After SPARK-37272 and SPARK-37282, we can manually add > {code:java} > -Dtest.exclude.tags=org.apache.spark.tags.ExtendedLevelDBTest,org.apache.spark.tags.ExtendedRocksDBTest > {code} > when run mvn test or sbt test to disable unsupported UTs on Macos using Apple > Silicon. > > We can add a profile to and activate this property automatically when run > UTs on Macos using Apple Silicon. > > > > -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-37434) Disable `ExtendedLevelDBTest` on `MacOS/aarch64`
[ https://issues.apache.org/jira/browse/SPARK-37434?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dongjoon Hyun updated SPARK-37434: -- Summary: Disable `ExtendedLevelDBTest` on `MacOS/aarch64` (was: Add a new profile to auto disable unsupported UTs on Macos using Apple Silicon) > Disable `ExtendedLevelDBTest` on `MacOS/aarch64` > > > Key: SPARK-37434 > URL: https://issues.apache.org/jira/browse/SPARK-37434 > Project: Spark > Issue Type: Improvement > Components: Build >Affects Versions: 3.3.0 >Reporter: Yang Jie >Priority: Major > > After SPARK-37272 and SPARK-37282, we can manually add > {code:java} > -Dtest.exclude.tags=org.apache.spark.tags.ExtendedLevelDBTest,org.apache.spark.tags.ExtendedRocksDBTest > {code} > when run mvn test or sbt test to disable unsupported UTs on Macos using Apple > Silicon. > > We can add a profile to and activate this property automatically when run > UTs on Macos using Apple Silicon. > > > > -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-37434) Disable `ExtendedLevelDBTest` on `MacOS/aarch64`
[ https://issues.apache.org/jira/browse/SPARK-37434?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dongjoon Hyun updated SPARK-37434: -- Priority: Minor (was: Major) > Disable `ExtendedLevelDBTest` on `MacOS/aarch64` > > > Key: SPARK-37434 > URL: https://issues.apache.org/jira/browse/SPARK-37434 > Project: Spark > Issue Type: Improvement > Components: Build >Affects Versions: 4.0.0 >Reporter: Yang Jie >Priority: Minor > > After SPARK-37272 and SPARK-37282, we can manually add > {code:java} > -Dtest.exclude.tags=org.apache.spark.tags.ExtendedLevelDBTest,org.apache.spark.tags.ExtendedRocksDBTest > {code} > when run mvn test or sbt test to disable unsupported UTs on Macos using Apple > Silicon. > > We can add a profile to and activate this property automatically when run > UTs on Macos using Apple Silicon. > > > > -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-37434) Disable `ExtendedLevelDBTest` on `MacOS/aarch64`
[ https://issues.apache.org/jira/browse/SPARK-37434?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dongjoon Hyun updated SPARK-37434: -- Issue Type: Test (was: Improvement) > Disable `ExtendedLevelDBTest` on `MacOS/aarch64` > > > Key: SPARK-37434 > URL: https://issues.apache.org/jira/browse/SPARK-37434 > Project: Spark > Issue Type: Test > Components: Build >Affects Versions: 4.0.0 >Reporter: Yang Jie >Priority: Minor > > After SPARK-37272 and SPARK-37282, we can manually add > {code:java} > -Dtest.exclude.tags=org.apache.spark.tags.ExtendedLevelDBTest,org.apache.spark.tags.ExtendedRocksDBTest > {code} > when run mvn test or sbt test to disable unsupported UTs on Macos using Apple > Silicon. > > We can add a profile to and activate this property automatically when run > UTs on Macos using Apple Silicon. > > > > -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-47093) Upgrade `mockito` to 5.10.0
[ https://issues.apache.org/jira/browse/SPARK-47093?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated SPARK-47093: --- Labels: pull-request-available (was: ) > Upgrade `mockito` to 5.10.0 > --- > > Key: SPARK-47093 > URL: https://issues.apache.org/jira/browse/SPARK-47093 > Project: Spark > Issue Type: Sub-task > Components: Tests >Affects Versions: 4.0.0 >Reporter: Dongjoon Hyun >Priority: Major > Labels: pull-request-available > -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-46934) Unable to create Hive View from certain Spark Dataframe StructType
[ https://issues.apache.org/jira/browse/SPARK-46934?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17818565#comment-17818565 ] Dongjoon Hyun commented on SPARK-46934: --- This is resolved at Apache Spark 4.0.0. Do you think this is a regression from some old Spark versions or a blocker for Apache Spark 3.5.1 release, [~yutinglin] ? > Unable to create Hive View from certain Spark Dataframe StructType > -- > > Key: SPARK-46934 > URL: https://issues.apache.org/jira/browse/SPARK-46934 > Project: Spark > Issue Type: Bug > Components: SQL >Affects Versions: 3.3.0, 3.3.2, 3.3.4 > Environment: Tested in Spark 3.3.0, 3.3.2. >Reporter: Yu-Ting LIN >Assignee: Kent Yao >Priority: Blocker > Labels: pull-request-available > Fix For: 4.0.0 > > > We are trying to create a Hive View using following SQL command "CREATE OR > REPLACE VIEW yuting AS SELECT INFO_ANN FROM table_2611810". > Our table_2611810 has certain columns contain special characters such as "/". > Here is the schema of this table. > {code:java} > contigName string > start bigint > end bigint > names array > referenceAllele string > alternateAlleles array > qual double > filters array > splitFromMultiAllelic boolean > INFO_NCAMP int > INFO_ODDRATIO double > INFO_NM double > INFO_DBSNP_CAF array > INFO_SPANPAIR int > INFO_TLAMP int > INFO_PSTD double > INFO_QSTD double > INFO_SBF double > INFO_AF array > INFO_QUAL double > INFO_SHIFT3 int > INFO_VARBIAS string > INFO_HICOV int > INFO_PMEAN double > INFO_MSI double > INFO_VD int > INFO_DP int > INFO_HICNT int > INFO_ADJAF double > INFO_SVLEN int > INFO_RSEQ string > INFO_MSigDb array > INFO_NMD array > INFO_ANN > array,Annotation_Impact:string,Gene_Name:string,Gene_ID:string,Feature_Type:string,Feature_ID:string,Transcript_BioType:string,Rank:struct,HGVS_c:string,HGVS_p:string,cDNA_pos/cDNA_length:struct,CDS_pos/CDS_length:struct,AA_pos/AA_length:struct,Distance:int,ERRORS/WARNINGS/INFO:string>> > INFO_BIAS string > INFO_MQ double > INFO_HIAF double > INFO_END int > INFO_SPLITREAD int > INFO_GDAMP int > INFO_LSEQ string > INFO_LOF array > INFO_SAMPLE string > INFO_AMPFLAG int > INFO_SN double > INFO_SVTYPE string > INFO_TYPE string > INFO_MSILEN double > INFO_DUPRATE double > INFO_DBSNP_COMMON int > INFO_REFBIAS string > genotypes > array,ALD:array,AF:array,phased:boolean,calls:array,VD:int,depth:int,RD:array>> > {code} > You can see that column INFO_ANN is an array of struct and it contains column > which has "/" inside such as "cDNA_pos/cDNA_length", etc. > We believe that it is the root cause that cause the following SparkException: > {code:java} > scala> val schema = spark.sql("CREATE OR REPLACE VIEW yuting AS SELECT > INFO_ANN FROM table_2611810") > 24/01/31 07:50:02.658 [main] WARN o.a.spark.sql.catalyst.util.package - > Truncated the string representation of a plan since it was too large. This > behavior can be adjusted by setting 'spark.sql.debug.maxToStringFields'. > org.apache.spark.SparkException: Cannot recognize hive type string: > array,Annotation_Impact:string,Gene_Name:string,Gene_ID:string,Feature_Type:string,Feature_ID:string,Transcript_BioType:string,Rank:struct,HGVS_c:string,HGVS_p:string,cDNA_pos/cDNA_length:struct,CDS_pos/CDS_length:struct,AA_pos/AA_length:struct,Distance:int,ERRORS/WARNINGS/INFO:string>>, > column: INFO_ANN > at > org.apache.spark.sql.errors.QueryExecutionErrors$.cannotRecognizeHiveTypeError(QueryExecutionErrors.scala:1455) > at > org.apache.spark.sql.hive.client.HiveClientImpl$.getSparkSQLDataType(HiveClientImpl.scala:1022) > at > org.apache.spark.sql.hive.client.HiveClientImpl$.$anonfun$verifyColumnDataType$1(HiveClientImpl.scala:1037) > at scala.collection.Iterator.foreach(Iterator.scala:943) > at scala.collection.Iterator.foreach$(Iterator.scala:943) > at scala.collection.AbstractIterator.foreach(Iterator.scala:1431) > at scala.collection.IterableLike.foreach(IterableLike.scala:74) > at scala.collection.IterableLike.foreach$(IterableLike.scala:73) > at
[jira] [Comment Edited] (SPARK-46934) Unable to create Hive View from certain Spark Dataframe StructType
[ https://issues.apache.org/jira/browse/SPARK-46934?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17818565#comment-17818565 ] Dongjoon Hyun edited comment on SPARK-46934 at 2/19/24 7:57 PM: This is resolved at Apache Spark 4.0.0. Do you think this is a regression from some old Spark versions or a blocker for Apache Spark 3.5.1 release, [~yutinglin] and [~yao] ? was (Author: dongjoon): This is resolved at Apache Spark 4.0.0. Do you think this is a regression from some old Spark versions or a blocker for Apache Spark 3.5.1 release, [~yutinglin] ? > Unable to create Hive View from certain Spark Dataframe StructType > -- > > Key: SPARK-46934 > URL: https://issues.apache.org/jira/browse/SPARK-46934 > Project: Spark > Issue Type: Bug > Components: SQL >Affects Versions: 3.3.0, 3.3.2, 3.3.4 > Environment: Tested in Spark 3.3.0, 3.3.2. >Reporter: Yu-Ting LIN >Assignee: Kent Yao >Priority: Blocker > Labels: pull-request-available > Fix For: 4.0.0 > > > We are trying to create a Hive View using following SQL command "CREATE OR > REPLACE VIEW yuting AS SELECT INFO_ANN FROM table_2611810". > Our table_2611810 has certain columns contain special characters such as "/". > Here is the schema of this table. > {code:java} > contigName string > start bigint > end bigint > names array > referenceAllele string > alternateAlleles array > qual double > filters array > splitFromMultiAllelic boolean > INFO_NCAMP int > INFO_ODDRATIO double > INFO_NM double > INFO_DBSNP_CAF array > INFO_SPANPAIR int > INFO_TLAMP int > INFO_PSTD double > INFO_QSTD double > INFO_SBF double > INFO_AF array > INFO_QUAL double > INFO_SHIFT3 int > INFO_VARBIAS string > INFO_HICOV int > INFO_PMEAN double > INFO_MSI double > INFO_VD int > INFO_DP int > INFO_HICNT int > INFO_ADJAF double > INFO_SVLEN int > INFO_RSEQ string > INFO_MSigDb array > INFO_NMD array > INFO_ANN > array,Annotation_Impact:string,Gene_Name:string,Gene_ID:string,Feature_Type:string,Feature_ID:string,Transcript_BioType:string,Rank:struct,HGVS_c:string,HGVS_p:string,cDNA_pos/cDNA_length:struct,CDS_pos/CDS_length:struct,AA_pos/AA_length:struct,Distance:int,ERRORS/WARNINGS/INFO:string>> > INFO_BIAS string > INFO_MQ double > INFO_HIAF double > INFO_END int > INFO_SPLITREAD int > INFO_GDAMP int > INFO_LSEQ string > INFO_LOF array > INFO_SAMPLE string > INFO_AMPFLAG int > INFO_SN double > INFO_SVTYPE string > INFO_TYPE string > INFO_MSILEN double > INFO_DUPRATE double > INFO_DBSNP_COMMON int > INFO_REFBIAS string > genotypes > array,ALD:array,AF:array,phased:boolean,calls:array,VD:int,depth:int,RD:array>> > {code} > You can see that column INFO_ANN is an array of struct and it contains column > which has "/" inside such as "cDNA_pos/cDNA_length", etc. > We believe that it is the root cause that cause the following SparkException: > {code:java} > scala> val schema = spark.sql("CREATE OR REPLACE VIEW yuting AS SELECT > INFO_ANN FROM table_2611810") > 24/01/31 07:50:02.658 [main] WARN o.a.spark.sql.catalyst.util.package - > Truncated the string representation of a plan since it was too large. This > behavior can be adjusted by setting 'spark.sql.debug.maxToStringFields'. > org.apache.spark.SparkException: Cannot recognize hive type string: > array,Annotation_Impact:string,Gene_Name:string,Gene_ID:string,Feature_Type:string,Feature_ID:string,Transcript_BioType:string,Rank:struct,HGVS_c:string,HGVS_p:string,cDNA_pos/cDNA_length:struct,CDS_pos/CDS_length:struct,AA_pos/AA_length:struct,Distance:int,ERRORS/WARNINGS/INFO:string>>, > column: INFO_ANN > at > org.apache.spark.sql.errors.QueryExecutionErrors$.cannotRecognizeHiveTypeError(QueryExecutionErrors.scala:1455) > at > org.apache.spark.sql.hive.client.HiveClientImpl$.getSparkSQLDataType(HiveClientImpl.scala:1022) > at > org.apache.spark.sql.hive.client.HiveClientImpl$.$anonfun$verifyColumnDataType$1(HiveClientImpl.scala:1037) > at scala.collection.Iterator.foreach(Iterator.scala:943) > at scala.collection.Iterator.foreach$(Iterator.scala:943) >
[jira] [Resolved] (SPARK-46934) Unable to create Hive View from certain Spark Dataframe StructType
[ https://issues.apache.org/jira/browse/SPARK-46934?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dongjoon Hyun resolved SPARK-46934. --- Fix Version/s: 4.0.0 Resolution: Fixed Issue resolved by pull request 45039 [https://github.com/apache/spark/pull/45039] > Unable to create Hive View from certain Spark Dataframe StructType > -- > > Key: SPARK-46934 > URL: https://issues.apache.org/jira/browse/SPARK-46934 > Project: Spark > Issue Type: Bug > Components: SQL >Affects Versions: 3.3.0, 3.3.2, 3.3.4 > Environment: Tested in Spark 3.3.0, 3.3.2. >Reporter: Yu-Ting LIN >Assignee: Kent Yao >Priority: Blocker > Labels: pull-request-available > Fix For: 4.0.0 > > > We are trying to create a Hive View using following SQL command "CREATE OR > REPLACE VIEW yuting AS SELECT INFO_ANN FROM table_2611810". > Our table_2611810 has certain columns contain special characters such as "/". > Here is the schema of this table. > {code:java} > contigName string > start bigint > end bigint > names array > referenceAllele string > alternateAlleles array > qual double > filters array > splitFromMultiAllelic boolean > INFO_NCAMP int > INFO_ODDRATIO double > INFO_NM double > INFO_DBSNP_CAF array > INFO_SPANPAIR int > INFO_TLAMP int > INFO_PSTD double > INFO_QSTD double > INFO_SBF double > INFO_AF array > INFO_QUAL double > INFO_SHIFT3 int > INFO_VARBIAS string > INFO_HICOV int > INFO_PMEAN double > INFO_MSI double > INFO_VD int > INFO_DP int > INFO_HICNT int > INFO_ADJAF double > INFO_SVLEN int > INFO_RSEQ string > INFO_MSigDb array > INFO_NMD array > INFO_ANN > array,Annotation_Impact:string,Gene_Name:string,Gene_ID:string,Feature_Type:string,Feature_ID:string,Transcript_BioType:string,Rank:struct,HGVS_c:string,HGVS_p:string,cDNA_pos/cDNA_length:struct,CDS_pos/CDS_length:struct,AA_pos/AA_length:struct,Distance:int,ERRORS/WARNINGS/INFO:string>> > INFO_BIAS string > INFO_MQ double > INFO_HIAF double > INFO_END int > INFO_SPLITREAD int > INFO_GDAMP int > INFO_LSEQ string > INFO_LOF array > INFO_SAMPLE string > INFO_AMPFLAG int > INFO_SN double > INFO_SVTYPE string > INFO_TYPE string > INFO_MSILEN double > INFO_DUPRATE double > INFO_DBSNP_COMMON int > INFO_REFBIAS string > genotypes > array,ALD:array,AF:array,phased:boolean,calls:array,VD:int,depth:int,RD:array>> > {code} > You can see that column INFO_ANN is an array of struct and it contains column > which has "/" inside such as "cDNA_pos/cDNA_length", etc. > We believe that it is the root cause that cause the following SparkException: > {code:java} > scala> val schema = spark.sql("CREATE OR REPLACE VIEW yuting AS SELECT > INFO_ANN FROM table_2611810") > 24/01/31 07:50:02.658 [main] WARN o.a.spark.sql.catalyst.util.package - > Truncated the string representation of a plan since it was too large. This > behavior can be adjusted by setting 'spark.sql.debug.maxToStringFields'. > org.apache.spark.SparkException: Cannot recognize hive type string: > array,Annotation_Impact:string,Gene_Name:string,Gene_ID:string,Feature_Type:string,Feature_ID:string,Transcript_BioType:string,Rank:struct,HGVS_c:string,HGVS_p:string,cDNA_pos/cDNA_length:struct,CDS_pos/CDS_length:struct,AA_pos/AA_length:struct,Distance:int,ERRORS/WARNINGS/INFO:string>>, > column: INFO_ANN > at > org.apache.spark.sql.errors.QueryExecutionErrors$.cannotRecognizeHiveTypeError(QueryExecutionErrors.scala:1455) > at > org.apache.spark.sql.hive.client.HiveClientImpl$.getSparkSQLDataType(HiveClientImpl.scala:1022) > at > org.apache.spark.sql.hive.client.HiveClientImpl$.$anonfun$verifyColumnDataType$1(HiveClientImpl.scala:1037) > at scala.collection.Iterator.foreach(Iterator.scala:943) > at scala.collection.Iterator.foreach$(Iterator.scala:943) > at scala.collection.AbstractIterator.foreach(Iterator.scala:1431) > at scala.collection.IterableLike.foreach(IterableLike.scala:74) > at scala.collection.IterableLike.foreach$(IterableLike.scala:73) > at org.apache.spark.sql.types.StructType.foreach(StructType.scala:102) > at >
[jira] [Assigned] (SPARK-46934) Unable to create Hive View from certain Spark Dataframe StructType
[ https://issues.apache.org/jira/browse/SPARK-46934?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dongjoon Hyun reassigned SPARK-46934: - Assignee: Kent Yao > Unable to create Hive View from certain Spark Dataframe StructType > -- > > Key: SPARK-46934 > URL: https://issues.apache.org/jira/browse/SPARK-46934 > Project: Spark > Issue Type: Bug > Components: SQL >Affects Versions: 3.3.0, 3.3.2, 3.3.4 > Environment: Tested in Spark 3.3.0, 3.3.2. >Reporter: Yu-Ting LIN >Assignee: Kent Yao >Priority: Blocker > Labels: pull-request-available > > We are trying to create a Hive View using following SQL command "CREATE OR > REPLACE VIEW yuting AS SELECT INFO_ANN FROM table_2611810". > Our table_2611810 has certain columns contain special characters such as "/". > Here is the schema of this table. > {code:java} > contigName string > start bigint > end bigint > names array > referenceAllele string > alternateAlleles array > qual double > filters array > splitFromMultiAllelic boolean > INFO_NCAMP int > INFO_ODDRATIO double > INFO_NM double > INFO_DBSNP_CAF array > INFO_SPANPAIR int > INFO_TLAMP int > INFO_PSTD double > INFO_QSTD double > INFO_SBF double > INFO_AF array > INFO_QUAL double > INFO_SHIFT3 int > INFO_VARBIAS string > INFO_HICOV int > INFO_PMEAN double > INFO_MSI double > INFO_VD int > INFO_DP int > INFO_HICNT int > INFO_ADJAF double > INFO_SVLEN int > INFO_RSEQ string > INFO_MSigDb array > INFO_NMD array > INFO_ANN > array,Annotation_Impact:string,Gene_Name:string,Gene_ID:string,Feature_Type:string,Feature_ID:string,Transcript_BioType:string,Rank:struct,HGVS_c:string,HGVS_p:string,cDNA_pos/cDNA_length:struct,CDS_pos/CDS_length:struct,AA_pos/AA_length:struct,Distance:int,ERRORS/WARNINGS/INFO:string>> > INFO_BIAS string > INFO_MQ double > INFO_HIAF double > INFO_END int > INFO_SPLITREAD int > INFO_GDAMP int > INFO_LSEQ string > INFO_LOF array > INFO_SAMPLE string > INFO_AMPFLAG int > INFO_SN double > INFO_SVTYPE string > INFO_TYPE string > INFO_MSILEN double > INFO_DUPRATE double > INFO_DBSNP_COMMON int > INFO_REFBIAS string > genotypes > array,ALD:array,AF:array,phased:boolean,calls:array,VD:int,depth:int,RD:array>> > {code} > You can see that column INFO_ANN is an array of struct and it contains column > which has "/" inside such as "cDNA_pos/cDNA_length", etc. > We believe that it is the root cause that cause the following SparkException: > {code:java} > scala> val schema = spark.sql("CREATE OR REPLACE VIEW yuting AS SELECT > INFO_ANN FROM table_2611810") > 24/01/31 07:50:02.658 [main] WARN o.a.spark.sql.catalyst.util.package - > Truncated the string representation of a plan since it was too large. This > behavior can be adjusted by setting 'spark.sql.debug.maxToStringFields'. > org.apache.spark.SparkException: Cannot recognize hive type string: > array,Annotation_Impact:string,Gene_Name:string,Gene_ID:string,Feature_Type:string,Feature_ID:string,Transcript_BioType:string,Rank:struct,HGVS_c:string,HGVS_p:string,cDNA_pos/cDNA_length:struct,CDS_pos/CDS_length:struct,AA_pos/AA_length:struct,Distance:int,ERRORS/WARNINGS/INFO:string>>, > column: INFO_ANN > at > org.apache.spark.sql.errors.QueryExecutionErrors$.cannotRecognizeHiveTypeError(QueryExecutionErrors.scala:1455) > at > org.apache.spark.sql.hive.client.HiveClientImpl$.getSparkSQLDataType(HiveClientImpl.scala:1022) > at > org.apache.spark.sql.hive.client.HiveClientImpl$.$anonfun$verifyColumnDataType$1(HiveClientImpl.scala:1037) > at scala.collection.Iterator.foreach(Iterator.scala:943) > at scala.collection.Iterator.foreach$(Iterator.scala:943) > at scala.collection.AbstractIterator.foreach(Iterator.scala:1431) > at scala.collection.IterableLike.foreach(IterableLike.scala:74) > at scala.collection.IterableLike.foreach$(IterableLike.scala:73) > at org.apache.spark.sql.types.StructType.foreach(StructType.scala:102) > at > org.apache.spark.sql.hive.client.HiveClientImpl$.org$apache$spark$sql$hive$client$HiveClientImpl$$verifyColumnDataType(HiveClientImpl.scala:1037) > at >
[jira] [Resolved] (SPARK-44826) Resolve testing timeout issue from Spark Connect
[ https://issues.apache.org/jira/browse/SPARK-44826?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dongjoon Hyun resolved SPARK-44826. --- Fix Version/s: 4.0.0 Resolution: Fixed Issue resolved by pull request 45166 [https://github.com/apache/spark/pull/45166] > Resolve testing timeout issue from Spark Connect > > > Key: SPARK-44826 > URL: https://issues.apache.org/jira/browse/SPARK-44826 > Project: Spark > Issue Type: Bug > Components: Connect, Pandas API on Spark, Tests >Affects Versions: 4.0.0 >Reporter: Haejoon Lee >Assignee: Haejoon Lee >Priority: Major > Labels: pull-request-available > Fix For: 4.0.0 > > > DiffFramesParitySetItemSeriesTests.test_series_iloc_setitem is failing on > Spark Connect due to unexpected timeout issue: > https://github.com/itholic/spark/actions/runs/5850534247/job/15860127608 -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Assigned] (SPARK-44826) Resolve testing timeout issue from Spark Connect
[ https://issues.apache.org/jira/browse/SPARK-44826?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dongjoon Hyun reassigned SPARK-44826: - Assignee: Haejoon Lee > Resolve testing timeout issue from Spark Connect > > > Key: SPARK-44826 > URL: https://issues.apache.org/jira/browse/SPARK-44826 > Project: Spark > Issue Type: Bug > Components: Connect, Pandas API on Spark, Tests >Affects Versions: 4.0.0 >Reporter: Haejoon Lee >Assignee: Haejoon Lee >Priority: Major > Labels: pull-request-available > > DiffFramesParitySetItemSeriesTests.test_series_iloc_setitem is failing on > Spark Connect due to unexpected timeout issue: > https://github.com/itholic/spark/actions/runs/5850534247/job/15860127608 -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Assigned] (SPARK-47092) Add `getUriBuilder` to `o.a.s.u.Utils` and use it
[ https://issues.apache.org/jira/browse/SPARK-47092?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dongjoon Hyun reassigned SPARK-47092: - Assignee: Dongjoon Hyun > Add `getUriBuilder` to `o.a.s.u.Utils` and use it > - > > Key: SPARK-47092 > URL: https://issues.apache.org/jira/browse/SPARK-47092 > Project: Spark > Issue Type: Sub-task > Components: Spark Core >Affects Versions: 4.0.0 >Reporter: Dongjoon Hyun >Assignee: Dongjoon Hyun >Priority: Major > Labels: pull-request-available > -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Resolved] (SPARK-47089) Migrate mockito 4 to mockito5
[ https://issues.apache.org/jira/browse/SPARK-47089?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dongjoon Hyun resolved SPARK-47089. --- Fix Version/s: 4.0.0 Resolution: Fixed Issue resolved by pull request 45158 [https://github.com/apache/spark/pull/45158] > Migrate mockito 4 to mockito5 > - > > Key: SPARK-47089 > URL: https://issues.apache.org/jira/browse/SPARK-47089 > Project: Spark > Issue Type: Improvement > Components: Build, Tests >Affects Versions: 4.0.0 >Reporter: Yang Jie >Assignee: BingKun Pan >Priority: Major > Labels: pull-request-available > Fix For: 4.0.0 > > -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Assigned] (SPARK-47089) Migrate mockito 4 to mockito5
[ https://issues.apache.org/jira/browse/SPARK-47089?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dongjoon Hyun reassigned SPARK-47089: - Assignee: BingKun Pan > Migrate mockito 4 to mockito5 > - > > Key: SPARK-47089 > URL: https://issues.apache.org/jira/browse/SPARK-47089 > Project: Spark > Issue Type: Improvement > Components: Build, Tests >Affects Versions: 4.0.0 >Reporter: Yang Jie >Assignee: BingKun Pan >Priority: Major > Labels: pull-request-available > -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-47089) Migrate mockito 4 to mockito5
[ https://issues.apache.org/jira/browse/SPARK-47089?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dongjoon Hyun updated SPARK-47089: -- Parent: SPARK-47046 Issue Type: Sub-task (was: Improvement) > Migrate mockito 4 to mockito5 > - > > Key: SPARK-47089 > URL: https://issues.apache.org/jira/browse/SPARK-47089 > Project: Spark > Issue Type: Sub-task > Components: Build, Tests >Affects Versions: 4.0.0 >Reporter: Yang Jie >Assignee: BingKun Pan >Priority: Major > Labels: pull-request-available > Fix For: 4.0.0 > > -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-47092) Add `getUriBuilder` to `o.a.s.u.Utils` and use it
[ https://issues.apache.org/jira/browse/SPARK-47092?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated SPARK-47092: --- Labels: pull-request-available (was: ) > Add `getUriBuilder` to `o.a.s.u.Utils` and use it > - > > Key: SPARK-47092 > URL: https://issues.apache.org/jira/browse/SPARK-47092 > Project: Spark > Issue Type: Sub-task > Components: Spark Core >Affects Versions: 4.0.0 >Reporter: Dongjoon Hyun >Priority: Major > Labels: pull-request-available > -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-47092) Add `getUriBuilder` to `o.a.s.u.Utils` and use it
[ https://issues.apache.org/jira/browse/SPARK-47092?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dongjoon Hyun updated SPARK-47092: -- Summary: Add `getUriBuilder` to `o.a.s.u.Utils` and use it (was: Add `getUriBuilder` to `o.a.s.u.Utils`) > Add `getUriBuilder` to `o.a.s.u.Utils` and use it > - > > Key: SPARK-47092 > URL: https://issues.apache.org/jira/browse/SPARK-47092 > Project: Spark > Issue Type: Sub-task > Components: Spark Core >Affects Versions: 4.0.0 >Reporter: Dongjoon Hyun >Priority: Major > -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-47092) Add `getUriBuilder` to `o.a.s.u.Utils`
Dongjoon Hyun created SPARK-47092: - Summary: Add `getUriBuilder` to `o.a.s.u.Utils` Key: SPARK-47092 URL: https://issues.apache.org/jira/browse/SPARK-47092 Project: Spark Issue Type: Sub-task Components: Spark Core Affects Versions: 4.0.0 Reporter: Dongjoon Hyun -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Resolved] (SPARK-47067) Add Daily Apple Silicon Github Action Job (Java/Scala)
[ https://issues.apache.org/jira/browse/SPARK-47067?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dongjoon Hyun resolved SPARK-47067. --- Fix Version/s: 4.0.0 Resolution: Fixed Issue resolved by pull request 45162 [https://github.com/apache/spark/pull/45162] > Add Daily Apple Silicon Github Action Job (Java/Scala) > -- > > Key: SPARK-47067 > URL: https://issues.apache.org/jira/browse/SPARK-47067 > Project: Spark > Issue Type: Sub-task > Components: Project Infra >Affects Versions: 4.0.0 >Reporter: Dongjoon Hyun >Assignee: Hyukjin Kwon >Priority: Major > Labels: pull-request-available > Fix For: 4.0.0 > > -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Resolved] (SPARK-47087) Raise Spark's exception with an error class in config value check
[ https://issues.apache.org/jira/browse/SPARK-47087?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Max Gekk resolved SPARK-47087. -- Fix Version/s: 4.0.0 Resolution: Fixed Issue resolved by pull request 45156 [https://github.com/apache/spark/pull/45156] > Raise Spark's exception with an error class in config value check > - > > Key: SPARK-47087 > URL: https://issues.apache.org/jira/browse/SPARK-47087 > Project: Spark > Issue Type: Sub-task > Components: SQL >Affects Versions: 4.0.0 >Reporter: Max Gekk >Assignee: Max Gekk >Priority: Major > Labels: pull-request-available > Fix For: 4.0.0 > > > Currently, Spark throws *IllegalArgumentException* in `checkValue` of > ConfigBuilder. Need to overload `checkValue` to throw > `SparkIllegalArgumentException` with an error class. This should improve user > experience with Spark SQL, and impressions of Spark's errors. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-24578) Reading remote cache block behavior changes and causes timeout issue
[ https://issues.apache.org/jira/browse/SPARK-24578?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated SPARK-24578: --- Labels: pull-request-available (was: ) > Reading remote cache block behavior changes and causes timeout issue > > > Key: SPARK-24578 > URL: https://issues.apache.org/jira/browse/SPARK-24578 > Project: Spark > Issue Type: Bug > Components: Spark Core >Affects Versions: 2.3.0, 2.3.1 >Reporter: Wenbo Zhao >Assignee: Wenbo Zhao >Priority: Blocker > Labels: pull-request-available > Fix For: 2.3.2, 2.4.0 > > > After Spark 2.3, we observed lots of errors like the following in some of our > production job > {code:java} > 18/06/15 20:59:42 ERROR TransportRequestHandler: Error sending result > ChunkFetchSuccess{streamChunkId=StreamChunkId{streamId=91672904003, > chunkIndex=0}, > buffer=org.apache.spark.storage.BlockManagerManagedBuffer@783a9324} to > /172.22.18.7:60865; closing connection > java.io.IOException: Broken pipe > at sun.nio.ch.FileDispatcherImpl.write0(Native Method) > at sun.nio.ch.SocketDispatcher.write(SocketDispatcher.java:47) > at sun.nio.ch.IOUtil.writeFromNativeBuffer(IOUtil.java:93) > at sun.nio.ch.IOUtil.write(IOUtil.java:65) > at sun.nio.ch.SocketChannelImpl.write(SocketChannelImpl.java:471) > at > org.apache.spark.network.protocol.MessageWithHeader.writeNioBuffer(MessageWithHeader.java:156) > at > org.apache.spark.network.protocol.MessageWithHeader.copyByteBuf(MessageWithHeader.java:142) > at > org.apache.spark.network.protocol.MessageWithHeader.transferTo(MessageWithHeader.java:123) > at > io.netty.channel.socket.nio.NioSocketChannel.doWriteFileRegion(NioSocketChannel.java:355) > at > io.netty.channel.nio.AbstractNioByteChannel.doWrite(AbstractNioByteChannel.java:224) > at > io.netty.channel.socket.nio.NioSocketChannel.doWrite(NioSocketChannel.java:382) > at > io.netty.channel.AbstractChannel$AbstractUnsafe.flush0(AbstractChannel.java:934) > at > io.netty.channel.nio.AbstractNioChannel$AbstractNioUnsafe.flush0(AbstractNioChannel.java:362) > at > io.netty.channel.AbstractChannel$AbstractUnsafe.flush(AbstractChannel.java:901) > at > io.netty.channel.DefaultChannelPipeline$HeadContext.flush(DefaultChannelPipeline.java:1321) > at > io.netty.channel.AbstractChannelHandlerContext.invokeFlush0(AbstractChannelHandlerContext.java:776) > at > io.netty.channel.AbstractChannelHandlerContext.invokeFlush(AbstractChannelHandlerContext.java:768) > at > io.netty.channel.AbstractChannelHandlerContext.flush(AbstractChannelHandlerContext.java:749) > at > io.netty.channel.ChannelOutboundHandlerAdapter.flush(ChannelOutboundHandlerAdapter.java:115) > at > io.netty.channel.AbstractChannelHandlerContext.invokeFlush0(AbstractChannelHandlerContext.java:776) > at > io.netty.channel.AbstractChannelHandlerContext.invokeFlush(AbstractChannelHandlerContext.java:768) > at > io.netty.channel.AbstractChannelHandlerContext.flush(AbstractChannelHandlerContext.java:749) > at io.netty.channel.ChannelDuplexHandler.flush(ChannelDuplexHandler.java:117) > at > io.netty.channel.AbstractChannelHandlerContext.invokeFlush0(AbstractChannelHandlerContext.java:776) > at > io.netty.channel.AbstractChannelHandlerContext.invokeFlush(AbstractChannelHandlerContext.java:768) > at > io.netty.channel.AbstractChannelHandlerContext.flush(AbstractChannelHandlerContext.java:749) > at > io.netty.channel.DefaultChannelPipeline.flush(DefaultChannelPipeline.java:983) > at io.netty.channel.AbstractChannel.flush(AbstractChannel.java:248) > at > io.netty.channel.nio.AbstractNioByteChannel$1.run(AbstractNioByteChannel.java:284) > at > io.netty.util.concurrent.AbstractEventExecutor.safeExecute(AbstractEventExecutor.java:163) > at > io.netty.util.concurrent.SingleThreadEventExecutor.runAllTasks(SingleThreadEventExecutor.java:403) > at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:463) > at > io.netty.util.concurrent.SingleThreadEventExecutor$5.run(SingleThreadEventExecutor.java:858) > at > io.netty.util.concurrent.DefaultThreadFactory$DefaultRunnableDecorator.run(DefaultThreadFactory.java:138) > {code} > > Here is a small reproducible for a small cluster of 2 executors (say host-1 > and host-2) each with 8 cores. Here, the memory of driver and executors are > not an import factor here as long as it is big enough, say 20G. > {code:java} > val n = 1 > val df0 = sc.parallelize(1 to n).toDF > val df = df0.withColumn("x0", rand()).withColumn("x0", rand() > ).withColumn("x1", rand() > ).withColumn("x2", rand() > ).withColumn("x3", rand() > ).withColumn("x4", rand() > ).withColumn("x5", rand() > ).withColumn("x6", rand() > ).withColumn("x7", rand() > ).withColumn("x8", rand() > ).withColumn("x9", rand()) > df.cache;
[jira] [Updated] (SPARK-47088) Utilize BigDecimal to calculate the GPU resource
[ https://issues.apache.org/jira/browse/SPARK-47088?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated SPARK-47088: --- Labels: pull-request-available (was: ) > Utilize BigDecimal to calculate the GPU resource > - > > Key: SPARK-47088 > URL: https://issues.apache.org/jira/browse/SPARK-47088 > Project: Spark > Issue Type: Improvement > Components: Spark Core >Affects Versions: 4.0.0 >Reporter: Bobby Wang >Priority: Minor > Labels: pull-request-available > > To prevent precision errors, the current method of calculating GPU resources > involves multiplying by 1E16 to convert doubles to Longs. If needed, it will > also convert Longs back to doubles. This approach introduces redundancy in > the code, especially for test code. > More details can be found at > https://github.com/apache/spark/pull/44690#discussion_r1482301112 -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-44826) Resolve testing timeout issue from Spark Connect
[ https://issues.apache.org/jira/browse/SPARK-44826?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated SPARK-44826: --- Labels: pull-request-available (was: ) > Resolve testing timeout issue from Spark Connect > > > Key: SPARK-44826 > URL: https://issues.apache.org/jira/browse/SPARK-44826 > Project: Spark > Issue Type: Bug > Components: Connect, Pandas API on Spark, Tests >Affects Versions: 4.0.0 >Reporter: Haejoon Lee >Priority: Major > Labels: pull-request-available > > DiffFramesParitySetItemSeriesTests.test_series_iloc_setitem is failing on > Spark Connect due to unexpected timeout issue: > https://github.com/itholic/spark/actions/runs/5850534247/job/15860127608 -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Resolved] (SPARK-47090) Skip JDK 17/21 Maven build in branch-3.4 scheduled job
[ https://issues.apache.org/jira/browse/SPARK-47090?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hyukjin Kwon resolved SPARK-47090. -- Resolution: Invalid > Skip JDK 17/21 Maven build in branch-3.4 scheduled job > -- > > Key: SPARK-47090 > URL: https://issues.apache.org/jira/browse/SPARK-47090 > Project: Spark > Issue Type: Improvement > Components: Project Infra >Affects Versions: 4.0.0 >Reporter: Hyukjin Kwon >Priority: Major > Labels: pull-request-available > > https://github.com/apache/spark/actions/runs/7928294496/job/21664443573 -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-47091) An error occurs when executing the pyspark program
jackyjfhu created SPARK-47091: - Summary: An error occurs when executing the pyspark program Key: SPARK-47091 URL: https://issues.apache.org/jira/browse/SPARK-47091 Project: Spark Issue Type: Bug Components: Spark Core Affects Versions: 3.5.0, 3.1.3 Reporter: jackyjfhu When I excute this code via pyspark: spark._sc.textFile("/tmp/spark_data1").repartition(50).toDF().show I get an error: ERROR spark.TaskContextImpl: Error in TaskCompletionListener io.netty.util.IllegalReferenceCountException: refCnt: 0, decrement: 1 at io.netty.util.internal.ReferenceCountUpdater.toLiveRealRefCnt(ReferenceCountUpdater.java:74) ~[iceberg-spark-runtime-3.1_2.12-0.14.3-5-tencent.jar:?] at io.netty.util.internal.ReferenceCountUpdater.release(ReferenceCountUpdater.java:138) ~[iceberg-spark-runtime-3.1_2.12-0.14.3-5-tencent.jar:?] at io.netty.buffer.AbstractReferenceCountedByteBuf.release(AbstractReferenceCountedByteBuf.java:100) ~[netty-all-4.1.51.Final.jar:4.1.51.Final] at io.netty.buffer.AbstractDerivedByteBuf.release0(AbstractDerivedByteBuf.java:94) ~[netty-all-4.1.51.Final.jar:4.1.51.Final] at io.netty.buffer.AbstractDerivedByteBuf.release(AbstractDerivedByteBuf.java:90) ~[netty-all-4.1.51.Final.jar:4.1.51.Final] at org.apache.spark.network.buffer.NettyManagedBuffer.release(NettyManagedBuffer.java:62) ~[spark-network-common_2.12-3.1.3.jar:3.1.3] at org.apache.spark.storage.ShuffleBlockFetcherIterator.cleanup(ShuffleBlockFetcherIterator.scala:226) ~[spark-core_2.12-3.1.3.jar:3.1.3] at org.apache.spark.storage.ShuffleFetchCompletionListener.onTaskCompletion(ShuffleBlockFetcherIterator.scala:862) ~[spark-core_2.12-3.1.3.jar:3.1.3] at org.apache.spark.TaskContextImpl.$anonfun$markTaskCompleted$1(TaskContextImpl.scala:124) ~[spark-core_2.12-3.1.3.jar:3.1.3] at org.apache.spark.TaskContextImpl.$anonfun$markTaskCompleted$1$adapted(TaskContextImpl.scala:124) ~[spark-core_2.12-3.1.3.jar:3.1.3] at org.apache.spark.TaskContextImpl.$anonfun$invokeListeners$1(TaskContextImpl.scala:137) ~[spark-core_2.12-3.1.3.jar:3.1.3] at org.apache.spark.TaskContextImpl.$anonfun$invokeListeners$1$adapted(TaskContextImpl.scala:135) ~[spark-core_2.12-3.1.3.jar:3.1.3] at scala.collection.mutable.ResizableArray.foreach(ResizableArray.scala:62) ~[scala-library-2.12.10.jar:?] at scala.collection.mutable.ResizableArray.foreach$(ResizableArray.scala:55) ~[scala-library-2.12.10.jar:?] at scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:49) ~[scala-library-2.12.10.jar:?] at org.apache.spark.TaskContextImpl.invokeListeners(TaskContextImpl.scala:135) ~[spark-core_2.12-3.1.3.jar:3.1.3] at org.apache.spark.TaskContextImpl.markTaskCompleted(TaskContextImpl.scala:124) ~[spark-core_2.12-3.1.3.jar:3.1.3] at org.apache.spark.scheduler.Task.run(Task.scala:147) ~[spark-core_2.12-3.1.3.jar:3.1.3] at org.apache.spark.executor.Executor$TaskRunner.$anonfun$run$3(Executor.scala:498) ~[spark-core_2.12-3.1.3.jar:3.1.3] at org.apache.spark.util.Utils$.tryWithSafeFinally(Utils.scala:1439) [spark-core_2.12-3.1.3.jar:3.1.3] at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:501) [spark-core_2.12-3.1.3.jar:3.1.3] at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) [?:1.8.0_362] at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) [?:1.8.0_362] at java.lang.Thread.run(Thread.java:750) [?:1.8.0_362] 24/02/19 11:26:53 ERROR executor.Executor: Exception in task 0.1 in stage 1.0 (TID 4001) org.apache.spark.util.TaskCompletionListenerException: refCnt: 0, decrement: 1 at org.apache.spark.TaskContextImpl.invokeListeners(TaskContextImpl.scala:145) ~[spark-core_2.12-3.1.3.jar:3.1.3] at org.apache.spark.TaskContextImpl.markTaskCompleted(TaskContextImpl.scala:124) ~[spark-core_2.12-3.1.3.jar:3.1.3] at org.apache.spark.scheduler.Task.run(Task.scala:147) ~[spark-core_2.12-3.1.3.jar:3.1.3] at org.apache.spark.executor.Executor$TaskRunner.$anonfun$run$3(Executor.scala:498) ~[spark-core_2.12-3.1.3.jar:3.1.3] at org.apache.spark.util.Utils$.tryWithSafeFinally(Utils.scala:1439) ~[spark-core_2.12-3.1.3.jar:3.1.3] at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:501) [spark-core_2.12-3.1.3.jar:3.1.3] at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) [?:1.8.0_362] at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) [?:1.8.0_362] at java.lang.Thread.run(Thread.java:750) Note: There are 4000 small files in this directory /tmp/spark_data1; In addition, if there are relatively few small files, no error will be reported. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands,
[jira] [Commented] (SPARK-9174) Add documentation for all public SQLConfs
[ https://issues.apache.org/jira/browse/SPARK-9174?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17818378#comment-17818378 ] Nandini commented on SPARK-9174: Hi Team, I was not able to find the doc for _spark.sql.retainGroupColumns_ Can I add 'Whether to retain group by columns or not in GroupedData.agg.'? > Add documentation for all public SQLConfs > - > > Key: SPARK-9174 > URL: https://issues.apache.org/jira/browse/SPARK-9174 > Project: Spark > Issue Type: Improvement > Components: SQL >Reporter: Reynold Xin >Assignee: Reynold Xin >Priority: Major > Fix For: 1.5.0 > > -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org