[jira] [Commented] (SPARK-36860) Create the external hive table for HBase failed
[ https://issues.apache.org/jira/browse/SPARK-36860?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17420483#comment-17420483 ] Kousuke Saruta commented on SPARK-36860: [~yimo_yym] Spark doesn't support creating Hive table using storage handlers yet. Please see also http://spark.apache.org/docs/latest/sql-data-sources-hive-tables.html#specifying-storage-format-for-hive-tables > Create the external hive table for HBase failed > > > Key: SPARK-36860 > URL: https://issues.apache.org/jira/browse/SPARK-36860 > Project: Spark > Issue Type: Bug > Components: SQL >Affects Versions: 3.1.2 >Reporter: wineternity >Priority: Major > > We use follow sql to create hive external table , which read from hbase > {code:java} > CREATE EXTERNAL TABLE if not exists dev.sanyu_spotlight_headline_material( >rowkey string COMMENT 'HBase主键', >content string COMMENT '图文正文') > USING HIVE > ROW FORMAT SERDE >'org.apache.hadoop.hive.hbase.HBaseSerDe' > STORED BY >'org.apache.hadoop.hive.hbase.HBaseStorageHandler' > WITH SERDEPROPERTIES ( >'hbase.columns.mapping'=':key, cf1:content' > ) > TBLPROPERTIES ( >'hbase.table.name'='spotlight_headline_material' > ); > {code} > But the sql failed in Spark 3.1.2, which throw this exception > {code:java} > 21/09/27 11:44:24 INFO scheduler.DAGScheduler: Asked to cancel job group > 26d7459f-7b58-4c18-9939-5f2737525ff2 > 21/09/27 11:44:24 ERROR thriftserver.SparkExecuteStatementOperation: Error > executing query with 26d7459f-7b58-4c18-9939-5f2737525ff2, currentState > RUNNING, > org.apache.spark.sql.catalyst.parser.ParseException: > Operation not allowed: Unexpected combination of ROW FORMAT SERDE > 'org.apache.hadoop.hive.hbase.HBaseSerDe' and STORED BY > 'org.apache.hadoop.hive.hbase.HBaseStorageHandler'WITHSERDEPROPERTIES('hbase.columns.mapping'=':key, > cf1:content')(line 5, pos 0) > {code} > this check was introduced from this change: > [https://github.com/apache/spark/pull/28026] > > Could anyone gave the introduction how to create the external table for hbase > in spark3 now ? -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-36856) Building by "./build/mvn" may be stuck on MacOS
[ https://issues.apache.org/jira/browse/SPARK-36856?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17420482#comment-17420482 ] Apache Spark commented on SPARK-36856: -- User 'copperybean' has created a pull request for this issue: https://github.com/apache/spark/pull/34111 > Building by "./build/mvn" may be stuck on MacOS > --- > > Key: SPARK-36856 > URL: https://issues.apache.org/jira/browse/SPARK-36856 > Project: Spark > Issue Type: Improvement > Components: Build >Affects Versions: 3.0.0, 3.3.0 > Environment: MacOS 11.4 >Reporter: copperybean >Priority: Major > > Command "./build/mvn" will be stuck on my MacOS 11.4. Because it is using > error java home. On my mac, "/usr/bin/java" is a real file instead of a > symbolic link, so the java home is set to path "/usr", and lead the launched > maven process stuck with this error java home. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-36856) Building by "./build/mvn" may be stuck on MacOS
[ https://issues.apache.org/jira/browse/SPARK-36856?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17420481#comment-17420481 ] Apache Spark commented on SPARK-36856: -- User 'copperybean' has created a pull request for this issue: https://github.com/apache/spark/pull/34111 > Building by "./build/mvn" may be stuck on MacOS > --- > > Key: SPARK-36856 > URL: https://issues.apache.org/jira/browse/SPARK-36856 > Project: Spark > Issue Type: Improvement > Components: Build >Affects Versions: 3.0.0, 3.3.0 > Environment: MacOS 11.4 >Reporter: copperybean >Priority: Major > > Command "./build/mvn" will be stuck on my MacOS 11.4. Because it is using > error java home. On my mac, "/usr/bin/java" is a real file instead of a > symbolic link, so the java home is set to path "/usr", and lead the launched > maven process stuck with this error java home. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Assigned] (SPARK-36856) Building by "./build/mvn" may be stuck on MacOS
[ https://issues.apache.org/jira/browse/SPARK-36856?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-36856: Assignee: Apache Spark > Building by "./build/mvn" may be stuck on MacOS > --- > > Key: SPARK-36856 > URL: https://issues.apache.org/jira/browse/SPARK-36856 > Project: Spark > Issue Type: Improvement > Components: Build >Affects Versions: 3.0.0, 3.3.0 > Environment: MacOS 11.4 >Reporter: copperybean >Assignee: Apache Spark >Priority: Major > > Command "./build/mvn" will be stuck on my MacOS 11.4. Because it is using > error java home. On my mac, "/usr/bin/java" is a real file instead of a > symbolic link, so the java home is set to path "/usr", and lead the launched > maven process stuck with this error java home. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Assigned] (SPARK-36856) Building by "./build/mvn" may be stuck on MacOS
[ https://issues.apache.org/jira/browse/SPARK-36856?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-36856: Assignee: (was: Apache Spark) > Building by "./build/mvn" may be stuck on MacOS > --- > > Key: SPARK-36856 > URL: https://issues.apache.org/jira/browse/SPARK-36856 > Project: Spark > Issue Type: Improvement > Components: Build >Affects Versions: 3.0.0, 3.3.0 > Environment: MacOS 11.4 >Reporter: copperybean >Priority: Major > > Command "./build/mvn" will be stuck on my MacOS 11.4. Because it is using > error java home. On my mac, "/usr/bin/java" is a real file instead of a > symbolic link, so the java home is set to path "/usr", and lead the launched > maven process stuck with this error java home. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-36856) Building by "./build/mvn" may be stuck on MacOS
[ https://issues.apache.org/jira/browse/SPARK-36856?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] copperybean updated SPARK-36856: Priority: Major (was: Minor) > Building by "./build/mvn" may be stuck on MacOS > --- > > Key: SPARK-36856 > URL: https://issues.apache.org/jira/browse/SPARK-36856 > Project: Spark > Issue Type: Improvement > Components: Build >Affects Versions: 3.0.0, 3.3.0 > Environment: MacOS 11.4 >Reporter: copperybean >Priority: Major > > Command "./build/mvn" will be stuck on my MacOS 11.4. Because it is using > error java home. On my mac, "/usr/bin/java" is a real file instead of a > symbolic link, so the java home is set to path "/usr", and lead the launched > maven process stuck with this error java home. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Resolved] (SPARK-36740) collection operators should handle duplicated NaN
[ https://issues.apache.org/jira/browse/SPARK-36740?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wenchen Fan resolved SPARK-36740. - Fix Version/s: 3.2.0 Assignee: angerszhu Resolution: Fixed > collection operators should handle duplicated NaN > - > > Key: SPARK-36740 > URL: https://issues.apache.org/jira/browse/SPARK-36740 > Project: Spark > Issue Type: Bug > Components: SQL >Affects Versions: 3.2.0 >Reporter: angerszhu >Assignee: angerszhu >Priority: Major > Labels: correctness > Fix For: 3.2.0 > > > collection operators should handle duplicated NaN, current OpenHashSet can't > handle duplicated NaN -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-36835) Spark 3.2.0 POMs are no longer "dependency reduced"
[ https://issues.apache.org/jira/browse/SPARK-36835?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17420473#comment-17420473 ] Apache Spark commented on SPARK-36835: -- User 'sunchao' has created a pull request for this issue: https://github.com/apache/spark/pull/34110 > Spark 3.2.0 POMs are no longer "dependency reduced" > --- > > Key: SPARK-36835 > URL: https://issues.apache.org/jira/browse/SPARK-36835 > Project: Spark > Issue Type: Bug > Components: Build >Affects Versions: 3.2.0 >Reporter: Josh Rosen >Assignee: Chao Sun >Priority: Blocker > Fix For: 3.2.0 > > > It looks like Spark 3.2.0's POMs are no longer "dependency reduced". As a > result, applications may pull in additional unnecessary dependencies when > depending on Spark. > Spark uses the Maven Shade plugin to create effective POMs and to bundle > shaded versions of certain libraries with Spark (namely, Jetty, Guava, and > JPPML). [By > default|https://maven.apache.org/plugins/maven-shade-plugin/shade-mojo.html#createDependencyReducedPom], > the Maven Shade plugin generates simplified POMs which remove dependencies > on artifacts that have been shaded. > SPARK-33212 / > [b6f46ca29742029efea2790af7fdefbc2fcf52de|https://github.com/apache/spark/commit/b6f46ca29742029efea2790af7fdefbc2fcf52de] > changed the configuration of the Maven Shade plugin, setting > {{createDependencyReducedPom}} to {{false}}. > As a result, the generated POMs now include compile-scope dependencies on the > shaded libraries. For example, compare the {{org.eclipse.jetty}} dependencies > in: > * Spark 3.1.2: > [https://repo1.maven.org/maven2/org/apache/spark/spark-core_2.12/3.1.2/spark-core_2.12-3.1.2.pom] > * Spark 3.2.0 RC2: > [https://repository.apache.org/content/repositories/orgapachespark-1390/org/apache/spark/spark-core_2.12/3.2.0/spark-core_2.12-3.2.0.pom] > I think we should revert back to generating "dependency reduced" POMs to > ensure that Spark declares a proper set of dependencies and to avoid "unknown > unknown" consequences of changing our generated POM format. > /cc [~csun] -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-36860) Create the external hive table for HBase failed
wineternity created SPARK-36860: --- Summary: Create the external hive table for HBase failed Key: SPARK-36860 URL: https://issues.apache.org/jira/browse/SPARK-36860 Project: Spark Issue Type: Bug Components: SQL Affects Versions: 3.1.2 Reporter: wineternity We use follow sql to create hive external table , which read from hbase {code:java} CREATE EXTERNAL TABLE if not exists dev.sanyu_spotlight_headline_material( rowkey string COMMENT 'HBase主键', content string COMMENT '图文正文') USING HIVE ROW FORMAT SERDE 'org.apache.hadoop.hive.hbase.HBaseSerDe' STORED BY 'org.apache.hadoop.hive.hbase.HBaseStorageHandler' WITH SERDEPROPERTIES ( 'hbase.columns.mapping'=':key, cf1:content' ) TBLPROPERTIES ( 'hbase.table.name'='spotlight_headline_material' ); {code} But the sql failed in Spark 3.1.2, which throw this exception {code:java} 21/09/27 11:44:24 INFO scheduler.DAGScheduler: Asked to cancel job group 26d7459f-7b58-4c18-9939-5f2737525ff2 21/09/27 11:44:24 ERROR thriftserver.SparkExecuteStatementOperation: Error executing query with 26d7459f-7b58-4c18-9939-5f2737525ff2, currentState RUNNING, org.apache.spark.sql.catalyst.parser.ParseException: Operation not allowed: Unexpected combination of ROW FORMAT SERDE 'org.apache.hadoop.hive.hbase.HBaseSerDe' and STORED BY 'org.apache.hadoop.hive.hbase.HBaseStorageHandler'WITHSERDEPROPERTIES('hbase.columns.mapping'=':key, cf1:content')(line 5, pos 0) {code} this check was introduced from this change: [https://github.com/apache/spark/pull/28026] Could anyone gave the introduction how to create the external table for hbase in spark3 now ? -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-36838) Improve InSet NaN check generated code performance
[ https://issues.apache.org/jira/browse/SPARK-36838?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hyukjin Kwon updated SPARK-36838: - Priority: Minor (was: Major) > Improve InSet NaN check generated code performance > -- > > Key: SPARK-36838 > URL: https://issues.apache.org/jira/browse/SPARK-36838 > Project: Spark > Issue Type: Improvement > Components: SQL >Affects Versions: 3.0.3, 3.1.2, 3.2.0 >Reporter: angerszhu >Assignee: Apache Spark >Priority: Minor > > Since Set can't check is NaN value is contained in current set. > With codegen, only when value set contains NaN then we have necessary to > check if the value is NaN, or we just need t > o check is the Set contains the value. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Resolved] (SPARK-36838) Improve InSet NaN check generated code performance
[ https://issues.apache.org/jira/browse/SPARK-36838?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hyukjin Kwon resolved SPARK-36838. -- Fix Version/s: 3.3.0 Resolution: Fixed Issue resolved by pull request 34097 [https://github.com/apache/spark/pull/34097] > Improve InSet NaN check generated code performance > -- > > Key: SPARK-36838 > URL: https://issues.apache.org/jira/browse/SPARK-36838 > Project: Spark > Issue Type: Improvement > Components: SQL >Affects Versions: 3.0.3, 3.1.2, 3.2.0 >Reporter: angerszhu >Assignee: Apache Spark >Priority: Minor > Fix For: 3.3.0 > > > Since Set can't check is NaN value is contained in current set. > With codegen, only when value set contains NaN then we have necessary to > check if the value is NaN, or we just need t > o check is the Set contains the value. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-36838) Improve InSet NaN check generated code performance
[ https://issues.apache.org/jira/browse/SPARK-36838?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hyukjin Kwon updated SPARK-36838: - Issue Type: Improvement (was: Bug) > Improve InSet NaN check generated code performance > -- > > Key: SPARK-36838 > URL: https://issues.apache.org/jira/browse/SPARK-36838 > Project: Spark > Issue Type: Improvement > Components: SQL >Affects Versions: 3.0.3, 3.1.2, 3.2.0 >Reporter: angerszhu >Assignee: Apache Spark >Priority: Major > > Since Set can't check is NaN value is contained in current set. > With codegen, only when value set contains NaN then we have necessary to > check if the value is NaN, or we just need t > o check is the Set contains the value. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-36853) Code failing on checkstyle
[ https://issues.apache.org/jira/browse/SPARK-36853?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17420438#comment-17420438 ] Shockang commented on SPARK-36853: -- OK > Code failing on checkstyle > -- > > Key: SPARK-36853 > URL: https://issues.apache.org/jira/browse/SPARK-36853 > Project: Spark > Issue Type: Bug > Components: Build >Affects Versions: 3.3.0 >Reporter: Abhinav Kumar >Priority: Trivial > > There are more - just pasting sample > > [INFO] There are 32 errors reported by Checkstyle 8.43 with > dev/checkstyle.xml ruleset. > [ERROR] src\main\java\org\apache\spark\sql\api\java\UDF11.java:[29] (sizes) > LineLength: Line is longer than 100 characters (found 107). > [ERROR] src\main\java\org\apache\spark\sql\api\java\UDF12.java:[29] (sizes) > LineLength: Line is longer than 100 characters (found 116). > [ERROR] src\main\java\org\apache\spark\sql\api\java\UDF13.java:[28] (sizes) > LineLength: Line is longer than 100 characters (found 104). > [ERROR] src\main\java\org\apache\spark\sql\api\java\UDF13.java:[29] (sizes) > LineLength: Line is longer than 100 characters (found 125). > [ERROR] src\main\java\org\apache\spark\sql\api\java\UDF14.java:[28] (sizes) > LineLength: Line is longer than 100 characters (found 109). > [ERROR] src\main\java\org\apache\spark\sql\api\java\UDF14.java:[29] (sizes) > LineLength: Line is longer than 100 characters (found 134). > [ERROR] src\main\java\org\apache\spark\sql\api\java\UDF15.java:[28] (sizes) > LineLength: Line is longer than 100 characters (found 114). > [ERROR] src\main\java\org\apache\spark\sql\api\java\UDF15.java:[29] (sizes) > LineLength: Line is longer than 100 characters (found 143). > [ERROR] src\main\java\org\apache\spark\sql\api\java\UDF16.java:[28] (sizes) > LineLength: Line is longer than 100 characters (found 119). > [ERROR] src\main\java\org\apache\spark\sql\api\java\UDF16.java:[29] (sizes) > LineLength: Line is longer than 100 characters (found 152). > [ERROR] src\main\java\org\apache\spark\sql\api\java\UDF17.java:[28] (sizes) > LineLength: Line is longer than 100 characters (found 124). > [ERROR] src\main\java\org\apache\spark\sql\api\java\UDF17.java:[29] (sizes) > LineLength: Line is longer than 100 characters (found 161). > [ERROR] src\main\java\org\apache\spark\sql\api\java\UDF18.java:[28] (sizes) > LineLength: Line is longer than 100 characters (found 129). > [ERROR] src\main\java\org\apache\spark\sql\api\java\UDF18.java:[29] (sizes) > LineLength: Line is longer than 100 characters (found 170). > [ERROR] src\main\java\org\apache\spark\sql\api\java\UDF19.java:[28] (sizes) > LineLength: Line is longer than 100 characters (found 134). > [ERROR] src\main\java\org\apache\spark\sql\api\java\UDF19.java:[29] (sizes) > LineLength: Line is longer than 100 characters (found 179). > [ERROR] src\main\java\org\apache\spark\sql\api\java\UDF20.java:[28] (sizes) > LineLength: Line is longer than 100 characters (found 139). > [ERROR] src\main\java\org\apache\spark\sql\api\java\UDF20.java:[29] (sizes) > LineLength: Line is longer than 100 characters (found 188). > [ERROR] src\main\java\org\apache\spark\sql\api\java\UDF21.java:[28] (sizes) > LineLength: Line is longer than 100 characters (found 144). > [ERROR] src\main\java\org\apache\spark\sql\api\java\UDF21.java:[29] (sizes) > LineLength: Line is longer than 100 characters (found 197). > [ERROR] src\main\java\org\apache\spark\sql\api\java\UDF22.java:[28] (sizes) > LineLength: Line is longer than 100 characters (found 149). > [ERROR] src\main\java\org\apache\spark\sql\api\java\UDF22.java:[29] (sizes) > LineLength: Line is longer than 100 characters (found 206). > [ERROR] src\main\java\org\apache\spark\sql\streaming\Trigger.java:[44,25] > (naming) MethodName: Method name 'ProcessingTime' must match pattern > '^[a-z][a-z0-9][a-zA-Z0-9_]*$'. > [ERROR] src\main\java\org\apache\spark\sql\streaming\Trigger.java:[60,25] > (naming) MethodName: Method name 'ProcessingTime' must match pattern > '^[a-z][a-z0-9][a-zA-Z0-9_]*$'. > [ERROR] src\main\java\org\apache\spark\sql\streaming\Trigger.java:[75,25] > (naming) MethodName: Method name 'ProcessingTime' must match pattern > '^[a-z][a-z0-9][a-zA-Z0-9_]*$'. > [ERROR] src\main\java\org\apache\spark\sql\streaming\Trigger.java:[88,25] > (naming) MethodName: Method name 'ProcessingTime' must match pattern > '^[a-z][a-z0-9][a-zA-Z0-9_]*$'. > [ERROR] src\main\java\org\apache\spark\sql\streaming\Trigger.java:[100,25] > (naming) MethodName: Method name 'Once' must match pattern > '^[a-z][a-z0-9][a-zA-Z0-9_]*$'. > [ERROR] src\main\java\org\apache\spark\sql\streaming\Trigger.java:[110,25] > (naming) MethodName: Method name 'AvailableNow' must match pattern > '^[a-z][a-z0-9][a-zA-Z0-9_]*$'. > [ERROR] src\main\java\org\apache\spark\sql\streaming\Trigger.java:[120,25] >
[jira] [Reopened] (SPARK-36853) Code failing on checkstyle
[ https://issues.apache.org/jira/browse/SPARK-36853?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hyukjin Kwon reopened SPARK-36853: -- > Code failing on checkstyle > -- > > Key: SPARK-36853 > URL: https://issues.apache.org/jira/browse/SPARK-36853 > Project: Spark > Issue Type: Bug > Components: Build >Affects Versions: 3.3.0 >Reporter: Abhinav Kumar >Priority: Trivial > > There are more - just pasting sample > > [INFO] There are 32 errors reported by Checkstyle 8.43 with > dev/checkstyle.xml ruleset. > [ERROR] src\main\java\org\apache\spark\sql\api\java\UDF11.java:[29] (sizes) > LineLength: Line is longer than 100 characters (found 107). > [ERROR] src\main\java\org\apache\spark\sql\api\java\UDF12.java:[29] (sizes) > LineLength: Line is longer than 100 characters (found 116). > [ERROR] src\main\java\org\apache\spark\sql\api\java\UDF13.java:[28] (sizes) > LineLength: Line is longer than 100 characters (found 104). > [ERROR] src\main\java\org\apache\spark\sql\api\java\UDF13.java:[29] (sizes) > LineLength: Line is longer than 100 characters (found 125). > [ERROR] src\main\java\org\apache\spark\sql\api\java\UDF14.java:[28] (sizes) > LineLength: Line is longer than 100 characters (found 109). > [ERROR] src\main\java\org\apache\spark\sql\api\java\UDF14.java:[29] (sizes) > LineLength: Line is longer than 100 characters (found 134). > [ERROR] src\main\java\org\apache\spark\sql\api\java\UDF15.java:[28] (sizes) > LineLength: Line is longer than 100 characters (found 114). > [ERROR] src\main\java\org\apache\spark\sql\api\java\UDF15.java:[29] (sizes) > LineLength: Line is longer than 100 characters (found 143). > [ERROR] src\main\java\org\apache\spark\sql\api\java\UDF16.java:[28] (sizes) > LineLength: Line is longer than 100 characters (found 119). > [ERROR] src\main\java\org\apache\spark\sql\api\java\UDF16.java:[29] (sizes) > LineLength: Line is longer than 100 characters (found 152). > [ERROR] src\main\java\org\apache\spark\sql\api\java\UDF17.java:[28] (sizes) > LineLength: Line is longer than 100 characters (found 124). > [ERROR] src\main\java\org\apache\spark\sql\api\java\UDF17.java:[29] (sizes) > LineLength: Line is longer than 100 characters (found 161). > [ERROR] src\main\java\org\apache\spark\sql\api\java\UDF18.java:[28] (sizes) > LineLength: Line is longer than 100 characters (found 129). > [ERROR] src\main\java\org\apache\spark\sql\api\java\UDF18.java:[29] (sizes) > LineLength: Line is longer than 100 characters (found 170). > [ERROR] src\main\java\org\apache\spark\sql\api\java\UDF19.java:[28] (sizes) > LineLength: Line is longer than 100 characters (found 134). > [ERROR] src\main\java\org\apache\spark\sql\api\java\UDF19.java:[29] (sizes) > LineLength: Line is longer than 100 characters (found 179). > [ERROR] src\main\java\org\apache\spark\sql\api\java\UDF20.java:[28] (sizes) > LineLength: Line is longer than 100 characters (found 139). > [ERROR] src\main\java\org\apache\spark\sql\api\java\UDF20.java:[29] (sizes) > LineLength: Line is longer than 100 characters (found 188). > [ERROR] src\main\java\org\apache\spark\sql\api\java\UDF21.java:[28] (sizes) > LineLength: Line is longer than 100 characters (found 144). > [ERROR] src\main\java\org\apache\spark\sql\api\java\UDF21.java:[29] (sizes) > LineLength: Line is longer than 100 characters (found 197). > [ERROR] src\main\java\org\apache\spark\sql\api\java\UDF22.java:[28] (sizes) > LineLength: Line is longer than 100 characters (found 149). > [ERROR] src\main\java\org\apache\spark\sql\api\java\UDF22.java:[29] (sizes) > LineLength: Line is longer than 100 characters (found 206). > [ERROR] src\main\java\org\apache\spark\sql\streaming\Trigger.java:[44,25] > (naming) MethodName: Method name 'ProcessingTime' must match pattern > '^[a-z][a-z0-9][a-zA-Z0-9_]*$'. > [ERROR] src\main\java\org\apache\spark\sql\streaming\Trigger.java:[60,25] > (naming) MethodName: Method name 'ProcessingTime' must match pattern > '^[a-z][a-z0-9][a-zA-Z0-9_]*$'. > [ERROR] src\main\java\org\apache\spark\sql\streaming\Trigger.java:[75,25] > (naming) MethodName: Method name 'ProcessingTime' must match pattern > '^[a-z][a-z0-9][a-zA-Z0-9_]*$'. > [ERROR] src\main\java\org\apache\spark\sql\streaming\Trigger.java:[88,25] > (naming) MethodName: Method name 'ProcessingTime' must match pattern > '^[a-z][a-z0-9][a-zA-Z0-9_]*$'. > [ERROR] src\main\java\org\apache\spark\sql\streaming\Trigger.java:[100,25] > (naming) MethodName: Method name 'Once' must match pattern > '^[a-z][a-z0-9][a-zA-Z0-9_]*$'. > [ERROR] src\main\java\org\apache\spark\sql\streaming\Trigger.java:[110,25] > (naming) MethodName: Method name 'AvailableNow' must match pattern > '^[a-z][a-z0-9][a-zA-Z0-9_]*$'. > [ERROR] src\main\java\org\apache\spark\sql\streaming\Trigger.java:[120,25] > (naming) MethodName: Method name
[jira] [Commented] (SPARK-36853) Code failing on checkstyle
[ https://issues.apache.org/jira/browse/SPARK-36853?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17420417#comment-17420417 ] Hyukjin Kwon commented on SPARK-36853: -- okay, they look valid. I don't know why they are not caught in CI though. feel free to make a Pr and go ahead. > Code failing on checkstyle > -- > > Key: SPARK-36853 > URL: https://issues.apache.org/jira/browse/SPARK-36853 > Project: Spark > Issue Type: Bug > Components: Build >Affects Versions: 3.3.0 >Reporter: Abhinav Kumar >Priority: Trivial > > There are more - just pasting sample > > [INFO] There are 32 errors reported by Checkstyle 8.43 with > dev/checkstyle.xml ruleset. > [ERROR] src\main\java\org\apache\spark\sql\api\java\UDF11.java:[29] (sizes) > LineLength: Line is longer than 100 characters (found 107). > [ERROR] src\main\java\org\apache\spark\sql\api\java\UDF12.java:[29] (sizes) > LineLength: Line is longer than 100 characters (found 116). > [ERROR] src\main\java\org\apache\spark\sql\api\java\UDF13.java:[28] (sizes) > LineLength: Line is longer than 100 characters (found 104). > [ERROR] src\main\java\org\apache\spark\sql\api\java\UDF13.java:[29] (sizes) > LineLength: Line is longer than 100 characters (found 125). > [ERROR] src\main\java\org\apache\spark\sql\api\java\UDF14.java:[28] (sizes) > LineLength: Line is longer than 100 characters (found 109). > [ERROR] src\main\java\org\apache\spark\sql\api\java\UDF14.java:[29] (sizes) > LineLength: Line is longer than 100 characters (found 134). > [ERROR] src\main\java\org\apache\spark\sql\api\java\UDF15.java:[28] (sizes) > LineLength: Line is longer than 100 characters (found 114). > [ERROR] src\main\java\org\apache\spark\sql\api\java\UDF15.java:[29] (sizes) > LineLength: Line is longer than 100 characters (found 143). > [ERROR] src\main\java\org\apache\spark\sql\api\java\UDF16.java:[28] (sizes) > LineLength: Line is longer than 100 characters (found 119). > [ERROR] src\main\java\org\apache\spark\sql\api\java\UDF16.java:[29] (sizes) > LineLength: Line is longer than 100 characters (found 152). > [ERROR] src\main\java\org\apache\spark\sql\api\java\UDF17.java:[28] (sizes) > LineLength: Line is longer than 100 characters (found 124). > [ERROR] src\main\java\org\apache\spark\sql\api\java\UDF17.java:[29] (sizes) > LineLength: Line is longer than 100 characters (found 161). > [ERROR] src\main\java\org\apache\spark\sql\api\java\UDF18.java:[28] (sizes) > LineLength: Line is longer than 100 characters (found 129). > [ERROR] src\main\java\org\apache\spark\sql\api\java\UDF18.java:[29] (sizes) > LineLength: Line is longer than 100 characters (found 170). > [ERROR] src\main\java\org\apache\spark\sql\api\java\UDF19.java:[28] (sizes) > LineLength: Line is longer than 100 characters (found 134). > [ERROR] src\main\java\org\apache\spark\sql\api\java\UDF19.java:[29] (sizes) > LineLength: Line is longer than 100 characters (found 179). > [ERROR] src\main\java\org\apache\spark\sql\api\java\UDF20.java:[28] (sizes) > LineLength: Line is longer than 100 characters (found 139). > [ERROR] src\main\java\org\apache\spark\sql\api\java\UDF20.java:[29] (sizes) > LineLength: Line is longer than 100 characters (found 188). > [ERROR] src\main\java\org\apache\spark\sql\api\java\UDF21.java:[28] (sizes) > LineLength: Line is longer than 100 characters (found 144). > [ERROR] src\main\java\org\apache\spark\sql\api\java\UDF21.java:[29] (sizes) > LineLength: Line is longer than 100 characters (found 197). > [ERROR] src\main\java\org\apache\spark\sql\api\java\UDF22.java:[28] (sizes) > LineLength: Line is longer than 100 characters (found 149). > [ERROR] src\main\java\org\apache\spark\sql\api\java\UDF22.java:[29] (sizes) > LineLength: Line is longer than 100 characters (found 206). > [ERROR] src\main\java\org\apache\spark\sql\streaming\Trigger.java:[44,25] > (naming) MethodName: Method name 'ProcessingTime' must match pattern > '^[a-z][a-z0-9][a-zA-Z0-9_]*$'. > [ERROR] src\main\java\org\apache\spark\sql\streaming\Trigger.java:[60,25] > (naming) MethodName: Method name 'ProcessingTime' must match pattern > '^[a-z][a-z0-9][a-zA-Z0-9_]*$'. > [ERROR] src\main\java\org\apache\spark\sql\streaming\Trigger.java:[75,25] > (naming) MethodName: Method name 'ProcessingTime' must match pattern > '^[a-z][a-z0-9][a-zA-Z0-9_]*$'. > [ERROR] src\main\java\org\apache\spark\sql\streaming\Trigger.java:[88,25] > (naming) MethodName: Method name 'ProcessingTime' must match pattern > '^[a-z][a-z0-9][a-zA-Z0-9_]*$'. > [ERROR] src\main\java\org\apache\spark\sql\streaming\Trigger.java:[100,25] > (naming) MethodName: Method name 'Once' must match pattern > '^[a-z][a-z0-9][a-zA-Z0-9_]*$'. > [ERROR] src\main\java\org\apache\spark\sql\streaming\Trigger.java:[110,25] > (naming) MethodName: Method name 'AvailableNow' must match pattern >
[jira] [Assigned] (SPARK-36859) Upgrade kubernetes-client to 5.8.0
[ https://issues.apache.org/jira/browse/SPARK-36859?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dongjoon Hyun reassigned SPARK-36859: - Assignee: Dongjoon Hyun > Upgrade kubernetes-client to 5.8.0 > -- > > Key: SPARK-36859 > URL: https://issues.apache.org/jira/browse/SPARK-36859 > Project: Spark > Issue Type: Improvement > Components: Build, Kubernetes >Affects Versions: 3.3.0 >Reporter: Dongjoon Hyun >Assignee: Dongjoon Hyun >Priority: Major > > This issue aims to support Kubernetes Model v1.22.1 -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Resolved] (SPARK-36859) Upgrade kubernetes-client to 5.8.0
[ https://issues.apache.org/jira/browse/SPARK-36859?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dongjoon Hyun resolved SPARK-36859. --- Fix Version/s: 3.3.0 Resolution: Fixed Issue resolved by pull request 34109 [https://github.com/apache/spark/pull/34109] > Upgrade kubernetes-client to 5.8.0 > -- > > Key: SPARK-36859 > URL: https://issues.apache.org/jira/browse/SPARK-36859 > Project: Spark > Issue Type: Improvement > Components: Build, Kubernetes >Affects Versions: 3.3.0 >Reporter: Dongjoon Hyun >Assignee: Dongjoon Hyun >Priority: Major > Fix For: 3.3.0 > > > This issue aims to support Kubernetes Model v1.22.1 -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-36802) Incorrect writing the string, containing symbols like "\" to Hive
[ https://issues.apache.org/jira/browse/SPARK-36802?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17420397#comment-17420397 ] Vladislav commented on SPARK-36802: --- [~hyukjin.kwon], 3.x has the same problem. > Incorrect writing the string, containing symbols like "\" to Hive > -- > > Key: SPARK-36802 > URL: https://issues.apache.org/jira/browse/SPARK-36802 > Project: Spark > Issue Type: Bug > Components: Spark Core >Affects Versions: 2.3.0 >Reporter: Vladislav >Priority: Minor > > After writing the strings, containing symbol like "\" to Hive, the result > record in HiveTable doesn't contain that symbol. It happens when using the > standart method of pyspark.sql.readwriter.DataFrameWriter saveAsTable as well > as insertInto. > For example, running the query > spark.sql("select '\d\{4}' as code").write.saveAsTable('db.table') > I've got the next result in Hive: > spark.table('db.table').collect()[0][0] > >>"d\{4}" > But expected the next > >> "\d\{4}" > Spark version : '2.3.0.2.6.5.0-292' > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-36859) Upgrade kubernetes-client to 5.8.0
[ https://issues.apache.org/jira/browse/SPARK-36859?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17420383#comment-17420383 ] Apache Spark commented on SPARK-36859: -- User 'dongjoon-hyun' has created a pull request for this issue: https://github.com/apache/spark/pull/34109 > Upgrade kubernetes-client to 5.8.0 > -- > > Key: SPARK-36859 > URL: https://issues.apache.org/jira/browse/SPARK-36859 > Project: Spark > Issue Type: Improvement > Components: Build, Kubernetes >Affects Versions: 3.3.0 >Reporter: Dongjoon Hyun >Priority: Major > > This issue aims to support Kubernetes Model v1.22.1 -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Assigned] (SPARK-36859) Upgrade kubernetes-client to 5.8.0
[ https://issues.apache.org/jira/browse/SPARK-36859?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-36859: Assignee: (was: Apache Spark) > Upgrade kubernetes-client to 5.8.0 > -- > > Key: SPARK-36859 > URL: https://issues.apache.org/jira/browse/SPARK-36859 > Project: Spark > Issue Type: Improvement > Components: Build, Kubernetes >Affects Versions: 3.3.0 >Reporter: Dongjoon Hyun >Priority: Major > > This issue aims to support Kubernetes Model v1.22.1 -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Assigned] (SPARK-36859) Upgrade kubernetes-client to 5.8.0
[ https://issues.apache.org/jira/browse/SPARK-36859?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-36859: Assignee: Apache Spark > Upgrade kubernetes-client to 5.8.0 > -- > > Key: SPARK-36859 > URL: https://issues.apache.org/jira/browse/SPARK-36859 > Project: Spark > Issue Type: Improvement > Components: Build, Kubernetes >Affects Versions: 3.3.0 >Reporter: Dongjoon Hyun >Assignee: Apache Spark >Priority: Major > > This issue aims to support Kubernetes Model v1.22.1 -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-36859) Upgrade kubernetes-client to 5.8.0
[ https://issues.apache.org/jira/browse/SPARK-36859?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17420382#comment-17420382 ] Apache Spark commented on SPARK-36859: -- User 'dongjoon-hyun' has created a pull request for this issue: https://github.com/apache/spark/pull/34109 > Upgrade kubernetes-client to 5.8.0 > -- > > Key: SPARK-36859 > URL: https://issues.apache.org/jira/browse/SPARK-36859 > Project: Spark > Issue Type: Improvement > Components: Build, Kubernetes >Affects Versions: 3.3.0 >Reporter: Dongjoon Hyun >Priority: Major > > This issue aims to support Kubernetes Model v1.22.1 -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-36859) Upgrade kubernetes-client to 5.8.0
Dongjoon Hyun created SPARK-36859: - Summary: Upgrade kubernetes-client to 5.8.0 Key: SPARK-36859 URL: https://issues.apache.org/jira/browse/SPARK-36859 Project: Spark Issue Type: Improvement Components: Build, Kubernetes Affects Versions: 3.3.0 Reporter: Dongjoon Hyun This issue aims to support Kubernetes Model v1.22.1 -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-36858) Spark API to apply same function to multiple columns
Armand BERGES created SPARK-36858: - Summary: Spark API to apply same function to multiple columns Key: SPARK-36858 URL: https://issues.apache.org/jira/browse/SPARK-36858 Project: Spark Issue Type: New Feature Components: Spark Core Affects Versions: 3.1.2, 2.4.8 Reporter: Armand BERGES Hi My team and I have regularly need to apply the same function to multiple columns at once. For example, we want to remove all non alphanumerical characters to each columns of our dataframes. When we hit this use case first, some people in my team were using this kind of code : {code:java} val colListToClean = ## Generate some list, could be very long. val dfToClean: DataFrame = ... ## This is the dataframe we want to clean def cleanFunction(colName: String): Column = ... ## Write some function to manipulate column based on its name. val dfCleaned = colListToClean.foldLeft(dfToClean)((df, colName) => df.withColumn(colName, cleanFunction(colName)){code} This kind of code when applied on a large set of columns overloaded our driver (because a Dataframe is generated for each column to clean). Based on this issue, we developed some code to add two functions : * One to apply the same function to multiple columns * One to rename multiple columns based on a Map. I wonder if your ever ask your team to add such kind of API ? If you did, had you any kind of issue regarding the implementation ? If you didn't, is this any idea you could add to Spark ? Best regards, LvffY -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Comment Edited] (SPARK-36694) Unable to build Spark on Azure DevOps with ubuntu-latest
[ https://issues.apache.org/jira/browse/SPARK-36694?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17420327#comment-17420327 ] Armand BERGES edited comment on SPARK-36694 at 9/26/21, 4:28 PM: - Didn't see last answer from [~sarutak], just answer now. was (Author: lvffy): Didn't last answer from [~sarutak], just answer now. > Unable to build Spark on Azure DevOps with ubuntu-latest > > > Key: SPARK-36694 > URL: https://issues.apache.org/jira/browse/SPARK-36694 > Project: Spark > Issue Type: Bug > Components: Build >Affects Versions: 2.4.8 >Reporter: Armand BERGES >Priority: Minor > > Hello > With my team we're currently triying to set up some test environment by using > Spark on kubernetes. > For this purpose, we're following your (great) documentation to [build > spark|https://spark.apache.org/docs/2.4.8/building-spark.html] and [build > spark docker > images|https://spark.apache.org/docs/latest/running-on-kubernetes.html#docker-images]. > > To make our build, we're using Azure DevOps. > I don't know if it's a known bug or requirements I didn't see but I found > that I couldn't build spark on the Azure agent *ubuntu-latest* which I trust > to be *ubuntu-20.04*. The exact same build works on *ubuntu-18.04*. > My maven build always failed on building *spark-unsafe_2.11* with the > following error : > {code:java} > [error] warning: [options] bootstrap class path not set in conjunction with > -source 8 > [error] > /home/vsts/work/1/s/spark/common/unsafe/src/main/java/org/apache/spark/unsafe/Platform.java:25: > error: cannot find symbol [error] import sun.misc.Cleaner; > [error] ^ > [error] symbol: class Cleaner > [error] location: package sun.misc > [error] > /home/vsts/work/1/s/spark/common/unsafe/src/main/java/org/apache/spark/unsafe/Platform.java:172: > error: cannot find symbol > [error] Cleaner cleaner = Cleaner.create(buffer, () -> freeMemory(memory)); > [error] ^ > [error] symbol: class Cleaner [error] location: class Platform > [error] > /home/vsts/work/1/s/spark/common/unsafe/src/main/java/org/apache/spark/unsafe/Platform.java:172: > error: cannot find symbol > [error] Cleaner cleaner = Cleaner.create(buffer, () -> freeMemory(memory)); > [error] ^ > [error] symbol: variable Cleaner > [error] location: class Platform > [error] 3 errors > [error] 1 warning > [error] Compile failed at Sep 8, 2021 10:37:02 AM [1.126s]{code} > > Please tell me if I miss anything, > Best regards -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Reopened] (SPARK-36694) Unable to build Spark on Azure DevOps with ubuntu-latest
[ https://issues.apache.org/jira/browse/SPARK-36694?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Armand BERGES reopened SPARK-36694: --- Didn't last answer from [~sarutak], just answer now. > Unable to build Spark on Azure DevOps with ubuntu-latest > > > Key: SPARK-36694 > URL: https://issues.apache.org/jira/browse/SPARK-36694 > Project: Spark > Issue Type: Bug > Components: Build >Affects Versions: 2.4.8 >Reporter: Armand BERGES >Priority: Minor > > Hello > With my team we're currently triying to set up some test environment by using > Spark on kubernetes. > For this purpose, we're following your (great) documentation to [build > spark|https://spark.apache.org/docs/2.4.8/building-spark.html] and [build > spark docker > images|https://spark.apache.org/docs/latest/running-on-kubernetes.html#docker-images]. > > To make our build, we're using Azure DevOps. > I don't know if it's a known bug or requirements I didn't see but I found > that I couldn't build spark on the Azure agent *ubuntu-latest* which I trust > to be *ubuntu-20.04*. The exact same build works on *ubuntu-18.04*. > My maven build always failed on building *spark-unsafe_2.11* with the > following error : > {code:java} > [error] warning: [options] bootstrap class path not set in conjunction with > -source 8 > [error] > /home/vsts/work/1/s/spark/common/unsafe/src/main/java/org/apache/spark/unsafe/Platform.java:25: > error: cannot find symbol [error] import sun.misc.Cleaner; > [error] ^ > [error] symbol: class Cleaner > [error] location: package sun.misc > [error] > /home/vsts/work/1/s/spark/common/unsafe/src/main/java/org/apache/spark/unsafe/Platform.java:172: > error: cannot find symbol > [error] Cleaner cleaner = Cleaner.create(buffer, () -> freeMemory(memory)); > [error] ^ > [error] symbol: class Cleaner [error] location: class Platform > [error] > /home/vsts/work/1/s/spark/common/unsafe/src/main/java/org/apache/spark/unsafe/Platform.java:172: > error: cannot find symbol > [error] Cleaner cleaner = Cleaner.create(buffer, () -> freeMemory(memory)); > [error] ^ > [error] symbol: variable Cleaner > [error] location: class Platform > [error] 3 errors > [error] 1 warning > [error] Compile failed at Sep 8, 2021 10:37:02 AM [1.126s]{code} > > Please tell me if I miss anything, > Best regards -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-36694) Unable to build Spark on Azure DevOps with ubuntu-latest
[ https://issues.apache.org/jira/browse/SPARK-36694?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17420326#comment-17420326 ] Armand BERGES commented on SPARK-36694: --- [~sarutak] is there any command you want me to test to ensure this ? The code used to build spark is the following snippet : {noformat} - task: Maven@3 displayName: 'Build spark ${{ parameters.spark_checkout_reference }}' inputs: mavenPomFile: 'spark/pom.xml' goals: 'clean package' options: -DskipTests --activate-profiles ${{ parameters.spark_profile_to_activate }} --no-transfer-progress jdkVersionOption: '1.8' ## This should ensure that maven is running under Java 8. mavenOptions: $(MAVEN_OPTS){noformat} > Unable to build Spark on Azure DevOps with ubuntu-latest > > > Key: SPARK-36694 > URL: https://issues.apache.org/jira/browse/SPARK-36694 > Project: Spark > Issue Type: Bug > Components: Build >Affects Versions: 2.4.8 >Reporter: Armand BERGES >Priority: Minor > > Hello > With my team we're currently triying to set up some test environment by using > Spark on kubernetes. > For this purpose, we're following your (great) documentation to [build > spark|https://spark.apache.org/docs/2.4.8/building-spark.html] and [build > spark docker > images|https://spark.apache.org/docs/latest/running-on-kubernetes.html#docker-images]. > > To make our build, we're using Azure DevOps. > I don't know if it's a known bug or requirements I didn't see but I found > that I couldn't build spark on the Azure agent *ubuntu-latest* which I trust > to be *ubuntu-20.04*. The exact same build works on *ubuntu-18.04*. > My maven build always failed on building *spark-unsafe_2.11* with the > following error : > {code:java} > [error] warning: [options] bootstrap class path not set in conjunction with > -source 8 > [error] > /home/vsts/work/1/s/spark/common/unsafe/src/main/java/org/apache/spark/unsafe/Platform.java:25: > error: cannot find symbol [error] import sun.misc.Cleaner; > [error] ^ > [error] symbol: class Cleaner > [error] location: package sun.misc > [error] > /home/vsts/work/1/s/spark/common/unsafe/src/main/java/org/apache/spark/unsafe/Platform.java:172: > error: cannot find symbol > [error] Cleaner cleaner = Cleaner.create(buffer, () -> freeMemory(memory)); > [error] ^ > [error] symbol: class Cleaner [error] location: class Platform > [error] > /home/vsts/work/1/s/spark/common/unsafe/src/main/java/org/apache/spark/unsafe/Platform.java:172: > error: cannot find symbol > [error] Cleaner cleaner = Cleaner.create(buffer, () -> freeMemory(memory)); > [error] ^ > [error] symbol: variable Cleaner > [error] location: class Platform > [error] 3 errors > [error] 1 warning > [error] Compile failed at Sep 8, 2021 10:37:02 AM [1.126s]{code} > > Please tell me if I miss anything, > Best regards -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-36826) CVEs in libraries used in bundled jars
[ https://issues.apache.org/jira/browse/SPARK-36826?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17420296#comment-17420296 ] Sean R. Owen commented on SPARK-36826: -- A few things here - Static analysis output like this has value, but it only tells you that some vulnerability affects some part of potentially large dependencies. It doesn't mean it affects Spark. That said it's always safer to just update the dependencies, if it's easy. Most of these come from Hadoop, which is somewhat tricky to update in a maintenance branch without breaking things. But I believe all of these are updated already directly or indirectly in Spark 3.2.0. You'll want to check code in master for check like this. See https://github.com/apache/spark/blob/master/dev/deps/spark-deps-hadoop-3.2-hive-2.3 for example. > CVEs in libraries used in bundled jars > -- > > Key: SPARK-36826 > URL: https://issues.apache.org/jira/browse/SPARK-36826 > Project: Spark > Issue Type: Bug > Components: Spark Core >Affects Versions: 3.1.2 >Reporter: Carlos Rodríguez Hernández >Priority: Major > > Hi, I found several CVEs in dependency libraries bundled in the > _aws-java-sdk-bundle_ jar. > We are using Spark 3.1.2, which bundles _hadoop-*_ jars version 3.2.0: > {code:bash} > $ curl -JLO > "https://ftp.cixug.es/apache/spark/spark-3.1.2/spark-3.1.2-bin-hadoop3.2.tgz; > $ tar xzf spark-3.1.2-bin-hadoop3.2.tgz > $ find spark-3.1.2-bin-hadoop3.2/jars -wholename '*/hadoop-*' > spark-3.1.2-bin-hadoop3.2/jars/hadoop-client-3.2.0.jar > spark-3.1.2-bin-hadoop3.2/jars/hadoop-mapreduce-client-core-3.2.0.jar > spark-3.1.2-bin-hadoop3.2/jars/hadoop-common-3.2.0.jar > spark-3.1.2-bin-hadoop3.2/jars/hadoop-mapreduce-client-jobclient-3.2.0.jar > spark-3.1.2-bin-hadoop3.2/jars/hadoop-auth-3.2.0.jar > spark-3.1.2-bin-hadoop3.2/jars/hadoop-yarn-server-common-3.2.0.jar > spark-3.1.2-bin-hadoop3.2/jars/hadoop-yarn-api-3.2.0.jar > spark-3.1.2-bin-hadoop3.2/jars/hadoop-yarn-registry-3.2.0.jar > spark-3.1.2-bin-hadoop3.2/jars/hadoop-annotations-3.2.0.jar > spark-3.1.2-bin-hadoop3.2/jars/hadoop-yarn-client-3.2.0.jar > spark-3.1.2-bin-hadoop3.2/jars/hadoop-hdfs-client-3.2.0.jar > spark-3.1.2-bin-hadoop3.2/jars/hadoop-mapreduce-client-common-3.2.0.jar > spark-3.1.2-bin-hadoop3.2/jars/hadoop-yarn-common-3.2.0.jar > spark-3.1.2-bin-hadoop3.2/jars/hadoop-yarn-server-web-proxy-3.2.0.jar > {code} > There is a dependency between _hadoop-aws_, _hadoop-common_, and > _hadoop-project_ versions, as well, the _aws-java-sdk_ one should match the > required by _hadoop-project_, due to this dependencies we are including > _hadoop-aws-3.2.0_ and _aws-java-sdk-bundle-1.11.375_: > {code:bash} > $ find spark-3.1.2-bin-hadoop3.2/jars -wholename > spark-3.1.2-bin-hadoop3.2/jars/hadoop-aws-3.2.0.jar > spark-3.1.2-bin-hadoop3.2/jars/aws-java-sdk-bundle-1.11.375.jar > {code} > Taking a look at the _hadoop-project_ pom, the _aws-java-sdk_ version is the > correct one: > {code:bash} > $ curl -JLO > "https://repo1.maven.org/maven2/org/apache/hadoop/hadoop-project/3.2.0/hadoop-project-3.2.0.pom; > $ cat hadoop-project-3.2.0.pom | grep aws-java-sdk > 1.11.375 > aws-java-sdk-bundle > ${aws-java-sdk.version} > {code} > Do you think it would be possible to update the versions of the jars to solve > the vulnerabilities? > > Please see below the CVE report for _jars/aws-java-sdk-bundle-1.11.375.jar_: > ||LIBRARY||VULNERABILITY ID||SEVERITY||INSTALLED VERSION||FIXED > VERSION||TITLE|| > |com.fasterxml.jackson.core:jackson-databind|CVE-2017-15095|CRITICAL|2.6.7.1|2.9.4, > 2.8.11|jackson-databind: Unsafe| > |com.fasterxml.jackson.core:jackson-databind|CVE-2017-17485|CRITICAL|2.6.7.1|2.8.11, > 2.9.4|jackson-databind: Unsafe| > |com.fasterxml.jackson.core:jackson-databind|CVE-2018-11307|CRITICAL|2.6.7.1|2.8.11.2, > 2.7.9.4, 2.9.6|jackson-databind: Potential| > |com.fasterxml.jackson.core:jackson-databind|CVE-2018-14718|CRITICAL|2.6.7.1|2.7.9.5, > 2.8.11.3, 2.9.7|jackson-databind: arbitrary code| > |com.fasterxml.jackson.core:jackson-databind|CVE-2018-14719|CRITICAL|2.6.7.1|2.7.9.5, > 2.8.11.3, 2.9.7|jackson-databind: arbitrary| > |com.fasterxml.jackson.core:jackson-databind|CVE-2018-14720|CRITICAL|2.6.7.1|2.6.7.2, > 2.9.7|jackson-databind: exfiltration/XXE| > |com.fasterxml.jackson.core:jackson-databind|CVE-2018-14721|CRITICAL|2.6.7.1|2.6.7.2, > 2.9.7|jackson-databind: server-side request| > |com.fasterxml.jackson.core:jackson-databind|CVE-2018-19360|CRITICAL|2.6.7.1|2.6.7.3, > 2.7.9.5, 2.8.11.3|jackson-databind: improper| > |com.fasterxml.jackson.core:jackson-databind|CVE-2018-19361|CRITICAL|2.6.7.1|2.6.7.3, > 2.7.9.5, 2.8.11.3|jackson-databind: improper| > |com.fasterxml.jackson.core:jackson-databind|CVE-2018-19362|CRITICAL|2.6.7.1|2.6.7.3, > 2.7.9.5,
[jira] [Resolved] (SPARK-36826) CVEs in libraries used in bundled jars
[ https://issues.apache.org/jira/browse/SPARK-36826?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean R. Owen resolved SPARK-36826. -- Resolution: Not A Problem > CVEs in libraries used in bundled jars > -- > > Key: SPARK-36826 > URL: https://issues.apache.org/jira/browse/SPARK-36826 > Project: Spark > Issue Type: Bug > Components: Spark Core >Affects Versions: 3.1.2 >Reporter: Carlos Rodríguez Hernández >Priority: Major > > Hi, I found several CVEs in dependency libraries bundled in the > _aws-java-sdk-bundle_ jar. > We are using Spark 3.1.2, which bundles _hadoop-*_ jars version 3.2.0: > {code:bash} > $ curl -JLO > "https://ftp.cixug.es/apache/spark/spark-3.1.2/spark-3.1.2-bin-hadoop3.2.tgz; > $ tar xzf spark-3.1.2-bin-hadoop3.2.tgz > $ find spark-3.1.2-bin-hadoop3.2/jars -wholename '*/hadoop-*' > spark-3.1.2-bin-hadoop3.2/jars/hadoop-client-3.2.0.jar > spark-3.1.2-bin-hadoop3.2/jars/hadoop-mapreduce-client-core-3.2.0.jar > spark-3.1.2-bin-hadoop3.2/jars/hadoop-common-3.2.0.jar > spark-3.1.2-bin-hadoop3.2/jars/hadoop-mapreduce-client-jobclient-3.2.0.jar > spark-3.1.2-bin-hadoop3.2/jars/hadoop-auth-3.2.0.jar > spark-3.1.2-bin-hadoop3.2/jars/hadoop-yarn-server-common-3.2.0.jar > spark-3.1.2-bin-hadoop3.2/jars/hadoop-yarn-api-3.2.0.jar > spark-3.1.2-bin-hadoop3.2/jars/hadoop-yarn-registry-3.2.0.jar > spark-3.1.2-bin-hadoop3.2/jars/hadoop-annotations-3.2.0.jar > spark-3.1.2-bin-hadoop3.2/jars/hadoop-yarn-client-3.2.0.jar > spark-3.1.2-bin-hadoop3.2/jars/hadoop-hdfs-client-3.2.0.jar > spark-3.1.2-bin-hadoop3.2/jars/hadoop-mapreduce-client-common-3.2.0.jar > spark-3.1.2-bin-hadoop3.2/jars/hadoop-yarn-common-3.2.0.jar > spark-3.1.2-bin-hadoop3.2/jars/hadoop-yarn-server-web-proxy-3.2.0.jar > {code} > There is a dependency between _hadoop-aws_, _hadoop-common_, and > _hadoop-project_ versions, as well, the _aws-java-sdk_ one should match the > required by _hadoop-project_, due to this dependencies we are including > _hadoop-aws-3.2.0_ and _aws-java-sdk-bundle-1.11.375_: > {code:bash} > $ find spark-3.1.2-bin-hadoop3.2/jars -wholename > spark-3.1.2-bin-hadoop3.2/jars/hadoop-aws-3.2.0.jar > spark-3.1.2-bin-hadoop3.2/jars/aws-java-sdk-bundle-1.11.375.jar > {code} > Taking a look at the _hadoop-project_ pom, the _aws-java-sdk_ version is the > correct one: > {code:bash} > $ curl -JLO > "https://repo1.maven.org/maven2/org/apache/hadoop/hadoop-project/3.2.0/hadoop-project-3.2.0.pom; > $ cat hadoop-project-3.2.0.pom | grep aws-java-sdk > 1.11.375 > aws-java-sdk-bundle > ${aws-java-sdk.version} > {code} > Do you think it would be possible to update the versions of the jars to solve > the vulnerabilities? > > Please see below the CVE report for _jars/aws-java-sdk-bundle-1.11.375.jar_: > ||LIBRARY||VULNERABILITY ID||SEVERITY||INSTALLED VERSION||FIXED > VERSION||TITLE|| > |com.fasterxml.jackson.core:jackson-databind|CVE-2017-15095|CRITICAL|2.6.7.1|2.9.4, > 2.8.11|jackson-databind: Unsafe| > |com.fasterxml.jackson.core:jackson-databind|CVE-2017-17485|CRITICAL|2.6.7.1|2.8.11, > 2.9.4|jackson-databind: Unsafe| > |com.fasterxml.jackson.core:jackson-databind|CVE-2018-11307|CRITICAL|2.6.7.1|2.8.11.2, > 2.7.9.4, 2.9.6|jackson-databind: Potential| > |com.fasterxml.jackson.core:jackson-databind|CVE-2018-14718|CRITICAL|2.6.7.1|2.7.9.5, > 2.8.11.3, 2.9.7|jackson-databind: arbitrary code| > |com.fasterxml.jackson.core:jackson-databind|CVE-2018-14719|CRITICAL|2.6.7.1|2.7.9.5, > 2.8.11.3, 2.9.7|jackson-databind: arbitrary| > |com.fasterxml.jackson.core:jackson-databind|CVE-2018-14720|CRITICAL|2.6.7.1|2.6.7.2, > 2.9.7|jackson-databind: exfiltration/XXE| > |com.fasterxml.jackson.core:jackson-databind|CVE-2018-14721|CRITICAL|2.6.7.1|2.6.7.2, > 2.9.7|jackson-databind: server-side request| > |com.fasterxml.jackson.core:jackson-databind|CVE-2018-19360|CRITICAL|2.6.7.1|2.6.7.3, > 2.7.9.5, 2.8.11.3|jackson-databind: improper| > |com.fasterxml.jackson.core:jackson-databind|CVE-2018-19361|CRITICAL|2.6.7.1|2.6.7.3, > 2.7.9.5, 2.8.11.3|jackson-databind: improper| > |com.fasterxml.jackson.core:jackson-databind|CVE-2018-19362|CRITICAL|2.6.7.1|2.6.7.3, > 2.7.9.5, 2.8.11.3|jackson-databind: improper| > |com.fasterxml.jackson.core:jackson-databind|CVE-2018-7489|CRITICAL|2.6.7.1|2.8.11.1, > 2.9.5|jackson-databind: incomplete fix| > |com.fasterxml.jackson.core:jackson-databind|CVE-2019-14379|CRITICAL|2.6.7.1|2.9.9.2|jackson-databind: > default| > |com.fasterxml.jackson.core:jackson-databind|CVE-2019-14540|CRITICAL|2.6.7.1|2.9.10|jackson-databind:| > |com.fasterxml.jackson.core:jackson-databind|CVE-2019-14892|CRITICAL|2.6.7.1|2.9.10, > 2.8.11.5, 2.6.7.3|jackson-databind: Serialization| > |com.fasterxml.jackson.core:jackson-databind|CVE-2019-14893|CRITICAL|2.6.7.1|2.8.11.5, > 2.9.10|jackson-databind:| >
[jira] [Commented] (SPARK-31602) memory leak of JobConf
[ https://issues.apache.org/jira/browse/SPARK-31602?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17420286#comment-17420286 ] muhong commented on SPARK-31602: we meet the same question on the spark driver side,but not find the answer。 we found the leak JobConf associate inside the DistributeFileSystem,the DistributeFileSystem are store in the FileSystem$Cache, it seems that the DistributeFileSystem were not closed; > memory leak of JobConf > -- > > Key: SPARK-31602 > URL: https://issues.apache.org/jira/browse/SPARK-31602 > Project: Spark > Issue Type: Bug > Components: Spark Core >Affects Versions: 2.4.0 >Reporter: angerszhu >Priority: Major > Labels: bulk-closed > Attachments: image-2020-04-29-14-34-39-496.png, > image-2020-04-29-14-35-55-986.png > > > !image-2020-04-29-14-34-39-496.png! > !image-2020-04-29-14-35-55-986.png! -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Assigned] (SPARK-36851) Incorrect parsing of negative ANSI typed interval literals
[ https://issues.apache.org/jira/browse/SPARK-36851?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gengliang Wang reassigned SPARK-36851: -- Assignee: Peng Lei > Incorrect parsing of negative ANSI typed interval literals > -- > > Key: SPARK-36851 > URL: https://issues.apache.org/jira/browse/SPARK-36851 > Project: Spark > Issue Type: Sub-task > Components: SQL >Affects Versions: 3.2.0 >Reporter: Max Gekk >Assignee: Peng Lei >Priority: Major > Fix For: 3.2.0 > > > If start field and end field are the same, parser doesn't take into account > the sign before interval literal string. For example: > Works fine: > {code:sql} > spark-sql> select interval -'1-1' year to month; > -1-1 > {code} > Incorrect result: > {code:sql} > spark-sql> select interval -'1' year; > 1-0 > {code} -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Resolved] (SPARK-36851) Incorrect parsing of negative ANSI typed interval literals
[ https://issues.apache.org/jira/browse/SPARK-36851?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gengliang Wang resolved SPARK-36851. Fix Version/s: 3.2.0 Resolution: Fixed Issue resolved by pull request 34107 [https://github.com/apache/spark/pull/34107] > Incorrect parsing of negative ANSI typed interval literals > -- > > Key: SPARK-36851 > URL: https://issues.apache.org/jira/browse/SPARK-36851 > Project: Spark > Issue Type: Sub-task > Components: SQL >Affects Versions: 3.2.0 >Reporter: Max Gekk >Priority: Major > Fix For: 3.2.0 > > > If start field and end field are the same, parser doesn't take into account > the sign before interval literal string. For example: > Works fine: > {code:sql} > spark-sql> select interval -'1-1' year to month; > -1-1 > {code} > Incorrect result: > {code:sql} > spark-sql> select interval -'1' year; > 1-0 > {code} -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-36857) structured streaming support backpressure for kafka source
[ https://issues.apache.org/jira/browse/SPARK-36857?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] baizhendong updated SPARK-36857: Description: Spark streaming support backpressure for kafka, but in structured streaming, not support backpressure for kafka. Can someone explain why not support it? (was: Spark streaming support backpressure for kafka, but in structured streaming, not support backpressure for kafka.) > structured streaming support backpressure for kafka source > -- > > Key: SPARK-36857 > URL: https://issues.apache.org/jira/browse/SPARK-36857 > Project: Spark > Issue Type: Improvement > Components: Structured Streaming >Affects Versions: 2.4.8, 3.1.2 >Reporter: baizhendong >Priority: Major > > Spark streaming support backpressure for kafka, but in structured streaming, > not support backpressure for kafka. Can someone explain why not support it? -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-36857) structured streaming support backpressure for kafka source
baizhendong created SPARK-36857: --- Summary: structured streaming support backpressure for kafka source Key: SPARK-36857 URL: https://issues.apache.org/jira/browse/SPARK-36857 Project: Spark Issue Type: Improvement Components: Structured Streaming Affects Versions: 3.1.2, 2.4.8 Reporter: baizhendong Spark streaming support backpressure for kafka, but in structured streaming, not support backpressure for kafka. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-36853) Code failing on checkstyle
[ https://issues.apache.org/jira/browse/SPARK-36853?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17420258#comment-17420258 ] Abhinav Kumar commented on SPARK-36853: --- This error in thrown in Windows in Maven installing phase. The build succeeds but with these errors. > Code failing on checkstyle > -- > > Key: SPARK-36853 > URL: https://issues.apache.org/jira/browse/SPARK-36853 > Project: Spark > Issue Type: Bug > Components: Build >Affects Versions: 3.3.0 >Reporter: Abhinav Kumar >Priority: Trivial > > There are more - just pasting sample > > [INFO] There are 32 errors reported by Checkstyle 8.43 with > dev/checkstyle.xml ruleset. > [ERROR] src\main\java\org\apache\spark\sql\api\java\UDF11.java:[29] (sizes) > LineLength: Line is longer than 100 characters (found 107). > [ERROR] src\main\java\org\apache\spark\sql\api\java\UDF12.java:[29] (sizes) > LineLength: Line is longer than 100 characters (found 116). > [ERROR] src\main\java\org\apache\spark\sql\api\java\UDF13.java:[28] (sizes) > LineLength: Line is longer than 100 characters (found 104). > [ERROR] src\main\java\org\apache\spark\sql\api\java\UDF13.java:[29] (sizes) > LineLength: Line is longer than 100 characters (found 125). > [ERROR] src\main\java\org\apache\spark\sql\api\java\UDF14.java:[28] (sizes) > LineLength: Line is longer than 100 characters (found 109). > [ERROR] src\main\java\org\apache\spark\sql\api\java\UDF14.java:[29] (sizes) > LineLength: Line is longer than 100 characters (found 134). > [ERROR] src\main\java\org\apache\spark\sql\api\java\UDF15.java:[28] (sizes) > LineLength: Line is longer than 100 characters (found 114). > [ERROR] src\main\java\org\apache\spark\sql\api\java\UDF15.java:[29] (sizes) > LineLength: Line is longer than 100 characters (found 143). > [ERROR] src\main\java\org\apache\spark\sql\api\java\UDF16.java:[28] (sizes) > LineLength: Line is longer than 100 characters (found 119). > [ERROR] src\main\java\org\apache\spark\sql\api\java\UDF16.java:[29] (sizes) > LineLength: Line is longer than 100 characters (found 152). > [ERROR] src\main\java\org\apache\spark\sql\api\java\UDF17.java:[28] (sizes) > LineLength: Line is longer than 100 characters (found 124). > [ERROR] src\main\java\org\apache\spark\sql\api\java\UDF17.java:[29] (sizes) > LineLength: Line is longer than 100 characters (found 161). > [ERROR] src\main\java\org\apache\spark\sql\api\java\UDF18.java:[28] (sizes) > LineLength: Line is longer than 100 characters (found 129). > [ERROR] src\main\java\org\apache\spark\sql\api\java\UDF18.java:[29] (sizes) > LineLength: Line is longer than 100 characters (found 170). > [ERROR] src\main\java\org\apache\spark\sql\api\java\UDF19.java:[28] (sizes) > LineLength: Line is longer than 100 characters (found 134). > [ERROR] src\main\java\org\apache\spark\sql\api\java\UDF19.java:[29] (sizes) > LineLength: Line is longer than 100 characters (found 179). > [ERROR] src\main\java\org\apache\spark\sql\api\java\UDF20.java:[28] (sizes) > LineLength: Line is longer than 100 characters (found 139). > [ERROR] src\main\java\org\apache\spark\sql\api\java\UDF20.java:[29] (sizes) > LineLength: Line is longer than 100 characters (found 188). > [ERROR] src\main\java\org\apache\spark\sql\api\java\UDF21.java:[28] (sizes) > LineLength: Line is longer than 100 characters (found 144). > [ERROR] src\main\java\org\apache\spark\sql\api\java\UDF21.java:[29] (sizes) > LineLength: Line is longer than 100 characters (found 197). > [ERROR] src\main\java\org\apache\spark\sql\api\java\UDF22.java:[28] (sizes) > LineLength: Line is longer than 100 characters (found 149). > [ERROR] src\main\java\org\apache\spark\sql\api\java\UDF22.java:[29] (sizes) > LineLength: Line is longer than 100 characters (found 206). > [ERROR] src\main\java\org\apache\spark\sql\streaming\Trigger.java:[44,25] > (naming) MethodName: Method name 'ProcessingTime' must match pattern > '^[a-z][a-z0-9][a-zA-Z0-9_]*$'. > [ERROR] src\main\java\org\apache\spark\sql\streaming\Trigger.java:[60,25] > (naming) MethodName: Method name 'ProcessingTime' must match pattern > '^[a-z][a-z0-9][a-zA-Z0-9_]*$'. > [ERROR] src\main\java\org\apache\spark\sql\streaming\Trigger.java:[75,25] > (naming) MethodName: Method name 'ProcessingTime' must match pattern > '^[a-z][a-z0-9][a-zA-Z0-9_]*$'. > [ERROR] src\main\java\org\apache\spark\sql\streaming\Trigger.java:[88,25] > (naming) MethodName: Method name 'ProcessingTime' must match pattern > '^[a-z][a-z0-9][a-zA-Z0-9_]*$'. > [ERROR] src\main\java\org\apache\spark\sql\streaming\Trigger.java:[100,25] > (naming) MethodName: Method name 'Once' must match pattern > '^[a-z][a-z0-9][a-zA-Z0-9_]*$'. > [ERROR] src\main\java\org\apache\spark\sql\streaming\Trigger.java:[110,25] > (naming) MethodName: Method name 'AvailableNow' must match pattern >
[jira] [Created] (SPARK-36856) Building by "./build/mvn" may be stuck on MacOS
copperybean created SPARK-36856: --- Summary: Building by "./build/mvn" may be stuck on MacOS Key: SPARK-36856 URL: https://issues.apache.org/jira/browse/SPARK-36856 Project: Spark Issue Type: Improvement Components: Build Affects Versions: 3.0.0, 3.3.0 Environment: MacOS 11.4 Reporter: copperybean Command "./build/mvn" will be stuck on my MacOS 11.4. Because it is using error java home. On my mac, "/usr/bin/java" is a real file instead of a symbolic link, so the java home is set to path "/usr", and lead the launched maven process stuck with this error java home. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-36855) Uniformly execute the mergeContinuousShuffleBlockIdsIfNeeded method first
[ https://issues.apache.org/jira/browse/SPARK-36855?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] jinhai updated SPARK-36855: --- Description: In the ShuffleBlockFetcherIterator.partitionBlocksByFetchMode method, both local and host-local execute the mergeContinuousShuffleBlockIdsIfNeeded method first, but remote blocks executes the method many times in the createFetchRequests method. Can the method of merge blocks be executed only once in advance? (was: In the ShuffleBlockFetcherIterator.partitionBlocksByFetchMode method, both local and host-local execute the mergeContinuousShuffleBlockIdsIfNeeded method first, but remote blocks executes the method many times in the createFetchRequests method. Can merge blocks be executed only once in advance?) > Uniformly execute the mergeContinuousShuffleBlockIdsIfNeeded method first > - > > Key: SPARK-36855 > URL: https://issues.apache.org/jira/browse/SPARK-36855 > Project: Spark > Issue Type: Improvement > Components: Spark Core >Affects Versions: 3.1.0, 3.1.1, 3.1.2 >Reporter: jinhai >Priority: Major > > In the ShuffleBlockFetcherIterator.partitionBlocksByFetchMode method, both > local and host-local execute the mergeContinuousShuffleBlockIdsIfNeeded > method first, but remote blocks executes the method many times in the > createFetchRequests method. Can the method of merge blocks be executed only > once in advance? -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-36855) Uniformly execute the mergeContinuousShuffleBlockIdsIfNeeded method first
[ https://issues.apache.org/jira/browse/SPARK-36855?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] jinhai updated SPARK-36855: --- Description: In the ShuffleBlockFetcherIterator.partitionBlocksByFetchMode method, both local and host-local execute the mergeContinuousShuffleBlockIdsIfNeeded method first, but remote blocks executes the method many times in the createFetchRequests method. Can merge blocks be executed only once in advance? (was: In the ShuffleBlockFetcherIterator.partitionBlocksByFetchMode method, both local and host-local execute the mergeContinuousShuffleBlockIdsIfNeeded method first, but remote blocks executes the mergeContinuousShuffleBlockIdsIfNeeded method many times in the createFetchRequests method. Can merge blocks be executed only once in advance?) > Uniformly execute the mergeContinuousShuffleBlockIdsIfNeeded method first > - > > Key: SPARK-36855 > URL: https://issues.apache.org/jira/browse/SPARK-36855 > Project: Spark > Issue Type: Improvement > Components: Spark Core >Affects Versions: 3.1.0, 3.1.1, 3.1.2 >Reporter: jinhai >Priority: Major > > In the ShuffleBlockFetcherIterator.partitionBlocksByFetchMode method, both > local and host-local execute the mergeContinuousShuffleBlockIdsIfNeeded > method first, but remote blocks executes the method many times in the > createFetchRequests method. Can merge blocks be executed only once in advance? -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-36855) Uniformly execute the mergeContinuousShuffleBlockIdsIfNeeded method first
jinhai created SPARK-36855: -- Summary: Uniformly execute the mergeContinuousShuffleBlockIdsIfNeeded method first Key: SPARK-36855 URL: https://issues.apache.org/jira/browse/SPARK-36855 Project: Spark Issue Type: Improvement Components: Spark Core Affects Versions: 3.1.2, 3.1.1, 3.1.0 Reporter: jinhai In the ShuffleBlockFetcherIterator.partitionBlocksByFetchMode method, both local and host-local execute the mergeContinuousShuffleBlockIdsIfNeeded method first, but remote blocks executes the mergeContinuousShuffleBlockIdsIfNeeded method many times in the createFetchRequests method. Can merge blocks be executed only once in advance? -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-36638) Generalize OptimizeSkewedJoin
[ https://issues.apache.org/jira/browse/SPARK-36638?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17420215#comment-17420215 ] Apache Spark commented on SPARK-36638: -- User 'zhengruifeng' has created a pull request for this issue: https://github.com/apache/spark/pull/34108 > Generalize OptimizeSkewedJoin > - > > Key: SPARK-36638 > URL: https://issues.apache.org/jira/browse/SPARK-36638 > Project: Spark > Issue Type: Improvement > Components: SQL >Affects Versions: 3.3.0 >Reporter: zhengruifeng >Priority: Major > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org