[jira] [Created] (SPARK-45782) Add Dataframe API df.explainString()
Khalid Mammadov created SPARK-45782: --- Summary: Add Dataframe API df.explainString() Key: SPARK-45782 URL: https://issues.apache.org/jira/browse/SPARK-45782 Project: Spark Issue Type: Improvement Components: Connect, PySpark, Spark Core Affects Versions: 4.0.0 Reporter: Khalid Mammadov This frequently needed feature for performance optimization purposes. Users often want to look into this output in running systems and so would like to save/extract this output from running systems for later analysis. Current API only provided for Scala i.e. {{df.queryExecution.toString()}} and also not located in intuitive place where average Spark user (i.e. non Expert/Scala dev) can see immediately. It will also avoid users using workarounds and capturing outputs with {code:java} with io.StringIO() as buf ...`: df.explain(True) {code} So, it would help users a lot have this output avalilable as: df.explainString() i.e. next to df.explain() so users can easily locate it and use. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-45716) Python parity method StructType.treeString
Khalid Mammadov created SPARK-45716: --- Summary: Python parity method StructType.treeString Key: SPARK-45716 URL: https://issues.apache.org/jira/browse/SPARK-45716 Project: Spark Issue Type: Improvement Components: Connect, PySpark Affects Versions: 4.0.0 Reporter: Khalid Mammadov Add missing parity megthod from Scala to Python -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-43243) Add Level param to df.printSchema for Python API
Khalid Mammadov created SPARK-43243: --- Summary: Add Level param to df.printSchema for Python API Key: SPARK-43243 URL: https://issues.apache.org/jira/browse/SPARK-43243 Project: Spark Issue Type: Improvement Components: Connect, PySpark Affects Versions: 3.5.0 Reporter: Khalid Mammadov Python printSchema in DataFrame API is missing level parameter which is available in Scala API. This is to add that -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-42437) Pyspark catalog.cacheTable allow to specify storage level Connect add support Storagelevel
[ https://issues.apache.org/jira/browse/SPARK-42437?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Khalid Mammadov updated SPARK-42437: Target Version/s: (was: 3.5.0) Affects Version/s: 3.4.0 (was: 3.5.0) Summary: Pyspark catalog.cacheTable allow to specify storage level Connect add support Storagelevel (was: Pyspark catalog.cacheTable allow to specify storage level) > Pyspark catalog.cacheTable allow to specify storage level Connect add support > Storagelevel > -- > > Key: SPARK-42437 > URL: https://issues.apache.org/jira/browse/SPARK-42437 > Project: Spark > Issue Type: Improvement > Components: Connect, PySpark >Affects Versions: 3.4.0 >Reporter: Khalid Mammadov >Priority: Major > > Currently PySpark version of catalog.cacheTable function does not support to > specify storage level. This is to add that. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-42437) Pyspark catalog.cacheTable allow to specify storage level
Khalid Mammadov created SPARK-42437: --- Summary: Pyspark catalog.cacheTable allow to specify storage level Key: SPARK-42437 URL: https://issues.apache.org/jira/browse/SPARK-42437 Project: Spark Issue Type: Improvement Components: Connect, PySpark Affects Versions: 3.5.0 Reporter: Khalid Mammadov Currently PySpark version of catalog.cacheTable function does not support to specify storage level. This is to add that. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-42400) Code clean up in org.apache.spark.storage
Khalid Mammadov created SPARK-42400: --- Summary: Code clean up in org.apache.spark.storage Key: SPARK-42400 URL: https://issues.apache.org/jira/browse/SPARK-42400 Project: Spark Issue Type: Improvement Components: Block Manager Affects Versions: 3.4.0 Reporter: Khalid Mammadov -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-42257) Remove unused variable in ExternalSorter
Khalid Mammadov created SPARK-42257: --- Summary: Remove unused variable in ExternalSorter Key: SPARK-42257 URL: https://issues.apache.org/jira/browse/SPARK-42257 Project: Spark Issue Type: Improvement Components: Spark Core Affects Versions: 3.4.0 Reporter: Khalid Mammadov nextPartitionId variable is not used anywhere {color:#ffc66d}writePartitionedMapOutput method. {color} -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-37946) Use error classes in the execution errors related to partitions
[ https://issues.apache.org/jira/browse/SPARK-37946?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17626866#comment-17626866 ] Khalid Mammadov commented on SPARK-37946: - Hi [~maxgekk], I see this one is not done yet here: partitionColumnNotFoundInSchemaError Can I look into it? Also, there are some more waiting to be done in QueryExecutionErrors.scala e.g. stateNotDefinedOrAlreadyRemovedError cannotSetTimeoutDurationError cannotGetEventTimeWatermarkError cannotSetTimeoutTimestampError batchMetadataFileNotFoundError Shall I look into these as well? > Use error classes in the execution errors related to partitions > --- > > Key: SPARK-37946 > URL: https://issues.apache.org/jira/browse/SPARK-37946 > Project: Spark > Issue Type: Sub-task > Components: SQL >Affects Versions: 3.3.0 >Reporter: Max Gekk >Priority: Major > > Migrate the following errors in QueryExecutionErrors: > * unableToDeletePartitionPathError > * unableToCreatePartitionPathError > * unableToRenamePartitionPathError > * notADatasourceRDDPartitionError > * cannotClearPartitionDirectoryError > * failedToCastValueToDataTypeForPartitionColumnError > * unsupportedPartitionTransformError > * cannotCreateJDBCTableWithPartitionsError > * requestedPartitionsMismatchTablePartitionsError > * dynamicPartitionKeyNotAmongWrittenPartitionPathsError > * cannotRemovePartitionDirError > * alterTableWithDropPartitionAndPurgeUnsupportedError > * invalidPartitionFilterError > * getPartitionMetadataByFilterError > * illegalLocationClauseForViewPartitionError > * partitionColumnNotFoundInSchemaError > * cannotAddMultiPartitionsOnNonatomicPartitionTableError > * cannotDropMultiPartitionsOnNonatomicPartitionTableError > * truncateMultiPartitionUnsupportedError > * dynamicPartitionOverwriteUnsupportedByTableError > * writePartitionExceedConfigSizeWhenDynamicPartitionError > onto use error classes. Throw an implementation of SparkThrowable. Also write > a test per every error in QueryExecutionErrorsSuite. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Comment Edited] (SPARK-37945) Use error classes in the execution errors of arithmetic ops
[ https://issues.apache.org/jira/browse/SPARK-37945?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17618038#comment-17618038 ] Khalid Mammadov edited comment on SPARK-37945 at 10/15/22 8:43 AM: --- I was [about to finish |https://github.com/apache/spark/pull/38266] this but I can see you set these error classes to _LEGACY which I don't know background so will leave this to you then [~maxgekk] Please let me know if there is one I can work on was (Author: JIRAUSER284054): I was [about to finish |https://github.com/apache/spark/pull/38266] this but I can see you set these error classes to _LEGACY which I don't know background so will leave this to you then [~maxgekk] > Use error classes in the execution errors of arithmetic ops > --- > > Key: SPARK-37945 > URL: https://issues.apache.org/jira/browse/SPARK-37945 > Project: Spark > Issue Type: Sub-task > Components: SQL >Affects Versions: 3.3.0 >Reporter: Max Gekk >Priority: Major > > Migrate the following errors in QueryExecutionErrors: > * overflowInSumOfDecimalError > * overflowInIntegralDivideError > * arithmeticOverflowError > * unaryMinusCauseOverflowError > * binaryArithmeticCauseOverflowError > * unscaledValueTooLargeForPrecisionError > * decimalPrecisionExceedsMaxPrecisionError > * outOfDecimalTypeRangeError > * integerOverflowError > onto use error classes. Throw an implementation of SparkThrowable. Also write > a test per every error in QueryExecutionErrorsSuite. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-37945) Use error classes in the execution errors of arithmetic ops
[ https://issues.apache.org/jira/browse/SPARK-37945?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17618038#comment-17618038 ] Khalid Mammadov commented on SPARK-37945: - I was [about to finish |https://github.com/apache/spark/pull/38266] this but I can see you set these error classes to _LEGACY which I don't know background so will leave this to you then [~maxgekk] > Use error classes in the execution errors of arithmetic ops > --- > > Key: SPARK-37945 > URL: https://issues.apache.org/jira/browse/SPARK-37945 > Project: Spark > Issue Type: Sub-task > Components: SQL >Affects Versions: 3.3.0 >Reporter: Max Gekk >Priority: Major > > Migrate the following errors in QueryExecutionErrors: > * overflowInSumOfDecimalError > * overflowInIntegralDivideError > * arithmeticOverflowError > * unaryMinusCauseOverflowError > * binaryArithmeticCauseOverflowError > * unscaledValueTooLargeForPrecisionError > * decimalPrecisionExceedsMaxPrecisionError > * outOfDecimalTypeRangeError > * integerOverflowError > onto use error classes. Throw an implementation of SparkThrowable. Also write > a test per every error in QueryExecutionErrorsSuite. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-37945) Use error classes in the execution errors of arithmetic ops
[ https://issues.apache.org/jira/browse/SPARK-37945?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17615361#comment-17615361 ] Khalid Mammadov commented on SPARK-37945: - [~maxgekk] I see you have already fixed most of these, I can pick up (and started already) below ones if Ok? unscaledValueTooLargeForPrecisionError decimalPrecisionExceedsMaxPrecisionError outOfDecimalTypeRangeError integerOverflowError Ps: Looks fearly streightforward and shouldn't take long > Use error classes in the execution errors of arithmetic ops > --- > > Key: SPARK-37945 > URL: https://issues.apache.org/jira/browse/SPARK-37945 > Project: Spark > Issue Type: Sub-task > Components: SQL >Affects Versions: 3.3.0 >Reporter: Max Gekk >Priority: Major > > Migrate the following errors in QueryExecutionErrors: > * overflowInSumOfDecimalError > * overflowInIntegralDivideError > * arithmeticOverflowError > * unaryMinusCauseOverflowError > * binaryArithmeticCauseOverflowError > * unscaledValueTooLargeForPrecisionError > * decimalPrecisionExceedsMaxPrecisionError > * outOfDecimalTypeRangeError > * integerOverflowError > onto use error classes. Throw an implementation of SparkThrowable. Also write > a test per every error in QueryExecutionErrorsSuite. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-38465) Use error classes in org.apache.spark.launcher
[ https://issues.apache.org/jira/browse/SPARK-38465?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17611896#comment-17611896 ] Khalid Mammadov commented on SPARK-38465: - Hi [~bozhang] [~maxgekk], I would like to look into this if there are no objections? > Use error classes in org.apache.spark.launcher > -- > > Key: SPARK-38465 > URL: https://issues.apache.org/jira/browse/SPARK-38465 > Project: Spark > Issue Type: Sub-task > Components: Spark Core >Affects Versions: 3.3.0 >Reporter: Bo Zhang >Priority: Major > -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-40620) Deduplication of WorkerOffer build in CoarseGrainedSchedulerBackend
Khalid Mammadov created SPARK-40620: --- Summary: Deduplication of WorkerOffer build in CoarseGrainedSchedulerBackend Key: SPARK-40620 URL: https://issues.apache.org/jira/browse/SPARK-40620 Project: Spark Issue Type: Improvement Components: Spark Core Affects Versions: 3.4.0 Reporter: Khalid Mammadov WorkerOffer build in CoarseGrainedSchedulerBackend is repeated two different places with exact same parameters. We can deduplicate and increase readability by moving that to a private function -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-40210) Fix math atan2, hypot, pow and pmod float argument call
Khalid Mammadov created SPARK-40210: --- Summary: Fix math atan2, hypot, pow and pmod float argument call Key: SPARK-40210 URL: https://issues.apache.org/jira/browse/SPARK-40210 Project: Spark Issue Type: Improvement Components: PySpark Affects Versions: 3.4.0 Reporter: Khalid Mammadov PySpark atan2, hypot, pow and pmod functions marked as accepting float type as argument but produce error when called together -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-40009) Add missing doc string info to DataFrame API
[ https://issues.apache.org/jira/browse/SPARK-40009?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Khalid Mammadov updated SPARK-40009: Description: Some of the docstrings in Python DataFrame API is not complete, for example some missing Parameters section or Return or Examples. It would help users if we can provide these missing infos for all methods/functions (was: Provide examples for DataFrame union and unionAll functions for PySpark. Also document parameters) > Add missing doc string info to DataFrame API > > > Key: SPARK-40009 > URL: https://issues.apache.org/jira/browse/SPARK-40009 > Project: Spark > Issue Type: Improvement > Components: Documentation >Affects Versions: 3.4.0 >Reporter: Khalid Mammadov >Priority: Minor > > Some of the docstrings in Python DataFrame API is not complete, for example > some missing Parameters section or Return or Examples. It would help users if > we can provide these missing infos for all methods/functions -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-40009) Add missing doc string info to DataFrame API
[ https://issues.apache.org/jira/browse/SPARK-40009?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Khalid Mammadov updated SPARK-40009: Summary: Add missing doc string info to DataFrame API (was: Add doc string to DataFrame union and unionAll) > Add missing doc string info to DataFrame API > > > Key: SPARK-40009 > URL: https://issues.apache.org/jira/browse/SPARK-40009 > Project: Spark > Issue Type: Improvement > Components: Documentation >Affects Versions: 3.4.0 >Reporter: Khalid Mammadov >Priority: Minor > > Provide examples for DataFrame union and unionAll functions for PySpark. Also > document parameters -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-40009) Add doc string to DataFrame union and unionAll
Khalid Mammadov created SPARK-40009: --- Summary: Add doc string to DataFrame union and unionAll Key: SPARK-40009 URL: https://issues.apache.org/jira/browse/SPARK-40009 Project: Spark Issue Type: Improvement Components: Documentation Affects Versions: 3.4.0 Reporter: Khalid Mammadov Provide examples for DataFrame union and unionAll functions for PySpark. Also document parameters -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-39982) StructType.fromJson method missing documentation
Khalid Mammadov created SPARK-39982: --- Summary: StructType.fromJson method missing documentation Key: SPARK-39982 URL: https://issues.apache.org/jira/browse/SPARK-39982 Project: Spark Issue Type: Improvement Components: PySpark Affects Versions: 3.3.0 Reporter: Khalid Mammadov StructType.fromJson method does not have any documentation. It would be good to have one that explains how one can use it. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-38261) Sync missing R packages with CI
Khalid Mammadov created SPARK-38261: --- Summary: Sync missing R packages with CI Key: SPARK-38261 URL: https://issues.apache.org/jira/browse/SPARK-38261 Project: Spark Issue Type: Github Integration Components: Build Affects Versions: 3.2.1 Reporter: Khalid Mammadov Current GitHub workflow job *Linters, licenses, dependencies and documentation generation* is missing R packages to complete Documentation and API build. *Build and test* - is not failing as these packages are installed in the base image. We need to keep them in-sync IMO with the base image for easy switch back to ubuntu runner when ready. These R packages are missing: *markdown* and *e1071* Reference: Base image - https://hub.docker.com/layers/dongjoon/apache-spark-github-action-image/20220207/images/sha256-af09d172ff8e2cbd71df9a1bc5384a47578c4a4cc293786c539333cafaf4a7ce?context=explore -- This message was sent by Atlassian Jira (v8.20.1#820001) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-38210) Spark documentation build README is stale
Khalid Mammadov created SPARK-38210: --- Summary: Spark documentation build README is stale Key: SPARK-38210 URL: https://issues.apache.org/jira/browse/SPARK-38210 Project: Spark Issue Type: Documentation Components: Documentation Affects Versions: 3.2.1 Reporter: Khalid Mammadov I was following docs/README.md to build documentation and found out that it's not complete. I had to install additional packages that is not documented but available in the [CI/CD phase |https://github.com/apache/spark/blob/c8b34ab7340265f1f2bec2afa694c10f174b222c/.github/workflows/build_and_test.yml#L526]and few more to finish the build process. I will file a PR to change README.md to include these packages and improve the guide -- This message was sent by Atlassian Jira (v8.20.1#820001) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-38120) HiveExternalCatalog.listPartitions is failing when partition column name is upper case and dot in partition value
Khalid Mammadov created SPARK-38120: --- Summary: HiveExternalCatalog.listPartitions is failing when partition column name is upper case and dot in partition value Key: SPARK-38120 URL: https://issues.apache.org/jira/browse/SPARK-38120 Project: Spark Issue Type: Bug Components: SQL Affects Versions: 3.2.1 Reporter: Khalid Mammadov HiveExternalCatalog.listPartitions method call is failing when a partition column name is upper case and partition value contains dot. It's related to this change [https://github.com/apache/spark/commit/f18b905f6cace7686ef169fda7de474079d0af23] The test casein that PR does not produce the issue as partition column name is lower case. Below how to reproduce the issue: scala> import org.apache.spark.sql.catalyst.TableIdentifier import org.apache.spark.sql.catalyst.TableIdentifier scala> spark.sql("CREATE TABLE customer(id INT, name STRING) PARTITIONED BY (partCol1 STRING, partCol2 STRING)") scala> spark.sql("INSERT INTO customer PARTITION (partCol1 = 'CA', partCol2 = 'i.j') VALUES (100, 'John')") scala> spark.sessionState.catalog.listPartitions(TableIdentifier("customer"), Some(Map("partCol2" -> "i.j"))).foreach(println) java.util.NoSuchElementException: key not found: partcol2 at scala.collection.immutable.Map$Map2.apply(Map.scala:227) at org.apache.spark.sql.catalyst.catalog.ExternalCatalogUtils$.$anonfun$isPartialPartitionSpec$1(ExternalCatalogUtils.scala:205) at org.apache.spark.sql.catalyst.catalog.ExternalCatalogUtils$.$anonfun$isPartialPartitionSpec$1$adapted(ExternalCatalogUtils.scala:202) at scala.collection.immutable.Map$Map1.forall(Map.scala:196) at org.apache.spark.sql.catalyst.catalog.ExternalCatalogUtils$.isPartialPartitionSpec(ExternalCatalogUtils.scala:202) at org.apache.spark.sql.hive.HiveExternalCatalog.$anonfun$listPartitions$6(HiveExternalCatalog.scala:1312) at org.apache.spark.sql.hive.HiveExternalCatalog.$anonfun$listPartitions$6$adapted(HiveExternalCatalog.scala:1312) at scala.collection.TraversableLike.$anonfun$filterImpl$1(TraversableLike.scala:304) at scala.collection.mutable.ResizableArray.foreach(ResizableArray.scala:62) at scala.collection.mutable.ResizableArray.foreach$(ResizableArray.scala:55) at scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:49) at scala.collection.TraversableLike.filterImpl(TraversableLike.scala:303) at scala.collection.TraversableLike.filterImpl$(TraversableLike.scala:297) at scala.collection.AbstractTraversable.filterImpl(Traversable.scala:108) at scala.collection.TraversableLike.filter(TraversableLike.scala:395) at scala.collection.TraversableLike.filter$(TraversableLike.scala:395) at scala.collection.AbstractTraversable.filter(Traversable.scala:108) at org.apache.spark.sql.hive.HiveExternalCatalog.$anonfun$listPartitions$1(HiveExternalCatalog.scala:1312) at org.apache.spark.sql.hive.HiveExternalCatalog.withClientWrappingException(HiveExternalCatalog.scala:114) at org.apache.spark.sql.hive.HiveExternalCatalog.withClient(HiveExternalCatalog.scala:103) at org.apache.spark.sql.hive.HiveExternalCatalog.listPartitions(HiveExternalCatalog.scala:1296) at org.apache.spark.sql.catalyst.catalog.ExternalCatalogWithListener.listPartitions(ExternalCatalogWithListener.scala:254) at org.apache.spark.sql.catalyst.catalog.SessionCatalog.listPartitions(SessionCatalog.scala:1251) ... 47 elided -- This message was sent by Atlassian Jira (v8.20.1#820001) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-37996) Contribution guide is stale
[ https://issues.apache.org/jira/browse/SPARK-37996?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17486139#comment-17486139 ] Khalid Mammadov commented on SPARK-37996: - Raised PR: https://github.com/apache/spark-website/pull/378 With following changes: * It describes in the Pull request section of the Contributing page the actual procedure and takes a contributor through a step by step process. * It removes optional "Running tests in your forked repository" section on Developer Tools page which is obsolete now and doesn't reflect reality anymore i.e. it says we can test by clicking “Run workflow” button which is not available anymore as workflow does not use "workflow_dispatch" event trigger anymore and was removed in * [[SPARK-35048][INFRA] Distribute GitHub Actions workflows to fork repositories to share the resources spark#32092|https://github.com/apache/spark/pull/32092] * Instead it documents the new procedure that above PR introduced i.e. contributors needs to use their own GitHub free workflow credits to test new changes they are purposing and a Spark Actions workflow will expect that to be completed before marking PR to be ready for a review. * Some general wording was copied from "Running tests in your forked repository" section on Developer Tools page but main content was rewritten to meet objective * Also fixed URL to developer-tools.html to be resolved by parser (that converted it into relative URI) instead of using hard coded absolute URL. > Contribution guide is stale > --- > > Key: SPARK-37996 > URL: https://issues.apache.org/jira/browse/SPARK-37996 > Project: Spark > Issue Type: Improvement > Components: Documentation >Affects Versions: 3.2.0 >Reporter: Khalid Mammadov >Priority: Minor > > Contribution guide mentions below link to use to test on local repo before > raising PR but the process has changed and documentation does not reflect it. > https://spark.apache.org/developer-tools.html#github-workflow-tests > Only digging into git log of " > [.github/workflows/build_and_test.yml|https://github.com/apache/spark/commit/2974b70d1efd4b1c5cfe7e2467766f0a9a1fec82#diff-48c0ee97c53013d18d6bbae44648f7fab9af2e0bf5b0dc1ca761e18ec5c478f2]"; > I managed to find what the new process is. It was changed in > [https://github.com/apache/spark/pull/32092] but documentation was not > updated. > I am happy to contribute to fix it but apparently > [https://spark.apache.org/developer-tools.html] is hosted in Apache website > rather that in the Spark source code -- This message was sent by Atlassian Jira (v8.20.1#820001) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-37996) Contribution guide is stale
[ https://issues.apache.org/jira/browse/SPARK-37996?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17482946#comment-17482946 ] Khalid Mammadov commented on SPARK-37996: - Sure, working on it. Thanks for the repo link! > Contribution guide is stale > --- > > Key: SPARK-37996 > URL: https://issues.apache.org/jira/browse/SPARK-37996 > Project: Spark > Issue Type: Improvement > Components: Documentation >Affects Versions: 3.2.0 >Reporter: Khalid Mammadov >Priority: Minor > > Contribution guide mentions below link to use to test on local repo before > raising PR but the process has changed and documentation does not reflect it. > https://spark.apache.org/developer-tools.html#github-workflow-tests > Only digging into git log of " > [.github/workflows/build_and_test.yml|https://github.com/apache/spark/commit/2974b70d1efd4b1c5cfe7e2467766f0a9a1fec82#diff-48c0ee97c53013d18d6bbae44648f7fab9af2e0bf5b0dc1ca761e18ec5c478f2]"; > I managed to find what the new process is. It was changed in > [https://github.com/apache/spark/pull/32092] but documentation was not > updated. > I am happy to contribute to fix it but apparently > [https://spark.apache.org/developer-tools.html] is hosted in Apache website > rather that in the Spark source code -- This message was sent by Atlassian Jira (v8.20.1#820001) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-38025) Improve test suite ExternalCatalogSuite
Khalid Mammadov created SPARK-38025: --- Summary: Improve test suite ExternalCatalogSuite Key: SPARK-38025 URL: https://issues.apache.org/jira/browse/SPARK-38025 Project: Spark Issue Type: Improvement Components: Tests Affects Versions: 3.3 Reporter: Khalid Mammadov Test suite *ExternalCatalogSuite.scala* can be optimized by removing repetitive code by replacing them with already available utility function with some minor changes. This will reduce redundant code, simplify the suite and improve readability. -- This message was sent by Atlassian Jira (v8.20.1#820001) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-37996) Contribution guide is stale
Khalid Mammadov created SPARK-37996: --- Summary: Contribution guide is stale Key: SPARK-37996 URL: https://issues.apache.org/jira/browse/SPARK-37996 Project: Spark Issue Type: Improvement Components: Documentation Affects Versions: 3.2.0 Reporter: Khalid Mammadov Contribution guide mentions below link to use to test on local repo before raising PR but the process has changed and documentation does not reflect it. https://spark.apache.org/developer-tools.html#github-workflow-tests Only digging into git log of " [.github/workflows/build_and_test.yml|https://github.com/apache/spark/commit/2974b70d1efd4b1c5cfe7e2467766f0a9a1fec82#diff-48c0ee97c53013d18d6bbae44648f7fab9af2e0bf5b0dc1ca761e18ec5c478f2]"; I managed to find what the new process is. It was changed in [https://github.com/apache/spark/pull/32092] but documentation was not updated. I am happy to contribute to fix it but apparently [https://spark.apache.org/developer-tools.html] is hosted in Apache website rather that in the Spark source code -- This message was sent by Atlassian Jira (v8.20.1#820001) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-37991) Improve test case inside SQL\Catalyst\Catalog
Khalid Mammadov created SPARK-37991: --- Summary: Improve test case inside SQL\Catalyst\Catalog Key: SPARK-37991 URL: https://issues.apache.org/jira/browse/SPARK-37991 Project: Spark Issue Type: Improvement Components: Tests Affects Versions: 3.2.0 Reporter: Khalid Mammadov Test case "{*}basic create and list partitions{*}" inside *ExternalCatalogSuite.scala* can be simplified by replacing set up code with a test utility function from the same suite: *newBasicCatalog()* -- This message was sent by Atlassian Jira (v8.20.1#820001) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org