[jira] [Resolved] (SPARK-47289) Allow extensions to log extended information in explain plan
[ https://issues.apache.org/jira/browse/SPARK-47289?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wenchen Fan resolved SPARK-47289. - Fix Version/s: 4.0.0 Resolution: Fixed Issue resolved by pull request 45488 [https://github.com/apache/spark/pull/45488] > Allow extensions to log extended information in explain plan > > > Key: SPARK-47289 > URL: https://issues.apache.org/jira/browse/SPARK-47289 > Project: Spark > Issue Type: Improvement > Components: SQL >Affects Versions: 4.0.0 >Reporter: Parth Chandra >Assignee: Parth Chandra >Priority: Major > Labels: pull-request-available > Fix For: 4.0.0 > > > With session extensions, Spark planning can be extended to apply additional > rules and modify the execution plan. If an extension replaces a node in the > plan, the new node will be displayed in the plan. However, it is sometimes > useful for extensions provided extended information to the end user to > explain the impact of the extension. For instance an extension may > automatically enable/disable some feature that it provides and can provide > this extended information in the plan. > The proposal is to optionally turn on extended plan information from > extensions. Extensions can add additional planning information via a new > interface that internally uses a new TreeNodeTag, say 'explainPlan'. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Assigned] (SPARK-47289) Allow extensions to log extended information in explain plan
[ https://issues.apache.org/jira/browse/SPARK-47289?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wenchen Fan reassigned SPARK-47289: --- Assignee: Parth Chandra > Allow extensions to log extended information in explain plan > > > Key: SPARK-47289 > URL: https://issues.apache.org/jira/browse/SPARK-47289 > Project: Spark > Issue Type: Improvement > Components: SQL >Affects Versions: 4.0.0 >Reporter: Parth Chandra >Assignee: Parth Chandra >Priority: Major > Labels: pull-request-available > > With session extensions, Spark planning can be extended to apply additional > rules and modify the execution plan. If an extension replaces a node in the > plan, the new node will be displayed in the plan. However, it is sometimes > useful for extensions provided extended information to the end user to > explain the impact of the extension. For instance an extension may > automatically enable/disable some feature that it provides and can provide > this extended information in the plan. > The proposal is to optionally turn on extended plan information from > extensions. Extensions can add additional planning information via a new > interface that internally uses a new TreeNodeTag, say 'explainPlan'. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-47718) .sql() does not recognize watermark defined upstream
[ https://issues.apache.org/jira/browse/SPARK-47718?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated SPARK-47718: --- Labels: pull-request-available (was: ) > .sql() does not recognize watermark defined upstream > > > Key: SPARK-47718 > URL: https://issues.apache.org/jira/browse/SPARK-47718 > Project: Spark > Issue Type: Bug > Components: PySpark >Affects Versions: 3.5.1 >Reporter: Chloe He >Priority: Blocker > Labels: pull-request-available > > I have a data pipeline set up in such a way that it reads data from a Kafka > source, does some transformation on the data using pyspark, then writes the > output into a sink (Kafka, Redis, etc). > > My entire pipeline in written in SQL, so I wish to use the .sql() method to > execute SQL on my streaming source directly. > > However, I'm running into the issue where my watermark is not being > recognized by the downstream query via the .sql() method. > > ``` > Python 3.11.8 | packaged by conda-forge | (main, Feb 16 2024, 20:49:36) > [Clang 16.0.6 ] on darwin > Type "help", "copyright", "credits" or "license" for more information. > >>> import pyspark > >>> print(pyspark.__version__) > 3.5.1 > >>> from pyspark.sql import SparkSession > >>> > >>> session = SparkSession.builder \ > ... .config("spark.jars.packages", > "org.apache.spark:spark-sql-kafka-0-10_2.12:3.5.1")\ > ... .getOrCreate() > >>> from pyspark.sql.functions import col, from_json > >>> from pyspark.sql.types import StructField, StructType, TimestampType, > >>> LongType, DoubleType, IntegerType > >>> schema = StructType( > ... [ > ... StructField('createTime', TimestampType(), True), > ... StructField('orderId', LongType(), True), > ... StructField('payAmount', DoubleType(), True), > ... StructField('payPlatform', IntegerType(), True), > ... StructField('provinceId', IntegerType(), True), > ... ]) > >>> > >>> streaming_df = session.readStream\ > ... .format("kafka")\ > ... .option("kafka.bootstrap.servers", "localhost:9092")\ > ... .option("subscribe", "payment_msg")\ > ... .option("startingOffsets","earliest")\ > ... .load()\ > ... .select(from_json(col("value").cast("string"), > schema).alias("parsed_value"))\ > ... .select("parsed_value.*")\ > ... .withWatermark("createTime", "10 seconds") > >>> > >>> streaming_df.createOrReplaceTempView("streaming_df") > >>> session.sql(""" > ... SELECT > ... window.start, window.end, provinceId, sum(payAmount) as totalPayAmount > ... FROM streaming_df > ... GROUP BY provinceId, window('createTime', '1 hour', '30 minutes') > ... ORDER BY window.start > ... """)\ > ... .writeStream\ > ... .format("kafka") \ > ... .option("checkpointLocation", "checkpoint") \ > ... .option("kafka.bootstrap.servers", "localhost:9092") \ > ... .option("topic", "sink") \ > ... .start() > ``` > > This throws exception > ``` > pyspark.errors.exceptions.captured.AnalysisException: Append output mode not > supported when there are streaming aggregations on streaming > DataFrames/DataSets without watermark; line 6 pos 4; > ``` > > -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-47735) Make pyspark.testing.connectutils compatible with pyspark-connect
[ https://issues.apache.org/jira/browse/SPARK-47735?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated SPARK-47735: --- Labels: pull-request-available (was: ) > Make pyspark.testing.connectutils compatible with pyspark-connect > - > > Key: SPARK-47735 > URL: https://issues.apache.org/jira/browse/SPARK-47735 > Project: Spark > Issue Type: Sub-task > Components: PySpark, Tests >Affects Versions: 4.0.0 >Reporter: Hyukjin Kwon >Priority: Major > Labels: pull-request-available > -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-47735) Make pyspark.testing.connectutils compatible with pyspark-connect
Hyukjin Kwon created SPARK-47735: Summary: Make pyspark.testing.connectutils compatible with pyspark-connect Key: SPARK-47735 URL: https://issues.apache.org/jira/browse/SPARK-47735 Project: Spark Issue Type: Sub-task Components: PySpark, Tests Affects Versions: 4.0.0 Reporter: Hyukjin Kwon -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Resolved] (SPARK-47734) Fix flaky pyspark.sql.dataframe.DataFrame.writeStream doctest by stopping streaming query
[ https://issues.apache.org/jira/browse/SPARK-47734?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hyukjin Kwon resolved SPARK-47734. -- Fix Version/s: 4.0.0 3.5.2 Resolution: Fixed Fixed in https://github.com/apache/spark/pull/45885 > Fix flaky pyspark.sql.dataframe.DataFrame.writeStream doctest by stopping > streaming query > - > > Key: SPARK-47734 > URL: https://issues.apache.org/jira/browse/SPARK-47734 > Project: Spark > Issue Type: Improvement > Components: PySpark, Tests >Affects Versions: 4.0.0 >Reporter: Josh Rosen >Assignee: Josh Rosen >Priority: Major > Labels: pull-request-available > Fix For: 4.0.0, 3.5.2 > > > https://issues.apache.org/jira/browse/SPARK-47199 didn't fix the flakiness in > the pyspark.sql.dataframe.DataFrame.writeStream doctest : the problem is not > that we are colliding on the test but, rather, that the test is starting a > background thread to write to a directory then deleting that directory from > the main test thread, something which is inherently race prone. > The fix is simple: stop the streaming query in the doctest itself, similar to > other streaming doctest examples. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-47733) Add operational metrics for TWS operators
[ https://issues.apache.org/jira/browse/SPARK-47733?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jing Zhan updated SPARK-47733: -- Description: Add metrics to improve observability for the newly added operator TransformWithState and some changes we've made into RocksDB. Proposed metrics to add: * on the RocksDB StateStore metrics side, we will add the following: ** num external col families ** num internal col families * on the operator side, we will add the following: ** number of state vars ** count of state vars by type ** output mode ** timeout mode ** registered timers in batch ** expired timers in batch ** initial state enabled or not ** number of state vars removed in batch > Add operational metrics for TWS operators > - > > Key: SPARK-47733 > URL: https://issues.apache.org/jira/browse/SPARK-47733 > Project: Spark > Issue Type: Task > Components: Structured Streaming >Affects Versions: 4.0.0 >Reporter: Jing Zhan >Priority: Major > Labels: pull-request-available > > Add metrics to improve observability for the newly added operator > TransformWithState and some changes we've made into RocksDB. > Proposed metrics to add: > * on the RocksDB StateStore metrics side, we will add the following: > ** num external col families > ** num internal col families > * on the operator side, we will add the following: > ** number of state vars > ** count of state vars by type > ** output mode > ** timeout mode > ** registered timers in batch > ** expired timers in batch > ** initial state enabled or not > ** number of state vars removed in batch -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-47734) Fix flaky pyspark.sql.dataframe.DataFrame.writeStream doctest by stopping streaming query
[ https://issues.apache.org/jira/browse/SPARK-47734?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated SPARK-47734: --- Labels: pull-request-available (was: ) > Fix flaky pyspark.sql.dataframe.DataFrame.writeStream doctest by stopping > streaming query > - > > Key: SPARK-47734 > URL: https://issues.apache.org/jira/browse/SPARK-47734 > Project: Spark > Issue Type: Improvement > Components: PySpark, Tests >Affects Versions: 4.0.0 >Reporter: Josh Rosen >Assignee: Josh Rosen >Priority: Major > Labels: pull-request-available > > https://issues.apache.org/jira/browse/SPARK-47199 didn't fix the flakiness in > the pyspark.sql.dataframe.DataFrame.writeStream doctest : the problem is not > that we are colliding on the test but, rather, that the test is starting a > background thread to write to a directory then deleting that directory from > the main test thread, something which is inherently race prone. > The fix is simple: stop the streaming query in the doctest itself, similar to > other streaming doctest examples. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-47734) Fix flaky pyspark.sql.dataframe.DataFrame.writeStream doctest by stopping streaming query
Josh Rosen created SPARK-47734: -- Summary: Fix flaky pyspark.sql.dataframe.DataFrame.writeStream doctest by stopping streaming query Key: SPARK-47734 URL: https://issues.apache.org/jira/browse/SPARK-47734 Project: Spark Issue Type: Improvement Components: PySpark, Tests Affects Versions: 4.0.0 Reporter: Josh Rosen Assignee: Josh Rosen https://issues.apache.org/jira/browse/SPARK-47199 didn't fix the flakiness in the pyspark.sql.dataframe.DataFrame.writeStream doctest : the problem is not that we are colliding on the test but, rather, that the test is starting a background thread to write to a directory then deleting that directory from the main test thread, something which is inherently race prone. The fix is simple: stop the streaming query in the doctest itself, similar to other streaming doctest examples. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-47592) Connector module: Migrate logError with variables to structured logging framework
[ https://issues.apache.org/jira/browse/SPARK-47592?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated SPARK-47592: --- Labels: pull-request-available (was: ) > Connector module: Migrate logError with variables to structured logging > framework > - > > Key: SPARK-47592 > URL: https://issues.apache.org/jira/browse/SPARK-47592 > Project: Spark > Issue Type: Sub-task > Components: Spark Core >Affects Versions: 4.0.0 >Reporter: Gengliang Wang >Priority: Major > Labels: pull-request-available > -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Resolved] (SPARK-47598) MLLib: Migrate logError with variables to structured logging framework
[ https://issues.apache.org/jira/browse/SPARK-47598?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gengliang Wang resolved SPARK-47598. Fix Version/s: 4.0.0 Resolution: Fixed Issue resolved by pull request 45837 [https://github.com/apache/spark/pull/45837] > MLLib: Migrate logError with variables to structured logging framework > -- > > Key: SPARK-47598 > URL: https://issues.apache.org/jira/browse/SPARK-47598 > Project: Spark > Issue Type: Sub-task > Components: Spark Core >Affects Versions: 4.0.0 >Reporter: Gengliang Wang >Assignee: BingKun Pan >Priority: Major > Labels: pull-request-available > Fix For: 4.0.0 > > -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-26875) Add an option on FileStreamSource to include modified files
[ https://issues.apache.org/jira/browse/SPARK-26875?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated SPARK-26875: --- Labels: pull-request-available (was: ) > Add an option on FileStreamSource to include modified files > > > Key: SPARK-26875 > URL: https://issues.apache.org/jira/browse/SPARK-26875 > Project: Spark > Issue Type: New Feature > Components: SQL >Affects Versions: 3.0.0 >Reporter: Mike Dias >Priority: Minor > Labels: pull-request-available > > The current behavior only the check the filename to determine if a file > should be processed or not. I propose to add an option to also test the file > timestamp if is greater than last time it was processed, as an indication > that it's modified and have different content. > It is useful when the source producer eventually overrides files with new > content. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-47731) Fix the 2b+ rows in a single rowgroup for row_index in Parquet reader
[ https://issues.apache.org/jira/browse/SPARK-47731?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Thang Long Vu updated SPARK-47731: -- Description: Parquet reader in Spark has a bug where a file containing 2b+ rows in a single rowgroup causes it to run out of the `Integer` range. This prevents Delta Parquet readers from exposing the row_index field as a metadata field. It would be great to have this fix so that we can use 2b+ rows in a single rowgroup and also to safely allow row_index field to be used in the Delta Parquet readers for any functionalities that might depend on it. Link to the comment in the code: https://github.com/delta-io/delta/blob/e3a481bd6c42a4f91686377d78ec9d9c934e27ee/spark/src/main/scala/org/apache/spark/sql/delta/DeltaParquetFileFormat.scala#L200 was: Parquet reader in Spark has a bug where a file containing 2b+ rows in a single rowgroup causes it to run out of the `Integer` range. This prevents Delta Parquet readers from exposing the row_index field as a metadata field. It would be great to have this fix so that we can use 2b+ rows in a single rowgroup and also to safely allow row_index field to be used in the Delta Parquet readers for any functionalities that might depend on it. > Fix the 2b+ rows in a single rowgroup for row_index in Parquet reader > - > > Key: SPARK-47731 > URL: https://issues.apache.org/jira/browse/SPARK-47731 > Project: Spark > Issue Type: Improvement > Components: SQL >Affects Versions: 3.5.0, 4.0.0 >Reporter: Thang Long Vu >Priority: Major > > Parquet reader in Spark has a bug where a file containing 2b+ rows in a > single rowgroup causes it to run out of the `Integer` range. This prevents > Delta Parquet readers from exposing the row_index field as a metadata field. > It would be great to have this fix so that we can use 2b+ rows in a single > rowgroup and also to safely allow row_index field to be used in the Delta > Parquet readers for any functionalities that might depend on it. > Link to the comment in the code: > https://github.com/delta-io/delta/blob/e3a481bd6c42a4f91686377d78ec9d9c934e27ee/spark/src/main/scala/org/apache/spark/sql/delta/DeltaParquetFileFormat.scala#L200 -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-47731) Fix the 2b+ rows in a single rowgroup for row_index in Parquet reader
Thang Long Vu created SPARK-47731: - Summary: Fix the 2b+ rows in a single rowgroup for row_index in Parquet reader Key: SPARK-47731 URL: https://issues.apache.org/jira/browse/SPARK-47731 Project: Spark Issue Type: Improvement Components: SQL Affects Versions: 3.5.0, 4.0.0 Reporter: Thang Long Vu Parquet reader in Spark has a bug where a file containing 2b+ rows in a single rowgroup causes it to run out of the `Integer` range. This prevents Delta Parquet readers from exposing the row_index field as a metadata field. It would be great to have this fix so that we can use 2b+ rows in a single rowgroup and also to safely allow row_index field to be used in the Delta Parquet readers for any functionalities that might depend on it. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-47730) Support APP_ID and EXECUTOR_ID placeholder in labels
Xi Chen created SPARK-47730: --- Summary: Support APP_ID and EXECUTOR_ID placeholder in labels Key: SPARK-47730 URL: https://issues.apache.org/jira/browse/SPARK-47730 Project: Spark Issue Type: Improvement Components: k8s Affects Versions: 3.5.1 Reporter: Xi Chen -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Resolved] (SPARK-47728) Document G1 Concurrent GC metrics
[ https://issues.apache.org/jira/browse/SPARK-47728?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dongjoon Hyun resolved SPARK-47728. --- Fix Version/s: 4.0.0 Resolution: Fixed Issue resolved by pull request 45874 [https://github.com/apache/spark/pull/45874] > Document G1 Concurrent GC metrics > - > > Key: SPARK-47728 > URL: https://issues.apache.org/jira/browse/SPARK-47728 > Project: Spark > Issue Type: Documentation > Components: Documentation >Affects Versions: 4.0.0 >Reporter: Luca Canali >Assignee: Luca Canali >Priority: Minor > Labels: pull-request-available > Fix For: 4.0.0 > > > This is to document G1 Concurrent GC metrics introduced with > https://issues.apache.org/jira/browse/SPARK-44162 -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Assigned] (SPARK-47728) Document G1 Concurrent GC metrics
[ https://issues.apache.org/jira/browse/SPARK-47728?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dongjoon Hyun reassigned SPARK-47728: - Assignee: Luca Canali > Document G1 Concurrent GC metrics > - > > Key: SPARK-47728 > URL: https://issues.apache.org/jira/browse/SPARK-47728 > Project: Spark > Issue Type: Documentation > Components: Documentation >Affects Versions: 4.0.0 >Reporter: Luca Canali >Assignee: Luca Canali >Priority: Minor > Labels: pull-request-available > > This is to document G1 Concurrent GC metrics introduced with > https://issues.apache.org/jira/browse/SPARK-44162 -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Resolved] (SPARK-47729) Get the proper default port for pyspark-connect testcases
[ https://issues.apache.org/jira/browse/SPARK-47729?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dongjoon Hyun resolved SPARK-47729. --- Fix Version/s: 4.0.0 Resolution: Fixed Issue resolved by pull request 45875 [https://github.com/apache/spark/pull/45875] > Get the proper default port for pyspark-connect testcases > - > > Key: SPARK-47729 > URL: https://issues.apache.org/jira/browse/SPARK-47729 > Project: Spark > Issue Type: Sub-task > Components: PySpark, Tests >Affects Versions: 4.0.0 >Reporter: Hyukjin Kwon >Assignee: Hyukjin Kwon >Priority: Minor > Labels: pull-request-available > Fix For: 4.0.0 > > -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Resolved] (SPARK-47565) PySpark workers dying in daemon mode idle queue fail query
[ https://issues.apache.org/jira/browse/SPARK-47565?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hyukjin Kwon resolved SPARK-47565. -- Fix Version/s: 4.0.0 Resolution: Fixed Issue resolved by pull request 45635 [https://github.com/apache/spark/pull/45635] > PySpark workers dying in daemon mode idle queue fail query > -- > > Key: SPARK-47565 > URL: https://issues.apache.org/jira/browse/SPARK-47565 > Project: Spark > Issue Type: Improvement > Components: PySpark >Affects Versions: 3.4.2, 3.5.1, 3.3.4 >Reporter: Sebastian Hillig >Assignee: Nikita Awasthi >Priority: Major > Labels: pull-request-available > Fix For: 4.0.0 > > > PySpark workers may die after entering the idle queue in > `PythonWorkerFactory`. This may happen because of code that runs in the > process, or external factors. > When drawn from the warmpool, such a worker will result in an I/O exception > on the first read/write . -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Assigned] (SPARK-47565) PySpark workers dying in daemon mode idle queue fail query
[ https://issues.apache.org/jira/browse/SPARK-47565?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hyukjin Kwon reassigned SPARK-47565: Assignee: Nikita Awasthi > PySpark workers dying in daemon mode idle queue fail query > -- > > Key: SPARK-47565 > URL: https://issues.apache.org/jira/browse/SPARK-47565 > Project: Spark > Issue Type: Improvement > Components: PySpark >Affects Versions: 3.4.2, 3.5.1, 3.3.4 >Reporter: Sebastian Hillig >Assignee: Nikita Awasthi >Priority: Major > Labels: pull-request-available > > PySpark workers may die after entering the idle queue in > `PythonWorkerFactory`. This may happen because of code that runs in the > process, or external factors. > When drawn from the warmpool, such a worker will result in an I/O exception > on the first read/write . -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Assigned] (SPARK-47694) Make max message size configurable on client side
[ https://issues.apache.org/jira/browse/SPARK-47694?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot reassigned SPARK-47694: -- Assignee: Martin Grund (was: Apache Spark) > Make max message size configurable on client side > - > > Key: SPARK-47694 > URL: https://issues.apache.org/jira/browse/SPARK-47694 > Project: Spark > Issue Type: Sub-task > Components: Connect >Affects Versions: 3.4.1 >Reporter: Robert Dillitz >Assignee: Martin Grund >Priority: Major > Labels: pull-request-available > Fix For: 3.4.3 > > > Follow-up to SPARK-42816: Make the limit configurable on the client side. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Assigned] (SPARK-47694) Make max message size configurable on client side
[ https://issues.apache.org/jira/browse/SPARK-47694?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot reassigned SPARK-47694: -- Assignee: Apache Spark (was: Martin Grund) > Make max message size configurable on client side > - > > Key: SPARK-47694 > URL: https://issues.apache.org/jira/browse/SPARK-47694 > Project: Spark > Issue Type: Sub-task > Components: Connect >Affects Versions: 3.4.1 >Reporter: Robert Dillitz >Assignee: Apache Spark >Priority: Major > Labels: pull-request-available > Fix For: 3.4.3 > > > Follow-up to SPARK-42816: Make the limit configurable on the client side. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Assigned] (SPARK-47359) StringTranslate (all collations)
[ https://issues.apache.org/jira/browse/SPARK-47359?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot reassigned SPARK-47359: -- Assignee: (was: Apache Spark) > StringTranslate (all collations) > > > Key: SPARK-47359 > URL: https://issues.apache.org/jira/browse/SPARK-47359 > Project: Spark > Issue Type: Sub-task > Components: SQL >Affects Versions: 4.0.0 >Reporter: Uroš Bojanić >Priority: Major > Labels: pull-request-available > > Enable collation support for the *StringTranslate* built-in string function > in Spark. First confirm what is the expected behaviour for this function when > given collated strings, and then move on to implementation and testing. One > way to go about this is to consider using {_}StringSearch{_}, an efficient > ICU service for string matching. Implement the corresponding unit tests > (CollationStringExpressionsSuite) and E2E tests (CollationSuite) to reflect > how this function should be used with collation in SparkSQL, and feel free to > use your chosen Spark SQL Editor to experiment with the existing functions to > learn more about how they work. In addition, look into the possible use-cases > and implementation of similar functions within other other open-source DBMS, > such as [PostgreSQL|https://www.postgresql.org/docs/]. > > The goal for this Jira ticket is to implement the *StringTranslate* function > so it supports all collation types currently supported in Spark. To > understand what changes were introduced in order to enable full collation > support for other existing functions in Spark, take a look at the Spark PRs > and Jira tickets for completed tasks in this parent (for example: Contains, > StartsWith, EndsWith). > > Read more about ICU [Collation Concepts|http://example.com/] and > [Collator|http://example.com/] class, as well as _StringSearch_ using the > [ICU user > guide|https://unicode-org.github.io/icu/userguide/collation/string-search.html] > and [ICU > docs|https://unicode-org.github.io/icu-docs/apidoc/released/icu4j/com/ibm/icu/text/StringSearch.html]. > Also, refer to the Unicode Technical Standard for string > [searching|https://www.unicode.org/reports/tr10/#Searching] and > [collation|https://www.unicode.org/reports/tr35/tr35-collation.html#Collation_Type_Fallback]. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Assigned] (SPARK-47359) StringTranslate (all collations)
[ https://issues.apache.org/jira/browse/SPARK-47359?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot reassigned SPARK-47359: -- Assignee: Apache Spark > StringTranslate (all collations) > > > Key: SPARK-47359 > URL: https://issues.apache.org/jira/browse/SPARK-47359 > Project: Spark > Issue Type: Sub-task > Components: SQL >Affects Versions: 4.0.0 >Reporter: Uroš Bojanić >Assignee: Apache Spark >Priority: Major > Labels: pull-request-available > > Enable collation support for the *StringTranslate* built-in string function > in Spark. First confirm what is the expected behaviour for this function when > given collated strings, and then move on to implementation and testing. One > way to go about this is to consider using {_}StringSearch{_}, an efficient > ICU service for string matching. Implement the corresponding unit tests > (CollationStringExpressionsSuite) and E2E tests (CollationSuite) to reflect > how this function should be used with collation in SparkSQL, and feel free to > use your chosen Spark SQL Editor to experiment with the existing functions to > learn more about how they work. In addition, look into the possible use-cases > and implementation of similar functions within other other open-source DBMS, > such as [PostgreSQL|https://www.postgresql.org/docs/]. > > The goal for this Jira ticket is to implement the *StringTranslate* function > so it supports all collation types currently supported in Spark. To > understand what changes were introduced in order to enable full collation > support for other existing functions in Spark, take a look at the Spark PRs > and Jira tickets for completed tasks in this parent (for example: Contains, > StartsWith, EndsWith). > > Read more about ICU [Collation Concepts|http://example.com/] and > [Collator|http://example.com/] class, as well as _StringSearch_ using the > [ICU user > guide|https://unicode-org.github.io/icu/userguide/collation/string-search.html] > and [ICU > docs|https://unicode-org.github.io/icu-docs/apidoc/released/icu4j/com/ibm/icu/text/StringSearch.html]. > Also, refer to the Unicode Technical Standard for string > [searching|https://www.unicode.org/reports/tr10/#Searching] and > [collation|https://www.unicode.org/reports/tr35/tr35-collation.html#Collation_Type_Fallback]. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Assigned] (SPARK-47359) StringTranslate (all collations)
[ https://issues.apache.org/jira/browse/SPARK-47359?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot reassigned SPARK-47359: -- Assignee: (was: Apache Spark) > StringTranslate (all collations) > > > Key: SPARK-47359 > URL: https://issues.apache.org/jira/browse/SPARK-47359 > Project: Spark > Issue Type: Sub-task > Components: SQL >Affects Versions: 4.0.0 >Reporter: Uroš Bojanić >Priority: Major > Labels: pull-request-available > > Enable collation support for the *StringTranslate* built-in string function > in Spark. First confirm what is the expected behaviour for this function when > given collated strings, and then move on to implementation and testing. One > way to go about this is to consider using {_}StringSearch{_}, an efficient > ICU service for string matching. Implement the corresponding unit tests > (CollationStringExpressionsSuite) and E2E tests (CollationSuite) to reflect > how this function should be used with collation in SparkSQL, and feel free to > use your chosen Spark SQL Editor to experiment with the existing functions to > learn more about how they work. In addition, look into the possible use-cases > and implementation of similar functions within other other open-source DBMS, > such as [PostgreSQL|https://www.postgresql.org/docs/]. > > The goal for this Jira ticket is to implement the *StringTranslate* function > so it supports all collation types currently supported in Spark. To > understand what changes were introduced in order to enable full collation > support for other existing functions in Spark, take a look at the Spark PRs > and Jira tickets for completed tasks in this parent (for example: Contains, > StartsWith, EndsWith). > > Read more about ICU [Collation Concepts|http://example.com/] and > [Collator|http://example.com/] class, as well as _StringSearch_ using the > [ICU user > guide|https://unicode-org.github.io/icu/userguide/collation/string-search.html] > and [ICU > docs|https://unicode-org.github.io/icu-docs/apidoc/released/icu4j/com/ibm/icu/text/StringSearch.html]. > Also, refer to the Unicode Technical Standard for string > [searching|https://www.unicode.org/reports/tr10/#Searching] and > [collation|https://www.unicode.org/reports/tr35/tr35-collation.html#Collation_Type_Fallback]. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Assigned] (SPARK-47567) StringLocate
[ https://issues.apache.org/jira/browse/SPARK-47567?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot reassigned SPARK-47567: -- Assignee: Apache Spark > StringLocate > > > Key: SPARK-47567 > URL: https://issues.apache.org/jira/browse/SPARK-47567 > Project: Spark > Issue Type: Sub-task > Components: SQL >Affects Versions: 4.0.0 >Reporter: Milan Dankovic >Assignee: Apache Spark >Priority: Major > Labels: pull-request-available > > Enable collation support for the *StringLocate* built-in string function in > Spark. First confirm what is the expected behaviour for these functions when > given collated strings, and then move on to implementation and testing. One > way to go about this is to consider using {_}StringSearch{_}, an efficient > ICU service for string matching. Implement the corresponding unit tests > (CollationStringExpressionsSuite) and E2E tests (CollationSuite) to reflect > how this function should be used with collation in SparkSQL, and feel free to > use your chosen Spark SQL Editor to experiment with the existing functions to > learn more about how they work. In addition, look into the possible use-cases > and implementation of similar functions within other other open-source DBMS, > such as [PostgreSQL|https://www.postgresql.org/docs/]. > > The goal for this Jira ticket is to implement the *StringLocate* functions so > that they support all collation types currently supported in Spark. To > understand what changes were introduced in order to enable full collation > support for other existing functions in Spark, take a look at the Spark PRs > and Jira tickets for completed tasks in this parent (for example: Contains, > StartsWith, EndsWith). > > Read more about ICU [Collation Concepts|http://example.com/] and > [Collator|http://example.com/] class, as well as _StringSearch_ using the > [ICU user > guide|https://unicode-org.github.io/icu/userguide/collation/string-search.html] > and [ICU > docs|https://unicode-org.github.io/icu-docs/apidoc/released/icu4j/com/ibm/icu/text/StringSearch.html]. > Also, refer to the Unicode Technical Standard for string > [searching|https://www.unicode.org/reports/tr10/#Searching] and > [collation|https://www.unicode.org/reports/tr35/tr35-collation.html#Collation_Type_Fallback]. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Assigned] (SPARK-47567) StringLocate
[ https://issues.apache.org/jira/browse/SPARK-47567?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot reassigned SPARK-47567: -- Assignee: Apache Spark > StringLocate > > > Key: SPARK-47567 > URL: https://issues.apache.org/jira/browse/SPARK-47567 > Project: Spark > Issue Type: Sub-task > Components: SQL >Affects Versions: 4.0.0 >Reporter: Milan Dankovic >Assignee: Apache Spark >Priority: Major > Labels: pull-request-available > > Enable collation support for the *StringLocate* built-in string function in > Spark. First confirm what is the expected behaviour for these functions when > given collated strings, and then move on to implementation and testing. One > way to go about this is to consider using {_}StringSearch{_}, an efficient > ICU service for string matching. Implement the corresponding unit tests > (CollationStringExpressionsSuite) and E2E tests (CollationSuite) to reflect > how this function should be used with collation in SparkSQL, and feel free to > use your chosen Spark SQL Editor to experiment with the existing functions to > learn more about how they work. In addition, look into the possible use-cases > and implementation of similar functions within other other open-source DBMS, > such as [PostgreSQL|https://www.postgresql.org/docs/]. > > The goal for this Jira ticket is to implement the *StringLocate* functions so > that they support all collation types currently supported in Spark. To > understand what changes were introduced in order to enable full collation > support for other existing functions in Spark, take a look at the Spark PRs > and Jira tickets for completed tasks in this parent (for example: Contains, > StartsWith, EndsWith). > > Read more about ICU [Collation Concepts|http://example.com/] and > [Collator|http://example.com/] class, as well as _StringSearch_ using the > [ICU user > guide|https://unicode-org.github.io/icu/userguide/collation/string-search.html] > and [ICU > docs|https://unicode-org.github.io/icu-docs/apidoc/released/icu4j/com/ibm/icu/text/StringSearch.html]. > Also, refer to the Unicode Technical Standard for string > [searching|https://www.unicode.org/reports/tr10/#Searching] and > [collation|https://www.unicode.org/reports/tr35/tr35-collation.html#Collation_Type_Fallback]. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Assigned] (SPARK-47567) StringLocate
[ https://issues.apache.org/jira/browse/SPARK-47567?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot reassigned SPARK-47567: -- Assignee: (was: Apache Spark) > StringLocate > > > Key: SPARK-47567 > URL: https://issues.apache.org/jira/browse/SPARK-47567 > Project: Spark > Issue Type: Sub-task > Components: SQL >Affects Versions: 4.0.0 >Reporter: Milan Dankovic >Priority: Major > Labels: pull-request-available > > Enable collation support for the *StringLocate* built-in string function in > Spark. First confirm what is the expected behaviour for these functions when > given collated strings, and then move on to implementation and testing. One > way to go about this is to consider using {_}StringSearch{_}, an efficient > ICU service for string matching. Implement the corresponding unit tests > (CollationStringExpressionsSuite) and E2E tests (CollationSuite) to reflect > how this function should be used with collation in SparkSQL, and feel free to > use your chosen Spark SQL Editor to experiment with the existing functions to > learn more about how they work. In addition, look into the possible use-cases > and implementation of similar functions within other other open-source DBMS, > such as [PostgreSQL|https://www.postgresql.org/docs/]. > > The goal for this Jira ticket is to implement the *StringLocate* functions so > that they support all collation types currently supported in Spark. To > understand what changes were introduced in order to enable full collation > support for other existing functions in Spark, take a look at the Spark PRs > and Jira tickets for completed tasks in this parent (for example: Contains, > StartsWith, EndsWith). > > Read more about ICU [Collation Concepts|http://example.com/] and > [Collator|http://example.com/] class, as well as _StringSearch_ using the > [ICU user > guide|https://unicode-org.github.io/icu/userguide/collation/string-search.html] > and [ICU > docs|https://unicode-org.github.io/icu-docs/apidoc/released/icu4j/com/ibm/icu/text/StringSearch.html]. > Also, refer to the Unicode Technical Standard for string > [searching|https://www.unicode.org/reports/tr10/#Searching] and > [collation|https://www.unicode.org/reports/tr35/tr35-collation.html#Collation_Type_Fallback]. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Assigned] (SPARK-47359) StringTranslate (all collations)
[ https://issues.apache.org/jira/browse/SPARK-47359?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot reassigned SPARK-47359: -- Assignee: Apache Spark > StringTranslate (all collations) > > > Key: SPARK-47359 > URL: https://issues.apache.org/jira/browse/SPARK-47359 > Project: Spark > Issue Type: Sub-task > Components: SQL >Affects Versions: 4.0.0 >Reporter: Uroš Bojanić >Assignee: Apache Spark >Priority: Major > Labels: pull-request-available > > Enable collation support for the *StringTranslate* built-in string function > in Spark. First confirm what is the expected behaviour for this function when > given collated strings, and then move on to implementation and testing. One > way to go about this is to consider using {_}StringSearch{_}, an efficient > ICU service for string matching. Implement the corresponding unit tests > (CollationStringExpressionsSuite) and E2E tests (CollationSuite) to reflect > how this function should be used with collation in SparkSQL, and feel free to > use your chosen Spark SQL Editor to experiment with the existing functions to > learn more about how they work. In addition, look into the possible use-cases > and implementation of similar functions within other other open-source DBMS, > such as [PostgreSQL|https://www.postgresql.org/docs/]. > > The goal for this Jira ticket is to implement the *StringTranslate* function > so it supports all collation types currently supported in Spark. To > understand what changes were introduced in order to enable full collation > support for other existing functions in Spark, take a look at the Spark PRs > and Jira tickets for completed tasks in this parent (for example: Contains, > StartsWith, EndsWith). > > Read more about ICU [Collation Concepts|http://example.com/] and > [Collator|http://example.com/] class, as well as _StringSearch_ using the > [ICU user > guide|https://unicode-org.github.io/icu/userguide/collation/string-search.html] > and [ICU > docs|https://unicode-org.github.io/icu-docs/apidoc/released/icu4j/com/ibm/icu/text/StringSearch.html]. > Also, refer to the Unicode Technical Standard for string > [searching|https://www.unicode.org/reports/tr10/#Searching] and > [collation|https://www.unicode.org/reports/tr35/tr35-collation.html#Collation_Type_Fallback]. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Assigned] (SPARK-47567) StringLocate
[ https://issues.apache.org/jira/browse/SPARK-47567?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot reassigned SPARK-47567: -- Assignee: (was: Apache Spark) > StringLocate > > > Key: SPARK-47567 > URL: https://issues.apache.org/jira/browse/SPARK-47567 > Project: Spark > Issue Type: Sub-task > Components: SQL >Affects Versions: 4.0.0 >Reporter: Milan Dankovic >Priority: Major > Labels: pull-request-available > > Enable collation support for the *StringLocate* built-in string function in > Spark. First confirm what is the expected behaviour for these functions when > given collated strings, and then move on to implementation and testing. One > way to go about this is to consider using {_}StringSearch{_}, an efficient > ICU service for string matching. Implement the corresponding unit tests > (CollationStringExpressionsSuite) and E2E tests (CollationSuite) to reflect > how this function should be used with collation in SparkSQL, and feel free to > use your chosen Spark SQL Editor to experiment with the existing functions to > learn more about how they work. In addition, look into the possible use-cases > and implementation of similar functions within other other open-source DBMS, > such as [PostgreSQL|https://www.postgresql.org/docs/]. > > The goal for this Jira ticket is to implement the *StringLocate* functions so > that they support all collation types currently supported in Spark. To > understand what changes were introduced in order to enable full collation > support for other existing functions in Spark, take a look at the Spark PRs > and Jira tickets for completed tasks in this parent (for example: Contains, > StartsWith, EndsWith). > > Read more about ICU [Collation Concepts|http://example.com/] and > [Collator|http://example.com/] class, as well as _StringSearch_ using the > [ICU user > guide|https://unicode-org.github.io/icu/userguide/collation/string-search.html] > and [ICU > docs|https://unicode-org.github.io/icu-docs/apidoc/released/icu4j/com/ibm/icu/text/StringSearch.html]. > Also, refer to the Unicode Technical Standard for string > [searching|https://www.unicode.org/reports/tr10/#Searching] and > [collation|https://www.unicode.org/reports/tr35/tr35-collation.html#Collation_Type_Fallback]. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-47593) Connector module: Migrate logWarn with variables to structured logging framework
[ https://issues.apache.org/jira/browse/SPARK-47593?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated SPARK-47593: --- Labels: pull-request-available (was: ) > Connector module: Migrate logWarn with variables to structured logging > framework > > > Key: SPARK-47593 > URL: https://issues.apache.org/jira/browse/SPARK-47593 > Project: Spark > Issue Type: Sub-task > Components: Spark Core >Affects Versions: 4.0.0 >Reporter: Gengliang Wang >Priority: Major > Labels: pull-request-available > -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-47586) Hive module: Migrate logError with variables to structured logging framework
[ https://issues.apache.org/jira/browse/SPARK-47586?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated SPARK-47586: --- Labels: pull-request-available (was: ) > Hive module: Migrate logError with variables to structured logging framework > > > Key: SPARK-47586 > URL: https://issues.apache.org/jira/browse/SPARK-47586 > Project: Spark > Issue Type: Sub-task > Components: Spark Core >Affects Versions: 4.0.0 >Reporter: Gengliang Wang >Priority: Major > Labels: pull-request-available > -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-47727) Make SparkConf to root level to for both SparkSession and SparkContext
[ https://issues.apache.org/jira/browse/SPARK-47727?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated SPARK-47727: --- Labels: pull-request-available (was: ) > Make SparkConf to root level to for both SparkSession and SparkContext > -- > > Key: SPARK-47727 > URL: https://issues.apache.org/jira/browse/SPARK-47727 > Project: Spark > Issue Type: Sub-task > Components: PySpark >Affects Versions: 4.0.0 >Reporter: Hyukjin Kwon >Priority: Major > Labels: pull-request-available > -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-47728) Document G1 Concurrent GC metrics
[ https://issues.apache.org/jira/browse/SPARK-47728?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated SPARK-47728: --- Labels: pull-request-available (was: ) > Document G1 Concurrent GC metrics > - > > Key: SPARK-47728 > URL: https://issues.apache.org/jira/browse/SPARK-47728 > Project: Spark > Issue Type: Documentation > Components: Documentation >Affects Versions: 4.0.0 >Reporter: Luca Canali >Priority: Minor > Labels: pull-request-available > > This is to document G1 Concurrent GC metrics introduced with > https://issues.apache.org/jira/browse/SPARK-44162 -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-47728) Document G1 Concurrent GC metrics
Luca Canali created SPARK-47728: --- Summary: Document G1 Concurrent GC metrics Key: SPARK-47728 URL: https://issues.apache.org/jira/browse/SPARK-47728 Project: Spark Issue Type: Documentation Components: Documentation Affects Versions: 4.0.0 Reporter: Luca Canali This is to document G1 Concurrent GC metrics introduced with https://issues.apache.org/jira/browse/SPARK-44162 -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-47727) Make SparkConf to root level to for both SparkSession and SparkContext
Hyukjin Kwon created SPARK-47727: Summary: Make SparkConf to root level to for both SparkSession and SparkContext Key: SPARK-47727 URL: https://issues.apache.org/jira/browse/SPARK-47727 Project: Spark Issue Type: Sub-task Components: PySpark Affects Versions: 4.0.0 Reporter: Hyukjin Kwon -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-47726) Document push-based shuffle metrics
[ https://issues.apache.org/jira/browse/SPARK-47726?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated SPARK-47726: --- Labels: pull-request-available (was: ) > Document push-based shuffle metrics > --- > > Key: SPARK-47726 > URL: https://issues.apache.org/jira/browse/SPARK-47726 > Project: Spark > Issue Type: Documentation > Components: Documentation >Affects Versions: 3.4.2, 4.0.0, 3.5.1 >Reporter: Luca Canali >Priority: Minor > Labels: pull-request-available > > This is to add documentation for the metrics related to push-based shuffle. > It's a follow up documentation ticket from: > https://issues.apache.org/jira/browse/SPARK-36620 > Related to this, note also: https://issues.apache.org/jira/browse/SPARK-42203 -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-47726) Document push-based shuffle metrics
Luca Canali created SPARK-47726: --- Summary: Document push-based shuffle metrics Key: SPARK-47726 URL: https://issues.apache.org/jira/browse/SPARK-47726 Project: Spark Issue Type: Documentation Components: Documentation Affects Versions: 3.5.1, 3.4.2, 4.0.0 Reporter: Luca Canali This is to add documentation for the metrics related to push-based shuffle. It's a follow up documentation ticket from: https://issues.apache.org/jira/browse/SPARK-36620 Related to this, note also: https://issues.apache.org/jira/browse/SPARK-42203 -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org