[jira] [Commented] (SPARK-44782) Adjust Pull Request Template to incorporate the ASF Generative Tooling Guidance recommendations
[ https://issues.apache.org/jira/browse/SPARK-44782?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17754482#comment-17754482 ] Maciej Szymkiewicz commented on SPARK-44782: Created a pull request for this issue: https://github.com/apache/spark/pull/42469 > Adjust Pull Request Template to incorporate the ASF Generative Tooling > Guidance recommendations > --- > > Key: SPARK-44782 > URL: https://issues.apache.org/jira/browse/SPARK-44782 > Project: Spark > Issue Type: Improvement > Components: Project Infra >Affects Versions: 3.3.2, 3.4.1 >Reporter: Maciej Szymkiewicz >Priority: Major > > Recently releases [ASF Generative Tooling > Guidance|https://www.apache.org/legal/generative-tooling.html] recommends > keeping track of the generative AI tools used to author patches > ??When providing contributions authored using generative AI tooling, a > recommended practice is for contributors to indicate the tooling used to > create the contribution. This should be included as a token in the source > control commit message, for example including the phrase “Generated-by: ”. > This allows for future release tooling to be considered that pulls this > content into a machine parsable Tooling-Provenance file.?? > We should adjust PR template accordingly. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-44782) Adjust Pull Request Template to incorporate the ASF Generative Tooling Guidance recommendations
Maciej Szymkiewicz created SPARK-44782: -- Summary: Adjust Pull Request Template to incorporate the ASF Generative Tooling Guidance recommendations Key: SPARK-44782 URL: https://issues.apache.org/jira/browse/SPARK-44782 Project: Spark Issue Type: Improvement Components: Project Infra Affects Versions: 3.4.1, 3.3.2 Reporter: Maciej Szymkiewicz Recently releases [ASF Generative Tooling Guidance|https://www.apache.org/legal/generative-tooling.html] recommends keeping track of the generative AI tools used to author patches ??When providing contributions authored using generative AI tooling, a recommended practice is for contributors to indicate the tooling used to create the contribution. This should be included as a token in the source control commit message, for example including the phrase “Generated-by: ”. This allows for future release tooling to be considered that pulls this content into a machine parsable Tooling-Provenance file.?? We should adjust PR template accordingly. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-42910) Generic annotation of class attribute in abstract class is NOT initalized in inherited classes
[ https://issues.apache.org/jira/browse/SPARK-42910?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17705063#comment-17705063 ] Maciej Szymkiewicz commented on SPARK-42910: [~gurwls223] After further investigation it looks like it is {{cloudpickle}} issue and has been resolved somewhere between 2.0.0 and 2.2.0. I guess we could just backport SPARK-40991. > Generic annotation of class attribute in abstract class is NOT initalized in > inherited classes > -- > > Key: SPARK-42910 > URL: https://issues.apache.org/jira/browse/SPARK-42910 > Project: Spark > Issue Type: Bug > Components: PySpark >Affects Versions: 3.3.0, 3.3.2 > Environment: Tested in two environments: > # Databricks > Pyspark Version: 3.3.0 > Python Version: 3.9.15 > # Local > Pyspark Verison: 3.3.2 > Python Version: 3.3.10 >Reporter: Jon Farzanfar >Priority: Minor > > We are trying to leverage generics to better type our code base. The example > below shows the problem we are having, however without generics this works > completely fine in pyspark however with generics it doesn't but does locally > without leveraging pyspark. > Output for local: > > {code:java} > {code} > > TraceBack for pyspark: > {code:java} > AttributeError: type object 'C' has no attribute 'base_record' > at > org.apache.spark.api.python.BasePythonRunner$ReaderIterator.handlePythonException(PythonRunner.scala:559) > at > org.apache.spark.api.python.PythonRunner$$anon$3.read(PythonRunner.scala:765) > at > org.apache.spark.api.python.PythonRunner$$anon$3.read(PythonRunner.scala:747) > at > org.apache.spark.api.python.BasePythonRunner$ReaderIterator.hasNext(PythonRunner.scala:512) > at > org.apache.spark.InterruptibleIterator.hasNext(InterruptibleIterator.scala:37) > at scala.collection.Iterator.foreach(Iterator.scala:943) > at scala.collection.Iterator.foreach$(Iterator.scala:943) > at > org.apache.spark.InterruptibleIterator.foreach(InterruptibleIterator.scala:28) > at scala.collection.generic.Growable.$plus$plus$eq(Growable.scala:62) > at scala.collection.generic.Growable.$plus$plus$eq$(Growable.scala:53) > at > scala.collection.mutable.ArrayBuffer.$plus$plus$eq(ArrayBuffer.scala:105) > at > scala.collection.mutable.ArrayBuffer.$plus$plus$eq(ArrayBuffer.scala:49) > at scala.collection.TraversableOnce.to(TraversableOnce.scala:366) > at scala.collection.TraversableOnce.to$(TraversableOnce.scala:364) > at > org.apache.spark.InterruptibleIterator.to(InterruptibleIterator.scala:28) > at scala.collection.TraversableOnce.toBuffer(TraversableOnce.scala:358) > at scala.collection.TraversableOnce.toBuffer$(TraversableOnce.scala:358) > at > org.apache.spark.InterruptibleIterator.toBuffer(InterruptibleIterator.scala:28) > at scala.collection.TraversableOnce.toArray(TraversableOnce.scala:345) > at scala.collection.TraversableOnce.toArray$(TraversableOnce.scala:339) > at > org.apache.spark.InterruptibleIterator.toArray(InterruptibleIterator.scala:28) > at org.apache.spark.rdd.RDD.$anonfun$collect$2(RDD.scala:1021) > at > org.apache.spark.SparkContext.$anonfun$runJob$5(SparkContext.scala:2268) > at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:90) > at org.apache.spark.scheduler.Task.run(Task.scala:136) > at > org.apache.spark.executor.Executor$TaskRunner.$anonfun$run$3(Executor.scala:548) > at org.apache.spark.util.Utils$.tryWithSafeFinally(Utils.scala:1504) > at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:551) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) > ... 1 more {code} > > Code: > > {code:java} > from abc import ABC > from typing import Generic, TypeVar, Callable > from operator import add > from pyspark.sql import SparkSession > T = TypeVar("T") > class Foo: > ... > class A(ABC, Generic[T]): > base_record: Callable[..., T] > class B(A): > base_record = Foo > class C(B): > ... > def f(_: int) -> int: > print(C.base_record) > return 1 > spark = SparkSession\ > .builder\ > .appName("schema_test")\ > .getOrCreate() > spark.sparkContext.parallelize(range(1, 100)).map(f).reduce(add) {code} > > -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-42910) Generic annotation of class attribute in abstract class is NOT initalized in inherited classes
[ https://issues.apache.org/jira/browse/SPARK-42910?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17704575#comment-17704575 ] Maciej Szymkiewicz commented on SPARK-42910: It is no longer generic, so that cannot be a problem. Additionally, the issue seems to disappear when classes are defined externally: {code:python} # foo,py from abc import ABC from typing import Generic, TypeVar, Callable T = TypeVar("T") class Foo: ... class A(ABC, Generic[T]): base_record: Callable[..., T] class B(A): base_record = Foo class C(B): ... def f(_: int) -> int: print(C.base_record) return 1 {code} and then {code: python} from operator import add from foo import C, f from pyspark.sql import SparkSession spark = SparkSession\ .builder\ .appName("schema_test")\ .getOrCreate() spark.sparkContext.parallelize(range(1, 100)).map(f).reduce(add) {code} so it makes sense to focus further investigation on the way how we prepare locally defined classes for shipping over the wire. > Generic annotation of class attribute in abstract class is NOT initalized in > inherited classes > -- > > Key: SPARK-42910 > URL: https://issues.apache.org/jira/browse/SPARK-42910 > Project: Spark > Issue Type: Bug > Components: PySpark >Affects Versions: 3.3.0, 3.3.2 > Environment: Tested in two environments: > # Databricks > Pyspark Version: 3.3.0 > Python Version: 3.9.15 > # Local > Pyspark Verison: 3.3.2 > Python Version: 3.3.10 >Reporter: Jon Farzanfar >Priority: Minor > > We are trying to leverage generics to better type our code base. The example > below shows the problem we are having, however without generics this works > completely fine in pyspark however with generics it doesn't but does locally > without leveraging pyspark. > Output for local: > > {code:java} > {code} > > TraceBack for pyspark: > {code:java} > AttributeError: type object 'C' has no attribute 'base_record' > at > org.apache.spark.api.python.BasePythonRunner$ReaderIterator.handlePythonException(PythonRunner.scala:559) > at > org.apache.spark.api.python.PythonRunner$$anon$3.read(PythonRunner.scala:765) > at > org.apache.spark.api.python.PythonRunner$$anon$3.read(PythonRunner.scala:747) > at > org.apache.spark.api.python.BasePythonRunner$ReaderIterator.hasNext(PythonRunner.scala:512) > at > org.apache.spark.InterruptibleIterator.hasNext(InterruptibleIterator.scala:37) > at scala.collection.Iterator.foreach(Iterator.scala:943) > at scala.collection.Iterator.foreach$(Iterator.scala:943) > at > org.apache.spark.InterruptibleIterator.foreach(InterruptibleIterator.scala:28) > at scala.collection.generic.Growable.$plus$plus$eq(Growable.scala:62) > at scala.collection.generic.Growable.$plus$plus$eq$(Growable.scala:53) > at > scala.collection.mutable.ArrayBuffer.$plus$plus$eq(ArrayBuffer.scala:105) > at > scala.collection.mutable.ArrayBuffer.$plus$plus$eq(ArrayBuffer.scala:49) > at scala.collection.TraversableOnce.to(TraversableOnce.scala:366) > at scala.collection.TraversableOnce.to$(TraversableOnce.scala:364) > at > org.apache.spark.InterruptibleIterator.to(InterruptibleIterator.scala:28) > at scala.collection.TraversableOnce.toBuffer(TraversableOnce.scala:358) > at scala.collection.TraversableOnce.toBuffer$(TraversableOnce.scala:358) > at > org.apache.spark.InterruptibleIterator.toBuffer(InterruptibleIterator.scala:28) > at scala.collection.TraversableOnce.toArray(TraversableOnce.scala:345) > at scala.collection.TraversableOnce.toArray$(TraversableOnce.scala:339) > at > org.apache.spark.InterruptibleIterator.toArray(InterruptibleIterator.scala:28) > at org.apache.spark.rdd.RDD.$anonfun$collect$2(RDD.scala:1021) > at > org.apache.spark.SparkContext.$anonfun$runJob$5(SparkContext.scala:2268) > at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:90) > at org.apache.spark.scheduler.Task.run(Task.scala:136) > at > org.apache.spark.executor.Executor$TaskRunner.$anonfun$run$3(Executor.scala:548) > at org.apache.spark.util.Utils$.tryWithSafeFinally(Utils.scala:1504) > at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:551) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) > ... 1 more {code} > > Code: > > {code:java} > from abc import ABC > from typing import Generic, TypeVar, Callable > from operator import add > from pyspark.sql import SparkSession > T = TypeVar("T") > class Foo: > ... > class A(ABC,
[jira] [Comment Edited] (SPARK-42910) Generic annotation of class attribute in abstract class is NOT initalized in inherited classes
[ https://issues.apache.org/jira/browse/SPARK-42910?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17704575#comment-17704575 ] Maciej Szymkiewicz edited comment on SPARK-42910 at 3/24/23 11:15 AM: -- It is no longer generic, so that cannot be a problem. Additionally, the issue seems to disappear when classes are defined externally: {code:python} # foo,py from abc import ABC from typing import Generic, TypeVar, Callable T = TypeVar("T") class Foo: ... class A(ABC, Generic[T]): base_record: Callable[..., T] class B(A): base_record = Foo class C(B): ... def f(_: int) -> int: print(C.base_record) return 1 {code} and then {code:python} from operator import add from foo import C, f from pyspark.sql import SparkSession spark = SparkSession\ .builder\ .appName("schema_test")\ .getOrCreate() spark.sparkContext.parallelize(range(1, 100)).map(f).reduce(add) {code} so it makes sense to focus further investigation on the way how we prepare locally defined classes for shipping over the wire. was (Author: zero323): It is no longer generic, so that cannot be a problem. Additionally, the issue seems to disappear when classes are defined externally: {code:python} # foo,py from abc import ABC from typing import Generic, TypeVar, Callable T = TypeVar("T") class Foo: ... class A(ABC, Generic[T]): base_record: Callable[..., T] class B(A): base_record = Foo class C(B): ... def f(_: int) -> int: print(C.base_record) return 1 {code} and then {code: python} from operator import add from foo import C, f from pyspark.sql import SparkSession spark = SparkSession\ .builder\ .appName("schema_test")\ .getOrCreate() spark.sparkContext.parallelize(range(1, 100)).map(f).reduce(add) {code} so it makes sense to focus further investigation on the way how we prepare locally defined classes for shipping over the wire. > Generic annotation of class attribute in abstract class is NOT initalized in > inherited classes > -- > > Key: SPARK-42910 > URL: https://issues.apache.org/jira/browse/SPARK-42910 > Project: Spark > Issue Type: Bug > Components: PySpark >Affects Versions: 3.3.0, 3.3.2 > Environment: Tested in two environments: > # Databricks > Pyspark Version: 3.3.0 > Python Version: 3.9.15 > # Local > Pyspark Verison: 3.3.2 > Python Version: 3.3.10 >Reporter: Jon Farzanfar >Priority: Minor > > We are trying to leverage generics to better type our code base. The example > below shows the problem we are having, however without generics this works > completely fine in pyspark however with generics it doesn't but does locally > without leveraging pyspark. > Output for local: > > {code:java} > {code} > > TraceBack for pyspark: > {code:java} > AttributeError: type object 'C' has no attribute 'base_record' > at > org.apache.spark.api.python.BasePythonRunner$ReaderIterator.handlePythonException(PythonRunner.scala:559) > at > org.apache.spark.api.python.PythonRunner$$anon$3.read(PythonRunner.scala:765) > at > org.apache.spark.api.python.PythonRunner$$anon$3.read(PythonRunner.scala:747) > at > org.apache.spark.api.python.BasePythonRunner$ReaderIterator.hasNext(PythonRunner.scala:512) > at > org.apache.spark.InterruptibleIterator.hasNext(InterruptibleIterator.scala:37) > at scala.collection.Iterator.foreach(Iterator.scala:943) > at scala.collection.Iterator.foreach$(Iterator.scala:943) > at > org.apache.spark.InterruptibleIterator.foreach(InterruptibleIterator.scala:28) > at scala.collection.generic.Growable.$plus$plus$eq(Growable.scala:62) > at scala.collection.generic.Growable.$plus$plus$eq$(Growable.scala:53) > at > scala.collection.mutable.ArrayBuffer.$plus$plus$eq(ArrayBuffer.scala:105) > at > scala.collection.mutable.ArrayBuffer.$plus$plus$eq(ArrayBuffer.scala:49) > at scala.collection.TraversableOnce.to(TraversableOnce.scala:366) > at scala.collection.TraversableOnce.to$(TraversableOnce.scala:364) > at > org.apache.spark.InterruptibleIterator.to(InterruptibleIterator.scala:28) > at scala.collection.TraversableOnce.toBuffer(TraversableOnce.scala:358) > at scala.collection.TraversableOnce.toBuffer$(TraversableOnce.scala:358) > at > org.apache.spark.InterruptibleIterator.toBuffer(InterruptibleIterator.scala:28) > at scala.collection.TraversableOnce.toArray(TraversableOnce.scala:345) > at scala.collection.TraversableOnce.toArray$(TraversableOnce.scala:339) > at > org.apache.spark.InterruptibleIterator.toArray(InterruptibleIterator.scala:28) > at org.apache.spark.rdd.RDD.$anonfun$collect$2(RDD.scala:1021)
[jira] [Commented] (SPARK-42910) Generic annotation of class attribute in abstract class is NOT initalized in inherited classes
[ https://issues.apache.org/jira/browse/SPARK-42910?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17704572#comment-17704572 ] Maciej Szymkiewicz commented on SPARK-42910: Thanks [~gurwls223] Only glanced over this, but an obvious observation is that type hierarchy is messed up on the worker {{C.mro()}} is (to the module) {code:python} [__main__.C, __main__.B, __main__.A, abc.ABC, typing.Generic, object {code} at the point of definition / import, and {code:python} [, , , ] {code} on the worker. This seems to be consistent across {{serializers}}, as far as I can tell. It seems to me, that {{B}} should be properly initialized as {{A[Foo]}}, i.e. {code:python} class B(A[Foo]): base_record = Foo {code} This also adjusts worker-side {{mro}} to {code:python} [, , , , ] {code} but I don't see the problem with {{C}} definition. > Generic annotation of class attribute in abstract class is NOT initalized in > inherited classes > -- > > Key: SPARK-42910 > URL: https://issues.apache.org/jira/browse/SPARK-42910 > Project: Spark > Issue Type: Bug > Components: PySpark >Affects Versions: 3.3.0, 3.3.2 > Environment: Tested in two environments: > # Databricks > Pyspark Version: 3.3.0 > Python Version: 3.9.15 > # Local > Pyspark Verison: 3.3.2 > Python Version: 3.3.10 >Reporter: Jon Farzanfar >Priority: Minor > > We are trying to leverage generics to better type our code base. The example > below shows the problem we are having, however without generics this works > completely fine in pyspark however with generics it doesn't but does locally > without leveraging pyspark. > Output for local: > > {code:java} > {code} > > TraceBack for pyspark: > {code:java} > AttributeError: type object 'C' has no attribute 'base_record' > at > org.apache.spark.api.python.BasePythonRunner$ReaderIterator.handlePythonException(PythonRunner.scala:559) > at > org.apache.spark.api.python.PythonRunner$$anon$3.read(PythonRunner.scala:765) > at > org.apache.spark.api.python.PythonRunner$$anon$3.read(PythonRunner.scala:747) > at > org.apache.spark.api.python.BasePythonRunner$ReaderIterator.hasNext(PythonRunner.scala:512) > at > org.apache.spark.InterruptibleIterator.hasNext(InterruptibleIterator.scala:37) > at scala.collection.Iterator.foreach(Iterator.scala:943) > at scala.collection.Iterator.foreach$(Iterator.scala:943) > at > org.apache.spark.InterruptibleIterator.foreach(InterruptibleIterator.scala:28) > at scala.collection.generic.Growable.$plus$plus$eq(Growable.scala:62) > at scala.collection.generic.Growable.$plus$plus$eq$(Growable.scala:53) > at > scala.collection.mutable.ArrayBuffer.$plus$plus$eq(ArrayBuffer.scala:105) > at > scala.collection.mutable.ArrayBuffer.$plus$plus$eq(ArrayBuffer.scala:49) > at scala.collection.TraversableOnce.to(TraversableOnce.scala:366) > at scala.collection.TraversableOnce.to$(TraversableOnce.scala:364) > at > org.apache.spark.InterruptibleIterator.to(InterruptibleIterator.scala:28) > at scala.collection.TraversableOnce.toBuffer(TraversableOnce.scala:358) > at scala.collection.TraversableOnce.toBuffer$(TraversableOnce.scala:358) > at > org.apache.spark.InterruptibleIterator.toBuffer(InterruptibleIterator.scala:28) > at scala.collection.TraversableOnce.toArray(TraversableOnce.scala:345) > at scala.collection.TraversableOnce.toArray$(TraversableOnce.scala:339) > at > org.apache.spark.InterruptibleIterator.toArray(InterruptibleIterator.scala:28) > at org.apache.spark.rdd.RDD.$anonfun$collect$2(RDD.scala:1021) > at > org.apache.spark.SparkContext.$anonfun$runJob$5(SparkContext.scala:2268) > at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:90) > at org.apache.spark.scheduler.Task.run(Task.scala:136) > at > org.apache.spark.executor.Executor$TaskRunner.$anonfun$run$3(Executor.scala:548) > at org.apache.spark.util.Utils$.tryWithSafeFinally(Utils.scala:1504) > at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:551) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) > ... 1 more {code} > > Code: > > {code:java} > from abc import ABC > from typing import Generic, TypeVar, Callable > from operator import add > from pyspark.sql import SparkSession > T = TypeVar("T") > class Foo: > ... > class A(ABC, Generic[T]): > base_record: Callable[..., T] > class B(A): > base_record = Foo > class C(B): > ... > def f(_: int) -> int: > print(C.base_record) >
[jira] [Created] (SPARK-41267) Add unpivot / melt to SparkR
Maciej Szymkiewicz created SPARK-41267: -- Summary: Add unpivot / melt to SparkR Key: SPARK-41267 URL: https://issues.apache.org/jira/browse/SPARK-41267 Project: Spark Issue Type: Improvement Components: R, SQL Affects Versions: 3.4.0 Reporter: Maciej Szymkiewicz Unpivot / melt operations have been implemented for Scala {{Dataset}} and core Python {{{}DataFrame{}}}, but are missing from SparkR. We should add these to achieve feature parity. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-40273) Fix the documents "Contributing and Maintaining Type Hints".
[ https://issues.apache.org/jira/browse/SPARK-40273?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17601213#comment-17601213 ] Maciej Szymkiewicz commented on SPARK-40273: Issue resolved by pull request 37724 https://github.com/apache/spark/pull/37724 > Fix the documents "Contributing and Maintaining Type Hints". > > > Key: SPARK-40273 > URL: https://issues.apache.org/jira/browse/SPARK-40273 > Project: Spark > Issue Type: Test > Components: Documentation, PySpark >Affects Versions: 3.4.0 >Reporter: Haejoon Lee >Assignee: Haejoon Lee >Priority: Major > Fix For: 3.4.0 > > > Since we don't use `*.pyi` for type hinting anymore (it's all ported as > inline type hints in the `*.py` files), we also should fix the related > documents accordingly > (https://spark.apache.org/docs/latest/api/python/development/contributing.html#contributing-and-maintaining-type-hints) -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Resolved] (SPARK-40273) Fix the documents "Contributing and Maintaining Type Hints".
[ https://issues.apache.org/jira/browse/SPARK-40273?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Maciej Szymkiewicz resolved SPARK-40273. Fix Version/s: 3.4.0 Assignee: Haejoon Lee Resolution: Fixed > Fix the documents "Contributing and Maintaining Type Hints". > > > Key: SPARK-40273 > URL: https://issues.apache.org/jira/browse/SPARK-40273 > Project: Spark > Issue Type: Test > Components: Documentation, PySpark >Affects Versions: 3.4.0 >Reporter: Haejoon Lee >Assignee: Haejoon Lee >Priority: Major > Fix For: 3.4.0 > > > Since we don't use `*.pyi` for type hinting anymore (it's all ported as > inline type hints in the `*.py` files), we also should fix the related > documents accordingly > (https://spark.apache.org/docs/latest/api/python/development/contributing.html#contributing-and-maintaining-type-hints) -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Resolved] (SPARK-40166) Add array_sort(column, comparator) to PySpark
[ https://issues.apache.org/jira/browse/SPARK-40166?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Maciej Szymkiewicz resolved SPARK-40166. Fix Version/s: 3.4.0 Resolution: Fixed Issue resolved by pull request 37600 [https://github.com/apache/spark/pull/37600] > Add array_sort(column, comparator) to PySpark > - > > Key: SPARK-40166 > URL: https://issues.apache.org/jira/browse/SPARK-40166 > Project: Spark > Issue Type: Improvement > Components: PySpark, SQL >Affects Versions: 3.4.0 >Reporter: Maciej Szymkiewicz >Assignee: Maciej Szymkiewicz >Priority: Minor > Fix For: 3.4.0 > > > SPARK-39925 exposed array_sort(column, comparator) on JVM. It should be > available in Python as well. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Assigned] (SPARK-40166) Add array_sort(column, comparator) to PySpark
[ https://issues.apache.org/jira/browse/SPARK-40166?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Maciej Szymkiewicz reassigned SPARK-40166: -- Assignee: Maciej Szymkiewicz > Add array_sort(column, comparator) to PySpark > - > > Key: SPARK-40166 > URL: https://issues.apache.org/jira/browse/SPARK-40166 > Project: Spark > Issue Type: Improvement > Components: PySpark, SQL >Affects Versions: 3.4.0 >Reporter: Maciej Szymkiewicz >Assignee: Maciej Szymkiewicz >Priority: Minor > > SPARK-39925 exposed array_sort(column, comparator) on JVM. It should be > available in Python as well. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Resolved] (SPARK-40167) Add array_sort(column, comparator) to SparkR
[ https://issues.apache.org/jira/browse/SPARK-40167?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Maciej Szymkiewicz resolved SPARK-40167. Fix Version/s: 3.4.0 Resolution: Fixed Issue resolved by pull request 37600 [https://github.com/apache/spark/pull/37600] > Add array_sort(column, comparator) to SparkR > > > Key: SPARK-40167 > URL: https://issues.apache.org/jira/browse/SPARK-40167 > Project: Spark > Issue Type: Improvement > Components: R, SQL >Affects Versions: 3.4.0 >Reporter: Maciej Szymkiewicz >Assignee: Maciej Szymkiewicz >Priority: Minor > Fix For: 3.4.0 > > > SPARK-39925 exposed array_sort(column, comparator) on JVM. It should be > available in R as well. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Assigned] (SPARK-40167) Add array_sort(column, comparator) to SparkR
[ https://issues.apache.org/jira/browse/SPARK-40167?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Maciej Szymkiewicz reassigned SPARK-40167: -- Assignee: Maciej Szymkiewicz > Add array_sort(column, comparator) to SparkR > > > Key: SPARK-40167 > URL: https://issues.apache.org/jira/browse/SPARK-40167 > Project: Spark > Issue Type: Improvement > Components: R, SQL >Affects Versions: 3.4.0 >Reporter: Maciej Szymkiewicz >Assignee: Maciej Szymkiewicz >Priority: Minor > > SPARK-39925 exposed array_sort(column, comparator) on JVM. It should be > available in R as well. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-40167) Add array_sort(column, comparator) to SparkR
Maciej Szymkiewicz created SPARK-40167: -- Summary: Add array_sort(column, comparator) to SparkR Key: SPARK-40167 URL: https://issues.apache.org/jira/browse/SPARK-40167 Project: Spark Issue Type: Improvement Components: R, SQL Affects Versions: 3.4.0 Reporter: Maciej Szymkiewicz SPARK-39925 exposed array_sort(column, comparator) on JVM. It should be available in R as well. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-40166) Add array_sort(column, comparator) to PySpark
Maciej Szymkiewicz created SPARK-40166: -- Summary: Add array_sort(column, comparator) to PySpark Key: SPARK-40166 URL: https://issues.apache.org/jira/browse/SPARK-40166 Project: Spark Issue Type: Improvement Components: PySpark, SQL Affects Versions: 3.4.0 Reporter: Maciej Szymkiewicz SPARK-39925 exposed array_sort(column, comparator) on JVM. It should be available in Python as well. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Resolved] (SPARK-39832) regexp_replace should support column arguments
[ https://issues.apache.org/jira/browse/SPARK-39832?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Maciej Szymkiewicz resolved SPARK-39832. Fix Version/s: 3.4.0 Resolution: Fixed Issue resolved by pull request 37329 [https://github.com/apache/spark/pull/37329] > regexp_replace should support column arguments > -- > > Key: SPARK-39832 > URL: https://issues.apache.org/jira/browse/SPARK-39832 > Project: Spark > Issue Type: Improvement > Components: PySpark >Affects Versions: 3.3.0 >Reporter: Brian Schaefer >Assignee: Brian Schaefer >Priority: Major > Labels: starter > Fix For: 3.4.0 > > > {{F.regexp_replace}} in PySpark currently only supports strings for the > second and third argument: > [https://github.com/apache/spark/blob/1df6006ea977ae3b8c53fe33630e277e8c1bc49c/python/pyspark/sql/functions.py#L3265] > In Scala, columns are also supported: > [https://github.com/apache/spark/blob/master/sql/core/src/main/scala/org/apache/spark/sql/functions.scala#L2836|https://github.com/apache/spark/blob/1df6006ea977ae3b8c53fe33630e277e8c1bc49c/sql/core/src/main/scala/org/apache/spark/sql/functions.scala#L2836] > The desire to use columns as arguments for the function has been raised > previously on StackExchange: > [https://stackoverflow.com/questions/64613761/in-pyspark-using-regexp-replace-how-to-replace-a-group-with-value-from-another|https://stackoverflow.com/questions/64613761/in-pyspark-using-regexp-replace-how-to-replace-a-group-with-value-from-another,], > where the suggested fix was to use {{{}F.expr{}}}. > It should be relatively straightforward to support in PySpark the two > function signatures supported in Scala. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Assigned] (SPARK-39832) regexp_replace should support column arguments
[ https://issues.apache.org/jira/browse/SPARK-39832?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Maciej Szymkiewicz reassigned SPARK-39832: -- Assignee: Brian Schaefer > regexp_replace should support column arguments > -- > > Key: SPARK-39832 > URL: https://issues.apache.org/jira/browse/SPARK-39832 > Project: Spark > Issue Type: Improvement > Components: PySpark >Affects Versions: 3.3.0 >Reporter: Brian Schaefer >Assignee: Brian Schaefer >Priority: Major > Labels: starter > > {{F.regexp_replace}} in PySpark currently only supports strings for the > second and third argument: > [https://github.com/apache/spark/blob/1df6006ea977ae3b8c53fe33630e277e8c1bc49c/python/pyspark/sql/functions.py#L3265] > In Scala, columns are also supported: > [https://github.com/apache/spark/blob/master/sql/core/src/main/scala/org/apache/spark/sql/functions.scala#L2836|https://github.com/apache/spark/blob/1df6006ea977ae3b8c53fe33630e277e8c1bc49c/sql/core/src/main/scala/org/apache/spark/sql/functions.scala#L2836] > The desire to use columns as arguments for the function has been raised > previously on StackExchange: > [https://stackoverflow.com/questions/64613761/in-pyspark-using-regexp-replace-how-to-replace-a-group-with-value-from-another|https://stackoverflow.com/questions/64613761/in-pyspark-using-regexp-replace-how-to-replace-a-group-with-value-from-another,], > where the suggested fix was to use {{{}F.expr{}}}. > It should be relatively straightforward to support in PySpark the two > function signatures supported in Scala. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Resolved] (SPARK-37015) Inline type hints for python/pyspark/streaming/dstream.py
[ https://issues.apache.org/jira/browse/SPARK-37015?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Maciej Szymkiewicz resolved SPARK-37015. Fix Version/s: 3.3.0 Resolution: Fixed > Inline type hints for python/pyspark/streaming/dstream.py > - > > Key: SPARK-37015 > URL: https://issues.apache.org/jira/browse/SPARK-37015 > Project: Spark > Issue Type: Sub-task > Components: PySpark >Affects Versions: 3.3.0 >Reporter: dch nguyen >Priority: Major > Fix For: 3.3.0 > > -- This message was sent by Atlassian Jira (v8.20.1#820001) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Assigned] (SPARK-37015) Inline type hints for python/pyspark/streaming/dstream.py
[ https://issues.apache.org/jira/browse/SPARK-37015?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Maciej Szymkiewicz reassigned SPARK-37015: -- Assignee: dch nguyen > Inline type hints for python/pyspark/streaming/dstream.py > - > > Key: SPARK-37015 > URL: https://issues.apache.org/jira/browse/SPARK-37015 > Project: Spark > Issue Type: Sub-task > Components: PySpark >Affects Versions: 3.3.0 >Reporter: dch nguyen >Assignee: dch nguyen >Priority: Major > Fix For: 3.3.0 > > -- This message was sent by Atlassian Jira (v8.20.1#820001) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Resolved] (SPARK-37093) Inline type hints python/pyspark/streaming
[ https://issues.apache.org/jira/browse/SPARK-37093?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Maciej Szymkiewicz resolved SPARK-37093. Fix Version/s: 3.3.0 Resolution: Fixed > Inline type hints python/pyspark/streaming > -- > > Key: SPARK-37093 > URL: https://issues.apache.org/jira/browse/SPARK-37093 > Project: Spark > Issue Type: Umbrella > Components: PySpark >Affects Versions: 3.3.0 >Reporter: dch nguyen >Assignee: dch nguyen >Priority: Major > Fix For: 3.3.0 > > -- This message was sent by Atlassian Jira (v8.20.1#820001) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-37015) Inline type hints for python/pyspark/streaming/dstream.py
[ https://issues.apache.org/jira/browse/SPARK-37015?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17523761#comment-17523761 ] Maciej Szymkiewicz commented on SPARK-37015: Issue resolved by pull request 34324. https://github.com/apache/spark/pull/34324 > Inline type hints for python/pyspark/streaming/dstream.py > - > > Key: SPARK-37015 > URL: https://issues.apache.org/jira/browse/SPARK-37015 > Project: Spark > Issue Type: Sub-task > Components: PySpark >Affects Versions: 3.3.0 >Reporter: dch nguyen >Priority: Major > -- This message was sent by Atlassian Jira (v8.20.1#820001) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Resolved] (SPARK-37395) Inline type hint files for files in python/pyspark/ml
[ https://issues.apache.org/jira/browse/SPARK-37395?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Maciej Szymkiewicz resolved SPARK-37395. Fix Version/s: 3.3.0 Resolution: Fixed > Inline type hint files for files in python/pyspark/ml > - > > Key: SPARK-37395 > URL: https://issues.apache.org/jira/browse/SPARK-37395 > Project: Spark > Issue Type: Umbrella > Components: ML, PySpark >Affects Versions: 3.3.0 >Reporter: Maciej Szymkiewicz >Assignee: Maciej Szymkiewicz >Priority: Major > Fix For: 3.3.0 > > > Currently there are type hint stub files ({{*.pyi}}) to show the expected > types for functions, but we can also take advantage of static type checking > within the functions by inlining the type hints. -- This message was sent by Atlassian Jira (v8.20.1#820001) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Resolved] (SPARK-37405) Inline type hints for python/pyspark/ml/feature.py
[ https://issues.apache.org/jira/browse/SPARK-37405?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Maciej Szymkiewicz resolved SPARK-37405. Fix Version/s: 3.3.0 Resolution: Fixed Issue resolved by pull request 35530 [https://github.com/apache/spark/pull/35530] > Inline type hints for python/pyspark/ml/feature.py > -- > > Key: SPARK-37405 > URL: https://issues.apache.org/jira/browse/SPARK-37405 > Project: Spark > Issue Type: Sub-task > Components: ML, PySpark >Affects Versions: 3.3.0 >Reporter: Maciej Szymkiewicz >Assignee: Apache Spark >Priority: Major > Fix For: 3.3.0 > > > Inline type hints from python/pyspark/ml/feature.pyi to > python/pyspark/ml/feature.py -- This message was sent by Atlassian Jira (v8.20.1#820001) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Resolved] (SPARK-37014) Inline type hints for python/pyspark/streaming/context.py
[ https://issues.apache.org/jira/browse/SPARK-37014?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Maciej Szymkiewicz resolved SPARK-37014. Fix Version/s: 3.3.0 Resolution: Fixed Issue resolved by pull request 34293 [https://github.com/apache/spark/pull/34293] > Inline type hints for python/pyspark/streaming/context.py > - > > Key: SPARK-37014 > URL: https://issues.apache.org/jira/browse/SPARK-37014 > Project: Spark > Issue Type: Sub-task > Components: PySpark >Affects Versions: 3.3.0 >Reporter: dch nguyen >Assignee: dch nguyen >Priority: Major > Fix For: 3.3.0 > > -- This message was sent by Atlassian Jira (v8.20.1#820001) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Assigned] (SPARK-37014) Inline type hints for python/pyspark/streaming/context.py
[ https://issues.apache.org/jira/browse/SPARK-37014?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Maciej Szymkiewicz reassigned SPARK-37014: -- Assignee: dch nguyen > Inline type hints for python/pyspark/streaming/context.py > - > > Key: SPARK-37014 > URL: https://issues.apache.org/jira/browse/SPARK-37014 > Project: Spark > Issue Type: Sub-task > Components: PySpark >Affects Versions: 3.3.0 >Reporter: dch nguyen >Assignee: dch nguyen >Priority: Major > -- This message was sent by Atlassian Jira (v8.20.1#820001) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Resolved] (SPARK-37424) Inline type hints for python/pyspark/mllib/random.py
[ https://issues.apache.org/jira/browse/SPARK-37424?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Maciej Szymkiewicz resolved SPARK-37424. Fix Version/s: 3.3.0 Assignee: Maciej Szymkiewicz Resolution: Fixed Issue resolved by pull request 35576 https://github.com/apache/spark/pull/35576 > Inline type hints for python/pyspark/mllib/random.py > > > Key: SPARK-37424 > URL: https://issues.apache.org/jira/browse/SPARK-37424 > Project: Spark > Issue Type: Sub-task > Components: MLlib, PySpark >Affects Versions: 3.3.0 >Reporter: Maciej Szymkiewicz >Assignee: Maciej Szymkiewicz >Priority: Major > Fix For: 3.3.0 > > > Inline type hints from python/pyspark/mlib/random.pyi to > python/pyspark/mllib/random.py -- This message was sent by Atlassian Jira (v8.20.1#820001) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Assigned] (SPARK-37396) Inline type hint files for files in python/pyspark/mllib
[ https://issues.apache.org/jira/browse/SPARK-37396?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Maciej Szymkiewicz reassigned SPARK-37396: -- Assignee: Maciej Szymkiewicz > Inline type hint files for files in python/pyspark/mllib > > > Key: SPARK-37396 > URL: https://issues.apache.org/jira/browse/SPARK-37396 > Project: Spark > Issue Type: Umbrella > Components: MLlib, PySpark >Affects Versions: 3.3.0 >Reporter: Maciej Szymkiewicz >Assignee: Maciej Szymkiewicz >Priority: Major > > Currently there are type hint stub files ({{*.pyi}}) to show the expected > types for functions, but we can also take advantage of static type checking > within the functions by inlining the type hints. -- This message was sent by Atlassian Jira (v8.20.1#820001) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Resolved] (SPARK-37396) Inline type hint files for files in python/pyspark/mllib
[ https://issues.apache.org/jira/browse/SPARK-37396?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Maciej Szymkiewicz resolved SPARK-37396. Fix Version/s: 3.3.0 Resolution: Fixed > Inline type hint files for files in python/pyspark/mllib > > > Key: SPARK-37396 > URL: https://issues.apache.org/jira/browse/SPARK-37396 > Project: Spark > Issue Type: Umbrella > Components: MLlib, PySpark >Affects Versions: 3.3.0 >Reporter: Maciej Szymkiewicz >Assignee: Maciej Szymkiewicz >Priority: Major > Fix For: 3.3.0 > > > Currently there are type hint stub files ({{*.pyi}}) to show the expected > types for functions, but we can also take advantage of static type checking > within the functions by inlining the type hints. -- This message was sent by Atlassian Jira (v8.20.1#820001) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Assigned] (SPARK-37402) Inline type hints for python/pyspark/mllib/clustering.py
[ https://issues.apache.org/jira/browse/SPARK-37402?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Maciej Szymkiewicz reassigned SPARK-37402: -- Assignee: Maciej Szymkiewicz > Inline type hints for python/pyspark/mllib/clustering.py > > > Key: SPARK-37402 > URL: https://issues.apache.org/jira/browse/SPARK-37402 > Project: Spark > Issue Type: Sub-task > Components: MLlib, PySpark >Affects Versions: 3.3.0 >Reporter: Maciej Szymkiewicz >Assignee: Maciej Szymkiewicz >Priority: Major > > Inline type hints from python/pyspark/mlib/clustering.pyi to > python/pyspark/mllib/clustering.py. -- This message was sent by Atlassian Jira (v8.20.1#820001) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Resolved] (SPARK-37402) Inline type hints for python/pyspark/mllib/clustering.py
[ https://issues.apache.org/jira/browse/SPARK-37402?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Maciej Szymkiewicz resolved SPARK-37402. Fix Version/s: 3.3.0 Resolution: Fixed Issue resolved by pull request 35578 [https://github.com/apache/spark/pull/35578] > Inline type hints for python/pyspark/mllib/clustering.py > > > Key: SPARK-37402 > URL: https://issues.apache.org/jira/browse/SPARK-37402 > Project: Spark > Issue Type: Sub-task > Components: MLlib, PySpark >Affects Versions: 3.3.0 >Reporter: Maciej Szymkiewicz >Assignee: Maciej Szymkiewicz >Priority: Major > Fix For: 3.3.0 > > > Inline type hints from python/pyspark/mlib/clustering.pyi to > python/pyspark/mllib/clustering.py. -- This message was sent by Atlassian Jira (v8.20.1#820001) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Resolved] (SPARK-37234) Inline type hints for python/pyspark/mllib/stat/_statistics.py
[ https://issues.apache.org/jira/browse/SPARK-37234?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Maciej Szymkiewicz resolved SPARK-37234. Fix Version/s: 3.3.0 Resolution: Fixed Issue resolved by pull request 34513 [https://github.com/apache/spark/pull/34513] > Inline type hints for python/pyspark/mllib/stat/_statistics.py > -- > > Key: SPARK-37234 > URL: https://issues.apache.org/jira/browse/SPARK-37234 > Project: Spark > Issue Type: Sub-task > Components: PySpark >Affects Versions: 3.3.0 >Reporter: dch nguyen >Assignee: dch nguyen >Priority: Major > Fix For: 3.3.0 > > -- This message was sent by Atlassian Jira (v8.20.1#820001) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Assigned] (SPARK-37234) Inline type hints for python/pyspark/mllib/stat/_statistics.py
[ https://issues.apache.org/jira/browse/SPARK-37234?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Maciej Szymkiewicz reassigned SPARK-37234: -- Assignee: dch nguyen > Inline type hints for python/pyspark/mllib/stat/_statistics.py > -- > > Key: SPARK-37234 > URL: https://issues.apache.org/jira/browse/SPARK-37234 > Project: Spark > Issue Type: Sub-task > Components: PySpark >Affects Versions: 3.3.0 >Reporter: dch nguyen >Assignee: dch nguyen >Priority: Major > -- This message was sent by Atlassian Jira (v8.20.1#820001) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Assigned] (SPARK-37395) Inline type hint files for files in python/pyspark/ml
[ https://issues.apache.org/jira/browse/SPARK-37395?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Maciej Szymkiewicz reassigned SPARK-37395: -- Assignee: Maciej Szymkiewicz > Inline type hint files for files in python/pyspark/ml > - > > Key: SPARK-37395 > URL: https://issues.apache.org/jira/browse/SPARK-37395 > Project: Spark > Issue Type: Umbrella > Components: ML, PySpark >Affects Versions: 3.3.0 >Reporter: Maciej Szymkiewicz >Assignee: Maciej Szymkiewicz >Priority: Major > > Currently there are type hint stub files ({{*.pyi}}) to show the expected > types for functions, but we can also take advantage of static type checking > within the functions by inlining the type hints. -- This message was sent by Atlassian Jira (v8.20.1#820001) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Resolved] (SPARK-37398) Inline type hints for python/pyspark/ml/classification.py
[ https://issues.apache.org/jira/browse/SPARK-37398?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Maciej Szymkiewicz resolved SPARK-37398. Fix Version/s: 3.3.0 Resolution: Fixed Issue resolved by pull request 36071 [https://github.com/apache/spark/pull/36071] > Inline type hints for python/pyspark/ml/classification.py > - > > Key: SPARK-37398 > URL: https://issues.apache.org/jira/browse/SPARK-37398 > Project: Spark > Issue Type: Sub-task > Components: ML, PySpark >Affects Versions: 3.3.0 >Reporter: Maciej Szymkiewicz >Assignee: alper tankut turker >Priority: Major > Fix For: 3.3.0 > > > Inline type hints from python/pyspark/ml/classification.pyi to > python/pyspark/ml/classification.py. -- This message was sent by Atlassian Jira (v8.20.1#820001) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Assigned] (SPARK-37398) Inline type hints for python/pyspark/ml/classification.py
[ https://issues.apache.org/jira/browse/SPARK-37398?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Maciej Szymkiewicz reassigned SPARK-37398: -- Assignee: alper tankut turker > Inline type hints for python/pyspark/ml/classification.py > - > > Key: SPARK-37398 > URL: https://issues.apache.org/jira/browse/SPARK-37398 > Project: Spark > Issue Type: Sub-task > Components: ML, PySpark >Affects Versions: 3.3.0 >Reporter: Maciej Szymkiewicz >Assignee: alper tankut turker >Priority: Major > > Inline type hints from python/pyspark/ml/classification.pyi to > python/pyspark/ml/classification.py. -- This message was sent by Atlassian Jira (v8.20.1#820001) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Assigned] (SPARK-37423) Inline type hints for python/pyspark/mllib/fpm.py
[ https://issues.apache.org/jira/browse/SPARK-37423?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Maciej Szymkiewicz reassigned SPARK-37423: -- Assignee: dch nguyen > Inline type hints for python/pyspark/mllib/fpm.py > - > > Key: SPARK-37423 > URL: https://issues.apache.org/jira/browse/SPARK-37423 > Project: Spark > Issue Type: Sub-task > Components: MLlib, PySpark >Affects Versions: 3.3.0 >Reporter: Maciej Szymkiewicz >Assignee: dch nguyen >Priority: Major > > Inline type hints from python/pyspark/mlib/fpm.pyi to > python/pyspark/mllib/fpm.py -- This message was sent by Atlassian Jira (v8.20.1#820001) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Resolved] (SPARK-37423) Inline type hints for python/pyspark/mllib/fpm.py
[ https://issues.apache.org/jira/browse/SPARK-37423?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Maciej Szymkiewicz resolved SPARK-37423. Fix Version/s: 3.3.0 Resolution: Fixed Issue resolved by pull request 35067 [https://github.com/apache/spark/pull/35067] > Inline type hints for python/pyspark/mllib/fpm.py > - > > Key: SPARK-37423 > URL: https://issues.apache.org/jira/browse/SPARK-37423 > Project: Spark > Issue Type: Sub-task > Components: MLlib, PySpark >Affects Versions: 3.3.0 >Reporter: Maciej Szymkiewicz >Assignee: dch nguyen >Priority: Major > Fix For: 3.3.0 > > > Inline type hints from python/pyspark/mlib/fpm.pyi to > python/pyspark/mllib/fpm.py -- This message was sent by Atlassian Jira (v8.20.1#820001) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Resolved] (SPARK-37425) Inline type hints for python/pyspark/mllib/recommendation.py
[ https://issues.apache.org/jira/browse/SPARK-37425?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Maciej Szymkiewicz resolved SPARK-37425. Fix Version/s: 3.4.0 Resolution: Fixed Issue resolved by pull request 35766 [https://github.com/apache/spark/pull/35766] > Inline type hints for python/pyspark/mllib/recommendation.py > > > Key: SPARK-37425 > URL: https://issues.apache.org/jira/browse/SPARK-37425 > Project: Spark > Issue Type: Sub-task > Components: MLlib, PySpark >Affects Versions: 3.3.0 >Reporter: Maciej Szymkiewicz >Assignee: dch nguyen >Priority: Major > Fix For: 3.4.0 > > > Inline type hints from python/pyspark/mlib/recommendation.pyi to > python/pyspark/mllib/recommendation.py -- This message was sent by Atlassian Jira (v8.20.1#820001) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Assigned] (SPARK-37425) Inline type hints for python/pyspark/mllib/recommendation.py
[ https://issues.apache.org/jira/browse/SPARK-37425?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Maciej Szymkiewicz reassigned SPARK-37425: -- Assignee: dch nguyen > Inline type hints for python/pyspark/mllib/recommendation.py > > > Key: SPARK-37425 > URL: https://issues.apache.org/jira/browse/SPARK-37425 > Project: Spark > Issue Type: Sub-task > Components: MLlib, PySpark >Affects Versions: 3.3.0 >Reporter: Maciej Szymkiewicz >Assignee: dch nguyen >Priority: Major > > Inline type hints from python/pyspark/mlib/recommendation.pyi to > python/pyspark/mllib/recommendation.py -- This message was sent by Atlassian Jira (v8.20.1#820001) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Resolved] (SPARK-37430) Inline type hints for python/pyspark/mllib/linalg/distributed.py
[ https://issues.apache.org/jira/browse/SPARK-37430?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Maciej Szymkiewicz resolved SPARK-37430. Fix Version/s: 3.3.0 Resolution: Fixed Issue resolved by pull request 35739 [https://github.com/apache/spark/pull/35739] > Inline type hints for python/pyspark/mllib/linalg/distributed.py > > > Key: SPARK-37430 > URL: https://issues.apache.org/jira/browse/SPARK-37430 > Project: Spark > Issue Type: Sub-task > Components: MLlib, PySpark >Affects Versions: 3.3.0 >Reporter: Maciej Szymkiewicz >Assignee: alper tankut turker >Priority: Major > Fix For: 3.3.0 > > > Inline type hints from python/pyspark/mlib/linalg/distributed.pyi to > python/pyspark/mllib/linalg/distributed.py -- This message was sent by Atlassian Jira (v8.20.1#820001) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Assigned] (SPARK-37430) Inline type hints for python/pyspark/mllib/linalg/distributed.py
[ https://issues.apache.org/jira/browse/SPARK-37430?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Maciej Szymkiewicz reassigned SPARK-37430: -- Assignee: alper tankut turker > Inline type hints for python/pyspark/mllib/linalg/distributed.py > > > Key: SPARK-37430 > URL: https://issues.apache.org/jira/browse/SPARK-37430 > Project: Spark > Issue Type: Sub-task > Components: MLlib, PySpark >Affects Versions: 3.3.0 >Reporter: Maciej Szymkiewicz >Assignee: alper tankut turker >Priority: Major > > Inline type hints from python/pyspark/mlib/linalg/distributed.pyi to > python/pyspark/mllib/linalg/distributed.py -- This message was sent by Atlassian Jira (v8.20.1#820001) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Resolved] (SPARK-37421) Inline type hints for python/pyspark/mllib/evaluation.py
[ https://issues.apache.org/jira/browse/SPARK-37421?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Maciej Szymkiewicz resolved SPARK-37421. Fix Version/s: 3.3.0 Resolution: Fixed Issue resolved by pull request 34680 [https://github.com/apache/spark/pull/34680] > Inline type hints for python/pyspark/mllib/evaluation.py > > > Key: SPARK-37421 > URL: https://issues.apache.org/jira/browse/SPARK-37421 > Project: Spark > Issue Type: Sub-task > Components: MLlib, PySpark >Affects Versions: 3.3.0 >Reporter: Maciej Szymkiewicz >Assignee: dch nguyen >Priority: Major > Fix For: 3.3.0 > > > Inline type hints from python/pyspark/mlib/evaluation.pyi to > python/pyspark/mllib/evaluation.py -- This message was sent by Atlassian Jira (v8.20.1#820001) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Assigned] (SPARK-37421) Inline type hints for python/pyspark/mllib/evaluation.py
[ https://issues.apache.org/jira/browse/SPARK-37421?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Maciej Szymkiewicz reassigned SPARK-37421: -- Assignee: dch nguyen > Inline type hints for python/pyspark/mllib/evaluation.py > > > Key: SPARK-37421 > URL: https://issues.apache.org/jira/browse/SPARK-37421 > Project: Spark > Issue Type: Sub-task > Components: MLlib, PySpark >Affects Versions: 3.3.0 >Reporter: Maciej Szymkiewicz >Assignee: dch nguyen >Priority: Major > > Inline type hints from python/pyspark/mlib/evaluation.pyi to > python/pyspark/mllib/evaluation.py -- This message was sent by Atlassian Jira (v8.20.1#820001) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-38424) Disallow unused casts and ignores
Maciej Szymkiewicz created SPARK-38424: -- Summary: Disallow unused casts and ignores Key: SPARK-38424 URL: https://issues.apache.org/jira/browse/SPARK-38424 Project: Spark Issue Type: Improvement Components: PySpark Affects Versions: 3.3.0 Reporter: Maciej Szymkiewicz Now, when we have almost full typing coverage, we should consider setting the following mypy options: {code} warn_unused_ignores = True warn_redundant_casts = True {code} -- This message was sent by Atlassian Jira (v8.20.1#820001) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Assigned] (SPARK-37426) Inline type hints for python/pyspark/mllib/regression.py
[ https://issues.apache.org/jira/browse/SPARK-37426?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Maciej Szymkiewicz reassigned SPARK-37426: -- Assignee: Maciej Szymkiewicz > Inline type hints for python/pyspark/mllib/regression.py > > > Key: SPARK-37426 > URL: https://issues.apache.org/jira/browse/SPARK-37426 > Project: Spark > Issue Type: Sub-task > Components: MLlib, PySpark >Affects Versions: 3.3.0 >Reporter: Maciej Szymkiewicz >Assignee: Maciej Szymkiewicz >Priority: Major > > Inline type hints from python/pyspark/mlib/regression.pyi to > python/pyspark/mllib/regression.py -- This message was sent by Atlassian Jira (v8.20.1#820001) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Resolved] (SPARK-37426) Inline type hints for python/pyspark/mllib/regression.py
[ https://issues.apache.org/jira/browse/SPARK-37426?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Maciej Szymkiewicz resolved SPARK-37426. Fix Version/s: 3.3.0 Resolution: Fixed Issue resolved by pull request 35585 [https://github.com/apache/spark/pull/35585] > Inline type hints for python/pyspark/mllib/regression.py > > > Key: SPARK-37426 > URL: https://issues.apache.org/jira/browse/SPARK-37426 > Project: Spark > Issue Type: Sub-task > Components: MLlib, PySpark >Affects Versions: 3.3.0 >Reporter: Maciej Szymkiewicz >Assignee: Maciej Szymkiewicz >Priority: Major > Fix For: 3.3.0 > > > Inline type hints from python/pyspark/mlib/regression.pyi to > python/pyspark/mllib/regression.py -- This message was sent by Atlassian Jira (v8.20.1#820001) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Resolved] (SPARK-37400) Inline type hints for python/pyspark/mllib/classification.py
[ https://issues.apache.org/jira/browse/SPARK-37400?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Maciej Szymkiewicz resolved SPARK-37400. Fix Version/s: 3.3.0 Resolution: Fixed Issue resolved by pull request 35585 [https://github.com/apache/spark/pull/35585] > Inline type hints for python/pyspark/mllib/classification.py > > > Key: SPARK-37400 > URL: https://issues.apache.org/jira/browse/SPARK-37400 > Project: Spark > Issue Type: Sub-task > Components: MLlib, PySpark >Affects Versions: 3.3.0 >Reporter: Maciej Szymkiewicz >Assignee: Maciej Szymkiewicz >Priority: Major > Fix For: 3.3.0 > > > Inline type hints from python/pyspark/mlib/classification.pyi to > python/pyspark/mllib/classification.py. -- This message was sent by Atlassian Jira (v8.20.1#820001) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Assigned] (SPARK-37400) Inline type hints for python/pyspark/mllib/classification.py
[ https://issues.apache.org/jira/browse/SPARK-37400?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Maciej Szymkiewicz reassigned SPARK-37400: -- Assignee: Maciej Szymkiewicz > Inline type hints for python/pyspark/mllib/classification.py > > > Key: SPARK-37400 > URL: https://issues.apache.org/jira/browse/SPARK-37400 > Project: Spark > Issue Type: Sub-task > Components: MLlib, PySpark >Affects Versions: 3.3.0 >Reporter: Maciej Szymkiewicz >Assignee: Maciej Szymkiewicz >Priority: Major > > Inline type hints from python/pyspark/mlib/classification.pyi to > python/pyspark/mllib/classification.py. -- This message was sent by Atlassian Jira (v8.20.1#820001) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Resolved] (SPARK-37422) Inline type hints for python/pyspark/mllib/feature.py
[ https://issues.apache.org/jira/browse/SPARK-37422?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Maciej Szymkiewicz resolved SPARK-37422. Fix Version/s: 3.3.0 Assignee: Maciej Szymkiewicz Resolution: Fixed Issue resolved by pull request 35546 https://github.com/apache/spark/pull/35546 > Inline type hints for python/pyspark/mllib/feature.py > - > > Key: SPARK-37422 > URL: https://issues.apache.org/jira/browse/SPARK-37422 > Project: Spark > Issue Type: Sub-task > Components: MLlib, PySpark >Affects Versions: 3.3.0 >Reporter: Maciej Szymkiewicz >Assignee: Maciej Szymkiewicz >Priority: Major > Fix For: 3.3.0 > > > Inline type hints from python/pyspark/mlib/feature.pyi to > python/pyspark/mllib/feature.py -- This message was sent by Atlassian Jira (v8.20.1#820001) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Resolved] (SPARK-37427) Inline type hints for python/pyspark/mllib/tree.py
[ https://issues.apache.org/jira/browse/SPARK-37427?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Maciej Szymkiewicz resolved SPARK-37427. Fix Version/s: 3.3.0 Resolution: Fixed Issue resolved by pull request 35545 [https://github.com/apache/spark/pull/35545] > Inline type hints for python/pyspark/mllib/tree.py > -- > > Key: SPARK-37427 > URL: https://issues.apache.org/jira/browse/SPARK-37427 > Project: Spark > Issue Type: Sub-task > Components: MLlib, PySpark >Affects Versions: 3.3.0 >Reporter: Maciej Szymkiewicz >Assignee: Maciej Szymkiewicz >Priority: Major > Fix For: 3.3.0 > > > Inline type hints from python/pyspark/mlib/tree.pyi to > python/pyspark/mllib/tree.py -- This message was sent by Atlassian Jira (v8.20.1#820001) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Assigned] (SPARK-37427) Inline type hints for python/pyspark/mllib/tree.py
[ https://issues.apache.org/jira/browse/SPARK-37427?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Maciej Szymkiewicz reassigned SPARK-37427: -- Assignee: Maciej Szymkiewicz > Inline type hints for python/pyspark/mllib/tree.py > -- > > Key: SPARK-37427 > URL: https://issues.apache.org/jira/browse/SPARK-37427 > Project: Spark > Issue Type: Sub-task > Components: MLlib, PySpark >Affects Versions: 3.3.0 >Reporter: Maciej Szymkiewicz >Assignee: Maciej Szymkiewicz >Priority: Major > > Inline type hints from python/pyspark/mlib/tree.pyi to > python/pyspark/mllib/tree.py -- This message was sent by Atlassian Jira (v8.20.1#820001) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Resolved] (SPARK-37428) Inline type hints for python/pyspark/mllib/util.py
[ https://issues.apache.org/jira/browse/SPARK-37428?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Maciej Szymkiewicz resolved SPARK-37428. Fix Version/s: 3.3.0 Resolution: Fixed Issue resolved by pull request 35532 [https://github.com/apache/spark/pull/35532] > Inline type hints for python/pyspark/mllib/util.py > -- > > Key: SPARK-37428 > URL: https://issues.apache.org/jira/browse/SPARK-37428 > Project: Spark > Issue Type: Sub-task > Components: PySpark >Affects Versions: 3.3.0 >Reporter: Maciej Szymkiewicz >Assignee: Maciej Szymkiewicz >Priority: Major > Fix For: 3.3.0 > > > Inline type hints from python/pyspark/mlib/util.pyi to > python/pyspark/mllib/util.py -- This message was sent by Atlassian Jira (v8.20.1#820001) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Assigned] (SPARK-37428) Inline type hints for python/pyspark/mllib/util.py
[ https://issues.apache.org/jira/browse/SPARK-37428?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Maciej Szymkiewicz reassigned SPARK-37428: -- Assignee: Maciej Szymkiewicz > Inline type hints for python/pyspark/mllib/util.py > -- > > Key: SPARK-37428 > URL: https://issues.apache.org/jira/browse/SPARK-37428 > Project: Spark > Issue Type: Sub-task > Components: PySpark >Affects Versions: 3.3.0 >Reporter: Maciej Szymkiewicz >Assignee: Maciej Szymkiewicz >Priority: Major > > Inline type hints from python/pyspark/mlib/util.pyi to > python/pyspark/mllib/util.py -- This message was sent by Atlassian Jira (v8.20.1#820001) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-37430) Inline type hints for python/pyspark/mllib/linalg/distributed.py
[ https://issues.apache.org/jira/browse/SPARK-37430?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17494939#comment-17494939 ] Maciej Szymkiewicz commented on SPARK-37430: [~jyoti_08] Hard prerequisites so if you still want to work on that, go ahead. If not, please drop a line, so someone else can resolve this. TIA :) > Inline type hints for python/pyspark/mllib/linalg/distributed.py > > > Key: SPARK-37430 > URL: https://issues.apache.org/jira/browse/SPARK-37430 > Project: Spark > Issue Type: Sub-task > Components: MLlib, PySpark >Affects Versions: 3.3.0 >Reporter: Maciej Szymkiewicz >Priority: Major > > Inline type hints from python/pyspark/mlib/linalg/distributed.pyi to > python/pyspark/mllib/linalg/distributed.py -- This message was sent by Atlassian Jira (v8.20.1#820001) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Resolved] (SPARK-37094) Inline type hints for files in python/pyspark
[ https://issues.apache.org/jira/browse/SPARK-37094?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Maciej Szymkiewicz resolved SPARK-37094. Fix Version/s: 3.3.0 Resolution: Fixed > Inline type hints for files in python/pyspark > - > > Key: SPARK-37094 > URL: https://issues.apache.org/jira/browse/SPARK-37094 > Project: Spark > Issue Type: Umbrella > Components: PySpark >Affects Versions: 3.3.0 >Reporter: dch nguyen >Assignee: dch nguyen >Priority: Major > Fix For: 3.3.0 > > -- This message was sent by Atlassian Jira (v8.20.1#820001) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Resolved] (SPARK-37154) Inline type hints for python/pyspark/rdd.py
[ https://issues.apache.org/jira/browse/SPARK-37154?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Maciej Szymkiewicz resolved SPARK-37154. Fix Version/s: 3.3.0 Resolution: Fixed Issue resolved by pull request 35252 [https://github.com/apache/spark/pull/35252] > Inline type hints for python/pyspark/rdd.py > --- > > Key: SPARK-37154 > URL: https://issues.apache.org/jira/browse/SPARK-37154 > Project: Spark > Issue Type: Sub-task > Components: PySpark >Affects Versions: 3.2.0 >Reporter: Byron Hsu >Assignee: Maciej Szymkiewicz >Priority: Major > Fix For: 3.3.0 > > -- This message was sent by Atlassian Jira (v8.20.1#820001) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Assigned] (SPARK-37154) Inline type hints for python/pyspark/rdd.py
[ https://issues.apache.org/jira/browse/SPARK-37154?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Maciej Szymkiewicz reassigned SPARK-37154: -- Assignee: Maciej Szymkiewicz > Inline type hints for python/pyspark/rdd.py > --- > > Key: SPARK-37154 > URL: https://issues.apache.org/jira/browse/SPARK-37154 > Project: Spark > Issue Type: Sub-task > Components: PySpark >Affects Versions: 3.2.0 >Reporter: Byron Hsu >Assignee: Maciej Szymkiewicz >Priority: Major > -- This message was sent by Atlassian Jira (v8.20.1#820001) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-38243) Unintended exception thrown in pyspark.ml.LogisticRegression.getThreshold
[ https://issues.apache.org/jira/browse/SPARK-38243?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Maciej Szymkiewicz updated SPARK-38243: --- Description: If {{LogisticRegression.getThreshold}} is called with model having multiple thresholds we suppose to raise an exception, {code:python} ValueError: Logistic Regression getThreshold only applies to binary classification ... {code} However, {{thresholds}} ({{{}List[float]{}}}) are incorrectly passed to {{{}str.join{}}}, resulting in unintended {{TypeError}} {code:python} >>> from pyspark.ml.classification import LogisticRegression ... ... model = LogisticRegression(thresholds=[1.0, 2.0, 3.0]) >>> model.getThreshold() Traceback (most recent call last): Input In [7] in model.getThreshold() File /path/to/spark/python/pyspark/ml/classification.py:1003 in getThreshold + ",".join(ts) Type Error: sequence item 0: expected str instance, float found {code} was: If {{LogisticRegression.getThreshold}} is called with model having multiple thresholds we suppose to raise an exception, {code:python} ValueError: Logistic Regression getThreshold only applies to binary classification ... {code} However, {{thresholds}} ({{{}List[float]{}}}) are incorrectly passed to {{{}str.join{}}}, resulting in unintended {{TypeError}} {code:python} >>> from pyspark.ml.classification import LogisticRegression ... ... model = LogisticRegression(thresholds=[1.0, 2.0, 3.0]) >>> model.getThreshold() Traceback (most recent call last): Input In [7] in model.getThreshold() File ~/Workspace/spark/python/pyspark/ml/classification.py:1003 in getThreshold + ",".join(ts) Type Error: sequence item 0: expected str instance, float found {code} > Unintended exception thrown in pyspark.ml.LogisticRegression.getThreshold > - > > Key: SPARK-38243 > URL: https://issues.apache.org/jira/browse/SPARK-38243 > Project: Spark > Issue Type: Bug > Components: ML, PySpark >Affects Versions: 2.4.0, 3.1.0, 3.2.0, 3.3.0 >Reporter: Maciej Szymkiewicz >Priority: Minor > > If {{LogisticRegression.getThreshold}} is called with model having multiple > thresholds we suppose to raise an exception, > {code:python} > ValueError: Logistic Regression getThreshold only applies to binary > classification ... > {code} > However, {{thresholds}} ({{{}List[float]{}}}) are incorrectly passed to > {{{}str.join{}}}, resulting in unintended {{TypeError}} > {code:python} > >>> from pyspark.ml.classification import LogisticRegression > ... > ... model = LogisticRegression(thresholds=[1.0, 2.0, 3.0]) > >>> model.getThreshold() > Traceback (most recent call last): > Input In [7] in > model.getThreshold() > File /path/to/spark/python/pyspark/ml/classification.py:1003 in getThreshold > + ",".join(ts) > Type Error: sequence item 0: expected str instance, float found > {code} -- This message was sent by Atlassian Jira (v8.20.1#820001) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-38243) Unintended exception thrown in pyspark.ml.LogisticRegression.getThreshold
[ https://issues.apache.org/jira/browse/SPARK-38243?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Maciej Szymkiewicz updated SPARK-38243: --- Description: If {{LogisticRegression.getThreshold}} is called with model having multiple thresholds we suppose to raise an exception, {code:python} ValueError: Logistic Regression getThreshold only applies to binary classification ... {code} However, {{thresholds}} ({{{}List[float]{}}}) are incorrectly passed to {{{}str.join{}}}, resulting in unintended {{TypeError}} {code:python} >>> from pyspark.ml.classification import LogisticRegression ... ... model = LogisticRegression(thresholds=[1.0, 2.0, 3.0]) >>> model.getThreshold() Traceback (most recent call last): Input In [7] in model.getThreshold() File ~/Workspace/spark/python/pyspark/ml/classification.py:1003 in getThreshold + ",".join(ts) Type Error: sequence item 0: expected str instance, float found {code} was: If {{LogisticRegression.getThreshold}} is called with model having multiple thresholds we suppose to raise an exception, {code:python} ValueError: Logistic Regression getThreshold only applies to binary classification ... {code} However, {{thresholds}} ({{{}List[float]{}}}) are incorrectly passed to {{{}str.format{}}}, resulting in unintended {{TypeError}} {code:python} >>> from pyspark.ml.classification import LogisticRegression ... ... model = LogisticRegression(thresholds=[1.0, 2.0, 3.0]) >>> model.getThreshold() Traceback (most recent call last): Input In [7] in model.getThreshold() File ~/Workspace/spark/python/pyspark/ml/classification.py:1003 in getThreshold + ",".join(ts) Type Error: sequence item 0: expected str instance, float found {code} > Unintended exception thrown in pyspark.ml.LogisticRegression.getThreshold > - > > Key: SPARK-38243 > URL: https://issues.apache.org/jira/browse/SPARK-38243 > Project: Spark > Issue Type: Bug > Components: ML, PySpark >Affects Versions: 2.4.0, 3.1.0, 3.2.0, 3.3.0 >Reporter: Maciej Szymkiewicz >Priority: Minor > > If {{LogisticRegression.getThreshold}} is called with model having multiple > thresholds we suppose to raise an exception, > {code:python} > ValueError: Logistic Regression getThreshold only applies to binary > classification ... > {code} > However, {{thresholds}} ({{{}List[float]{}}}) are incorrectly passed to > {{{}str.join{}}}, resulting in unintended {{TypeError}} > {code:python} > >>> from pyspark.ml.classification import LogisticRegression > ... > ... model = LogisticRegression(thresholds=[1.0, 2.0, 3.0]) > >>> model.getThreshold() > Traceback (most recent call last): > Input In [7] in > model.getThreshold() > File ~/Workspace/spark/python/pyspark/ml/classification.py:1003 in > getThreshold > + ",".join(ts) > Type Error: sequence item 0: expected str instance, float found > {code} -- This message was sent by Atlassian Jira (v8.20.1#820001) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-38243) Unintended exception thrown in pyspark.ml.LogisticRegression.getThreshold
Maciej Szymkiewicz created SPARK-38243: -- Summary: Unintended exception thrown in pyspark.ml.LogisticRegression.getThreshold Key: SPARK-38243 URL: https://issues.apache.org/jira/browse/SPARK-38243 Project: Spark Issue Type: Bug Components: ML, PySpark Affects Versions: 3.2.0, 3.1.0, 2.4.0, 3.3.0 Reporter: Maciej Szymkiewicz If {{LogisticRegression.getThreshold}} is called with model having multiple thresholds we suppose to raise an exception, {code:python} ValueError: Logistic Regression getThreshold only applies to binary classification ... {code} However, {{thresholds}} ({{{}List[float]{}}}) are incorrectly passed to {{{}str.format{}}}, resulting in unintended {{TypeError}} {code:python} >>> from pyspark.ml.classification import LogisticRegression ... ... model = LogisticRegression(thresholds=[1.0, 2.0, 3.0]) >>> model.getThreshold() Traceback (most recent call last): Input In [7] in model.getThreshold() File ~/Workspace/spark/python/pyspark/ml/classification.py:1003 in getThreshold + ",".join(ts) Type Error: sequence item 0: expected str instance, float found {code} -- This message was sent by Atlassian Jira (v8.20.1#820001) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-38239) AttributeError: 'LogisticRegressionModel' object has no attribute '_call_java'
Maciej Szymkiewicz created SPARK-38239: -- Summary: AttributeError: 'LogisticRegressionModel' object has no attribute '_call_java' Key: SPARK-38239 URL: https://issues.apache.org/jira/browse/SPARK-38239 Project: Spark Issue Type: Bug Components: MLlib, PySpark Affects Versions: 3.2.0, 3.1.0, 3.0.0, 2.4.0, 3.3.0 Reporter: Maciej Szymkiewicz Trying to invoke {{\_\_repr\_\_}} on {{pyspark.mllib.classification.LogisticRegressionModel}} leads to {{AttributeError}}: {code:python} >>> type(model) >>> model Traceback (most recent call last): File /path/to/python3.9/site-packages/IPython/core/formatters.py:698 in __call__ return repr(obj) File /path/to/spark/python/pyspark/mllib/classification.py:281 in __repr__ return self._call_java("toString") AttributeError: 'LogisticRegressionModel' object has no attribute '_call_java' {code} This problem was introduced SPARK-14712, where the method was added, with the same implementation, for both {{ml}} and {{mllib}}. -- This message was sent by Atlassian Jira (v8.20.1#820001) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Assigned] (SPARK-37410) Inline type hints for python/pyspark/ml/recommendation.py
[ https://issues.apache.org/jira/browse/SPARK-37410?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Maciej Szymkiewicz reassigned SPARK-37410: -- Assignee: Maciej Szymkiewicz > Inline type hints for python/pyspark/ml/recommendation.py > - > > Key: SPARK-37410 > URL: https://issues.apache.org/jira/browse/SPARK-37410 > Project: Spark > Issue Type: Sub-task > Components: ML, PySpark >Affects Versions: 3.3.0 >Reporter: Maciej Szymkiewicz >Assignee: Maciej Szymkiewicz >Priority: Major > > Inline type hints from python/pyspark/ml/recommendation.pyi to > python/pyspark/ml/recommendation.py. -- This message was sent by Atlassian Jira (v8.20.1#820001) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Resolved] (SPARK-37410) Inline type hints for python/pyspark/ml/recommendation.py
[ https://issues.apache.org/jira/browse/SPARK-37410?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Maciej Szymkiewicz resolved SPARK-37410. Fix Version/s: 3.3.0 Resolution: Fixed Issue resolved by pull request 35429 [https://github.com/apache/spark/pull/35429] > Inline type hints for python/pyspark/ml/recommendation.py > - > > Key: SPARK-37410 > URL: https://issues.apache.org/jira/browse/SPARK-37410 > Project: Spark > Issue Type: Sub-task > Components: ML, PySpark >Affects Versions: 3.3.0 >Reporter: Maciej Szymkiewicz >Assignee: Maciej Szymkiewicz >Priority: Major > Fix For: 3.3.0 > > > Inline type hints from python/pyspark/ml/recommendation.pyi to > python/pyspark/ml/recommendation.py. -- This message was sent by Atlassian Jira (v8.20.1#820001) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Resolved] (SPARK-37411) Inline type hints for python/pyspark/ml/regression.py
[ https://issues.apache.org/jira/browse/SPARK-37411?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Maciej Szymkiewicz resolved SPARK-37411. Fix Version/s: 3.3.0 Resolution: Fixed Issue resolved by pull request 35427 [https://github.com/apache/spark/pull/35427] > Inline type hints for python/pyspark/ml/regression.py > - > > Key: SPARK-37411 > URL: https://issues.apache.org/jira/browse/SPARK-37411 > Project: Spark > Issue Type: Sub-task > Components: ML, PySpark >Affects Versions: 3.3.0 >Reporter: Maciej Szymkiewicz >Assignee: Maciej Szymkiewicz >Priority: Major > Fix For: 3.3.0 > > > Inline type hints from python/pyspark/ml/regression.pyi to > python/pyspark/ml/regression.py. -- This message was sent by Atlassian Jira (v8.20.1#820001) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Assigned] (SPARK-37411) Inline type hints for python/pyspark/ml/regression.py
[ https://issues.apache.org/jira/browse/SPARK-37411?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Maciej Szymkiewicz reassigned SPARK-37411: -- Assignee: Maciej Szymkiewicz > Inline type hints for python/pyspark/ml/regression.py > - > > Key: SPARK-37411 > URL: https://issues.apache.org/jira/browse/SPARK-37411 > Project: Spark > Issue Type: Sub-task > Components: ML, PySpark >Affects Versions: 3.3.0 >Reporter: Maciej Szymkiewicz >Assignee: Maciej Szymkiewicz >Priority: Major > > Inline type hints from python/pyspark/ml/regression.pyi to > python/pyspark/ml/regression.py. -- This message was sent by Atlassian Jira (v8.20.1#820001) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Assigned] (SPARK-37156) Inline type hints for python/pyspark/storagelevel.py
[ https://issues.apache.org/jira/browse/SPARK-37156?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Maciej Szymkiewicz reassigned SPARK-37156: -- Assignee: dch nguyen (was: Apache Spark) > Inline type hints for python/pyspark/storagelevel.py > > > Key: SPARK-37156 > URL: https://issues.apache.org/jira/browse/SPARK-37156 > Project: Spark > Issue Type: Sub-task > Components: PySpark >Affects Versions: 3.2.0 >Reporter: Byron Hsu >Assignee: dch nguyen >Priority: Major > Fix For: 3.3.0 > > -- This message was sent by Atlassian Jira (v8.20.1#820001) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Assigned] (SPARK-37155) Inline type hints for python/pyspark/statcounter.py
[ https://issues.apache.org/jira/browse/SPARK-37155?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Maciej Szymkiewicz reassigned SPARK-37155: -- Assignee: (was: Byron Hsu) > Inline type hints for python/pyspark/statcounter.py > --- > > Key: SPARK-37155 > URL: https://issues.apache.org/jira/browse/SPARK-37155 > Project: Spark > Issue Type: Sub-task > Components: PySpark >Affects Versions: 3.2.0 >Reporter: Byron Hsu >Priority: Major > Fix For: 3.3.0 > > -- This message was sent by Atlassian Jira (v8.20.1#820001) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Assigned] (SPARK-37155) Inline type hints for python/pyspark/statcounter.py
[ https://issues.apache.org/jira/browse/SPARK-37155?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Maciej Szymkiewicz reassigned SPARK-37155: -- Assignee: dch nguyen > Inline type hints for python/pyspark/statcounter.py > --- > > Key: SPARK-37155 > URL: https://issues.apache.org/jira/browse/SPARK-37155 > Project: Spark > Issue Type: Sub-task > Components: PySpark >Affects Versions: 3.2.0 >Reporter: Byron Hsu >Assignee: dch nguyen >Priority: Major > Fix For: 3.3.0 > > -- This message was sent by Atlassian Jira (v8.20.1#820001) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-37425) Inline type hints for python/pyspark/mllib/recommendation.py
[ https://issues.apache.org/jira/browse/SPARK-37425?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17492616#comment-17492616 ] Maciej Szymkiewicz commented on SPARK-37425: Hi [~amirkdv] Just FYI ‒ most of the blockers are already resolved. It should be able to pick pending changes from SPARK-37428 and SPARK-37154 and complete this, or any of the remaining ones in mllib. > Inline type hints for python/pyspark/mllib/recommendation.py > > > Key: SPARK-37425 > URL: https://issues.apache.org/jira/browse/SPARK-37425 > Project: Spark > Issue Type: Sub-task > Components: MLlib, PySpark >Affects Versions: 3.3.0 >Reporter: Maciej Szymkiewicz >Priority: Major > > Inline type hints from python/pyspark/mlib/recommendation.pyi to > python/pyspark/mllib/recommendation.py -- This message was sent by Atlassian Jira (v8.20.1#820001) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Assigned] (SPARK-37413) Inline type hints for python/pyspark/ml/tree.py
[ https://issues.apache.org/jira/browse/SPARK-37413?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Maciej Szymkiewicz reassigned SPARK-37413: -- Assignee: dch nguyen > Inline type hints for python/pyspark/ml/tree.py > --- > > Key: SPARK-37413 > URL: https://issues.apache.org/jira/browse/SPARK-37413 > Project: Spark > Issue Type: Sub-task > Components: ML, PySpark >Affects Versions: 3.3.0 >Reporter: Maciej Szymkiewicz >Assignee: dch nguyen >Priority: Major > > Inline type hints from python/pyspark/ml/tree.pyi to > python/pyspark/ml/tree.py. -- This message was sent by Atlassian Jira (v8.20.1#820001) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Resolved] (SPARK-37413) Inline type hints for python/pyspark/ml/tree.py
[ https://issues.apache.org/jira/browse/SPARK-37413?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Maciej Szymkiewicz resolved SPARK-37413. Fix Version/s: 3.3.0 Resolution: Fixed Issue resolved by pull request 35420 [https://github.com/apache/spark/pull/35420] > Inline type hints for python/pyspark/ml/tree.py > --- > > Key: SPARK-37413 > URL: https://issues.apache.org/jira/browse/SPARK-37413 > Project: Spark > Issue Type: Sub-task > Components: ML, PySpark >Affects Versions: 3.3.0 >Reporter: Maciej Szymkiewicz >Assignee: dch nguyen >Priority: Major > Fix For: 3.3.0 > > > Inline type hints from python/pyspark/ml/tree.pyi to > python/pyspark/ml/tree.py. -- This message was sent by Atlassian Jira (v8.20.1#820001) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-37405) Inline type hints for python/pyspark/ml/feature.py
[ https://issues.apache.org/jira/browse/SPARK-37405?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17491363#comment-17491363 ] Maciej Szymkiewicz commented on SPARK-37405: I am gonna handle this one. > Inline type hints for python/pyspark/ml/feature.py > -- > > Key: SPARK-37405 > URL: https://issues.apache.org/jira/browse/SPARK-37405 > Project: Spark > Issue Type: Sub-task > Components: ML, PySpark >Affects Versions: 3.3.0 >Reporter: Maciej Szymkiewicz >Priority: Major > > Inline type hints from python/pyspark/ml/feature.pyi to > python/pyspark/ml/feature.py -- This message was sent by Atlassian Jira (v8.20.1#820001) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-38174) SparkR documentation build fails in CI
[ https://issues.apache.org/jira/browse/SPARK-38174?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17490877#comment-17490877 ] Maciej Szymkiewicz commented on SPARK-38174: Done some more digging, and I believe it has been caused by malformed response from [https://cran.rstudio.com/web/packages/packages.rds.] {{histogram}} example loads {{ggplot2}} https://github.com/apache/spark/blob/d4a2e5c55d127218f6ae42925443f7d0588d5875/R/pkg/R/DataFrame.R#L3645 which caused attempt to fetch data from CRAN: https://github.com/r-lib/downlit/blob/d5622d665ebe4d3529b297aa926fe3c0132c8b81/R/metadata.R#L132-L137 and {{CRAN_package_db}} loads RDS object {code:r} > tools::CRAN_package_db function () as.data.frame(read_CRAN_object(CRAN_baseurl_for_web_area(), "web/packages/packages.rds"), stringsAsFactors = FALSE) {code} If the problem repeats we might try to see if setting {{deploy.install_metadata: true}} in pkgdown and installing {{ggplot2}} (and other incidental doc build dependencies), resolves the issue. > SparkR documentation build fails in CI > -- > > Key: SPARK-38174 > URL: https://issues.apache.org/jira/browse/SPARK-38174 > Project: Spark > Issue Type: Test > Components: Build, SparkR >Affects Versions: 3.3.0 >Reporter: Hyukjin Kwon >Priority: Major > > SparkR documentation job in GitHub Actions seems to be broken now as below > (https://github.com/apache/spark/runs/5138914521?check_suite_focus=true): > {code} > Writing 'reference/head.html' > Reading 'man/hint.Rd' > Writing 'reference/hint.html' > Reading 'man/histogram.Rd' > Error in .f(.x[[i]], ...) : Failed to parse Rd in histogram.Rd > ℹ ReadItem: unknown type 73, perhaps written by later version of R > Caused by error in `readRDS()`: > ! ReadItem: unknown type 73, perhaps written by later version of R > Error: histogram.Rd > ℹ ReadItem: unknown type 73, perhaps written by later version of R > Caused by error in `readRDS()`: > ! ReadItem: unknown type 73, perhaps written by later version of R> > --> > Failed to parse Rd in histogram.Rd > ℹ ReadItem: unknown type 73, perhaps written by later version of R > Caused by error in `readRDS()`: > ! ReadItem: unknown type 73, perhaps written by later version of R> > in process 13417 > Stack trace: > Process 13346: > 1. pkgdown::build_site("..") > 2. pkgdown:::build_site_external(pkg = pkg, examples = examples, ... > 3. callr::r(function(..., crayon_enabled, crayon_colors, pkgdown_internet) { > ... > 4. callr:::get_result(output = out, options) > 5. throw(newerr, parent = remerr[[2]]) > x callr subprocess failed: Failed to parse Rd in histogram.Rd > ℹ ReadItem: unknown type 73, perhaps written by later version of R > Caused by error in `readRDS()`: > ! ReadItem: unknown type 73, perhaps written by later version of R > Process 13417: > 17. (function (..., crayon_enabled, crayon_colors, pkgdown_internet) ... > 18. pkgdown::build_site(...) > 19. pkgdown:::build_site_local(pkg = pkg, examples = examples, run_do ... > 20. pkgdown:::build_reference(pkg, lazy = lazy, examples = examples, ... > 21. purrr::map(topics, build_reference_topic, pkg = pkg, lazy = lazy, ... > {code} -- This message was sent by Atlassian Jira (v8.20.1#820001) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-38174) SparkR documentation build fails in CI
[ https://issues.apache.org/jira/browse/SPARK-38174?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17490574#comment-17490574 ] Maciej Szymkiewicz commented on SPARK-38174: I also compared the last failing log and the next that passed (https://github.com/apache/spark/runs/5144322617?check_suite_focus=true), but nothing really stands out, and virtual image version is still 20220207.1. Histogram looks unremarkable and, since files are written for each build, we can reject "later version" as a plausible explanation. It seems like the issue can arise if files were somehow corrupt, but I cannot see why on this particular one. It seems like it failed pretty deep into core R utilities {code} 45. base:::as.data.frame(read_CRAN_object(CRAN_baseurl_for_web_area() ... 46. tools:::read_CRAN_object(CRAN_baseurl_for_web_area(), "web/packag ... 47. base:::readRDS(con) 48. base:::.handleSimpleError(function (err) ... 49. pkgdown:::h(simpleError(msg, call)) 50. rlang:::abort(msg, parent = err) 51. rlang:::signal_abort(cnd, .file) 52. base:::signalCondition(cnd) 53. (function (e) ... {code} and I none of the packages that we install specifically for docs R builds are likely to the culprit ‒ these are mostly font utils. Doesn't seem like CRAN deb repos provide changelogs, so I am not sure if something changed there. I guess it is something to monitor, unless anyone has any other ideas where to look for the source of the problem. > SparkR documentation build fails in CI > -- > > Key: SPARK-38174 > URL: https://issues.apache.org/jira/browse/SPARK-38174 > Project: Spark > Issue Type: Test > Components: Build, SparkR >Affects Versions: 3.3.0 >Reporter: Hyukjin Kwon >Priority: Major > > SparkR documentation job in GitHub Actions seems to be broken now as below > (https://github.com/apache/spark/runs/5138914521?check_suite_focus=true): > {code} > Writing 'reference/head.html' > Reading 'man/hint.Rd' > Writing 'reference/hint.html' > Reading 'man/histogram.Rd' > Error in .f(.x[[i]], ...) : Failed to parse Rd in histogram.Rd > ℹ ReadItem: unknown type 73, perhaps written by later version of R > Caused by error in `readRDS()`: > ! ReadItem: unknown type 73, perhaps written by later version of R > Error: histogram.Rd > ℹ ReadItem: unknown type 73, perhaps written by later version of R > Caused by error in `readRDS()`: > ! ReadItem: unknown type 73, perhaps written by later version of R> > --> > Failed to parse Rd in histogram.Rd > ℹ ReadItem: unknown type 73, perhaps written by later version of R > Caused by error in `readRDS()`: > ! ReadItem: unknown type 73, perhaps written by later version of R> > in process 13417 > Stack trace: > Process 13346: > 1. pkgdown::build_site("..") > 2. pkgdown:::build_site_external(pkg = pkg, examples = examples, ... > 3. callr::r(function(..., crayon_enabled, crayon_colors, pkgdown_internet) { > ... > 4. callr:::get_result(output = out, options) > 5. throw(newerr, parent = remerr[[2]]) > x callr subprocess failed: Failed to parse Rd in histogram.Rd > ℹ ReadItem: unknown type 73, perhaps written by later version of R > Caused by error in `readRDS()`: > ! ReadItem: unknown type 73, perhaps written by later version of R > Process 13417: > 17. (function (..., crayon_enabled, crayon_colors, pkgdown_internet) ... > 18. pkgdown::build_site(...) > 19. pkgdown:::build_site_local(pkg = pkg, examples = examples, run_do ... > 20. pkgdown:::build_reference(pkg, lazy = lazy, examples = examples, ... > 21. purrr::map(topics, build_reference_topic, pkg = pkg, lazy = lazy, ... > {code} -- This message was sent by Atlassian Jira (v8.20.1#820001) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-38174) SparkR documentation build fails in CI
[ https://issues.apache.org/jira/browse/SPARK-38174?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17490344#comment-17490344 ] Maciej Szymkiewicz commented on SPARK-38174: And now, it is green again :/ > SparkR documentation build fails in CI > -- > > Key: SPARK-38174 > URL: https://issues.apache.org/jira/browse/SPARK-38174 > Project: Spark > Issue Type: Test > Components: Build, SparkR >Affects Versions: 3.3.0 >Reporter: Hyukjin Kwon >Priority: Major > > SparkR documentation job in GitHub Actions seems to be broken now as below > (https://github.com/apache/spark/runs/5138914521?check_suite_focus=true): > {code} > Writing 'reference/head.html' > Reading 'man/hint.Rd' > Writing 'reference/hint.html' > Reading 'man/histogram.Rd' > Error in .f(.x[[i]], ...) : Failed to parse Rd in histogram.Rd > ℹ ReadItem: unknown type 73, perhaps written by later version of R > Caused by error in `readRDS()`: > ! ReadItem: unknown type 73, perhaps written by later version of R > Error: histogram.Rd > ℹ ReadItem: unknown type 73, perhaps written by later version of R > Caused by error in `readRDS()`: > ! ReadItem: unknown type 73, perhaps written by later version of R> > --> > Failed to parse Rd in histogram.Rd > ℹ ReadItem: unknown type 73, perhaps written by later version of R > Caused by error in `readRDS()`: > ! ReadItem: unknown type 73, perhaps written by later version of R> > in process 13417 > Stack trace: > Process 13346: > 1. pkgdown::build_site("..") > 2. pkgdown:::build_site_external(pkg = pkg, examples = examples, ... > 3. callr::r(function(..., crayon_enabled, crayon_colors, pkgdown_internet) { > ... > 4. callr:::get_result(output = out, options) > 5. throw(newerr, parent = remerr[[2]]) > x callr subprocess failed: Failed to parse Rd in histogram.Rd > ℹ ReadItem: unknown type 73, perhaps written by later version of R > Caused by error in `readRDS()`: > ! ReadItem: unknown type 73, perhaps written by later version of R > Process 13417: > 17. (function (..., crayon_enabled, crayon_colors, pkgdown_internet) ... > 18. pkgdown::build_site(...) > 19. pkgdown:::build_site_local(pkg = pkg, examples = examples, run_do ... > 20. pkgdown:::build_reference(pkg, lazy = lazy, examples = examples, ... > 21. purrr::map(topics, build_reference_topic, pkg = pkg, lazy = lazy, ... > {code} -- This message was sent by Atlassian Jira (v8.20.1#820001) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-38174) SparkR documentation build fails in CI
[ https://issues.apache.org/jira/browse/SPARK-38174?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17490227#comment-17490227 ] Maciej Szymkiewicz commented on SPARK-38174: I run R doc builds against 3d285c11b611e63d6ebb0b209f52d6ec7a61debe using R 4.0.3, 4.1.1 and 4.1.2 conda envs, but so far couldn't reproduce the error. > SparkR documentation build fails in CI > -- > > Key: SPARK-38174 > URL: https://issues.apache.org/jira/browse/SPARK-38174 > Project: Spark > Issue Type: Test > Components: Build, SparkR >Affects Versions: 3.3.0 >Reporter: Hyukjin Kwon >Priority: Major > > SparkR documentation job in GitHub Actions seems to be broken now as below > (https://github.com/apache/spark/runs/5138914521?check_suite_focus=true): > {code} > Writing 'reference/head.html' > Reading 'man/hint.Rd' > Writing 'reference/hint.html' > Reading 'man/histogram.Rd' > Error in .f(.x[[i]], ...) : Failed to parse Rd in histogram.Rd > ℹ ReadItem: unknown type 73, perhaps written by later version of R > Caused by error in `readRDS()`: > ! ReadItem: unknown type 73, perhaps written by later version of R > Error: histogram.Rd > ℹ ReadItem: unknown type 73, perhaps written by later version of R > Caused by error in `readRDS()`: > ! ReadItem: unknown type 73, perhaps written by later version of R> > --> > Failed to parse Rd in histogram.Rd > ℹ ReadItem: unknown type 73, perhaps written by later version of R > Caused by error in `readRDS()`: > ! ReadItem: unknown type 73, perhaps written by later version of R> > in process 13417 > Stack trace: > Process 13346: > 1. pkgdown::build_site("..") > 2. pkgdown:::build_site_external(pkg = pkg, examples = examples, ... > 3. callr::r(function(..., crayon_enabled, crayon_colors, pkgdown_internet) { > ... > 4. callr:::get_result(output = out, options) > 5. throw(newerr, parent = remerr[[2]]) > x callr subprocess failed: Failed to parse Rd in histogram.Rd > ℹ ReadItem: unknown type 73, perhaps written by later version of R > Caused by error in `readRDS()`: > ! ReadItem: unknown type 73, perhaps written by later version of R > Process 13417: > 17. (function (..., crayon_enabled, crayon_colors, pkgdown_internet) ... > 18. pkgdown::build_site(...) > 19. pkgdown:::build_site_local(pkg = pkg, examples = examples, run_do ... > 20. pkgdown:::build_reference(pkg, lazy = lazy, examples = examples, ... > 21. purrr::map(topics, build_reference_topic, pkg = pkg, lazy = lazy, ... > {code} -- This message was sent by Atlassian Jira (v8.20.1#820001) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-38174) SparkR documentation build fails in CI
[ https://issues.apache.org/jira/browse/SPARK-38174?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17490180#comment-17490180 ] Maciej Szymkiewicz commented on SPARK-38174: The most obvious change, but maybe unrelated is switch from {code} Environment: ubuntu-20.04 Version: 20220131.1 {code} to {code} Environment: ubuntu-20.04 Version: 20220207.1 {code} > SparkR documentation build fails in CI > -- > > Key: SPARK-38174 > URL: https://issues.apache.org/jira/browse/SPARK-38174 > Project: Spark > Issue Type: Test > Components: Build, SparkR >Affects Versions: 3.3.0 >Reporter: Hyukjin Kwon >Priority: Major > > SparkR documentation job in GitHub Actions seems to be broken now as below > (https://github.com/apache/spark/runs/5138914521?check_suite_focus=true): > {code} > Writing 'reference/head.html' > Reading 'man/hint.Rd' > Writing 'reference/hint.html' > Reading 'man/histogram.Rd' > Error in .f(.x[[i]], ...) : Failed to parse Rd in histogram.Rd > ℹ ReadItem: unknown type 73, perhaps written by later version of R > Caused by error in `readRDS()`: > ! ReadItem: unknown type 73, perhaps written by later version of R > Error: histogram.Rd > ℹ ReadItem: unknown type 73, perhaps written by later version of R > Caused by error in `readRDS()`: > ! ReadItem: unknown type 73, perhaps written by later version of R> > --> > Failed to parse Rd in histogram.Rd > ℹ ReadItem: unknown type 73, perhaps written by later version of R > Caused by error in `readRDS()`: > ! ReadItem: unknown type 73, perhaps written by later version of R> > in process 13417 > Stack trace: > Process 13346: > 1. pkgdown::build_site("..") > 2. pkgdown:::build_site_external(pkg = pkg, examples = examples, ... > 3. callr::r(function(..., crayon_enabled, crayon_colors, pkgdown_internet) { > ... > 4. callr:::get_result(output = out, options) > 5. throw(newerr, parent = remerr[[2]]) > x callr subprocess failed: Failed to parse Rd in histogram.Rd > ℹ ReadItem: unknown type 73, perhaps written by later version of R > Caused by error in `readRDS()`: > ! ReadItem: unknown type 73, perhaps written by later version of R > Process 13417: > 17. (function (..., crayon_enabled, crayon_colors, pkgdown_internet) ... > 18. pkgdown::build_site(...) > 19. pkgdown:::build_site_local(pkg = pkg, examples = examples, run_do ... > 20. pkgdown:::build_reference(pkg, lazy = lazy, examples = examples, ... > 21. purrr::map(topics, build_reference_topic, pkg = pkg, lazy = lazy, ... > {code} -- This message was sent by Atlassian Jira (v8.20.1#820001) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-38174) SparkR documentation build fails in CI
[ https://issues.apache.org/jira/browse/SPARK-38174?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17490173#comment-17490173 ] Maciej Szymkiewicz commented on SPARK-38174: Also, all R dependencies for documentation build seem to be unchanged between https://github.com/apache/spark/runs/5136885171?check_suite_focus=true and https://github.com/apache/spark/runs/5138402728?check_suite_focus=true > SparkR documentation build fails in CI > -- > > Key: SPARK-38174 > URL: https://issues.apache.org/jira/browse/SPARK-38174 > Project: Spark > Issue Type: Test > Components: Build, SparkR >Affects Versions: 3.3.0 >Reporter: Hyukjin Kwon >Priority: Major > > SparkR documentation job in GitHub Actions seems to be broken now as below > (https://github.com/apache/spark/runs/5138914521?check_suite_focus=true): > {code} > Writing 'reference/head.html' > Reading 'man/hint.Rd' > Writing 'reference/hint.html' > Reading 'man/histogram.Rd' > Error in .f(.x[[i]], ...) : Failed to parse Rd in histogram.Rd > ℹ ReadItem: unknown type 73, perhaps written by later version of R > Caused by error in `readRDS()`: > ! ReadItem: unknown type 73, perhaps written by later version of R > Error: histogram.Rd > ℹ ReadItem: unknown type 73, perhaps written by later version of R > Caused by error in `readRDS()`: > ! ReadItem: unknown type 73, perhaps written by later version of R> > --> > Failed to parse Rd in histogram.Rd > ℹ ReadItem: unknown type 73, perhaps written by later version of R > Caused by error in `readRDS()`: > ! ReadItem: unknown type 73, perhaps written by later version of R> > in process 13417 > Stack trace: > Process 13346: > 1. pkgdown::build_site("..") > 2. pkgdown:::build_site_external(pkg = pkg, examples = examples, ... > 3. callr::r(function(..., crayon_enabled, crayon_colors, pkgdown_internet) { > ... > 4. callr:::get_result(output = out, options) > 5. throw(newerr, parent = remerr[[2]]) > x callr subprocess failed: Failed to parse Rd in histogram.Rd > ℹ ReadItem: unknown type 73, perhaps written by later version of R > Caused by error in `readRDS()`: > ! ReadItem: unknown type 73, perhaps written by later version of R > Process 13417: > 17. (function (..., crayon_enabled, crayon_colors, pkgdown_internet) ... > 18. pkgdown::build_site(...) > 19. pkgdown:::build_site_local(pkg = pkg, examples = examples, run_do ... > 20. pkgdown:::build_reference(pkg, lazy = lazy, examples = examples, ... > 21. purrr::map(topics, build_reference_topic, pkg = pkg, lazy = lazy, ... > {code} -- This message was sent by Atlassian Jira (v8.20.1#820001) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-38174) SparkR documentation build fails in CI
[ https://issues.apache.org/jira/browse/SPARK-38174?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17490163#comment-17490163 ] Maciej Szymkiewicz commented on SPARK-38174: Not a clue, but I am trying to reproduce this. Stating the obvious ‒ no changes where made to the R tree lately and pkgdown has pinned version, so either another package or some system dependency. > SparkR documentation build fails in CI > -- > > Key: SPARK-38174 > URL: https://issues.apache.org/jira/browse/SPARK-38174 > Project: Spark > Issue Type: Test > Components: Build, SparkR >Affects Versions: 3.3.0 >Reporter: Hyukjin Kwon >Priority: Major > > SparkR documentation job in GitHub Actions seems to be broken now as below > (https://github.com/apache/spark/runs/5138914521?check_suite_focus=true): > {code} > Writing 'reference/head.html' > Reading 'man/hint.Rd' > Writing 'reference/hint.html' > Reading 'man/histogram.Rd' > Error in .f(.x[[i]], ...) : Failed to parse Rd in histogram.Rd > ℹ ReadItem: unknown type 73, perhaps written by later version of R > Caused by error in `readRDS()`: > ! ReadItem: unknown type 73, perhaps written by later version of R > Error: histogram.Rd > ℹ ReadItem: unknown type 73, perhaps written by later version of R > Caused by error in `readRDS()`: > ! ReadItem: unknown type 73, perhaps written by later version of R> > --> > Failed to parse Rd in histogram.Rd > ℹ ReadItem: unknown type 73, perhaps written by later version of R > Caused by error in `readRDS()`: > ! ReadItem: unknown type 73, perhaps written by later version of R> > in process 13417 > Stack trace: > Process 13346: > 1. pkgdown::build_site("..") > 2. pkgdown:::build_site_external(pkg = pkg, examples = examples, ... > 3. callr::r(function(..., crayon_enabled, crayon_colors, pkgdown_internet) { > ... > 4. callr:::get_result(output = out, options) > 5. throw(newerr, parent = remerr[[2]]) > x callr subprocess failed: Failed to parse Rd in histogram.Rd > ℹ ReadItem: unknown type 73, perhaps written by later version of R > Caused by error in `readRDS()`: > ! ReadItem: unknown type 73, perhaps written by later version of R > Process 13417: > 17. (function (..., crayon_enabled, crayon_colors, pkgdown_internet) ... > 18. pkgdown::build_site(...) > 19. pkgdown:::build_site_local(pkg = pkg, examples = examples, run_do ... > 20. pkgdown:::build_reference(pkg, lazy = lazy, examples = examples, ... > 21. purrr::map(topics, build_reference_topic, pkg = pkg, lazy = lazy, ... > {code} -- This message was sent by Atlassian Jira (v8.20.1#820001) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Resolved] (SPARK-37401) Inline type hints for python/pyspark/ml/clustering.py
[ https://issues.apache.org/jira/browse/SPARK-37401?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Maciej Szymkiewicz resolved SPARK-37401. Fix Version/s: 3.3.0 Resolution: Fixed Issue resolved by pull request 35439 [https://github.com/apache/spark/pull/35439] > Inline type hints for python/pyspark/ml/clustering.py > - > > Key: SPARK-37401 > URL: https://issues.apache.org/jira/browse/SPARK-37401 > Project: Spark > Issue Type: Sub-task > Components: ML, PySpark >Affects Versions: 3.3.0 >Reporter: Maciej Szymkiewicz >Assignee: Maciej Szymkiewicz >Priority: Major > Fix For: 3.3.0 > > > Inline type hints from python/pyspark/ml/clustering.pyi to > python/pyspark/ml/clustering.py. -- This message was sent by Atlassian Jira (v8.20.1#820001) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Assigned] (SPARK-37401) Inline type hints for python/pyspark/ml/clustering.py
[ https://issues.apache.org/jira/browse/SPARK-37401?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Maciej Szymkiewicz reassigned SPARK-37401: -- Assignee: Maciej Szymkiewicz > Inline type hints for python/pyspark/ml/clustering.py > - > > Key: SPARK-37401 > URL: https://issues.apache.org/jira/browse/SPARK-37401 > Project: Spark > Issue Type: Sub-task > Components: ML, PySpark >Affects Versions: 3.3.0 >Reporter: Maciej Szymkiewicz >Assignee: Maciej Szymkiewicz >Priority: Major > > Inline type hints from python/pyspark/ml/clustering.pyi to > python/pyspark/ml/clustering.py. -- This message was sent by Atlassian Jira (v8.20.1#820001) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Resolved] (SPARK-37406) Inline type hints for python/pyspark/ml/fpm.py
[ https://issues.apache.org/jira/browse/SPARK-37406?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Maciej Szymkiewicz resolved SPARK-37406. Fix Version/s: 3.3.0 Resolution: Fixed Issue resolved by pull request 35407 https://github.com/apache/spark/pull/35407 > Inline type hints for python/pyspark/ml/fpm.py > -- > > Key: SPARK-37406 > URL: https://issues.apache.org/jira/browse/SPARK-37406 > Project: Spark > Issue Type: Sub-task > Components: ML, PySpark >Affects Versions: 3.3.0 >Reporter: Maciej Szymkiewicz >Assignee: Maciej Szymkiewicz >Priority: Major > Fix For: 3.3.0 > > > Inline type hints from python/pyspark/ml/fpm.pyi to python/pyspark/ml/fpm.py. -- This message was sent by Atlassian Jira (v8.20.1#820001) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Assigned] (SPARK-37406) Inline type hints for python/pyspark/ml/fpm.py
[ https://issues.apache.org/jira/browse/SPARK-37406?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Maciej Szymkiewicz reassigned SPARK-37406: -- Assignee: Maciej Szymkiewicz > Inline type hints for python/pyspark/ml/fpm.py > -- > > Key: SPARK-37406 > URL: https://issues.apache.org/jira/browse/SPARK-37406 > Project: Spark > Issue Type: Sub-task > Components: ML, PySpark >Affects Versions: 3.3.0 >Reporter: Maciej Szymkiewicz >Assignee: Maciej Szymkiewicz >Priority: Major > > Inline type hints from python/pyspark/ml/fpm.pyi to python/pyspark/ml/fpm.py. -- This message was sent by Atlassian Jira (v8.20.1#820001) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Assigned] (SPARK-37414) Inline type hints for python/pyspark/ml/tuning.py
[ https://issues.apache.org/jira/browse/SPARK-37414?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Maciej Szymkiewicz reassigned SPARK-37414: -- Assignee: Maciej Szymkiewicz > Inline type hints for python/pyspark/ml/tuning.py > - > > Key: SPARK-37414 > URL: https://issues.apache.org/jira/browse/SPARK-37414 > Project: Spark > Issue Type: Sub-task > Components: ML, PySpark >Affects Versions: 3.3.0 >Reporter: Maciej Szymkiewicz >Assignee: Maciej Szymkiewicz >Priority: Major > Fix For: 3.3.0 > > > Inline type hints from python/pyspark/ml/tuning.pyi to > python/pyspark/ml/tuning.py. -- This message was sent by Atlassian Jira (v8.20.1#820001) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Resolved] (SPARK-37414) Inline type hints for python/pyspark/ml/tuning.py
[ https://issues.apache.org/jira/browse/SPARK-37414?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Maciej Szymkiewicz resolved SPARK-37414. Fix Version/s: 3.3.0 Resolution: Fixed Issue resolved by pull request 35406 [https://github.com/apache/spark/pull/35406] > Inline type hints for python/pyspark/ml/tuning.py > - > > Key: SPARK-37414 > URL: https://issues.apache.org/jira/browse/SPARK-37414 > Project: Spark > Issue Type: Sub-task > Components: ML, PySpark >Affects Versions: 3.3.0 >Reporter: Maciej Szymkiewicz >Priority: Major > Fix For: 3.3.0 > > > Inline type hints from python/pyspark/ml/tuning.pyi to > python/pyspark/ml/tuning.py. -- This message was sent by Atlassian Jira (v8.20.1#820001) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Resolved] (SPARK-37404) Inline type hints for python/pyspark/ml/evaluation.py
[ https://issues.apache.org/jira/browse/SPARK-37404?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Maciej Szymkiewicz resolved SPARK-37404. Fix Version/s: 3.3.0 Resolution: Fixed Issue resolved by pull request 35403 [https://github.com/apache/spark/pull/35403] > Inline type hints for python/pyspark/ml/evaluation.py > - > > Key: SPARK-37404 > URL: https://issues.apache.org/jira/browse/SPARK-37404 > Project: Spark > Issue Type: Sub-task > Components: ML, PySpark >Affects Versions: 3.3.0 >Reporter: Maciej Szymkiewicz >Assignee: Maciej Szymkiewicz >Priority: Major > Fix For: 3.3.0 > > > Inline type hints from python/pyspark/ml/evaluation.pyi to > python/pyspark/ml/evaluation.py. -- This message was sent by Atlassian Jira (v8.20.1#820001) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Assigned] (SPARK-37404) Inline type hints for python/pyspark/ml/evaluation.py
[ https://issues.apache.org/jira/browse/SPARK-37404?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Maciej Szymkiewicz reassigned SPARK-37404: -- Assignee: Maciej Szymkiewicz > Inline type hints for python/pyspark/ml/evaluation.py > - > > Key: SPARK-37404 > URL: https://issues.apache.org/jira/browse/SPARK-37404 > Project: Spark > Issue Type: Sub-task > Components: ML, PySpark >Affects Versions: 3.3.0 >Reporter: Maciej Szymkiewicz >Assignee: Maciej Szymkiewicz >Priority: Major > > Inline type hints from python/pyspark/ml/evaluation.pyi to > python/pyspark/ml/evaluation.py. -- This message was sent by Atlassian Jira (v8.20.1#820001) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-38139) ml.recommendation.ALS doctests failures
Maciej Szymkiewicz created SPARK-38139: -- Summary: ml.recommendation.ALS doctests failures Key: SPARK-38139 URL: https://issues.apache.org/jira/browse/SPARK-38139 Project: Spark Issue Type: Bug Components: ML, PySpark Affects Versions: 3.3.0 Reporter: Maciej Szymkiewicz In my dev setups, ml.recommendation:ALS test consistently converges to value lower than expected and fails with: {code:python} File "/path/to/spark/python/pyspark/ml/recommendation.py", line 322, in __main__.ALS Failed example: predictions[0] Expected: Row(user=0, item=2, newPrediction=0.69291...) Got: Row(user=0, item=2, newPrediction=0.6929099559783936) {code} In can correct for that, but it creates some noise, so if anyone else experiences this, we could drop a digit from the results {code} diff --git a/python/pyspark/ml/recommendation.py b/python/pyspark/ml/recommendation.py index f0628fb922..b8e2a6097d 100644 --- a/python/pyspark/ml/recommendation.py +++ b/python/pyspark/ml/recommendation.py @@ -320,7 +320,7 @@ class ALS(JavaEstimator, _ALSParams, JavaMLWritable, JavaMLReadable): >>> test = spark.createDataFrame([(0, 2), (1, 0), (2, 0)], ["user", "item"]) >>> predictions = sorted(model.transform(test).collect(), key=lambda r: r[0]) >>> predictions[0] -Row(user=0, item=2, newPrediction=0.69291...) +Row(user=0, item=2, newPrediction=0.6929...) >>> predictions[1] Row(user=1, item=0, newPrediction=3.47356...) >>> predictions[2] {code} -- This message was sent by Atlassian Jira (v8.20.1#820001) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-37428) Inline type hints for python/pyspark/mllib/util.py
[ https://issues.apache.org/jira/browse/SPARK-37428?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17488794#comment-17488794 ] Maciej Szymkiewicz commented on SPARK-37428: I am going to handle this one. > Inline type hints for python/pyspark/mllib/util.py > -- > > Key: SPARK-37428 > URL: https://issues.apache.org/jira/browse/SPARK-37428 > Project: Spark > Issue Type: Sub-task > Components: PySpark >Affects Versions: 3.3.0 >Reporter: Maciej Szymkiewicz >Priority: Major > > Inline type hints from python/pyspark/mlib/util.pyi to > python/pyspark/mllib/util.py -- This message was sent by Atlassian Jira (v8.20.1#820001) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Resolved] (SPARK-37412) Inline type hints for python/pyspark/ml/stat.py
[ https://issues.apache.org/jira/browse/SPARK-37412?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Maciej Szymkiewicz resolved SPARK-37412. Fix Version/s: 3.3.0 Resolution: Fixed Issue resolved by pull request 35401 [https://github.com/apache/spark/pull/35401] > Inline type hints for python/pyspark/ml/stat.py > --- > > Key: SPARK-37412 > URL: https://issues.apache.org/jira/browse/SPARK-37412 > Project: Spark > Issue Type: Sub-task > Components: ML, PySpark >Affects Versions: 3.3.0 >Reporter: Maciej Szymkiewicz >Assignee: Maciej Szymkiewicz >Priority: Major > Fix For: 3.3.0 > > > Inline type hints from python/pyspark/ml/stat.pyi to > python/pyspark/ml/stat.py. -- This message was sent by Atlassian Jira (v8.20.1#820001) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Assigned] (SPARK-37412) Inline type hints for python/pyspark/ml/stat.py
[ https://issues.apache.org/jira/browse/SPARK-37412?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Maciej Szymkiewicz reassigned SPARK-37412: -- Assignee: Maciej Szymkiewicz > Inline type hints for python/pyspark/ml/stat.py > --- > > Key: SPARK-37412 > URL: https://issues.apache.org/jira/browse/SPARK-37412 > Project: Spark > Issue Type: Sub-task > Components: ML, PySpark >Affects Versions: 3.3.0 >Reporter: Maciej Szymkiewicz >Assignee: Maciej Szymkiewicz >Priority: Major > > Inline type hints from python/pyspark/ml/stat.pyi to > python/pyspark/ml/stat.py. -- This message was sent by Atlassian Jira (v8.20.1#820001) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Resolved] (SPARK-37409) Inline type hints for python/pyspark/ml/pipeline.py
[ https://issues.apache.org/jira/browse/SPARK-37409?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Maciej Szymkiewicz resolved SPARK-37409. Fix Version/s: 3.3.0 Assignee: Maciej Szymkiewicz Resolution: Fixed Issue resolved by pull request 35408 https://github.com/apache/spark/pull/35408 > Inline type hints for python/pyspark/ml/pipeline.py > --- > > Key: SPARK-37409 > URL: https://issues.apache.org/jira/browse/SPARK-37409 > Project: Spark > Issue Type: Sub-task > Components: ML, PySpark >Affects Versions: 3.3.0 >Reporter: Maciej Szymkiewicz >Assignee: Maciej Szymkiewicz >Priority: Major > Fix For: 3.3.0 > > > Inline type hints from python/pyspark/ml/pipeline.pyi to > python/pyspark/ml/pipeline.py. -- This message was sent by Atlassian Jira (v8.20.1#820001) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-37410) Inline type hints for python/pyspark/ml/recommendation.py
[ https://issues.apache.org/jira/browse/SPARK-37410?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17488245#comment-17488245 ] Maciej Szymkiewicz commented on SPARK-37410: I'll handle this one. > Inline type hints for python/pyspark/ml/recommendation.py > - > > Key: SPARK-37410 > URL: https://issues.apache.org/jira/browse/SPARK-37410 > Project: Spark > Issue Type: Sub-task > Components: ML, PySpark >Affects Versions: 3.3.0 >Reporter: Maciej Szymkiewicz >Priority: Major > > Inline type hints from python/pyspark/ml/recommendation.pyi to > python/pyspark/ml/recommendation.py. -- This message was sent by Atlassian Jira (v8.20.1#820001) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-37401) Inline type hints for python/pyspark/ml/clustering.py
[ https://issues.apache.org/jira/browse/SPARK-37401?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17488243#comment-17488243 ] Maciej Szymkiewicz commented on SPARK-37401: I'll handle this one. > Inline type hints for python/pyspark/ml/clustering.py > - > > Key: SPARK-37401 > URL: https://issues.apache.org/jira/browse/SPARK-37401 > Project: Spark > Issue Type: Sub-task > Components: ML, PySpark >Affects Versions: 3.3.0 >Reporter: Maciej Szymkiewicz >Priority: Major > > Inline type hints from python/pyspark/ml/clustering.pyi to > python/pyspark/ml/clustering.py. -- This message was sent by Atlassian Jira (v8.20.1#820001) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-37411) Inline type hints for python/pyspark/ml/regression.py
[ https://issues.apache.org/jira/browse/SPARK-37411?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17488042#comment-17488042 ] Maciej Szymkiewicz commented on SPARK-37411: I'll handle this one. > Inline type hints for python/pyspark/ml/regression.py > - > > Key: SPARK-37411 > URL: https://issues.apache.org/jira/browse/SPARK-37411 > Project: Spark > Issue Type: Sub-task > Components: ML, PySpark >Affects Versions: 3.3.0 >Reporter: Maciej Szymkiewicz >Priority: Major > > Inline type hints from python/pyspark/ml/regression.pyi to > python/pyspark/ml/regression.py. -- This message was sent by Atlassian Jira (v8.20.1#820001) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Resolved] (SPARK-37416) Inline type hints for python/pyspark/ml/wrapper.py
[ https://issues.apache.org/jira/browse/SPARK-37416?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Maciej Szymkiewicz resolved SPARK-37416. Fix Version/s: 3.3.0 Resolution: Fixed Issue resolved by pull request 35399 [https://github.com/apache/spark/pull/35399] > Inline type hints for python/pyspark/ml/wrapper.py > -- > > Key: SPARK-37416 > URL: https://issues.apache.org/jira/browse/SPARK-37416 > Project: Spark > Issue Type: Sub-task > Components: ML, PySpark >Affects Versions: 3.3.0 >Reporter: Maciej Szymkiewicz >Assignee: Maciej Szymkiewicz >Priority: Major > Fix For: 3.3.0 > > > Inline type hints from python/pyspark/ml/wrapper.pyi to > python/pyspark/ml/wrapper.py. -- This message was sent by Atlassian Jira (v8.20.1#820001) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Assigned] (SPARK-37416) Inline type hints for python/pyspark/ml/wrapper.py
[ https://issues.apache.org/jira/browse/SPARK-37416?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Maciej Szymkiewicz reassigned SPARK-37416: -- Assignee: Maciej Szymkiewicz > Inline type hints for python/pyspark/ml/wrapper.py > -- > > Key: SPARK-37416 > URL: https://issues.apache.org/jira/browse/SPARK-37416 > Project: Spark > Issue Type: Sub-task > Components: ML, PySpark >Affects Versions: 3.3.0 >Reporter: Maciej Szymkiewicz >Assignee: Maciej Szymkiewicz >Priority: Major > > Inline type hints from python/pyspark/ml/wrapper.pyi to > python/pyspark/ml/wrapper.py. -- This message was sent by Atlassian Jira (v8.20.1#820001) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Resolved] (SPARK-37415) Inline type hints for python/pyspark/ml/util.py
[ https://issues.apache.org/jira/browse/SPARK-37415?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Maciej Szymkiewicz resolved SPARK-37415. Fix Version/s: 3.3.0 Resolution: Fixed Issue resolved by pull request 35367 [https://github.com/apache/spark/pull/35367] > Inline type hints for python/pyspark/ml/util.py > --- > > Key: SPARK-37415 > URL: https://issues.apache.org/jira/browse/SPARK-37415 > Project: Spark > Issue Type: Sub-task > Components: ML, PySpark >Affects Versions: 3.3.0 >Reporter: Maciej Szymkiewicz >Assignee: Maciej Szymkiewicz >Priority: Major > Fix For: 3.3.0 > > > Inline type hints from python/pyspark/ml/util.pyi to > python/pyspark/ml/util.py. -- This message was sent by Atlassian Jira (v8.20.1#820001) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Assigned] (SPARK-37415) Inline type hints for python/pyspark/ml/util.py
[ https://issues.apache.org/jira/browse/SPARK-37415?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Maciej Szymkiewicz reassigned SPARK-37415: -- Assignee: Maciej Szymkiewicz > Inline type hints for python/pyspark/ml/util.py > --- > > Key: SPARK-37415 > URL: https://issues.apache.org/jira/browse/SPARK-37415 > Project: Spark > Issue Type: Sub-task > Components: ML, PySpark >Affects Versions: 3.3.0 >Reporter: Maciej Szymkiewicz >Assignee: Maciej Szymkiewicz >Priority: Major > > Inline type hints from python/pyspark/ml/util.pyi to > python/pyspark/ml/util.py. -- This message was sent by Atlassian Jira (v8.20.1#820001) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Resolved] (SPARK-37417) Inline type hints for python/pyspark/ml/linalg/__init__.py
[ https://issues.apache.org/jira/browse/SPARK-37417?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Maciej Szymkiewicz resolved SPARK-37417. Fix Version/s: 3.3.0 Resolution: Fixed Issue resolved by pull request 35380 [https://github.com/apache/spark/pull/35380] > Inline type hints for python/pyspark/ml/linalg/__init__.py > -- > > Key: SPARK-37417 > URL: https://issues.apache.org/jira/browse/SPARK-37417 > Project: Spark > Issue Type: Sub-task > Components: PySpark >Affects Versions: 3.3.0 >Reporter: Maciej Szymkiewicz >Assignee: Maciej Szymkiewicz >Priority: Major > Fix For: 3.3.0 > > > Inline type hints from python/pyspark/ml/linalg/__init__.pyi to > python/pyspark/ml/linalg/__init__.py. -- This message was sent by Atlassian Jira (v8.20.1#820001) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org