[jira] [Created] (SPARK-38659) PySpark ResourceWarning: unclosed socket

2022-03-25 Thread Gergely Kalmar (Jira)
Gergely Kalmar created SPARK-38659:
--

 Summary: PySpark ResourceWarning: unclosed socket
 Key: SPARK-38659
 URL: https://issues.apache.org/jira/browse/SPARK-38659
 Project: Spark
  Issue Type: Bug
  Components: PySpark
Affects Versions: 3.2.1
Reporter: Gergely Kalmar


Create a file called `spark.py` with the following contents:

```
from pyspark.sql import SparkSession

with SparkSession.builder.getOrCreate() as spark:
    spark.read.csv('test.csv').collect()
```

You can also create a `test.csv` file with whatever data in it. When executing 
`python -Wall spark.py` I get the following warning:

```
/usr/lib/python3.8/socket.py:740: ResourceWarning: unclosed 
  self._sock = None
ResourceWarning: Enable tracemalloc to get the object allocation traceback
```



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Created] (SPARK-38660) PySpark DeprecationWarning: distutils Version classes are deprecated

2022-03-25 Thread Gergely Kalmar (Jira)
Gergely Kalmar created SPARK-38660:
--

 Summary: PySpark DeprecationWarning: distutils Version classes are 
deprecated
 Key: SPARK-38660
 URL: https://issues.apache.org/jira/browse/SPARK-38660
 Project: Spark
  Issue Type: Bug
  Components: PySpark
Affects Versions: 3.2.1
Reporter: Gergely Kalmar


When executing spark.read.csv(f'\{gcs_bucket}/\{data_file}', inferSchema=True, 
header=True) I'm getting the following warning:
{noformat}
.../lib/python3.8/site-packages/pyspark/sql/pandas/conversion.py:62: in toPandas
require_minimum_pandas_version()
.../lib/python3.8/site-packages/pyspark/sql/pandas/utils.py:35: in 
require_minimum_pandas_version
if LooseVersion(pandas.__version__) < LooseVersion(minimum_pandas_version):
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _

self = <[AttributeError("'LooseVersion' object has no attribute 'vstring'") 
raised in repr()] LooseVersion object at 0x7f2319fc0f70>, vstring = '1.4.1'

def __init__ (self, vstring=None):
>   warnings.warn(
"distutils Version classes are deprecated. "
"Use packaging.version instead.",
DeprecationWarning,
stacklevel=2,
)
E   DeprecationWarning: distutils Version classes are deprecated. Use 
packaging.version instead.

.../lib/python3.8/site-packages/setuptools/_distutils/version.py:53: 
DeprecationWarning
{noformat}



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org