[ https://issues.apache.org/jira/browse/SPARK-38883?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Hyukjin Kwon resolved SPARK-38883. ---------------------------------- Resolution: Invalid Let's interact with Spark mailing list for questions. > smaller pyspark install if not using streaming? > ----------------------------------------------- > > Key: SPARK-38883 > URL: https://issues.apache.org/jira/browse/SPARK-38883 > Project: Spark > Issue Type: Improvement > Components: PySpark > Affects Versions: 3.2.1 > Reporter: t oo > Priority: Minor > > h3. Describe the feature > i am trying to include pyspark in my docker image, but the size is around > 300MB > the largest jar is rocksdbjni-6.20.3.jar at 35MB > is it safe to remove this jar if i have no need for SparkStreaming? > is there any advice on getting the install smaller? perhaps a map of which > jars are needed for batch vs sql vs streaming? > h3. Use Case > smaller python package means i can pack more concurrent pods on to my eks > workers -- This message was sent by Atlassian Jira (v8.20.1#820001) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org