Yes, Spark in local mode works :) One tip If you just start it, then the default settings is one core and 1 GB.
I'm using this func to start spark in local mode to get all cors and max RAM import multiprocessing import os from pyspark.sql import SparkSession from pyspark import SparkConf, SparkContext number_cores = int(multiprocessing.cpu_count()) mem_bytes = os.sysconf("SC_PAGE_SIZE") * os.sysconf("SC_PHYS_PAGES") # e.g. 4015976448 memory_gb = int(mem_bytes / (1024.0**3)) # e.g. 3.74 def get_spark_session(app_name: str, conf: SparkConf): conf.setMaster("local[{}]".format(number_cores)) conf.set("spark.driver.memory", "{}g".format(memory_gb)).set( "spark.sql.adaptive.enabled", "True" ).set( "spark.serializer", "org.apache.spark.serializer.KryoSerializer" ).set( "spark.sql.repl.eagerEval.maxNumRows", "100" ).set( "sc.setLogLevel", "ERROR" ) return SparkSession.builder.appName(app_name).config(conf=conf).getOrCreate() spark = get_spark_session("My_app", SparkConf()) Now when you type spark you will see something like this. SparkSession - in-memory SparkContext Spark UI Version v3.4.0-SNAPSHOT Master local[16] AppName My_app man. 31. okt. 2022 kl. 14:50 skrev Sean Owen <sro...@gmail.com>: > Sure, as stable and available as your machine is. If you don't need fault > tolerance or scale beyond one machine, sure. > > On Mon, Oct 31, 2022 at 8:43 AM 张健BJ <zhangjia...@datagrand.com> wrote: > >> Dear developers: >> I have a question about the pyspark local >> mode. Can it be used in production and Will it cause unexpected problems? >> The scenario is as follows: >> >> Our team wants to develop an etl component based on python language. Data >> can be transferred between various data sources. >> >> If there is no yarn environment, can we read data from Database A and write >> it to Database B in local mode.Will this function be guaranteed to be stable >> and available? >> >> >> >> Thanks, >> Look forward to your reply >> > -- Bjørn Jørgensen Vestre Aspehaug 4, 6010 Ålesund Norge +47 480 94 297