Re: spark - local question

2022-11-05 Thread Bjørn Jørgensen
.withColumnRenamed("category", > "cateid") \ > .withColumnRenamed('weight', 'score').withColumnRenamed('tag', > 'item_tags') \ > .withColumnRenamed('modify_time', > 'item_modify_time').withColumnRenamed('start_time', 'dg_start_time') \ > .withCo

Re: spark - local question

2022-11-04 Thread Bjørn Jørgensen
Yes, Spark in local mode works :) One tip If you just start it, then the default settings is one core and 1 GB. I'm using this func to start spark in local mode to get all cors and max RAM import multiprocessing import os from pyspark.sql import SparkSession from pyspark import SparkConf,

Re: spark - local question

2022-10-31 Thread Sean Owen
Sure, as stable and available as your machine is. If you don't need fault tolerance or scale beyond one machine, sure. On Mon, Oct 31, 2022 at 8:43 AM 张健BJ wrote: > Dear developers: > I have a question about the pyspark local > mode. Can it be used in production and Will it cause

spark - local question

2022-10-31 Thread 张健BJ
Dear developers: I have a question about the pyspark local mode. Can it be used in production and Will it cause unexpected problems? The scenario is as follows: Our team wants to develop an etl component based on python language. Data can be transferred between various data sources. If there