Hi *Tl;Dr:* I have a scenario where I generate code string on fly and execute that code, now for me if an error occurs I need the traceback but for executable code I just get partial traceback i.e. the line which caused the error is missing.
Consider below MRC: def fun(): from pyspark.sql import SparkSession from pyspark.sql.functions import col, udf from pyspark.sql.types import StringType spark = SparkSession.builder.appName("some_name").getOrCreate() columns = ["Seqno", "Name"] data = [("1", "john jones"), ("2", "tracey smith"), ("3", "amy sanders" )] df = spark.createDataFrame(data=data, schema=columns) def errror_func(str): def internal_error_method(): raise RuntimeError return internal_error_method() # Converting function to UDF errror_func_udf = udf(lambda z: errror_func(z), StringType()) df.select(col("Seqno"), errror_func_udf(col("Name")).alias("Name")).show (truncate=False) fun() This gives below shown Traceback, (Notice we are also getting the line content that caused error > Traceback (most recent call last): > > File "temp.py", line 28, in <module> > > fun() > > File "temp.py", line 25, in fun > > df.select(col("Seqno"), >> errror_func_udf(col("Name")).alias("Name")).show(truncate=False) > > File >> "/home/indivar/corridor/code/corridor-platforms/venv/lib/python3.8/site-packages/pyspark/sql/dataframe.py", >> line 502, in show > > print(self._jdf.showString(n, int_truncate, vertical)) > > File >> "/home/indivar/corridor/code/corridor-platforms/venv/lib/python3.8/site-packages/py4j/java_gateway.py", >> line 1321, in __call__ > > return_value = get_return_value( > > File >> "/home/indivar/corridor/code/corridor-platforms/venv/lib/python3.8/site-packages/pyspark/sql/utils.py", >> line 117, in deco > > raise converted from None > > pyspark.sql.utils.PythonException: > > An exception was thrown from the Python worker. Please see the stack >> trace below. > > Traceback (most recent call last): > > File "temp.py", line 23, in <lambda> > > errror_func_udf = udf(lambda z: errror_func(z), StringType()) > > File "temp.py", line 20, in errror_func > > return internal_error_method() > > File "temp.py", line 18, in internal_error_method > > raise RuntimeError > > RuntimeError > > > But now if i run the same code by doing an exec i loose the traceback line content although line number is there import linecache code = """ def fun(): from pyspark.sql import SparkSession from pyspark.sql.functions import col, udf from pyspark.sql.types import StringType spark = SparkSession.builder.appName("some_name").getOrCreate() columns = ["Seqno", "Name"] data = [("1", "john jones"), ("2", "tracey smith"), ("3", "amy sanders")] df = spark.createDataFrame(data=data, schema=columns) def errror_func(str): def internal_error_method(): raise RuntimeError return internal_error_method() # Converting function to UDF errror_func_udf = udf(lambda z: errror_func(z), StringType()) df.select(col("Seqno"), errror_func_udf(col("Name")).alias("Name")).show(truncate=False) """ scope = {} filename = "<tmpfile-q231231>" compiled_code = compile(code, filename, "exec") if filename not in linecache.cache: linecache.cache[filename] = ( len(scope), None, code.splitlines(keepends=True), filename, ) exec(compiled_code, scope, scope) fun = scope["fun"] fun() Traceback of this code is > Traceback (most recent call last): > > File "temp.py", line 74, in <module> > > fun() > > File "<tmpfile-q231231>", line 23, in fun > > File >> "/home/indivar/corridor/code/corridor-platforms/venv/lib/python3.8/site-packages/pyspark/sql/dataframe.py", >> line 502, in show > > print(self._jdf.showString(n, int_truncate, vertical)) > > File >> "/home/indivar/corridor/code/corridor-platforms/venv/lib/python3.8/site-packages/py4j/java_gateway.py", >> line 1321, in __call__ > > return_value = get_return_value( > > File >> "/home/indivar/corridor/code/corridor-platforms/venv/lib/python3.8/site-packages/pyspark/sql/utils.py", >> line 117, in deco > > raise converted from None > > pyspark.sql.utils.PythonException: > > An exception was thrown from the Python worker. Please see the stack >> trace below. > > Traceback (most recent call last): > > File "<tmpfile-q231231>", line 21, in <lambda> > > File "<tmpfile-q231231>", line 18, in errror_func > > File "<tmpfile-q231231>", line 16, in internal_error_method > > RuntimeError > > > As you can see this has missing line content. initially i thought this was a python issue, so i tried to do some reading, python internally seems to be using linecache module to get content of line, now when doing exec uptill python 3.12 python also had same issue which they have fixed in python 3.13 [issue ref for details]: Support multi-line error locations in traceback and other related improvements (PEP-657, 3.11) · Issue #106922 · python/cpython (github.com) <https://github.com/python/cpython/issues/106922> and it was a known issue for me also so I was re-massaging the traceback message using linecache which works with simple python definitions as I explicitly update linecache while creating exec. But it seems when i create a Udf and once execution steps inside the Udf the linecache becomes empty ( i checked this by printing linecache.cache, after every step in codestring above), due to which i am not able to get the content of the line number from where the error originates. I was wondering you can help with this Other ref: How can i pass linecache over an exec mthod local/global scope - Python Help - Discussions on Python.org <https://discuss.python.org/t/how-can-i-pass-linecache-over-an-exec-mthod-local-global-scope/51192/2> Thanks, Indivar