Re: UnspecifiedDistribution Error using AQE

2021-08-03 Thread Mich Talebzadeh
Hi, There have been reports of errors coming out when the following is set. spark.conf.set("spark.sql.adaptive.enabled", "true") Some reported in this forum. Please search the email list for => spark.sql.adaptive.enabled as you have not specified the nature of your query causing error with this

UnspecifiedDistribution Error using AQE

2021-08-03 Thread Jesse Lord
Hello spark users, I have an error that I would like to report as a spark 3.1.1 bug but I do not know how to create a reproducible example. I can provide a full stack trace if desired but the most useful information seems to be E py4j.protocol.Py4JJavaError: An error occurred

Re: Unsubscribe

2021-08-03 Thread Howard Yang
Unsubscribe Edward Wu 于2021年8月3日周二 下午4:15写道: > Unsubscribe >

Unsubscribe

2021-08-03 Thread Edward Wu
Unsubscribe

Re: Reading the last line of each file in a set of text files

2021-08-03 Thread Artemis User
Assuming you are running Linux, an easy option would be just to use the Linux tail command to extract the last line (or last couple of lines) of a file and save them to a different file/directory, before feeding it to Spark.  It shouldn't be hard to write a shell script that executes tail on

Re: Collecting list of errors across executors

2021-08-03 Thread Abdeali Kothari
You could create a custom accumulator using a linkedlist or so. Some examples that could help: https://towardsdatascience.com/custom-pyspark-accumulators-310f63ca3c8c https://stackoverflow.com/questions/34798578/how-to-create-custom-list-accumulator-i-e-listint-int On Tue, Aug 3, 2021 at 1:23

Collecting list of errors across executors

2021-08-03 Thread Sachit Murarka
Hi Team, We are using rdd.foreach(lambda x : do_something(x)) Our use case requires collecting of the error messages in a list which are coming up in the exception block of the method do_something. Since this will be running on executor , a global list won't work here. As the state needs to be