OK thanks for that.

I am using spark-submit with PySpark as follows

 spark-submit --version
Welcome to
      ____              __
     / __/__  ___ _____/ /__
    _\ \/ _ \/ _ `/ __/  '_/
   /___/ .__/\_,_/_/ /_/\_\   version 3.1.1
      /_/

Using Scala version 2.12.9, Java HotSpot(TM) 64-Bit Server VM, 1.8.0_201
Branch HEAD
Compiled by user ubuntu on 2021-02-22T01:33:19Z


spark-submit --master yarn --deploy-mode client --conf
spark.pyspark.virtualenv.enabled=true --conf
spark.pyspark.virtualenv.type=native --conf
spark.pyspark.virtualenv.requirements=/home/hduser/dba/bin/python/requirements.txt
--conf
spark.pyspark.virtualenv.bin.path=/usr/src/Python-3.7.3/airflow_virtualenv
--conf
spark.pyspark.python=/usr/src/Python-3.7.3/airflow_virtualenv/bin/python3
--driver-memory 16G --executor-memory 8G --num-executors 4 --executor-cores
2 xyz.py

enabling with virtual environment


That works fine with any job that does not do structured streaming in a
client mode.


Running on local  node with


spark-submit --master local[4] --conf spark.pyspark.virtualenv.enabled=true
--conf spark.pyspark.virtualenv.type=native --conf
spark.pyspark.virtualenv.requirements=/home/hduser/dba/bin/python/requirements.txt
--conf
spark.pyspark.virtualenv.bin.path=/usr/src/Python-3.7.3/airflow_virtualenv
--conf
spark.pyspark.python=/usr/src/Python-3.7.3/airflow_virtualenv/bin/python3
xyz.py


works fine with the same spark version and $SPARK_HOME/jars


Cheers



   view my Linkedin profile
<https://www.linkedin.com/in/mich-talebzadeh-ph-d-5205b2/>



*Disclaimer:* Use it at your own risk. Any and all responsibility for any
loss, damage or destruction of data or any other property which may arise
from relying on this email's technical content is explicitly disclaimed.
The author will in no case be liable for any monetary damages arising from
such loss, damage or destruction.




On Tue, 6 Apr 2021 at 13:20, Sean Owen <sro...@gmail.com> wrote:

> You may be compiling your app against 3.0.1 JARs but submitting to 3.1.1.
> You do not in general modify the Spark libs. You need to package libs like
> this with your app at the correct version.
>
> On Tue, Apr 6, 2021 at 6:42 AM Mich Talebzadeh <mich.talebza...@gmail.com>
> wrote:
>
>> Thanks Gabor.
>>
>> All nodes are running Spark /spark-3.1.1-bin-hadoop3.2
>>
>> So $SPARK_HOME/jars contains all the required jars on all nodes including
>> the jar file commons-pool2-2.9.0.jar as well.
>>
>> They are installed identically on all nodes.
>>
>> I have looked at the Spark environment for classpath. Still I don't see
>> the reason why Spark 3.1.1 fails with spark-sql-kafka-0-10_2.
>> 12-3.1.1.jar
>> but works ok with  spark-sql-kafka-0-10_2.12-3.1.0.jar
>>
>> Anyway I unzipped the tarball for Spark-3.1.1 and there is
>> no spark-sql-kafka-0-10_2.12-3.0.1.jar even
>>
>> I had to add spark-sql-kafka-0-10_2.12-3.0.1.jar to make it work. Then I
>> enquired the availability of new version from Maven that pointed to
>> *spark-sql-kafka-0-10_2.12-3.1.1.jar*
>>
>> So to confirm Spark out of the tarball does not have any
>>
>> ltr spark-sql-kafka-*
>> ls: cannot access spark-sql-kafka-*: No such file or directory
>>
>>
>> For SSS, I had to add these
>>
>> add commons-pool2-2.9.0.jar. The one shipped is commons-pool-1.5.4.jar!
>>
>> add kafka-clients-2.7.0.jar  Did not have any
>>
>> add  spark-sql-kafka-0-10_2.12-3.0.1.jar  Did not have any
>>
>> I gather from your second mail, there seems to be an issue with
>> spark-sql-kafka-0-10_2.12-3.*1*.1.jar ?
>>
>> HTH
>>
>>
>>
>>    view my Linkedin profile
>> <https://www.linkedin.com/in/mich-talebzadeh-ph-d-5205b2/>
>>
>>
>>
>> *Disclaimer:* Use it at your own risk. Any and all responsibility for
>> any loss, damage or destruction of data or any other property which may
>> arise from relying on this email's technical content is explicitly
>> disclaimed. The author will in no case be liable for any monetary damages
>> arising from such loss, damage or destruction.
>>
>>
>>
>>
>> On Tue, 6 Apr 2021 at 11:54, Gabor Somogyi <gabor.g.somo...@gmail.com>
>> wrote:
>>
>>> Since you've not shared too much details I presume you've updated the 
>>> spark-sql-kafka
>>> jar only.
>>> KafkaTokenUtil is in the token provider jar.
>>>
>>> As a general note if I'm right, please update Spark as a whole on all
>>> nodes and not just jars independently.
>>>
>>> BR,
>>> G
>>>
>>>
>>> On Tue, Apr 6, 2021 at 10:21 AM Mich Talebzadeh <
>>> mich.talebza...@gmail.com> wrote:
>>>
>>>>
>>>> Hi,
>>>>
>>>>
>>>> Any chance of someone testing  the latest 
>>>> spark-sql-kafka-0-10_2.12-3.1.1.jar
>>>> for Spark. It throws
>>>>
>>>>
>>>> java.lang.NoSuchMethodError:
>>>> org.apache.spark.kafka010.KafkaTokenUtil$.needTokenUpdate(Ljava/util/Map;Lscala/Option;)Z
>>>>
>>>>
>>>> However, the previous version spark-sql-kafka-0-10_2.12-3.0.1.jar
>>>> works fine
>>>>
>>>>
>>>> Thanks
>>>>
>>>>
>>>>    view my Linkedin profile
>>>> <https://www.linkedin.com/in/mich-talebzadeh-ph-d-5205b2/>
>>>>
>>>>
>>>>
>>>> *Disclaimer:* Use it at your own risk. Any and all responsibility for
>>>> any loss, damage or destruction of data or any other property which may
>>>> arise from relying on this email's technical content is explicitly
>>>> disclaimed. The author will in no case be liable for any monetary damages
>>>> arising from such loss, damage or destruction.
>>>>
>>>>
>>>>
>>>

Reply via email to