Not sure what you mean not working. You've added 3.1.1 to packages which uses: * 2.6.0 kafka-clients: https://github.com/apache/spark/blob/1d550c4e90275ab418b9161925049239227f3dc9/pom.xml#L136 * 2.6.2 commons pool: https://github.com/apache/spark/blob/1d550c4e90275ab418b9161925049239227f3dc9/pom.xml#L183
I think it worth an end-to-end dep-tree analysis what is really happening on the cluster... G On Wed, Apr 7, 2021 at 11:11 AM Mich Talebzadeh <mich.talebza...@gmail.com> wrote: > Hi Gabor et. al., > > To be honest I am not convinced this package --packages > org.apache.spark:spark-sql-kafka-0-10_2.12:3.1.1 is really working! > > I know for definite that spark-sql-kafka-0-10_2.12-3.1.0.jar works fine. I > reported the package working before because under $SPARK_HOME/jars on all > nodes there was a copy 3.0.1 jar file. Also in $SPARK_HOME/conf we had the > following entries: > > spark.yarn.archive=hdfs://rhes75:9000/jars/spark-libs.jar > spark.driver.extraClassPath $SPARK_HOME/jars/*.jar > spark.executor.extraClassPath $SPARK_HOME/jars/*.jar > > So the jar file was picked up first anyway. > > The concern I have is that that the package uses older version of jar > files, namely: the following in .ivy2/jars > > -rw-r--r-- 1 hduser hadoop 6407352 Dec 19 13:14 > com.github.luben_zstd-jni-1.4.8-1.jar > -rw-r--r-- 1 hduser hadoop 129174 Apr 6 2019 > org.apache.commons_commons-pool2-2.6.2.jar > -rw-r--r-- 1 hduser hadoop 3754508 Jul 28 2020 > org.apache.kafka_kafka-clients-2.6.0.jar > -rw-r--r-- 1 hduser hadoop 387494 Feb 22 03:57 > org.apache.spark_spark-sql-kafka-0-10_2.12-3.1.1.jar > -rw-r--r-- 1 hduser hadoop 55766 Feb 22 03:58 > org.apache.spark_spark-token-provider-kafka-0-10_2.12-3.1.1.jar > -rw-r--r-- 1 hduser hadoop 649950 Jan 18 2020 org.lz4_lz4-java-1.7.1.jar > -rw-r--r-- 1 hduser hadoop 41472 Dec 16 2019 > org.slf4j_slf4j-api-1.7.30.jar > -rw-r--r-- 1 hduser hadoop 2777 Oct 22 2014 > org.spark-project.spark_unused-1.0.0.jar > -rw-r--r-- 1 hduser hadoop 1969177 Nov 28 18:10 > org.xerial.snappy_snappy-java-1.1.8.2.jar > > > So I am not sure. Hence I want someone to verify this independently in > anger > > Cheers > > > > view my Linkedin profile > <https://www.linkedin.com/in/mich-talebzadeh-ph-d-5205b2/> > > > > *Disclaimer:* Use it at your own risk. Any and all responsibility for any > loss, damage or destruction of data or any other property which may arise > from relying on this email's technical content is explicitly disclaimed. > The author will in no case be liable for any monetary damages arising from > such loss, damage or destruction. > > > > > On Wed, 7 Apr 2021 at 07:51, Gabor Somogyi <gabor.g.somo...@gmail.com> > wrote: > >> Good to hear it's working. >> Happy Spark usage. >> >> G >> >> >> On Tue, 6 Apr 2021, 21:56 Mich Talebzadeh, <mich.talebza...@gmail.com> >> wrote: >> >>> OK we found out the root cause of this issue. >>> >>> We were writing to Redis from Spark and downloaded a recently compiled >>> version of Redis jar with scala 2.12. >>> >>> spark-redis_2.12-2.4.1-SNAPSHOT-jar-with-dependencies.jar >>> >>> It was giving grief. We removed that one. So the job runs with either >>> >>> spark-sql-kafka-0-10_2.12-*3.1.0*.jar >>> >>> or as packages through >>> >>> spark-submit .. --packages org.apache.spark:spark-sql-kafka-0-10_2.12: >>> *3.1.1* >>> >>> We will follow the suggested solution as per doc >>> >>> >>> Batch: 18 >>> >>> ------------------------------------------- >>> >>> +--------------------+------+-------------------+------+ >>> >>> | rowkey|ticker| timeissued| price| >>> >>> +--------------------+------+-------------------+------+ >>> >>> |b539cb54-3ddd-47c...| ORCL|2021-04-06 20:53:37| 41.32| >>> >>> |2d4bae2d-649e-4b8...| VOD|2021-04-06 20:53:37|317.48| >>> >>> |2f51f188-6da4-4bb...| MKS|2021-04-06 20:53:37|376.63| >>> >>> |1a4c4645-8dc7-4ef...| BP|2021-04-06 20:53:37| 571.5| >>> >>> |45c9e738-ead7-4e5...| SBRY|2021-04-06 20:53:37|244.76| >>> >>> |48f93c13-43ad-422...| SAP|2021-04-06 20:53:37| 58.71| >>> >>> |ed4d89b1-7fc1-420...| IBM|2021-04-06 20:53:37|105.91| >>> >>> |44b3f0ce-27b8-4a9...| MRW|2021-04-06 20:53:37|297.85| >>> >>> |4441b0b5-32c1-4cb...| MSFT|2021-04-06 20:53:37| 27.83| >>> >>> |143398a4-13b5-494...| TSCO|2021-04-06 20:53:37|183.42| >>> >>> +--------------------+------+-------------------+------+ >>> >>> Now we need to go back to the drawing board and see how to integrate >>> Redis >>> >>> >>> >>> Thanks >>> >>> >>> Mich >>> >>> >>> view my Linkedin profile >>> <https://www.linkedin.com/in/mich-talebzadeh-ph-d-5205b2/> >>> >>> >>> >>> *Disclaimer:* Use it at your own risk. Any and all responsibility for >>> any loss, damage or destruction of data or any other property which may >>> arise from relying on this email's technical content is explicitly >>> disclaimed. The author will in no case be liable for any monetary damages >>> arising from such loss, damage or destruction. >>> >>> >>> >>> >>> On Tue, 6 Apr 2021 at 17:52, Mich Talebzadeh <mich.talebza...@gmail.com> >>> wrote: >>> >>>> Fine. >>>> >>>> Just to clarify please. >>>> >>>> With SBT assembly and Scala I would create an Uber jar file and used >>>> that one with spark-submit >>>> >>>> As I understand (and stand corrected) with PySpark one can only run >>>> spark-submit in client mode by directly using a py file? >>>> >>>> So hence >>>> >>>> spark-submit --master local[4] --packages >>>> org.apache.spark:spark-sql-kafka-0-10_2.12:3.1.1 <python_file> >>>> >>>> >>>> >>>> >>>> >>>> view my Linkedin profile >>>> <https://www.linkedin.com/in/mich-talebzadeh-ph-d-5205b2/> >>>> >>>> >>>> >>>> *Disclaimer:* Use it at your own risk. Any and all responsibility for >>>> any loss, damage or destruction of data or any other property which may >>>> arise from relying on this email's technical content is explicitly >>>> disclaimed. The author will in no case be liable for any monetary damages >>>> arising from such loss, damage or destruction. >>>> >>>> >>>> >>>> >>>> On Tue, 6 Apr 2021 at 17:39, Sean Owen <sro...@gmail.com> wrote: >>>> >>>>> Gabor's point is that these are not libraries you typically install in >>>>> your cluster itself. You package them with your app. >>>>> >>>>> On Tue, Apr 6, 2021 at 11:35 AM Mich Talebzadeh < >>>>> mich.talebza...@gmail.com> wrote: >>>>> >>>>>> Hi G >>>>>> >>>>>> Thanks for the heads-up. >>>>>> >>>>>> In a thread on 3rd of March I reported that 3.1.1 works in yarn mode >>>>>> >>>>>> Spark 3.1.1 Preliminary results (mainly to do with Spark Structured >>>>>> Streaming) (mail-archive.com) >>>>>> <https://www.mail-archive.com/user@spark.apache.org/msg75979.html> >>>>>> >>>>>> From that mail >>>>>> >>>>>> >>>>>> The needed jar files for version 3.1.1 to read from Kafka and write to >>>>>> BigQuery for 3.1.1 are as follows: >>>>>> >>>>>> All under $SPARK_HOME/jars on all nodes. These are the latest available >>>>>> jar >>>>>> files >>>>>> >>>>>> >>>>>> - commons-pool2-2.9.0.jar >>>>>> - spark-token-provider-kafka-0-10_2.12-3.1.0.jar >>>>>> - spark-sql-kafka-0-10_2.12-3.1.0.jar >>>>>> - kafka-clients-2.7.0.jar >>>>>> - spark-bigquery-latest_2.12.jar >>>>>> >>>>>> >>>>>> >>>>>> I just tested it and in local mode single JVM it works fine without >>>>>> the addition of package --> --packages >>>>>> org.apache.spark:spark-sql-kafka-0-10_2.12:3.1.1 >>>>>> BUT including all the above jars files >>>>>> >>>>>> Batch: 17 >>>>>> ------------------------------------------- >>>>>> +--------------------+------+-------------------+------+ >>>>>> | rowkey|ticker| timeissued| price| >>>>>> +--------------------+------+-------------------+------+ >>>>>> |54651f0d-1be0-4d7...| IBM|2021-04-06 17:17:04| 91.92| >>>>>> |8aa1ad79-4792-466...| SAP|2021-04-06 17:17:04| 34.93| >>>>>> |8567f327-cfec-43d...| TSCO|2021-04-06 17:17:04| 324.5| >>>>>> |138a1278-2f54-45b...| VOD|2021-04-06 17:17:04| 241.4| >>>>>> |e02793c3-8e78-47e...| ORCL|2021-04-06 17:17:04| 17.6| >>>>>> |0ab456fb-bd22-465...| SBRY|2021-04-06 17:17:04|350.45| >>>>>> |74588e92-a3e2-48c...| MSFT|2021-04-06 17:17:04| 44.58| >>>>>> |1e7203c6-6938-4ea...| BP|2021-04-06 17:17:04| 588.0| >>>>>> |1e55021a-148d-4aa...| MRW|2021-04-06 17:17:04|171.21| >>>>>> |229ad6f9-e4ed-475...| MKS|2021-04-06 17:17:04|439.17| >>>>>> +--------------------+------+-------------------+------+ >>>>>> >>>>>> However, if I exclude the jar >>>>>> file spark-sql-kafka-0-10_2.12-3.1.0.jar and include the packages as >>>>>> suggested in the link >>>>>> >>>>>> >>>>>> spark-submit --master local[4] --conf >>>>>> spark.pyspark.virtualenv.enabled=true --conf >>>>>> spark.pyspark.virtualenv.type=native --conf >>>>>> spark.pyspark.virtualenv.requirements=/home/hduser/dba/bin/python/requirements.txt >>>>>> --conf >>>>>> spark.pyspark.virtualenv.bin.path=/usr/src/Python-3.7.3/airflow_virtualenv >>>>>> --conf >>>>>> spark.pyspark.python=/usr/src/Python-3.7.3/airflow_virtualenv/bin/python3 >>>>>> *--packages >>>>>> org.apache.spark:spark-sql-kafka-0-10_2.12:3.1.1* xyz.py >>>>>> >>>>>> It cannot fetch the data >>>>>> >>>>>> root >>>>>> |-- parsed_value: struct (nullable = true) >>>>>> | |-- rowkey: string (nullable = true) >>>>>> | |-- ticker: string (nullable = true) >>>>>> | |-- timeissued: timestamp (nullable = true) >>>>>> | |-- price: float (nullable = true) >>>>>> >>>>>> {'message': 'Initializing sources', 'isDataAvailable': False, >>>>>> 'isTriggerActive': False} >>>>>> ------------------------------------------- >>>>>> Batch: 0 >>>>>> ------------------------------------------- >>>>>> +------+------+----------+-----+ >>>>>> |rowkey|ticker|timeissued|price| >>>>>> +------+------+----------+-----+ >>>>>> +------+------+----------+-----+ >>>>>> >>>>>> 2021-04-06 17:20:11,492 ERROR util.Utils: Aborting task >>>>>> java.lang.NoSuchMethodError: >>>>>> org.apache.spark.kafka010.KafkaTokenUtil$.needTokenUpdate(Ljava/util/Map;Lscala/Option;)Z >>>>>> at >>>>>> org.apache.spark.sql.kafka010.consumer.KafkaDataConsumer.getOrRetrieveConsumer(KafkaDataConsumer.scala:549) >>>>>> at >>>>>> org.apache.spark.sql.kafka010.consumer.KafkaDataConsumer.$anonfun$get$1(KafkaDataConsumer.scala:291) >>>>>> at >>>>>> org.apache.spark.util.UninterruptibleThread.runUninterruptibly(UninterruptibleThread.scala:77) >>>>>> at >>>>>> org.apache.spark.sql.kafka010.consumer.KafkaDataConsumer.runUninterruptiblyIfPossible(KafkaDataConsumer.scala:604) >>>>>> at >>>>>> org.apache.spark.sql.kafka010.consumer.KafkaDataConsumer.get(KafkaDataConsumer.scala:287) >>>>>> at >>>>>> org.apache.spark.sql.kafka010.KafkaBatchPartitionReader.next(KafkaBatchPartitionReader.scala:63) >>>>>> at >>>>>> org.apache.spark.sql.execution.datasources.v2.PartitionIterator.hasNext(DataSourceRDD.scala:79) >>>>>> at >>>>>> org.apache.spark.sql.execution.datasources.v2.MetricsIterator.hasNext(DataSourceRDD.scala:112) >>>>>> at >>>>>> org.apache.spark.InterruptibleIterator.hasNext(InterruptibleIterator.scala:37) >>>>>> at >>>>>> scala.collection.Iterator$$anon$10.hasNext(Iterator.scala:458) >>>>>> at >>>>>> org.apache.spark.sql.catalyst.expressions.GeneratedClass$GeneratedIteratorForCodegenStage1.processNext(Unknown >>>>>> Source) >>>>>> at >>>>>> org.apache.spark.sql.execution.BufferedRowIterator.hasNext(BufferedRowIterator.java:43) >>>>>> at >>>>>> org.apache.spark.sql.execution.WholeStageCodegenExec$$anon$1.hasNext(WholeStageCodegenExec.scala:755) >>>>>> at >>>>>> scala.collection.Iterator$$anon$10.hasNext(Iterator.scala:458) >>>>>> at >>>>>> org.apache.spark.sql.execution.datasources.v2.DataWritingSparkTask$.$anonfun$run$1(WriteToDataSourceV2Exec.scala:413) >>>>>> at >>>>>> org.apache.spark.util.Utils$.tryWithSafeFinallyAndFailureCallbacks(Utils.scala:1473) >>>>>> at >>>>>> org.apache.spark.sql.execution.datasources.v2.DataWritingSparkTask$.run(WriteToDataSourceV2Exec.scala:452) >>>>>> at >>>>>> org.apache.spark.sql.execution.datasources.v2.V2TableWriteExec.$anonfun$writeWithV2$2(WriteToDataSourceV2Exec.scala:360) >>>>>> at >>>>>> org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:90) >>>>>> at org.apache.spark.scheduler.Task.run(Task.scala:131) >>>>>> at >>>>>> org.apache.spark.executor.Executor$TaskRunner.$anonfun$run$3(Executor.scala:497) >>>>>> at >>>>>> org.apache.spark.util.Utils$.tryWithSafeFinally(Utils.scala:1439) >>>>>> at >>>>>> org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:500) >>>>>> at >>>>>> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) >>>>>> at >>>>>> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) >>>>>> at java.lang.Thread.run(Thread.java:748) >>>>>> 2021-04-06 17:20:11,492 ERROR util.Utils: Aborting task >>>>>> java.lang.NoSuchMethodError: >>>>>> org.apache.spark.kafka010.KafkaTokenUtil$.needTokenUpdate(Ljava/util/Map;Lscala/Option;)Z >>>>>> >>>>>> >>>>>> Now I deleted ~/.ivy2 directory and ran the job again >>>>>> >>>>>> Ivy Default Cache set to: /home/hduser/.ivy2/cache >>>>>> The jars for the packages stored in: /home/hduser/.ivy2/jars >>>>>> org.apache.spark#spark-sql-kafka-0-10_2.12 added as a dependency >>>>>> :: resolving dependencies :: >>>>>> org.apache.spark#spark-submit-parent-2bab6bd2-3136-4783-b044-810f0800ef0e;1.0 >>>>>> >>>>>> let us go and have a look at the directory .ivy2/jars >>>>>> >>>>>> /home/hduser/.ivy2/jars> ltr >>>>>> total 13108 >>>>>> -rw-r--r-- 1 hduser hadoop 2777 Oct 22 2014 >>>>>> org.spark-project.spark_unused-1.0.0.jar >>>>>> -rw-r--r-- 1 hduser hadoop 129174 Apr 6 2019 >>>>>> org.apache.commons_commons-pool2-2.6.2.jar >>>>>> -rw-r--r-- 1 hduser hadoop 41472 Dec 16 2019 >>>>>> org.slf4j_slf4j-api-1.7.30.jar >>>>>> -rw-r--r-- 1 hduser hadoop 649950 Jan 18 2020 >>>>>> org.lz4_lz4-java-1.7.1.jar >>>>>> -rw-r--r-- 1 hduser hadoop 3754508 Jul 28 2020 >>>>>> org.apache.kafka_kafka-clients-2.6.0.jar >>>>>> -rw-r--r-- 1 hduser hadoop 1969177 Nov 28 18:10 >>>>>> org.xerial.snappy_snappy-java-1.1.8.2.jar >>>>>> -rw-r--r-- 1 hduser hadoop 6407352 Dec 19 13:14 >>>>>> com.github.luben_zstd-jni-1.4.8-1.jar >>>>>> -rw-r--r-- 1 hduser hadoop 387494 Feb 22 03:57 >>>>>> org.apache.spark_spark-sql-kafka-0-10_2.12-3.1.1.jar >>>>>> -rw-r--r-- 1 hduser hadoop 55766 Feb 22 03:58 >>>>>> org.apache.spark_spark-token-provider-kafka-0-10_2.12-3.1.1.jar >>>>>> drwxr-xr-x 4 hduser hadoop 4096 Apr 6 17:25 .. >>>>>> drwxr-xr-x 2 hduser hadoop 4096 Apr 6 17:25 . >>>>>> >>>>>> Strangely these jar files >>>>>> like org.apache.kafka_kafka-clients-2.6.0.jar >>>>>> and org.apache.commons_commons-pool2-2.6.2.jar seem to be out of date. >>>>>> >>>>>> Very confusing. Sounds like we have changed something in the cluster >>>>>> that as reported on 3rd March it used to work with those jar files and >>>>>> now >>>>>> not working. >>>>>> >>>>>> So in summary *without those jar files added to $SPARK_HOME/jars i*t >>>>>> fails totally even with the packages added. >>>>>> >>>>>> Cheers >>>>>> >>>>>> >>>>>> view my Linkedin profile >>>>>> <https://www.linkedin.com/in/mich-talebzadeh-ph-d-5205b2/> >>>>>> >>>>>> >>>>>> >>>>>> *Disclaimer:* Use it at your own risk. Any and all responsibility >>>>>> for any loss, damage or destruction of data or any other property which >>>>>> may >>>>>> arise from relying on this email's technical content is explicitly >>>>>> disclaimed. The author will in no case be liable for any monetary damages >>>>>> arising from such loss, damage or destruction. >>>>>> >>>>>> >>>>>> >>>>>> >>>>>> On Tue, 6 Apr 2021 at 15:44, Gabor Somogyi <gabor.g.somo...@gmail.com> >>>>>> wrote: >>>>>> >>>>>>> > Anyway I unzipped the tarball for Spark-3.1.1 and there is >>>>>>> no spark-sql-kafka-0-10_2.12-3.0.1.jar even >>>>>>> >>>>>>> Please see how Structured Streaming app with Kafka needs to be >>>>>>> deployed here: >>>>>>> https://spark.apache.org/docs/latest/structured-streaming-kafka-integration.html#deploying >>>>>>> I don't see the --packages option... >>>>>>> >>>>>>> G >>>>>>> >>>>>>> >>>>>>> On Tue, Apr 6, 2021 at 2:40 PM Mich Talebzadeh < >>>>>>> mich.talebza...@gmail.com> wrote: >>>>>>> >>>>>>>> OK thanks for that. >>>>>>>> >>>>>>>> I am using spark-submit with PySpark as follows >>>>>>>> >>>>>>>> spark-submit --version >>>>>>>> Welcome to >>>>>>>> ____ __ >>>>>>>> / __/__ ___ _____/ /__ >>>>>>>> _\ \/ _ \/ _ `/ __/ '_/ >>>>>>>> /___/ .__/\_,_/_/ /_/\_\ version 3.1.1 >>>>>>>> /_/ >>>>>>>> >>>>>>>> Using Scala version 2.12.9, Java HotSpot(TM) 64-Bit Server VM, >>>>>>>> 1.8.0_201 >>>>>>>> Branch HEAD >>>>>>>> Compiled by user ubuntu on 2021-02-22T01:33:19Z >>>>>>>> >>>>>>>> >>>>>>>> spark-submit --master yarn --deploy-mode client --conf >>>>>>>> spark.pyspark.virtualenv.enabled=true --conf >>>>>>>> spark.pyspark.virtualenv.type=native --conf >>>>>>>> spark.pyspark.virtualenv.requirements=/home/hduser/dba/bin/python/requirements.txt >>>>>>>> --conf >>>>>>>> spark.pyspark.virtualenv.bin.path=/usr/src/Python-3.7.3/airflow_virtualenv >>>>>>>> --conf >>>>>>>> spark.pyspark.python=/usr/src/Python-3.7.3/airflow_virtualenv/bin/python3 >>>>>>>> --driver-memory 16G --executor-memory 8G --num-executors 4 >>>>>>>> --executor-cores >>>>>>>> 2 xyz.py >>>>>>>> >>>>>>>> enabling with virtual environment >>>>>>>> >>>>>>>> >>>>>>>> That works fine with any job that does not do structured streaming >>>>>>>> in a client mode. >>>>>>>> >>>>>>>> >>>>>>>> Running on local node with >>>>>>>> >>>>>>>> >>>>>>>> spark-submit --master local[4] --conf >>>>>>>> spark.pyspark.virtualenv.enabled=true --conf >>>>>>>> spark.pyspark.virtualenv.type=native --conf >>>>>>>> spark.pyspark.virtualenv.requirements=/home/hduser/dba/bin/python/requirements.txt >>>>>>>> --conf >>>>>>>> spark.pyspark.virtualenv.bin.path=/usr/src/Python-3.7.3/airflow_virtualenv >>>>>>>> --conf >>>>>>>> spark.pyspark.python=/usr/src/Python-3.7.3/airflow_virtualenv/bin/python3 >>>>>>>> xyz.py >>>>>>>> >>>>>>>> >>>>>>>> works fine with the same spark version and $SPARK_HOME/jars >>>>>>>> >>>>>>>> >>>>>>>> Cheers >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> view my Linkedin profile >>>>>>>> <https://www.linkedin.com/in/mich-talebzadeh-ph-d-5205b2/> >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> *Disclaimer:* Use it at your own risk. Any and all responsibility >>>>>>>> for any loss, damage or destruction of data or any other property >>>>>>>> which may >>>>>>>> arise from relying on this email's technical content is explicitly >>>>>>>> disclaimed. The author will in no case be liable for any monetary >>>>>>>> damages >>>>>>>> arising from such loss, damage or destruction. >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> On Tue, 6 Apr 2021 at 13:20, Sean Owen <sro...@gmail.com> wrote: >>>>>>>> >>>>>>>>> You may be compiling your app against 3.0.1 JARs but submitting to >>>>>>>>> 3.1.1. >>>>>>>>> You do not in general modify the Spark libs. You need to package >>>>>>>>> libs like this with your app at the correct version. >>>>>>>>> >>>>>>>>> On Tue, Apr 6, 2021 at 6:42 AM Mich Talebzadeh < >>>>>>>>> mich.talebza...@gmail.com> wrote: >>>>>>>>> >>>>>>>>>> Thanks Gabor. >>>>>>>>>> >>>>>>>>>> All nodes are running Spark /spark-3.1.1-bin-hadoop3.2 >>>>>>>>>> >>>>>>>>>> So $SPARK_HOME/jars contains all the required jars on all nodes >>>>>>>>>> including the jar file commons-pool2-2.9.0.jar as well. >>>>>>>>>> >>>>>>>>>> They are installed identically on all nodes. >>>>>>>>>> >>>>>>>>>> I have looked at the Spark environment for classpath. Still I >>>>>>>>>> don't see the reason why Spark 3.1.1 fails with >>>>>>>>>> spark-sql-kafka-0-10_2.12-3.1.1.jar >>>>>>>>>> but works ok with spark-sql-kafka-0-10_2.12-3.1.0.jar >>>>>>>>>> >>>>>>>>>> Anyway I unzipped the tarball for Spark-3.1.1 and there is >>>>>>>>>> no spark-sql-kafka-0-10_2.12-3.0.1.jar even >>>>>>>>>> >>>>>>>>>> I had to add spark-sql-kafka-0-10_2.12-3.0.1.jar to make it work. >>>>>>>>>> Then I enquired the availability of new version from Maven that >>>>>>>>>> pointed to >>>>>>>>>> *spark-sql-kafka-0-10_2.12-3.1.1.jar* >>>>>>>>>> >>>>>>>>>> So to confirm Spark out of the tarball does not have any >>>>>>>>>> >>>>>>>>>> ltr spark-sql-kafka-* >>>>>>>>>> ls: cannot access spark-sql-kafka-*: No such file or directory >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> For SSS, I had to add these >>>>>>>>>> >>>>>>>>>> add commons-pool2-2.9.0.jar. The one shipped is >>>>>>>>>> commons-pool-1.5.4.jar! >>>>>>>>>> >>>>>>>>>> add kafka-clients-2.7.0.jar Did not have any >>>>>>>>>> >>>>>>>>>> add spark-sql-kafka-0-10_2.12-3.0.1.jar Did not have any >>>>>>>>>> >>>>>>>>>> I gather from your second mail, there seems to be an issue with >>>>>>>>>> spark-sql-kafka-0-10_2.12-3.*1*.1.jar ? >>>>>>>>>> >>>>>>>>>> HTH >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> view my Linkedin profile >>>>>>>>>> <https://www.linkedin.com/in/mich-talebzadeh-ph-d-5205b2/> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> *Disclaimer:* Use it at your own risk. Any and all >>>>>>>>>> responsibility for any loss, damage or destruction of data or any >>>>>>>>>> other >>>>>>>>>> property which may arise from relying on this email's technical >>>>>>>>>> content is >>>>>>>>>> explicitly disclaimed. The author will in no case be liable for any >>>>>>>>>> monetary damages arising from such loss, damage or destruction. >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> On Tue, 6 Apr 2021 at 11:54, Gabor Somogyi < >>>>>>>>>> gabor.g.somo...@gmail.com> wrote: >>>>>>>>>> >>>>>>>>>>> Since you've not shared too much details I presume you've >>>>>>>>>>> updated the spark-sql-kafka jar only. >>>>>>>>>>> KafkaTokenUtil is in the token provider jar. >>>>>>>>>>> >>>>>>>>>>> As a general note if I'm right, please update Spark as a whole >>>>>>>>>>> on all nodes and not just jars independently. >>>>>>>>>>> >>>>>>>>>>> BR, >>>>>>>>>>> G >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> On Tue, Apr 6, 2021 at 10:21 AM Mich Talebzadeh < >>>>>>>>>>> mich.talebza...@gmail.com> wrote: >>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> Hi, >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> Any chance of someone testing the latest >>>>>>>>>>>> spark-sql-kafka-0-10_2.12-3.1.1.jar >>>>>>>>>>>> for Spark. It throws >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> java.lang.NoSuchMethodError: >>>>>>>>>>>> org.apache.spark.kafka010.KafkaTokenUtil$.needTokenUpdate(Ljava/util/Map;Lscala/Option;)Z >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> However, the previous version >>>>>>>>>>>> spark-sql-kafka-0-10_2.12-3.0.1.jar works fine >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> Thanks >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> view my Linkedin profile >>>>>>>>>>>> <https://www.linkedin.com/in/mich-talebzadeh-ph-d-5205b2/> >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> *Disclaimer:* Use it at your own risk. Any and all >>>>>>>>>>>> responsibility for any loss, damage or destruction of data or any >>>>>>>>>>>> other >>>>>>>>>>>> property which may arise from relying on this email's technical >>>>>>>>>>>> content is >>>>>>>>>>>> explicitly disclaimed. The author will in no case be liable for any >>>>>>>>>>>> monetary damages arising from such loss, damage or destruction. >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>