If you look at the dependencies of the 5.0.0-HBase-2.0 artifact
https://mvnrepository.com/artifact/org.apache.phoenix/phoenix-spark/5.0.0-HBase-2.0
it was built against Spark 2.3.0, Scala 2.11.8

You may need to check with the Phoenix community if your setup with Spark
3.4.1 etc  is supported by something like
https://github.com/apache/phoenix-connectors/tree/master/phoenix5-spark3



On Mon, Aug 21, 2023 at 6:12 PM Kal Stevens <kalgstev...@gmail.com> wrote:

> Sorry for being so Dense and thank you for your help.
>
> I was using this version
> phoenix-spark-5.0.0-HBase-2.0.jar
>
> Because it was the latest in this repo
> https://mvnrepository.com/artifact/org.apache.phoenix/phoenix-spark
>
>
> On Mon, Aug 21, 2023 at 5:07 PM Sean Owen <sro...@gmail.com> wrote:
>
>> It is. But you have a third party library in here which seems to require
>> a different version.
>>
>> On Mon, Aug 21, 2023, 7:04 PM Kal Stevens <kalgstev...@gmail.com> wrote:
>>
>>> OK, it was my impression that scala was packaged with Spark to avoid a
>>> mismatch
>>> https://spark.apache.org/downloads.html
>>>
>>> It looks like spark 3.4.1 (my version) uses scala Scala 2.12
>>> How do I specify the scala version?
>>>
>>> On Mon, Aug 21, 2023 at 4:47 PM Sean Owen <sro...@gmail.com> wrote:
>>>
>>>> That's a mismatch in the version of scala that your library uses vs
>>>> spark uses.
>>>>
>>>> On Mon, Aug 21, 2023, 6:46 PM Kal Stevens <kalgstev...@gmail.com>
>>>> wrote:
>>>>
>>>>> I am having a hard time figuring out what I am doing wrong here.
>>>>> I am not sure if I have an incompatible version of something installed
>>>>> or something else.
>>>>> I can not find anything relevant in google to figure out what I am
>>>>> doing wrong
>>>>> I am using *spark 3.4.1*, and *python3.10*
>>>>>
>>>>> This is my code to save my dataframe
>>>>> urls = []
>>>>> pull_sitemap_xml(robot, urls)
>>>>> df = spark.createDataFrame(data=urls, schema=schema)
>>>>> df.write.format("org.apache.phoenix.spark") \
>>>>>     .mode("overwrite") \
>>>>>     .option("table", "property") \
>>>>>     .option("zkUrl", "192.168.1.162:2181") \
>>>>>     .save()
>>>>>
>>>>> urls is an array of maps, containing a "url" and a "last_mod" field.
>>>>>
>>>>> Here is the error that I am getting
>>>>>
>>>>> Traceback (most recent call last):
>>>>>
>>>>>   File "/home/kal/real-estate/pullhttp/pull_properties.py", line 65,
>>>>> in main
>>>>>
>>>>>     .save()
>>>>>
>>>>>   File
>>>>> "/hadoop/spark/spark/python/lib/pyspark.zip/pyspark/sql/readwriter.py",
>>>>> line 1396, in save
>>>>>
>>>>>     self._jwrite.save()
>>>>>
>>>>>   File
>>>>> "/hadoop/spark/spark/python/lib/py4j-0.10.9.7-src.zip/py4j/java_gateway.py",
>>>>> line 1322, in __call__
>>>>>
>>>>>     return_value = get_return_value(
>>>>>
>>>>>   File
>>>>> "/hadoop/spark/spark/python/lib/pyspark.zip/pyspark/errors/exceptions/captured.py",
>>>>> line 169, in deco
>>>>>
>>>>>     return f(*a, **kw)
>>>>>
>>>>>   File
>>>>> "/hadoop/spark/spark/python/lib/py4j-0.10.9.7-src.zip/py4j/protocol.py",
>>>>> line 326, in get_return_value
>>>>>
>>>>>     raise Py4JJavaError(
>>>>>
>>>>> py4j.protocol.Py4JJavaError: An error occurred while calling o636.save.
>>>>>
>>>>> : java.lang.NoSuchMethodError: 'scala.collection.mutable.ArrayOps
>>>>> scala.Predef$.refArrayOps(java.lang.Object[])'
>>>>>
>>>>> at
>>>>> org.apache.phoenix.spark.DataFrameFunctions.getFieldArray(DataFrameFunctions.scala:76)
>>>>>
>>>>> at
>>>>> org.apache.phoenix.spark.DataFrameFunctions.saveToPhoenix(DataFrameFunctions.scala:35)
>>>>>
>>>>> at
>>>>> org.apache.phoenix.spark.DataFrameFunctions.saveToPhoenix(DataFrameFunctions.scala:28)
>>>>>
>>>>> at
>>>>> org.apache.phoenix.spark.DefaultSource.createRelation(DefaultSource.scala:47)
>>>>>
>>>>> at
>>>>> org.apache.spark.sql.execution.datasources.SaveIntoDataSourceCommand.run(SaveIntoDataSourceCommand.scala:47)
>>>>>
>>>>> at
>>>>> org.apache.spark.sql.execution.command.ExecutedCommandExec.sideEffectResult$lzycompute(commands.scala:75)
>>>>>
>>>>> at
>>>>> org.apache.spark.sql.execution.command.ExecutedCommandExec.sideEffectResult(commands.scala:73)
>>>>>
>>>>

Reply via email to