Hi! Thanks Sean and Kent! By reading your answers I have also learnt something new.
@Mich Talebzadeh <mich.talebza...@gmail.com>: see the commit content by prefixing it with *https://github.com/apache/spark/commit/ <https://github.com/apache/spark/commit/>*. So in your case https://github.com/apache/spark/commit/1d550c4e90275ab418b9161925049239227f3dc9 Best Regards, Attila On Sun, Mar 21, 2021 at 5:02 PM Mich Talebzadeh <mich.talebza...@gmail.com> wrote: > > Hi Kent, > > Thanks for the links. > > You have to excuse my ignorance, what are the correlations among these > links and the ability to establish a spark build version? > > > view my Linkedin profile > <https://www.linkedin.com/in/mich-talebzadeh-ph-d-5205b2/> > > > > *Disclaimer:* Use it at your own risk. Any and all responsibility for any > loss, damage or destruction of data or any other property which may arise > from relying on this email's technical content is explicitly disclaimed. > The author will in no case be liable for any monetary damages arising from > such loss, damage or destruction. > > > > > On Sun, 21 Mar 2021 at 15:55, Kent Yao <yaooq...@qq.com> wrote: > >> Please refer to >> http://spark.apache.org/docs/latest/api/sql/index.html#version >> >> *Kent Yao * >> @ Data Science Center, Hangzhou Research Institute, NetEase Corp. >> *a spark enthusiast* >> *kyuubi <https://github.com/yaooqinn/kyuubi>is a >> unified multi-tenant JDBC interface for large-scale data processing and >> analytics, built on top of Apache Spark <http://spark.apache.org/>.* >> *spark-authorizer <https://github.com/yaooqinn/spark-authorizer>A Spark >> SQL extension which provides SQL Standard Authorization for **Apache >> Spark <http://spark.apache.org/>.* >> *spark-postgres <https://github.com/yaooqinn/spark-postgres> A library >> for reading data from and transferring data to Postgres / Greenplum with >> Spark SQL and DataFrames, 10~100x faster.* >> *spark-func-extras <https://github.com/yaooqinn/spark-func-extras>A >> library that brings excellent and useful functions from various modern >> database management systems to Apache Spark <http://spark.apache.org/>.* >> >> >> >> On 03/21/2021 23:28,Mich Talebzadeh<mich.talebza...@gmail.com> >> <mich.talebza...@gmail.com> wrote: >> >> Many thanks >> >> spark-sql> SELECT version(); >> 3.1.1 1d550c4e90275ab418b9161925049239227f3dc9 >> >> What does 1d550c4e90275ab418b9161925049239227f3dc9 signify please? >> >> >> >> >> view my Linkedin profile >> <https://www.linkedin.com/in/mich-talebzadeh-ph-d-5205b2/> >> >> >> >> *Disclaimer:* Use it at your own risk. Any and all responsibility for >> any loss, damage or destruction of data or any other property which may >> arise from relying on this email's technical content is explicitly >> disclaimed. The author will in no case be liable for any monetary damages >> arising from such loss, damage or destruction. >> >> >> >> >> On Sun, 21 Mar 2021 at 15:14, Sean Owen <sro...@gmail.com> wrote: >> >>> I believe you can "SELECT version()" in Spark SQL to see the build >>> version. >>> >>> On Sun, Mar 21, 2021 at 4:41 AM Mich Talebzadeh < >>> mich.talebza...@gmail.com> wrote: >>> >>>> Thanks for the detailed info. >>>> >>>> I was hoping that one can find a simpler answer to the Spark version >>>> than doing forensic examination on base code so to speak. >>>> >>>> The primer for this verification is that on GCP dataprocs originally >>>> built on 3.11-rc2, there was an issue with running Spark Structured >>>> Streaming (SSS) which I reported to this forum before. >>>> >>>> After a while and me reporting to Google, they have now upgraded the >>>> base to Spark 3.1.1 itself. I am not privy to how they did the upgrade >>>> itself. >>>> >>>> In the meantime we installed 3.1.1 on-premise and ran it with the same >>>> Python code for SSS. It worked fine. >>>> >>>> However, when I run the same code on GCP dataproc upgraded to 3.1.1, >>>> occasionally I see this error >>>> >>>> 21/03/18 16:53:38 ERROR org.apache.spark.scheduler.AsyncEventQueue: >>>> Listener EventLoggingListener threw an exception >>>> >>>> java.util.ConcurrentModificationException >>>> >>>> at java.util.Hashtable$Enumerator.next(Hashtable.java:1387) >>>> >>>> This may be for other reasons or the consequence of upgrading from >>>> 3.1.1-rc2 to 3.11? >>>> >>>> >>>> >>>> view my Linkedin profile >>>> <https://www.linkedin.com/in/mich-talebzadeh-ph-d-5205b2/> >>>> >>>> >>>> >>>> *Disclaimer:* Use it at your own risk. Any and all responsibility for >>>> any loss, damage or destruction of data or any other property which may >>>> arise from relying on this email's technical content is explicitly >>>> disclaimed. The author will in no case be liable for any monetary damages >>>> arising from such loss, damage or destruction. >>>> >>>> >>>> >>>> >>>> On Sat, 20 Mar 2021 at 22:41, Attila Zsolt Piros < >>>> piros.attila.zs...@gmail.com> wrote: >>>> >>>>> Hi! >>>>> >>>>> I would check out the Spark source then diff those two RCs (first just >>>>> take look to the list of the changed files): >>>>> >>>>> $ git diff v3.1.1-rc1..v3.1.1-rc2 --stat >>>>> ... >>>>> >>>>> The shell scripts in the release can be checked very easily: >>>>> >>>>> $ git diff v3.1.1-rc1..v3.1.1-rc2 --stat | grep ".sh " >>>>> bin/docker-image-tool.sh | 6 +- >>>>> dev/create-release/release-build.sh | 2 +- >>>>> >>>>> We are lucky as *docker-image-tool.sh* is part of the released >>>>> version. >>>>> Is it from v3.1.1-rc2 or v3.1.1-rc1? >>>>> >>>>> Of course this only works if docker-image-tool.sh is not changed from >>>>> the v3.1.1-rc2 back to v3.1.1-rc1. >>>>> So let's continue with the python (and latter with R) files: >>>>> >>>>> $ git diff v3.1.1-rc1..v3.1.1-rc2 --stat | grep ".py " >>>>> python/pyspark/sql/avro/functions.py | 4 +- >>>>> python/pyspark/sql/dataframe.py | 1 + >>>>> python/pyspark/sql/functions.py | 285 +++++------ >>>>> .../pyspark/sql/tests/test_pandas_cogrouped_map.py | 12 + >>>>> python/pyspark/sql/tests/test_pandas_map.py | 8 + >>>>> ... >>>>> >>>>> After you have enough proof you can stop (to decide what is enough >>>>> here should be decided by you). >>>>> Finally you can use javap / scalap on the classes from the jars and >>>>> check some code changes which is more harder to be analyzed than a simple >>>>> text file. >>>>> >>>>> Best Regards, >>>>> Attila >>>>> >>>>> >>>>> On Thu, Mar 18, 2021 at 4:09 PM Mich Talebzadeh < >>>>> mich.talebza...@gmail.com> wrote: >>>>> >>>>>> Hi >>>>>> >>>>>> What would be a signature in Spark version or binaries that confirms >>>>>> the release is built on Spark built on 3.1.1 as opposed to 3.1.1-RC-1 or >>>>>> RC-2? >>>>>> >>>>>> Thanks >>>>>> >>>>>> Mich >>>>>> >>>>>> >>>>>> view my Linkedin profile >>>>>> <https://www.linkedin.com/in/mich-talebzadeh-ph-d-5205b2/> >>>>>> >>>>>> >>>>>> >>>>>> *Disclaimer:* Use it at your own risk. Any and all responsibility >>>>>> for any loss, damage or destruction of data or any other property which >>>>>> may >>>>>> arise from relying on this email's technical content is explicitly >>>>>> disclaimed. The author will in no case be liable for any monetary damages >>>>>> arising from such loss, damage or destruction. >>>>>> >>>>>> >>>>>> >>>>>